Enhance the performance of data access layers

xiaoxiao2021-03-06  79

Enhance the performance of data access layers

In J2EE applications, we often access corporate resources through JDBC. But JDBC is not well used, will affect the performance of the system. This article refers to the article "Performance Tips for The Data Tier (JDBC)" in John Goodson, and I hope to help our development. This article describes the following four parts: l Appropriately use the metadata method for the database to retrieve the required data 1 to select the optimized performance L management connection and data update 1. Appropriately use the metadata method for the application to use the database. The metadata method is relatively slow due to the execution speed of the metadata method, so it is necessary to use a metammata method. Since the call metadata method generates a result set requires a large number of overhead, the result set generated by the metadata method should be cached instead of performing a query multiple times, this can provide the performance of JDBC. For example, in the application, you call GetTypeInfo once, you should save the result set and use it again. 1.2. Avoiding query mode gives metadata to provide NULL parameters or query mode will generate a consuming query. At the same time, since some unwanted data passed through network, the increase in network traffic is increased, and the performance of the entire system is reduced. Since the metadata method is slow, it provides non-NULL parameters and efficiently call it as much as possible. And our application often happens: ResultSet WSRS = WSC.GETTABLES (NULL, NULL, "WSTABLE", NULL); ResultSet WSRS = WSC.GETTABLES ("Cat1", "Johng", "Wstable" , "Table"); Obviously, in the first GetTables () call, the application may need to know if the Wstable table exists. Of course, the JDBC driver is different from the parsed request. JDBC is a resolution request: Returns all names, views, system tables, synonyms, zero tables called "wstable", or alias, or alias in any database directory. The second GetTables () call more accurately reflects what the application needs to know. JDBC parsing this request: Returns all the tables called "Wstable" exists in the current directory. Obviously, the second request of the JDBC driver is more efficient than processing the first request. Get metadata The more information provided by the method, the higher the accuracy and performance of the information you get. 1.3. Use the dumb query to determine the characteristics of the table to avoid using getColumns () to determine the characteristics of a table. Use getMedata () done query replacement Consider an application that allows the user to select the column.

What should I use with getColumns () to return the user column or prepare a dumb query and call getMetadata ()? Situation 1: getColumns method ResultSet WSRC = wsc.getColumns (... "unknownTable" ...); // this call to getColumns () Will Generate a query to // the system catalogs ... Possibly a join // Which Must be prepaared, and product // a result set.... .wsrc.next (); string cname = getString (4) ;.. .................................. of UnknownTable // result column information has now been obtained scenario 2: getMetadata method // prepare dummy queryPreparedStatement WSps = WSc.prepareStatement ( "SELECT * from UnknownTable WHERE 1 = 0"); // query is never executed on the server - only preparedResultSetMetaData WSsmd = WSps.getMetaData (); int numcols = WSrsmd.getColumnCount (); ... int ctype = WSrsmd.getColumnType (n) ... // result column information has now been obtained in both cases, the query is Send it to the server. However, in the situation 1, the query must be prepared and executed, and the description information must be simply expressed, and the result set must be sent to the client. In the case 2, a simple query must be prepared and there is only the result description information being simply described. Obviously, the situation 2 is a better performance mode. How many of this complicate this discussion, let's consider a database that does not support local preparing SQL statements. The performance of the situation 1 is not changed, but since the dumb query must be evaluated rather than just preparing, the performance of the situation is slightly increased. Since the WHERE clause calculation result of the query statement is always false, the query does not have the execution of the resulting row and the notgent table data. In this case, method 2 is still better than method 1. In short, always use the result of the column information, such as column name, column data type, and numerical range. When the requested information cannot be obtained from the result record set (eg, a table rod by default), only the getColumns () method is used. 2. Retrieve the required data 2.1. Search long data unless necessary, the performance is reduced due to the network resource tight. Usually most users do not need to see long data, if the user needs to see these data, the application will retrieve it. Such code in our code: SELECT * from

... If you have long data columns in the selected table, the performance of this query will be very bad.

Besides, do you need all the data items in the table? If you don't need, why do they transfer them on the Internet, waste network resources? For example, take a look at the JDBC code below: ResultSet RS = Stmt.executeQuery ("SELECT * from Employees where sside = '999-99-2222"); rs.next (); string name = rs.getstring (4); JDBC is not intelligent. When you write code like this, it doesn't know that you really need those columns. It returns all the things. Of course, it is a matter of reason. So it is laborious to get what you need in the SELECT statement when developing. If there is a long data field such as a photo in the Employees table, the system's performance is likely to know. Although there is a way GetClob () and getBLOD () support this long data field search, it is not that each database supports it. So remember: When you need long data, you will read it again. 2.2. Decrease the size of the retrieved data sometimes, long data must be retrieved. In this case, most users may not need to see 100K (or more) body on the screen. To reduce network traffic and improve performance, you can reduce the retrieved data size into manageable range by calling setMaxRows (), setmaxfieldsize (), and the SETFETCHSIZE () method associated with the driver. Another way to reduce the retrieved data size is to reduce the number of columns. If the driver allows you to define the size of the package, you will use the smallest package size to meet your needs. Remember: Pay attention to only returns the rows and columns you need. If you returns five rows and you only need two columns, performance is reduced ?? Especially unwanted results include long data. 2.3. Overhead of selecting the correct data type to retrieve and send a certain type of data is expensive. When designing database mode, select the data type that can be processed most efficient. For example, the integer is faster than floating point numbers and decimal data. The floating point is defined according to the specific format of the database, usually a compressed format. In order to be handled by the database communication protocol, these data must be decompressed and then converted into different formats. 2.4. Retrieving Recording Sets Due to the limited support of the rolling cursor, most JDBC drivers cannot implement scrolling cursors. Unless you determine that the database supports scrolling record sets (for example, RS), do not call rs.last () and rs.getrow () how many rows have been obtained. Calling rs.last () causes the drive to retrieve all data over the network to the last row. The method that can be replaced is that you can enumerate the number of lines by recording the set, or get the number of rows by submitting a query with a COUNT column in the SELECT statement. In general, don't write code that relies on the number of recorded lines, because in order to get the number of lines, the driver must read all the rows in the recordset.

3. Select the function of optimizing performance 3.1. When using the parameter tag as the parameter of the stored procedure calls the stored procedure, use the parameter tag as a parameter as much as possible to use the character to do parameters. The JDBC driver calls the stored procedure to perform the process like performing other SQL queries, either directly to optimize the execution process via RPC. If the stored procedure is performed like SQL query, the database server resolves the statement, verifies the parameter type, and then converts the parameters into the correct data type, apparent that this call mode is not the most efficient. The SQL statement is always sent as a string to the database server, for example, "{Call getcustname (12345)}". In this case, even if the programmer envisages the unique parameters for getCustName, it is still a string of the data into the database. The database server parses the statement, separating a single parameter value 12345, and then converts the string "12345" into an integer value before performing the process as the SQL language. Calling a stored procedure over the database server via the RPC can avoid the overhead of using the SQL string. Situation 1 In this example, you cannot use the server-side RPC optimization calling process. The process of calling includes parsing statements, verifying parameter types, converting these parameters into the correct type before executing. Callablestatement cstmt = conn.prepareCall ("{Call getCustName (12345)}"); ResultSet RS = cstmt.executeQuery (); Situation 2 In this example, you can use the server-side RPC to optimize the calling process. Since the application avoids the overhead brought by the text parameters, and the JDBC can be optimized in the database in the RPC mode to call the stored procedure, so the execution time is greatly shortened. Callablestatement cstmt = conn.preparecall ("{call getcustname"); cstmt.setlong (1,12345); ResultSet RS = cstmt.executeQuery (); JDBC optimizes performance according to different purposes, so we need to use according to purposes Make a choice between PreparedStatement objects and STATEMENT objects. If you perform a separate SQL statement, select the Statement object; if you are executing twice or more, select the PreparedStatement object. Sometimes, in order to improve performance, we can use the statement pool. When using the statement pool, if the query is executed once and may never be executed, then the Statement object is used. If the query is rarely being executed, it may be executed once in the life of the statement pool, so use preparedsatement. In the same case, if there is no statement pool, use the Statement object. 3.2. Use the batch rather than updating a large amount of data using a preparedStatement statement, usually preparing an INSERT statement and executing the statement multiple times, resulting in a large number of network round trips. To reduce the number of JDBC calls and improve performance, you can send multiple queries to the database at a time using PreparedStatement objects. For example, let's compare the examples below, situations 1 and the case 2.

Case 1: Multiple execution prepaaredStatement statement prepaaredStatement ps = conn.preparestatement ("Insert INTO EMPLOYEES VALUES (?,?,?)"); For (n = 0; n <100; n ) {ps.setstring (Name [Name " ]); Ps.setlong (ID [n]); ps.setint (Salary [n]); ps.executeUpdate ();} scenario 2: Using batch preparedState ps = conn.prepareStatement ("INSERT INTO Employees VALUES (? ,?,?) "); For (n = 0; n <100; n ) {ps.setstring (name [n]); ps.setlong (ID [n]); ps.setint (Salary [n]) Ps.addbatch ();} ps.executebatch (); in situations 1, a preparedStatement is used to execute an Insert statement multiple times. In this case, 101 network round trips are required for 100 insertion, one for preparing the statement, additional 100 network round trips for performing each operation. When the addbatch () method, as described in the case, only two network round trips, one preparation statement, another execution batch. Although more database CPU computational overhead is required to use batch processing, performance can be obtained from a reduced network round-trip. Remember to make JDBC drivers have a good performance in performance, reducing the network traffic between JDBC drivers and database servers. 3.3. Select the appropriate cursor to select the appropriate cursor to improve the flexibility of the application. This section summarizes the performance issues of three types of cursors. The forward cursor provides excellent performance on all the rows in the continuous reading table. For the search table data, there is no way to retrieve data faster than the forward cursor. However, when the application must handle a non-continuous line, it cannot be used. For applications that require database high-level concurrency control and need to set forward and backward scrolling capabilities, JDBC driver useless targets is the most ideal. The first request for the unusally knowledgeable game is to get all rows (or when JDBC is read using "lazy" mode, you can read some rows) and store them on the client. Then, the first request will be very slow, especially when the long data is retrieved. Subsequent requests no longer need network traffic (or when the drive is lazy mode, only limited network traffic) is very fast. Since the first request is slow, no sensing the game should not be used for a single request for a line of data. When you want to return long data, memory is easily exhausted, so developers should also avoid using no sensible games. Some implementations of unknown cursors are in the zero table in the database, avoid performance issues, however, most of them are cached in applying local. No sense of tricks, sometimes called key set drivers, using identifiers, such as RowID already existing in your database. When you scroll through the result set, the data suitable for the identifier will be retrieved. Since each request generates network traffic, performance will be very poor. However, returning non-continuous rows do not affect performance. For further explanation, let's take a look at an application that usually returns to the application of 1000 row data. When executed or the first row is requested, the JDBC does not perform the SELECT statement provided by the application. But the JDBC driver key identifier replaces the SELECT list of the query, for example, RowID. This modified query will be driven, and all 1000 key values ​​will be retrieved from the database and driven.

Each request from the application will go to the JDBC driver. In order to return the right row, the JDBC queries the key value in its local cache, constructs a statement similar to "where rowid =?" Containing WHERE, execute this Modify the query and retrieve a single result line from the server. When the application uses no insensitive cursor data from the cache, a sensive cursor is the preferred cursor mode in the dynamic situation. 3.4. Effectively using the GET method JDBC provides many ways to retrieve data from the results, such as GetInt (), getString (), and getObject (). The getObject () method is the most common method, but provides the worst performance when not explaining non-default mapping. This is because the JDBC driver must do additional processing in order to determine the type of the retrieved value and the appropriate mapping. Therefore, it always uses methods that can clarify data types. To better improve performance, provide column numbers that are retrieved, for example, getString (1), getlong (2), and getint (3), not column name. If the column numbers are not explained, network traffic is unaffected, but the cost of conversion and finding has risen. For example, suppose you use getString ("foo") ... Driver may have to convert the identifier FOO of the column to uppercase (if necessary), and compare with all column names in the list. If a column number is provided, a large part of the processing is saved. For example, if you have a result set of 15 columns 100 rows, the column name is not included in the result set. You are interested in three columns, Emploeement, Employeenumber (long integer), and Salary. If you explain GetString ("Employeename"), Getlong ("Employeenumber") and GetInt ("Salary)), each column name must be converted to the case of matches in database metadata, there is no doubt that the query will be corresponding increase. If you explain GetString (1), getlong (2), and GetInt (15), performance will greatly improve. 3.5. Search automatically generated keys Many databases have hidden columns (also called pseudo columns) of each row in the table. Typically, since the pseudo column describes the physical disk address of the data, it is the fastest way to use this type of storage in the query. Before JDBC3.0, the application can only perform the SELECT statement to retrieve the value of the false column immediately after the data is inserted. For example: // insert rowint rowcount = stmt.executeUpdate ( "insert into LocalGeniusList (name) values ​​( 'Karen')"); // now get the disk address - rowid - for the newly inserted rowResultSet rs = stmt.executeQuery ( "Select Rowid from localgeniuslist where name = 'karen'"); this search method has two main drawbacks. First, retrieving a false column needs to send a separate query statement to the server via the network. Second, since there may be no primary key in the table, the query condition may not be uniquely determined. In the rear, multiple pseudo-columns are returned, and the application may not be able to determine which is the most recently inserted row. The JDBC specification is an optional feature that can retrieve the automatically generated key information of the row when the row is inserted.

For example: int rowcount = stmt.executeUpdate ( "insert into LocalGeniusList (name) values ​​( 'Karen')", // insert row AND return keyStatement.RETURN_GENERATED_KEYS); ResultSet rs = stmt.getGeneratedKeys (); // key is automatically Available even has no primary key, which gives the application a unique to determine the fastest way to determine the value. When accessing data, the ability to retrieve pseudo colors provides flexibility to JDBC developers and created performance. 4. Manage Connection and Data Update 4.1. Managing the quality of the connection connection directly affects the performance of the application. Use a connection to create multiple Statement objects to optimize your app, not multiple connections. Avoid connecting the data source after the initial connection is established. A bad coding habit is to perform SQL sleep and disconnect several times. A connection object can have multiple STATEMENT objects and it associate. Since the Statement object is a memory store that defines the SQL statement information, it manages multiple SQL statements. In addition, you can use the connection pool to significantly improve performance, especially for applications that are connected through the network or through the WWW connection. The connecting pool allows you to reuse connected, close the connection is not the physical connection to the database, but put the connected connection into the connection pool. When an application requests a connection, an active connection will be reused from the connection pool, which avoids the network I / O generated by creating a new connection. 4.2. Management Submit to the Transaction Due to disk I / O and potential network I / O, the transaction is often slower. WSCONNECTION.SetAutoCommit (false) is often used to close automatic submission settings. What did you include? The database server must refer to each data page containing updated and new data. This is usually a process of continuously written to the log file, but it is also a disk I / O. By default, when connected to the data source, Auto Submit is open, since a large amount of disk I / O required for each operation requires, automatic submission mode usually weakens performance. In addition, most databases do not provide local automatic submission mode. For this type of server, the JDBC driver must express the server to the server and a begin transaction for each operation. Although it is helpful to use the transaction, do not use it over. The throughput will be reduced due to the long-term holding lock for a long time in order to prevent other users from accessing the row. Submitting a transaction in a short period of time can maximize concurrency. 4.3. Selecting the correct transaction mode Many systems support distributed transactions; that is, transactions can span multiple connections. Due to the record log and all network I / O included between components (JDBC drivers, transaction monitor and database systems) in distributed transactions, distributed transactions are more slower than ordinary transactions. Use them unless you need a distributed transaction. Local transactions can be used if possible. It should be noted that many Java application servers provide a default transaction behavior that utilizes distributed transactions. For the best system performance, the application is designed under a single connection object, unless it is necessary to avoid distributed transactions. 4.4. Using the Updatexxx method Although programming updates do not apply to all types of applications, developers should try to use programming updates and deletions, that is, update data with the Updatexxxxxxxxxxxxx () method of the ResultSet object. This method allows developers to update data without building complex SQL statements. In order to update the database, you must call the UPDATEROW () method before moving the cursor on the line of the result set.

转载请注明原文地址:https://www.9cbs.com/read-121317.html

New Post(0)
CopyRight © 2020 All Rights Reserved
Processed: 0.037, SQL: 9