Principle of optimizing SQL in Oracle

xiaoxiao2021-03-06  62

1. The already verified statement and the statement already in the shared pool 2 is exactly 2. Variable names as weighed 3. Reasonably uses outslocked 4. Less use of multi-layer nested 5. More useful

The optimization steps of statements generally: 1. Adjust the SGA area so that the SGA area is optimal. 2. The SQL statement itself is optimized, the tool has Explain, SQL TRACE, etc. 3. Database structure adjustment 4. The experience of the project structure adjusts the write statement: 1. For large numbers, use index 2, less in, exist, etc. 3, use collection operations

1. For the listed in the big table query, try to avoid conversion 2 such as TO_CHAR, TO_DATE, TO_NUMBER, etc.. There is an index to use an index, and the conditional to index is written in front. If necessary, it is possible to establish some index 3. Try to avoid full mete scan, limit conditions as possible, so as quickly as possible to search the data to be queried

How to make your SQL run faster Bank Changchun Branch Computer Department Ren Liang ---- People tend to fall into a misunderstanding when using SQL, that is, pay attention to whether the result is correct, and ignore different implementation methods There may be performance differences, which can be particularly obvious in large or complex database environments (such as online transaction OLTP or decision support system DSS). The author found in work practice, poor SQL often comes from inappropriate index design, unsolicient connection conditions and inertable WHERE clauses. After proper optimization of them, its running speed is significantly improved! Below I will summarize from these three aspects:

---- For more intuitive explanation, the SQL running time in all instances is tested, no more than 1 second is expressed as (<1 second).

---- Test Environment - ---- Host: HP LH II ---- The Cluster: 330MHz ---- Memory: 128 Magg ---- Operating System: OperServer 5.0.4 ---- Database: Sybase11 .0.3

First, unreasonable index design ---- Example: Table Record has 620000 lines, try to see the operation of several SQL below in different index: ---- 1. Construction of a non-communical index on Date

Select Count (*) from Record Where Date> '199991201' And Date <'19991214' And Date <' 19991214'and Amount> 2000 (25 second) Select Date, Sum (Amount) from Record Group by Date (55 seconds) Select Count (*) from Record WHERE DATE> '19990901' AND discount in ('bj', 'sh') (27 seconds) ---- Analysis: ---- Date has a large number of repetition values, under non-communic index, data is physically Randomly stored on the data page, when searching, you must perform a table scan to find all the rows within this range.

---- 2. A cluster index on Date

Select Count (*) from Record Where Date> 'and Date <' 19991214 'And Amount> 2000 (14 second) Select, SUM (Amount) from Record Group by Date (28 seconds) Select Count (*) from Record WHERE DATE> '19990901' AND discount in ('bj', 'sh') ---- Analysis: ---- Under the cluster index, data is physically in order on data page, repeat value Also arranged together, so you can first find the range of the range of this range, and only scan the data pages in this range, avoiding a wide range of scans, increase the query speed. ---- 3. Combine index on Place, Date, Amount

Select Count (*) from Record Where Date> 'and Date <' 19991214 'And Amount> 2000 (26 second) Select, SUM (Amount) from Record Group by Date (27 second) Select Count (*) from Record WHERE DATE> '19990901' AND discount in ('bj,' sh ') ---- Analysis: ---- This is an unseasonful combination index, because its leading column is Place, The first and second SQL did not reference the Place, so there is no use of the index; the third SQL uses the Place, and all columns referenced are included in the combined index, and the index coverage is formed, so it's speed is very fast. of.

---- 4. Combine index on Date, Place, Amount

Select Count (*) from Record Where Date> 'and Date <' 19991214 'and Amount> 2000 (<1 second) Select Date, Sum (Amount) from Record Group by Date (11 second) Select Count (*) from Record where Date> '19990901' and Place in ('bj', 'sh') (<1 second) ---- Analysis: ---- This is a reasonable combination index. It uses DATE as the leader, allowing each SQL to utilize indexes, and forms an index coverage in the first and third SQLs, and thus performance has achieved optimal.

---- 5. Summary:

---- The index established by default is a non-clustered index, but sometimes it is not the best; reasonable index design is built on various queries analysis and prediction. General:

---- 1. There are a lot of repetition values, and often have a range inquiry.

(Between,>, <,> =, <=) and the columns occurring in the group by, consider establishing a cluster index; ---- 2. Always access multiple columns simultaneously, and each column contains repetition values Consider establish a combined index; ---- 3. Combined index should try to make a key query form index coverage, the front lead list must be the most frequent column.

Second, the connection condition: ---- Example: Table Card has 7896 lines, there is a non-aggregated index on Card_no, table Account has 191122 lines, there is a non-aggregated index on Account_no, trying to look at different tables Under connection conditions, the execution of two SQL:

Select SUM (A.Amount) from Account A, Card B WHERE A.CARD_NO = B.Card_no (20 seconds) ---- Change SQL to: Select Sum (A.Amount) from Account A, Card B Where A. CARD_NO = B.Card_no and a.account_no = B.account_no (<1 second) ---- Analysis: ---- Under the first connection condition, the best query is to make an access to an additional table, CARD The inner layer, using the index on the Card, the number of I / O can be estimated by the following formula:

---- Outer Table Account 22541 page (Outer Table Account 191122 Row * Inland Table Card) 3 pages to be found on the first line of outer tables in the outer table) = 595907 times I / O

---- In the second connection condition, the best query scheme is to make a CARD out of the table, an Account as an inner table, and use an index on Account. The number of I / O can be estimated by the following formula:

--- 1944 page on the outer table Card (Outer Table Card 7896 Row * Inland Table Account, the 4 pages of each line to find each other) = 33528 times I / O

---- It can be seen that only the fully connected conditions, the real best solution will be executed.

---- to sum up:

---- 1. Multi-table operations before being implemented, the query optimizer lists several groups of possible connection scenarios, and identify the best solutions for system overhead based on the connection conditions. The connection condition should be considering the table with indexes, the number of rows of rows; the selection of the inner and outer tables can be determined by the formula: the number of matches in the outer table * The number of times in the inner layer table is determined, the minimum is the best Program.

---- 2. View the implementation method 310, 302).

Third, an inelaborative WHERE clause ---- 1. Example: The columns in the following SQL condition statements have a proper index, but the execution speed is very slow:

Select * from record where substring (card_no, 1, 4) = '5378' (13 seconds) Select * from record whereamount / 30 <1000 (11 second) Select * from Record Where Convert (Char (10), Date, 112) = '19991201' (10 seconds) ---- Analysis: ---- WHERE clauses The results of the columns in the list are calculated by the SQL runtime, so it has to be searched, not Using the index above the column; if these results can be obtained when query compile, you can be optimized by the SQL optimizer, use the index, avoid the table search, so rewritten SQL into the following: SELECT * from Record Where Card_no Like ' 5378% '(<1 second) SELECT * from Record Where Amount <1000 * 30 (<1 second) Select * from record where date =' 1999/12/01 '(<1 second)

---- You will find that SQL is obviously faster!

---- 2. Example: Table stuff has 200,000 rows, there is a non-clustered index on id_no, please see the following SQL:

Select count (*) from stuff where id_no in ('0', '1') (23 seconds) ---- Analysis: ---- The 'IN' in Where Condition is logically equivalent to 'or', so The grammatical analyzer converts in ('0', '1') to ID_NO = '0' OR ID_NO = '1'. We expect it to find separately according to each OR clause, then add the result, which can take the index on ID_no; but in fact (according to Showplan), it adopts "OR Strategy", that is, take out each The line of the OR clause, in the worksheet of the temporary database, then establish a unique index to remove the repetition, and finally calculate the results from this temporary table. Therefore, the actual process does not use ID_no to index, and the completion time is also affected by the performance of the Tempdb database.

---- Practice has proved, the more the number of lines, the worse the performance of the worksheet, when STUFF has 62,00000 lines, the implementation time has reached 220 seconds! It is better to separate the OR clause:

SELECT Count (*) from stuff where id_no = '0'select count (*) from stuff where id_no =' 1 '---- get two results, then make an additional calculation. Because each sentence uses an index, the execution time is only 3 seconds, and the time is only 4 seconds at 620000. Or, with a better way, write a simple stored procedure: create proc count_stuff asseclare @a intDeclare @B INTDECLARE @c INTDECLARE @d char (10) beginselect @ a = count (*) from stuff where id_no = '0' Select @ b = count (*) from stuff where id_no = '1'ndselect @ c = @ a @ bselect @ d = conver (char (10), @ c) Print @d ---- directly calculated results, execution time Same as above! ---- Summary: ---- Visible, so-called Optimization, the WHERE clause utilizes an index, and the table scan or additional overhead occurs.

---- 1. Any of the listed operations will cause a table scan, including database functions, calculation expressions, etc., move as much as possible to the right right when query.

---- 2.in, the OR clause often uses a worksheet to make index; if you do not generate a large amount of repetition values, you can consider unpacking the clause; the index should be included in the unpublished clause.

---- 3. Be good at using the stored procedure, which makes SQL more flexible and efficient.

---- From these examples, it can be seen that the essence of SQL optimization is to use the statement that can be identified by the optimizer, reducing the I / O number of table scans, and try to avoid watch search happened. In fact, the performance optimization of SQL is a complex process. These are only an embodiment of the application level, and in-depth studies will also involve the resource allocation of the database layer, the flow control of the network layer and the overall design of the operating system layer.

转载请注明原文地址:https://www.9cbs.com/read-111861.html

New Post(0)