Indexing to performance

xiaoxiao2021-03-06 35

How to make your SQL run faster, using the index:

1. By unique index (UNIQUE) ensures uniqueness of data 2, speed up the search speed of data, speed up the connection 4 between the tables, reduce the packet and sorting time 5, and use optimized hidden to improve system performance

Second, the principle of using an index:

1. Creating an index 2 on the column that needs to be searched. 3, the index 3 is created on the primary key. Index 6, often used to create an index on the column of the WHERE clause

Third, the principle of not creating an index:

1. Query query is rarely used and reference not to build index 2. For only a few columns, the index is not built, defined as text, image, bit does not build index 4, when Update performance is much higher than SELECT performance Should not build an index

Fourth, common commands:

1, SP_HELPINDEX: Report Table or Index Information 2, DBCC ShowContig: Displays the data and index of the specified table 3, DBCC DBREINDEX: Rebuild one or more indexes in the specified database 4, DBCC Indexdefrag: Squire specified table or View of the aggregation index or auxiliary index

V. Optimize the index:

1. Reconstruction Index (DBCC DBREINDEX) 2, Index Optimized Wizard 3, organizing the specified table or view of the aggregation index and auxiliary indexefrag ---- People tend to fall into a misunderstanding when using SQL, ie too concerned Whether the resulting result is correct, and ignoring the possible performance differences between different implementations, this performance difference in large or complex database environments (such as online transaction processing OLTP or decision support system DSS) It is particularly obvious. The author found in work practice, poor SQL often comes from inappropriate index design, unsolicient connection conditions and inertable WHERE clauses. After proper optimization of them, its running speed is significantly improved! Below I will summarize from these three aspects: ---- For more intuitive explanation, the SQL runtime in all instances is tested, no more than 1 second is expressed as (<1 second). ---- Test Environment ------ Host: HP LH II ---- The frequency: 330MHz ---- memory: 128 mega ---- Operating system: OperServer5.0.4 ---- Database: Sybase11 .0.3 1. Unreasonable index design ---- Example: Table Record has 620000 lines, try under different indexes, the following SQL operation: ---- 1. Built on Date with a non-commission set Index Select Count (*) from Record Where Date> '19991201' And Date <'19991214'and Amount> 2000 (25 second) Select Date, SUM (Amount) from Record Group by Date (55 second) Select Count (*) from Record where date> '19990901' and place in ('bj', 'sh') (27 seconds) ---- Analysis: ---- Date has a large number of repetition values, under non-communical index, data is physically Randomly stored on the data page, when searching, you must perform a table scan to find all the rows within this range. ---- 2. A cluster index in Date Select Count (*) from Record Where Date> '19991201' And Date <'19991214' and Amount> 2000 (14 seconds) Select Date, Sum (Amount) from Record Group BY DATE (28 seconds) Select count (*) from record where date> '19990901' and place in ('bj', 'sh') (14 seconds) ---- Analysis: ---- Under the cluster index, Data is physically in order on the data page, the repetition value is also arranged together, and thus in the range lookup, you can first find the starting point of this range, and only scan the data pages within this range, avoid a wide range of scans, Improve the query speed.

---- 3. In Place, Date, Amount SELECT Count (*) from Record Where Date> '19991201' And Date <'19991214' And Amount> 2000 (26 second) Select Date, SUM (Amount) From Record Group by Date (27 second) Select count (*) from record where date> '19990901' and place in ('bj', 'sh') (<1 second) ---- Analysis: ---- this It is an unseasonful combination index because it is the leader, the first and second SQL do not reference the Place, so there is no use of the index; the third SQL uses the Place, and all columns included are included In the combined index, an index coverage is formed, so its speed is very fast. ---- 4. In Date, Place, Amount SELECT Count (*) from Record Where Date> '19991201' and And> 2000 (<1 second) SELECT DATE, SUM (Amount ) from record group by date (11 second) Select count (*) from record where date> '19990901' and place in ('bj', 'sh') (<1 second) ---- Analysis: ---- This is a reasonable combination index. It uses DATE as the leader, allowing each SQL to utilize indexes, and forms an index coverage in the first and third SQLs, and thus performance has achieved optimal. ---- 5. Summary: ---- The index established by default is a non-clustered index, but sometimes it is not the best; reasonable index design is based on analysis and prediction of various queries. Generally, ---- 1. There are a large number of repetitive values, and often have a range of queries (Between,>, <,> =, <=) and the columns that occur, the columns that occur, can consider establish a cluster index; --- 2. Always access multiple columns simultaneously, and each column contains repetition values to consider establish a combined index; ---- 3. Combined index should try to make critical queries to form index coverage, the front lead list must be the most frequent use Columns. Second, the connection condition: ---- Example: Table Card has 7896 lines, there is a non-aggregated index on Card_no, table Account has 191122 lines, there is a non-aggregated index on Account_no, trying to look at different tables Under connection conditions, the execution of two SQL:

Select SUM (A.Amount) from Account A, Card B WHERE A.CARD_NO = B.Card_no (20 seconds) ---- Change SQL to: Select Sum (A.Amount) from Account A, Card B Where A. CARD_NO = B.Card_no and a.account_no = B.account_no (<1 second) ---- Analysis: ---- Under the first connection condition, the best query is to make an access to an additional table, CARD The inner layer table, using the index on the Card, the number of I / O can be estimated by the following formula: ---- Outer Table Account 22541 page (Outer Table Account 19112 line * inner layer table Card) The 3 pages of the first line to find the first line) = 595907 times I / O ---- Under the second connection conditions, the best query scheme is to make the CARD out of the table, Account as the inner table, and use account The index, the number of I / O can be estimated by the following formula: 1944 page on the outer table Card (7896 line of the outer table Card), the inner layer table Account, the outer table is looking for every line. 4 pages) = 33528 times I / O --- visible, only a fully connected condition, the real best solution will be executed. ---- Summary: ---- 1. Before being executed before being actually executed, the query optimizer lists several groups of possible connection schemes and finds the best solution for system overhead based on the connection conditions. The connection condition should be considering the table with indexes, the number of rows of rows; the selection of the inner and outer tables can be determined by the formula: the number of matches in the outer table * The number of times in the inner layer table is determined, the minimum is the best Program. ---- 2. View the implementation method - Use SET Showplanon to open the showplan option, you can see the connection order, use the information of the index; want to see more detailed information, you need to perform DBCC with SA roles (3604 310, 302).

Third, unmoderable WHERE clause ---- 1. Example: The columns in the following SQL condition statements have an appropriate index, but the execution speed is very slow: select * from record whereesubstring (CARD_NO, 1, 4) = '5378' (13 seconds) Select * from record whereamount / 30 <1000 (11 second) Select * from record whereconvert (char (10), DATE, 112) = '19991201' (10 seconds) ---- Analysis: - --- Any operating result of the column in the WHERE clause is calculated one by-quarter by SQL runtime, so it has to perform a table search without using the index above the column; if these results are compiled in the query Can be optimized by SQL optimizer, use indexes, avoid table search, so rewritten SQL into the following: Select * from record where card_no like'5378% '(<1 second) SELECT * from Record Where Amount < 1000 * 30 (<1 second) Select * from record where date = '1999/12/01' (<1 second) ---- you will find SQL obviously faster! ---- 2. Example: Table stuff has 200,000 lines, there is a non-clustered index on id_no, please see the following SQL: select count (*) from stuff where id_no in ('0', '1') (23 seconds) ---- Analysis: - 'I in' in WHERE Condition is logically equivalent to 'or', so the grammatical analyzer converts in ('0', '1') into ID_NO = '0' OR ID_NO = '1' is executed. We expect it to find separately according to each OR clause, then add the result, which can take the index on ID_no; but in fact (according to Showplan), it adopts "OR Strategy", that is, take out each The line of the OR clause, in the worksheet of the temporary database, then establish a unique index to remove the repetition, and finally calculate the results from this temporary table. Therefore, the actual process does not use ID_no to index, and the completion time is also affected by the performance of the Tempdb database. ---- Practice has proved, the more the number of lines, the worse the performance of the worksheet, when STUFF has 62,00000 lines, the implementation time has reached 220 seconds! It is better to separate the OR clause: SELECT Count (*) from stuff where id_no = '0'select count (*) from stuff where id_no =' 1 '---- get two results, then make an additional calculation. Because each sentence uses an index, the execution time is only 3 seconds, and the time is only 4 seconds at 620000.

Or, with a better way, write a simple stored procedure: create proc count_stuff asseclare @a intDeclare @B INTDECLARE @c INTDECLARE @d char (10) beginselect @ a = count (*) from stuff where id_no = '0' Select @ b = count (*) from stuff where id_no = '1'ndselect @ c = @ a @ bselect @ d = conver (char (10), @ c) Print @d ---- directly calculated results, execution time Same as above! ---- Summary: ---- Visible, so-called Optimization, the WHERE clause utilizes an index, and the table scan or additional overhead occurs. ---- 1. Any of the listed operations will cause a table scan, including database functions, calculation expressions, etc., move as much as possible to the right right when query. ---- 2.in, the OR clause often uses a worksheet to make index; if you do not generate a large amount of repetition values, you can consider unpacking the clause; the index should be included in the unpublished clause. ---- 3. Be good at using the stored procedure, which makes SQL more flexible and efficient. ---- From these examples, it can be seen that the essence of SQL optimization is to use the statement that can be identified by the optimizer, reducing the I / O number of table scans, and try to avoid watch search happened. In fact, the performance optimization of SQL is a complex process. These are only an embodiment of the application level, and in-depth studies will also involve the resource allocation of the database layer, the flow control of the network layer and the overall design of the operating system layer.

镞 Edge and non-index selection

<1:> 镞 Index is consistent with the order of the physical order and index. The actual data pages are included on each level of the page, the low-level index. A table can only have a 镞 index. Since Update, the Delete statement requires relatively more read operations, so the index is often able to accelerate such an operation. In the table at least one index, you should have a 镞 index. In several cases, you can consider using the index: for example: the number of different values including a column is limited (but not very small) the state name of the customer table has 50 or so of different state names. The abbreviation value can be used. For example, a column that returns a certain range value can be used, such as columns to operate on columns with BetWeen,>,> =, <, <=, etc.. Select * from sales where order_date betWeen '5/1/93' and '6/1/93', for example, columns that return a large number of results when the query can use the 镞 index. Select * from phonebook where last_name = 'smith'

When there is a large number of rows being inserted into the table, avoid establishing an index on the column of this table (for example, the Identity column). If you have established an index, the performance of INSERT will be greatly reduced. Because each inserted row must go to the final data page of the table. When a data is being inserted (this data page is locked), all other inserts must wait until the current insert is over. A indexed leaf-level page includes the actual data page, and the order of the data pages on the hard disk is the same as the logic order of the index.

<2:> A non-embarrassing index is that the order of physical order and the order of the index are different. A non-indexed leaf-level contains a pointer to the row data page. There are multiple non-indexes in a table, you can consider using non-indexes in the following cases. In columns with many different values, you can consider using non-comprambled, such as: a part_id column in a part table Select * from Employee where EMP_ID = 'pcm9809f' Query statement can be considered on the column of the Order By clause can consider using the 镞 index Pay attention to index fragmentation after a period of time: two types of index fragments

The reason for an external debris is because the index does not follow the logical order of logic, such as the current index page allocation order is the first page. Page III Data: 2 4 6 8 10 12 14 16 18 20 22 24

When we insert new data, such as 5, the system may allocate this, generate a new index page first page 2 page 3 Page 4 Data: 2 4 5 10 12 14 16 18 20 22 24 6 8 If we want to query 4-10 data, you need an additional page to return 6, 8 two data.

The cause of the second internal index fragment is because the index page does not take full use of the assigned space, the internal index fragment causes an increase in index space.

We can use three DBCC SHOWCONTIG to check the index fragmentation Usage: DBCC SHOWCONTIG WITH ALL_INDEXES return results DBCC SHOWCONTIG scanning 'authors' table ... Table: 'authors' (1977058079); index ID: 1, database ID: 5TABLE level scan performed. - Pages Scanned .......................................... 1- EXTENTS Scanned .......... .................................................. .....: 0- Avg. Pages Per eXtent .......................................................... 0- Scan Density [Best Count: Actual Count]. .....: 100.00% [1: 1] - Logical Scan Fragmentation ....................: 0.00% - Debris Report - Extent Scan Fragmentation .... ................................................................................ : 6002.0- avg. Page Density (Full) ...............................................

Four solution is 1 table delete and rebuild index 2 dbcc dbreindex 3 dbcc indexdefrag

转载请注明原文地址:https://www.9cbs.com/read-56370.html

9cbs

New Post(0)