Index and Index Adjustment Wizard

xiaoxiao2021-03-06  115

Index and Index Adjustment Wizard

Index is a way to speed up data in the search table.

The index of the database is similar to the index of books. In books, the index allows users to quickly find the information you need without having to flush the whole book. In the database, the index also allows the database program to quickly locate the data in the table without having to scan the entire database. In books, the index is the list of content and corresponding page numbers. In the database, the index is a list of data and corresponding storage locations in the table. Indexes can greatly reduce the time of database management system lookup data. The Index Adjustment Wizard is a tool, which analyzes the workload of the query task by using the query optimizer, and recommends an optimal index mixing method to speed up the query speed of the database. SQL Server now uses a cross index and a federation index to complete multiple indexes of a database.

The Index and Index Adjustment Wizard has the following features:

● Index can speed up the retrieval speed of the database. ● The index increases the time of maintenance tasks such as database insertion, modification, and delete. ● The index is created on the table and cannot be created on the view. ● There are two types of indexes, ie clustered indexes and non-clustered indexes. ● The clustered index is the same as the physical sequence of the base table, and the non-clustered index is different from the physical order of the base table. ● Index can be created directly or indirectly. ● You can use an index in optimized hidden. ● Use the query processor to execute the SQL statement, only one index can only be used at a table. ● Use the Index Adjustment Wizard to use multiple indexes on one table. ● Use the Index Adjustment Wizard to provide a workload as an adjustable object.

Advantages and disadvantages of indexes

Why create an index? This is because creating an index can greatly improve the performance of the system. First, by creating a uniqueness index, you can guarantee the uniqueness of each row of data in the database table. Second, it can greatly speed up the retrieval speed of the data, which is also the most important reason for creating an index. Third, accelerate the connection between the tables and tables, which is particularly meaningful in achieving the integrity of data. Fourth, when data retrieval is used to use packets and sorting subseranges, the time of packets and sorting in the query can also be significantly reduced. Fifth, by using an index, you can use an optimized hidden in the process of queries to improve the performance of the system.

Perhaps someone will ask: Increase the index has such an advantage, why not create an index on each column in the table? This idea has its rationality, but there is also a piece of one. Although the index has many advantages, it is very unwise to increase the index for each column in the table. This is because adding indexes have many unfavorable aspects. First, create an index and maintenance index, which increases with the increase of the amount of data. Second, in addition to the data set, each index also accounts for a certain physical space, and if you want to build a clustered index, the space you need is even more. Third, when the data in the table increases, deletes and modify, the index also needs to be dynamic, which reduces the maintenance speed of the data.

Indexes are built on some columns in the database table. Therefore, when you create an index, you should carefully consider which columns can create an index, which columns cannot create an index. In general, when choosing a column of creating an index, you can refer to the following principles: Create an index on the columns that often need to search, speed the speed of the search; create an index on the column as the primary key to force the uniqueness and organization of the column. The structure of the data; the index is created on the columns often used for the connection (these columns are primarily foreign bonds), speed up the connection speed; create indexes on columns often need to search according to the range, because index has been sorted, The specified range is continuous; the index is created on the column that often needs to be sorted, because the index has been sorted, so that the query can use the index to speed up the query speed; create an index in the column in the WHERE clause, speed up the condition Judgment speed.

Similarly, the index should not be created for some columns. In general, the columns that should not be created have the following features: First, for those columns rarely used in the query or the references should not be created. This is because, since these columns are rarely used, there is no index that does not significantly change the query speed. Instead, due to an increase in index, it reduces the maintenance speed of the system and increases space requirements. Second, for those columns that only very little data values, should not be added. Since these columns are rare, such as the gender column of the personnel table, in the results of the query, the data line of the result set accounts for a large proportion of the data line in the table, that is, the proportion of data that needs to be searched in the table is very Big. Increasing indexes do not significantly speed up the retrieval speed. Third, for those columns defined as text, Image, and Bit data types should not increase index. Because the amount of data these columns is either quite large, it is very small. Fourth, the index should not be created when the modified performance is much greater than the retrieval performance. Modifying performance and retrieval performance is contradictory. When an index is added, the retrieval performance is improved, but the modified performance will be reduced. When the index is reduced, modify performance is improved, reducing retrieval performance. Therefore, the index should not be created when the modified performance is much greater than the retrieval performance. Creating an indexed method Create an index with a variety of methods, including the method of creating an index and indirectly created an index. Create an index can be used to use the CREATE INDEX statement or use Create Index Wizards; indirectly create an index can define primary key constraints or unique key constraints in the table, and also create an index. Although these two methods can create an index, they are different from the specific content of their creation indexes.

Direct creation is the most basic index creation method, which is the most flexible and can customize the index that meets yourself. When you create an index in this way, you can use many options, such as the fullness, ordering, sorting statistics, etc. of the data page, which can optimize the index. Using this method, you can specify the type, uniqueness, and compoundability of the index, that is, you can create a clustered index or create a non-clustered index; you can create an index on a column or in two Or create an index in more than two columns.

Indirect creation method is an index creation method with binding. The primary key constraint is a logic that holds data integrity, which limits the records in the table have the same primary key. When creating a primary key constraint, the system automatically creates a unique clustered index. Although logical primary key constraints are an important structure, in physical structures, structures corresponding to primary key constraints are unique clustered indexes. In other words, in the physical implementation, there is no primary key constraint, there is only a unique cluster index. Similarly, an index is also created at the same time when creating a uniqueness key constraint, which is a unique non-clustered index. Therefore, when the index is created, the type and feature of the index has been determined, and the room customized is relatively small.

When the primary key or unique key key constraint is defined on the table, if there is a standard index created using the CREATE INDEX statement, the index created by the primary key constraint or the unique key key constraint will overwrite the previously created standard index. That is, the priority of the index created by the primary key constraint or the unique key key constraint is higher than the index created using the CREATE INDEX statement. Index characteristics

There are two features of the index, that is, unique index and composite indexes.

Uniqueness Index ensures that all data in the index column is unique and will not contain redundant data. If there is already a primary key constraint or uniqueness key constraint, the SQL Server automatically creates a unique index when the table or modify the table. However, if uniqueness must be guaranteed, the primary key constraint or uniqueness key should be created, not to create a unique index. When creating a unique index, you should carefully consider these rules: When you create a primary key constraint or a unique key key constraint, SQL Server automatically creates a unique index; if data is included in the table, then when the index is created, SQL Server checks the redundancy of the data in the table; whenever the insert statement is inserted into data or use modification statements to modify data, SQL Server checks the redundancy of data, if there is a redundant value, SQL Server cancels the statement of the statement And return an error message; ensure that each row of data in the table has a unique value, which ensures that each entity can be unique; you can only create unique indexes on columns that can guarantee entity integrity, for example, Create a unique index at the name column in the personnel table, because people can have the same name. The composite index is an index creation in two columns or multiple columns. When searching, when two or more columns are used as a key value, it is best to create a composite index on these columns. When you create a composite index, these rules should be considered: up to 16 columns into a separate composite index, the total length of the column constituting the composite index cannot exceed 900 bytes, that is, the length of the complex column cannot be too long; In the composite index, all columns must come from the same table, can't cross-table establish a composite index; in the composite index, the order of the columns is very important, so it is necessary to raise the order of the column. In principle, the most unique column should first be defined, such as index on (COL1, COL2) and index on (col2, col1), because the order of the two indexes is different; in order to optimize query Use the composite index, the WHERE clause in the query statement must refer to the first column in the composite index; when there are multiple key columns in the table, the composite index is very useful; use the composite index to improve query performance, reduce it in one The number of indexes created in the table.

Index type

The index can be divided into two types depending on whether the order of the index is the same as the physical order of the data table. One is a clustered index of the physical order of the data table and the index order, and the other is a non-clustered index of the physical order of the data table and the index order. Architecture index

The structure of the index is similar to the tree structure, and the top of the tree is referred to as a leaf stage, and the other parts of the tree is called non-leaf-free, the root of the tree belongs to the non-leaf stage. Similarly, in the clustered index, the leaf-level and non-leaf stage of the clustered index constitute a tree structure, the lowest level of the index is a leaf-level. In the clustered index, the data pages in the table are the leaf stage, and the index page on the leaf-level is the non-leaf-level, the index page where the index data is located is a non-leaf-level. In the clustered index, the order in which the data value is always arranged in ascending order.

A column index should be created in a column that is often searched in the table or in columns accessed in order. When creating a clustered index, these factors should be considered: Each table can only have a clustered index because there is only one physical order in the table; the physical order of the table is the same, and the physical order of the index is the same. Create a clustered index before creating any non-clustered index, because clustered index changes the physical order of the line in the table, the data line is arranged in a certain order, and the order is automatically maintained; the uniqueness of the key value either use the Unique key The word is clearly maintained, or it is clearly maintained by an internal unique identifier. These uniqueness identifiers are the system yourself. Users cannot access; the average size of clustered indexes is about 5% of the data table, but actual cluster The size of the index is often varied according to the size of the index column; during the creation of the index, SQL Server temporarily uses the disk space of the current database. When creating a clustered index, it takes 1.2 times the size of the table space, so it must be guaranteed There is enough space to create a clustered index. When the system accesses the data in the table, you should first determine if there is an index on the corresponding column and whether the index is meaningful to the data to be retrieved. If the index exists and the index is very meaningful, the system uses the record in the index access table. The system starts to browse to data from an index, and index browsing begins from the root of the tree index. Starting from the root, the search value is compared to each key value, determine if the search value is greater than or equal to the key value. This step is repeated until a key value larger than the search value, or the search value is greater than or equal to all key values ​​on the index page.

Architecture of non-clustered index

The structure of the non-clustered index is also a tree structure, which is very similar to the structure of the cluster index, but there is also a significant difference.

In the non-clustered index, the leaf-level contains only a key value without including the data line. Non-clustered indexes represent the logical order of the line. Non-clustered indexes have two architectures: an architecture is to create a non-clustered index on a table without clustered indexes, and another architecture is a non-clustered index in a table with clustered indexes.

If there is no clustered index in a data table, the data table is also called a data heap. When the non-clustered index is created at the top of the data heap, the system uses the row identifier in the index page to record the records in the data page. The line identifier stores information in the location of the data. Data stacks are maintained by using the Index Assignment Map (IAM). The IAM page contains storage information of the cluster of the data stack. In System Table SysIndexes, there is a pointer to the first IAM page associated with the data heap. The system uses IAM pages to browse and find space that can be inserted into the new record line. These data pages and records in these data pages have no order and have no links together. A unique connection between these data pages is the order of records recorded in the IAM. When the non-clustered index is created on the data heap, the leaf level contains a row identifier to the data page. The line identifier specifies the logical order of the record line, consists of file ID, page number, and row ID. The identifiers of these rows maintain uniqueness. The order of the leaf-level pages of the non-clustered index is different from the physical order of data in the table. These key values ​​are maintained in ascending order in the leaf stage.

When the non-clustered index is created on the table with the clustered index, the system uses the cluster key to the clustered index in the index page. Cluster keys store the location information of the data. If a table has a clustered index, the non-clustered index of the non-clustered index contains a clustered key value mapped to the cluster key, rather than being mapped to the physical line identifier. When the system accesses the data in the non-clustered index, and this non-clustered index is created on the cluster index, it first comes from the non-clustered index to find a pointer to the clustered index, and then by using the cluster index Come and find data.

The non-clustered index is very useful when retrieving data in a variety of ways. When creating a non-clustered index, consider these situations: In the default, the index created is a non-clustered index; not more than 249 non-clustered indexes can be created in each table, and the cluster index is most Can only have one.

How to access data in the table, the data accessed in the system can use two methods: the first method is a table scan, which means that the system places the pointer on the data page where the header data is located, then According to the order of the data pages, a page scans all the data pages occupied by the table data from the previous page until all records in the table are scanned. When scanning, if you find a record that meets the query condition, then this record is selected. Finally, the records that all comply with the query statement conditions are displayed. The second method is to use an index lookup. Index is a tree structure in which a keyword is stored and a pointer to the data page containing the recording of the keyword. When using an index lookup, the system is in accordance with the indexed tree structure, according to the keyword and pointer in the index, finds a record of the query condition. Finally, the records that all find the query statement condition are displayed.

When accessing the data of the database in SQL Server, determine if the table has an index existence by SQL Server. If there is no index, then SQL Server uses a table scan method to access data in the database. The query processor generates the optimization execution plan for the query statement based on the distributed statistics to improve access data, and determine whether to use table scans or use indexes.

Index option

When creating an index, you can specify some options to optimize the performance of the index by using these options. These options include FillFactor options, PAD_INDEX options, and sorted_ data_reorg options.

Use the FillFactor option to optimize the performance of the inserted statement and modify the statement. When an index page is full, SQL Server must spend time to decompose this page so that the new record is ready to spatially. Using the FillFactor option, you can assign a certain percentage of free space on the leaf-level index page to reduce the decomposition time of the page. When creating an index in a table with data, you can use the FillFactor option to specify the percentage of each leaf-level index node, the default is 0, the value is equivalent to 100%. When creating an index, the internal index node always has a certain space, which is enough to accommodate a record in one or two tables. In a table without data, do not use this option when creating an index, because this option is not practical. In addition, the value of this option is specified in the creation, and it cannot be dynamically maintained, so it should only be used when the index should be created in the table with data.

The PAD_INDEX option also uses the numerical value of the FillFactor option to the internal index node, so that the fill of the internal index node is the same as the fill of the leaf-level index node. If you do not specify a FillFactor option, you separately specify the PAD_INDEX option is not practical because the value of the PAD_INDEX option is determined by the value of the FillFactor option.

When the cluster index is created, the sorted_ data_ revorg option clears sort, so it can reduce the time required to establish a clustered index. When you create or rebuild a clustered index on a table that has become a broken block, use the sorted_data_reorg option to compress the data page. This option is also used when needed to apply fill degree on an index. When using the sorted_ data_reorg option, these factors should be considered: SQL Server confirms if each key value is higher than the previous key value, if it is not high, you can't create an index; SQL Server requires 1.2 times the tablespace to physically reorganize data. Use the sorted_data_ revorg option to speed up the index creation process by clearing the sort process; physical copying data from the table; when a row is deleted, the space occupied by it; create all non-clustered indexes; if you want Pack the leaf-level page to a certain percentage, you can use the FillFactor option and sorted_data_ reorg option simultaneously. The indexed maintenance index is created, due to frequent operations, delete and modify the data, such that the index page is broken, so the index must be maintained.

Use the DBCC ShowContig statement to display the data of the table and the fragmented block information of the index. When executing the DBCC ShowContig statement, SQL Server browsses the entire index page on the leaf level to determine if the table or the specified index is severely broken. The DBCC ShowContig statement can also determine if the data page and the index page are full. When a large number of modifications to the table and add a lot of data, or the query is very slow, the DBCC ShowContig statement should be performed on these tables. When performing a DBCC ShowContig statement, these factors should be considered: When the DBCC ShowContig statement is executed, SQL Server requires the ID number of the specified table or the index ID number, the ID number of the table or the index can be obtained from the system table sysIndexes. It should be determined how long the DBCC ShowContig statement is used for a long time, and this time length should be fixed according to the activity of the table, every day, weekly or monthly.

Use the DBCC DBREINDEX statement to rebuild one or more indexes. DBCC DBREINDEX statements should also be performed when reconstructing the index and the primary key constraint or uniqueness key constraint on the table. In addition, executing the DBCC DBREINDEX statement can also re-organize the storage space of the leaf-level index page, delete the broken block, and recalculate index statistics. When using the DBCC DBREINDEX statement, these factors should be considered: According to the specified fill degree, the system re-filled each leaf-level page; rebuild the index of the primary key constraint or the unique key key using the DBCC DBREINDEX statement; use the sorted_data_reorg option to create more quickly Cluster index, if there is no rigor, the DBCC DBREINDEX statement cannot be used; the DBCC DBREINDEX statement does not support system tables. In addition, you can automatically use the Database Maintenance Planning Wizard to automatically reconstruct the index process.

The statistics are samples stored in column data stored in SQL Server. These data are typically used to index columns, but also creating statistics for non-index columns. SQL Server maintains a distribution statistics of an index key value, and uses these statistics to determine which index in the query process is useful. The optimization of the query relies on the distribution accuracy of these statistics. The query optimizer uses these data samples to determine whether to use the table scan or use an index. When the data in the table changes, SQL Server automatically modifies statistics. Index statistics are automatically modified, and the key values ​​in the index have changed significantly. The frequency of statistical modification is determined by the amount of data and data changes in the index. For example, if there is 10000 row data in the table, 1000 row data modifies, then statistics may require modification. However, if only 50 lines of records are modified, then current statistics are still maintained. In addition to automatic system automatic modification, users can also manually modify statistics by performing an UPDATE STATISTICS statement or a sp_updatestats system stored procedure. Using the Update Statistics statement can modify all indexes in the table or modify the specified index. Use the showPLAN and STATISTICS IO statements to analyze indexes and query performance. Use these statements to better adjust queries and indexes. The showPLAN statement displays every step of the query optimizer used in the connection table and indicates which index access data is used. Use the showPLAN statement to view the query plan for the specified query. When using the showPLAN statement, these factors should be considered: the output result returned by the SET SHOWPLAN_ALL statement is detailed than the output result returned by the SET Showplan_Text statement. However, the application must be able to process the output result returned by the SET Showplan_all statement; the information generated by the showPLAN statement can only For a session, if you reconnect SQL Server, you must re-execute the showPLAN statement. The Statistics IO statement indicates the number of input and outputs, which is used to return the result set and display the logic and physical I / O of the specified query. You can use this information to determine if the query statement should be rewritten or redesigned indexes. Use the Statistics IO statement to view I / O information to process specified queries.

Just like a showplan statement, the optimizer is hidden to adjust the query performance. Optimizer hides to provide smaller improvements to query performance, and if the index policy changes, this optimizer hide is useless. When using an optimizer hidden, consider the following rules: Specify the index name, when the index_id is 0, use the table scan when INDEX_ID is 1, for use index lookup; optimizer hide override the query optimizer, if the data or environment has occurred Change, you must modify the optimizer hide.

Index Adjustment Wizard

The Index Adjustment Wizard is a tool that analyzes a series of database query statements, providing recommendations using a series of database indexes to optimize the performance of the entire query statement. For query statements, you need to specify the following:

● Query statement. This is the amount of work that will be optimized. ● Contains databases of these tables. In these tables, an index can be created to improve query performance. ● Table used in the analysis. ● Constraint conditions considered in the analysis, such as the maximum disk space that can be used.

The amount of work referred to here can be from two aspects: the trajectory captured by SQL Server and the file containing the SQL statement. The Index Adjustment Wizard is always based on a defined workload. If a workload cannot reflect normal operation, it is recommended to use the index of the actual workload performance. The Index Adjustment Wizard is called the query analyzer, and all possible combinations are used to assess the performance of each query statement in this workload. It is then recommended to improve the index of the entire query statement performance throughout the workload. If you do not have the workload of the index adjustment wizard, you can create it immediately using the diagram. Once you decide to track the description of a normal database activity, the wizard can analyze this workload and recommendation to improve the index configuration of the database operating performance. After the index adjustment is analyzed, you can find a series of reports, you can also create the wizard to create the best index suggest, or make this work into a scheduled job, or generate an included creation These indexed SQL statements of the file.

The Index Adjustment Wizard allows you to select and create an ideal index combination and statistics for SQL Server databases without requiring a database structure, workload, or SQL Server to reach an expert's understanding. In short, the index adjustment wizard can do the following aspects:

● By using the Query Optimizer to analyze the workload of the query task, recommend an optimal index mixing method to a large amount of workload. ● Analyze the effects after the suggestion changes, including the usage of the index, the distribution of the interval query, and the work effect of the query in a large number of work. ● Recommend the method of adjusting the database for a small query task. ● Allow custom recommendations by setting advanced options such as disk space constraints, maximum query statements, and number of maximum numbers of each index.

Diagram

The diagram can grab the continuous image running in the server in real time, select items and events you want to monitor, including Transact-SQL statements, batch commands, usage, lock, security events, and errors. The diagram can filter these events and only displays the problem of users. You can use the same server or other server to repeat the recorded tracking events that have been recorded, and re-execute those commands that have been recorded. By centralizing these events, you can easily monitor and debug problems in SQL Server. Through the study, monitoring and debugging SQL Server issues is much simpler.

Query processor

The query processor is a multi-purpose tool that can do many work. In the query processor, various Transact-SQL statements can be interactively entered and executed, and in one window, you can view the Transact-SQL statement at the same time and its result sets; you can perform multiple transact-sql simultaneously in the query processor. The statement can also perform some statements in the script file; the query processor also provides a graphical analysis query statement to perform a plan, which can report the data retrieval method selected by the query processor, and adjust the query statement according to the query planning Execution, proposing to improve performance optimization index suggestions. This suggestion is just a suggestion of an index of a query statement, and can only improve query performance of this query statement.

The system creates a distribution page for each index, and statistics refers to the distribution information of the key value of one or more indexes stored in a table on the distribution page. When the query statement is executed, in order to improve the query speed and performance, the system can use these distribution information to determine which index of the table. The query processor is the statistics that depend on these distributions to generate the execution plan of the query statement. The extent of performing planning depends on the level of accurate steps of these distribution statistics. If the statistics of these distributions are very consistent with the physical information of the index, the query processor can generate high optimization execution planning. Conversely, if the information of the actual storage of the index is relatively large, the optimization of the execution plan generated by the query processor is relatively low.

The query processor extracts the distribution information of the index keyword from the statistics, except that the user can perform Update Statistics manually, the query processor can automatically collect statistics for these distribution information. In this way, it is possible to fully guarantee the latest statistics to use the latest statistics to ensure that the implementation plan has high optimization, and reduces maintenance needs. Of course, there are some restrictions on the execution plan generated using the query processor. For example, using execution plan can only improve the performance of a single query statement, but may generate a positive or negative impact on the performance of the entire system, so we should use the index adjustment wizard to use the index adjustment guide. in conclusion

In the previous SQL Server version, a query statement uses a single index for a table. In SQL Server 7.0, the index operation is enhanced. SQL Server now uses an index insertion and indexing algorithm to implement multiple indexes in a query statement. The shared line identifier is used to connect two indexes on the same table. If there is a clustered index in a table, there is a cluster key, then all non-clustered indexes of the table use the cluster key as a row locator instead of using a physical record identifier. If there is no clustered index in the table, the non-clustered index continues to use the physical record identifier to point to the data page. In both cases, the line locator is very stable. When the leaf node of the clustered index is separated, the non-clustered index does not need to be modified because the line locator is effective. If there is no cluster index in the table, the separation of the page will not occur. In the previous version, the non-clustered index uses the physical record identifier such as the page number and the line number as the locator of the row. For example, if the clustered index (data page) is decomposed, many record lines are moved to a new data page, so there are multiple new physical record identifiers. Then, all non-clustered indexes must be modified using these new physical record identifiers, which takes a lot of time and resources.

Index Adjustment Wizards are a good tool for skilled users or new users. Skilled users can use this wizard to create a basic index configuration and then adjust and customize on the basic index configuration. New users can use this wizard to quickly create optimized indexes.

转载请注明原文地址:https://www.9cbs.com/read-84242.html

New Post(0)