SELECT Returns the order of records
ZTE Chongqing Institute tour waves Wu Yonghong
Keywords: Select, order, optimization, backup, scanning, index
Thesis:
When we execute the SELECT statement, the order of records returned by Select has a big impact on our programming, and has a big relationship for database recording backup and SQL performance optimization. Therefore, it is necessary to clarify the order of SELECT returns the record. This article discusses the order in which Oracle / Sybase / SQL Server returns the record of Oracle / Sybase / SQL Server, discusses the features of three databases, and focuses on these differences on data query and record backup.
Abbreviate language:
IAM: INDEX Allocation Map
PFS: Page Free Space
1 Introduction
When we execute the SELECT statement, the order of records returned by Select has a big impact on our programming, and has a big relationship for database recording backup and SQL performance optimization. Therefore, it is necessary to clarify the order of SELECT returns the record.
Select Returns the order of records is a big relationship with the database type, so the following is discussed separately by the database type. This paper mainly discusses the order of Oracle / Sybase / SQL Server returns records, discusses the features of three databases from the principle, and focuses on these differences on data query and record backup.
2. Oracle
The following assumes that the database query optimization method is Rule-based approach, Oracle uses two ways of access to the table:
a full table scan (Full Table Scan)
The full table scan is sequentially accessing each record in the table. Oracle uses a multi-data block (Database Block) to optimize full mete scan.
b. Access to the RowID
You can use ROWID-based access methods, improve access table efficiency, and RowID contains physical location information recorded in the table. Oracle uses indexing (index) to implement the link between data and storage data. Usually the index provides a way to quickly access the ROWID, so those queries based on the index column can be improved. It is usually manifested as scanning by an index. (Index Scan)
2.1 full table scanning
If the SELECT statement cannot use an index, Oracle reads the data block in full table scan, and returns the record for the returned result set, Oracle press RowID. So SELECT * from myTable and Select * from myTable Order by RowID effect is the same.
You can get the RowID pseudo column by Select Rowid from Table, the data type is the RowID type. Returning using the query statement is the extended RowID of the ROWID. The ROWID in the extended format consists of 18 characters. These 18 characters can be divided into 4 groups in the format of oooooo.fff.bbbbb.ss. Replica represents data object numbers, data file numbers, data block numbers, block numbers, records, or blocks of the pieces.
It must be explained that the larger the rear recorded ROWID is, and the record RowID that is inserted is also small. The two arguments are given below:
1. The line number of the post-inserted record block may be large or small
According to our trial, it is assumed that there are three records in the table that assume the file number. According to the block number, the line number is arranged as follows:
108 0
108 1
108 2
After deleting a middle record, get
108 0
108 2
Add a record, you may get
108 0
108 1 <--- Newly added record
108 2
May also be
108 0
108 2
108 3 <--- Newly added record
Both situations may occur, depending on the allocation algorithm within the Oracle block. See Document 2 on the more in-depth analysis of this situation.
2. The block number of the post-inserted record may be large, it is possible
The block number inserted into the record is not linear, but is controlled by FreeList. See Document 1 for the theory and algorithms of FreeELIST.
Therefore, the following conclusions can be drawn for full mete scans:
1. Select * from myTable in Oracle does not guarantee that the return of the record is in the order of insertion, but press the RowID order.
The order of ROWID is consistent with the "physical order" stored in the record line. In the case where there is no index, SELECT is a full mete scan, which is "physical order". At this time, SELECT returns to record the fastest in "Physical Order".
2. For records that have been inserted, its ROWID does not change.
If the full surface scan mode, ROWNUM directly uses ROWNUM as the selection condition, according to the conclusion 1, the records obtained twice may be different. If SQL has time condition or other condition as a SQL statement assists (discharges the currently inserted value), then ROWNUM is used as the selection condition, the order returned and recorded is the same.
Conclusion 2 The characteristics can be used in the clear-backup mechanism for some log tables. For some log tables in order to improve INSERT performance, there may be no index, and these log tables are cleared and backed up during the stored procedure. Use the Insert Into Select first to select some records into the backup table, and then delete the records in the log table with the DELETE statement. Control the number of rows of operations through ROWNUM, avoid returning segments, through time conditions to implement conclusions 2, to ensure the same record.
2.2 Scan by index
For a range of in-range in accordance with the index, the scan of the index leaf node is manifested inside Oracle, and the index node is usually sorted and there is a pointer between the leaf nodes for easy scans. Since SELECT is scanned by the index, the returned record is arranged in "index".
With the above features, the following can be applied to the following applications:
1. The returned record can be sorted by the index.
Use an index in Oracle to make the returned records to be sorted without using the order by. For different sorting methods, different indexes can be done, and the index can be controlled by different scanning methods by hint / * * / indication to achieve different effects. Such as / * index_name * / or / * index_desc (Table index_name) * / indicator Press index to scan or press index descending scan to implement the return records in order of field or in descending order.
For example, there is index index_a, and there is index B on A on A.
Select * from t, record
AB194321131058112
SELECT / * INDEX (T Index_a) * / * from T where A> 0 or
Select * from t where a> 0 ORDER BY A
AB310581121943211
From the execution plan, press the index scan and press the index RowID method.
SELECT / * INDEX_DESC (T Index_B) * / * from T where b> 0 or
Select * from t where b> 0 Order by B
AB211112583101943
From the execution plan, press the index scan and press the index RowID method.
2. By using the fields such as time, the water number, etc., the record implementation can be returned in the order of insertion, and the above-described characteristics can be used to explain the backup problems in Second.1. When the log list has an index, select the index field that define the scan range, so that the post-inserted record is behind, such as time or water numbers, etc., which guarantees INSERT and DELETE when the ROWNUM is controlled. The operation of the operation is exactly consistent, while the index-based scan guarantees SQL performance.
3.sybase
Regardless of whether you have used the index behind WHER, Sybase may be adjusted based on the use of the index. Even if there is no WHERE statement, it is possible to use an index, even if there is a WHERE statement, it may not be indexed. Of course, if the table itself has not created any index, it will definitely not use the index.
3.1 Table without an index
A table without an index is called a stack of tables. A pile table has a corresponding record in the sysIndexes table whose Indid = 0. The first page represents the homepage of the stack of tables, root represents the tail of the stack table. All data pages in the stack form form a two-way linked list from sysindex.first <-> sysindex.root.
For insert records, all data plugged into the stack table will be added to the tail of the table. Sybase uses the Indid (= 0) and root value of the sysindex table to find the last data page for the table. If there is space on this page, a new record line is inserted at the end of the data. If there is no space available on the last page, if there is a space available in the expansion unit, this is used; if the last page is already the last page of the expansion unit, start using a new The expansion unit, always chain to the tail of the linked list, while updating the value of sysindex.root.
For record deletion, when a record is deleted, the page is followed by the recording after the recording is moved to the front of the page, and all unused space remains adjacent to the bottom of the page. This page will be detached from the data chain of the stack in all rows in a page.
For updates, the stack table is followed by the following principles:
• If the length of the line does not change, it is directly updated on the original line, and there is no mobile within the page.
· If the length of the line changes, and the idle space of the page is sufficient. The line is still the same location on the page, but other rows will be moved up or down to hold the continuous connection of the page.
· If this page cannot accommodate rows. In the AllPages-Locked list, the guild will be deleted, and the "new" line is inserted into the last page. In the Data-Only-LockedThe list, the line is inserted into another page, and the steering pointer is used in the original position to refer to the page, so that the ID position of the guarantee is unchanged.
For scanning, press the SysIndex.First <-> sysindex.root Link Method to read the data page.
For a stack, according to the above insertion, deletion, update, scanning characteristics, the following conclusions can be obtained:
1. For a list of non-any indexes, if you do not use Update, or make sure that Update does not generate an insert, you can use SELECT to complete the natural sort, and record it in the order of insertion.
3.2 Table with an index
For the Sybase execution plan without an indexed table, SELECT returns the order of the record and the sequence returned by the stack scan.
For tables that Sybase execute the planned index, SELECT returns a record in the order of the index field. Sybase will index the index to B tree. Each page in the index contains a page, and the top is followed by the index line. Each index line contains a key value and a pointer to the lower level page or data row. Each page of the index is called an index node. The top node of the B tree is called the root node. The underlying node of the index is called a leaf node. The page links in each level index are in the two-way link list. For tables with indexes, get the following conclusions:
1. Control the query mode by controlling the index, thereby controlling the return order.
As we can specify the use of an index by (index index_name) to reach the search index_name. You can also use (index 0) to indicate forced to use the index, so that the returned recording order is in a stack of tables.
2. How to not force the specified index, whether you use an index in WHER in WHER, Sybase may be adjusted based on the use of the index. Since Sybase is adjusted to the use of the index based on the cost execution plan, it is not possible to use the non-clustered index to complete the natural sorting of the recording. At this time, it is best to add the Order By to ensure the accuracy of sorting.
3. If the fields need to be sorted is the clustered index, you can use the index to complete the sorting. At this time, whether the plan is performed, Sybase returns a record in the clustered index character sequence. For clustered index tables, when inserted into data, the movement of the page inside the record (value larger record) is caused by moving Sybase to ensure that the physical order of the data is consistent with the cluster index order.
4.MS SQL Server
SQL Server may be adjusted based on the use of the index in your SELECT statement. Even if there is no WHERE statement, it is possible to use an index, even if there is a WHERE statement, it may not be indexed. Of course, if the table itself has not created any index, it will definitely not use the index.
4.1 Table without index
Table without index is called a stack or a stack. A pile uses an IAM management expansion panel, and multiple IAM forms an IAM chain. The stack has a row in sysindexes, whose indid = 0. The sysindexes.firstiam column points to the IAM page of the IAM page chain, and the IAM page link management is assigned to the spatial space. SQL Server 2000 uses IAM pages to browse in a pile. The data pages and rows in the pile of collections do not have any specific order, nor is it linked together. The only logical connection between the data pages is the connection recorded in the IAM page.
For insert operation, when SQL Server 2000 needs to insert a new row and the current page does not have a free space, it uses IAM and PFS pages to find pages with sufficient space to accommodate the row. SQL Server uses the IAM page to find the extension panel assigned to the object. For each extension panel, SQL Server searches for a PFS page to see if there is a page with enough space to accommodate this line.
SQL Server is only assigned a new extension panel to the object when you quickly find a page with enough space to accommodate the positive inserted row in an existing expansion panel. SQL Server uses the proportional allocation algorithm to allocate the expansion panel from the available extension panel within the file group. If a file group has two files, one of the available space is twice the other, then each page is assigned a page from the latter, and assigns two pages from the former. This means that each file in the file group should have an approximate space use percentage.
For delete operations, in the stack table, even if the record is deleted, the page is not moving in the page.
For data updates, SQL Server can use a variety of ways. The update may be on-site, or it may be done in the way that first deletes and inserts, can also be managed by querying the processor or storage engine. However, in the stack table, always use the field update method, and the SQL Server 2000 uses steering pointer to ensure that the updated page will be used to ensure that the update is unchanged. You can find a table scan or serial read by scanning the IAM page to find the extended panel of the page that accommodates this pileset. Because IAM represents them in the order in which the expansion panel is present in the data file, this means that the serial stacking scan will be performed along each file.
According to the insertion, update, deletion, scanning principle of the above-mentioned stacks, the following conclusions can be obtained:
1. Setting the scanning order using the IAM page means that the rows in the stack is generally returned in the order inserted.
2. For the already existing record, the location of the record (database number, file number, page number, line number) will not change.
Conclusion 2 can be applied to the backup - clear mechanism. If the log table is a listless list without an index, you can exclude the currently inserted records through the fields such as time, water number, so that the result set and sequence of SELECT and DELETE returns completely, and then control each operation through SET ROWCOUNT. The number of records that make the backup-clear operation can be safely.
4.2 Table with an index
For the SQL Server execution plan without an indexed table, the order of select returns the order of the record and the sequence returned by the stack scan.
For tables that SQL Server perform a planned index, SELECT returns a record in the order of the index field. SQL Server will index the index to B tree. Each page in the index contains a page, and the top is followed by the index line. Each index line contains a key value and a pointer to the lower level page or data row. Each page of the index is called an index node. The top node of the B tree is called the root node. The underlying node of the index is called a leaf node. The page links in each level index are in the two-way link list.
For tables with indexes, get the following conclusions:
1. You can control the query mode by controlling the index, thereby controlling the return order.
If we can specify the use of an index with the use of with (index_name) to reach the index ingx_name.
2. How to do not force the specified index, whether you use an index in WHER in WHER, SQL Server may be adjusted based on the use of the index, even if there is no WHERE statement, even if there is a WHERE statement, it is possible without indexing . SQL Server may be adjusted based on the use of indexes in WHEREE, and SQL Server may be adjusted based on the use of the index, even if there is no WHERE statement, it is possible to use indexes, even if there is a WHERE statement may not have to index. SELECT and DELETE execution plans with the same WHERE statement are likely to be different.
Therefore, SELECT and DELETE have not consistently inconsistent, and if you want to select the previous N record, the resulting record set although the number of strips is consistent but the content is inconsistent. Although we can force the SELECT to use with the index by with (index_name)), but DELETE cannot force the specified index because delete involves deletion of the index itself.
In this case, if the performance of the database is good, there is not much data to be backed up, do not use set rowcount to control the number of strips. However, if you need to control the number of removable clauses, you can control a smaller range in the WHERE condition, such as the time range is controlled to hours, and the day data is backed by 24 hours.
Either DTS is backed up.
3. If the fields need to be sorted is the clustered index, you can use the index to complete the sorting. At this time, whether the plan is executed, SQL Server returns a record in the clustered index pavement. References and information:
1. "Oracle Freelist and HWM Principles and Related Performance Optimization", travel waves
2. "Inquiry on the storage and reorganization of data in block", http://www.itpub.net
3. "How to extract records in physical order?", Http://www.itpub.net
4. "How to find the last line of a table? Physical Insertion Order ", http://www.itpub.net
5. "Oracle 9i for Windows NT / 2000 Data System Training Course", Tsinghua University Press
6. "Microsoft SQL Server 2000 Technology Insider", Peking University Press
7. "Heaps of Data: Tables without Clustered Indexes"
The above article can be found on my Blog website http://blog.9cbs.net/youbo2004.