Optimization of a SQL statement

xiaoxiao2021-03-06  41

By Hellen

First, the SQL and access scheme before optimization

1. Original SQL:

Select * from (Select Row_Number () over (Order by Inner_Set.Order_Date DESC) AS Row_no, Inner_Set. * From

(Select ORDER_ID, ORDER_DATE

From

(Select ORDER_ID, ORDER_TYPE_ID, STATUS_ID, OWNER_ID, ORDER_DATE, CREATED_BY from ORDER_HEADER WHERE 1 = 1 And ORDER_TYPE_ID = 'Drpr_order' Order by Order_Date DESC) ORD

WHERE 1 = 1 and

(1 = 2 or EXISTS

(Select ORDER_ID from Order_Attribute Where Attr_Name = 'Order_Corporation_ID' AND (1 = 2 or attr_value = '10000' or attr_value = '11000') AND ORD.Order_ID = Order_ID)

Or exists (SELECT ORDER_ID from Order_attribute where attr_name = 'business_mode' and attr_value = 'bm_ddcg' and ord.order_id = order_id)))

And exists (select work_effort_id from work_effort where workflow_Activity_id = 'purchase_approve' and current_status_id = 'wf_running' and oddy_id = source_reference_id)

and exists (select oi.order_id from Order_Item oi, product prdt where 1 = 1 and oi.product_id = prdt.product_id and prdt.product_Name like '% Washing Machine%') ORDER BY ord.ORDER_DATE DESC) AS INNER_SET) AS OUTER_SET WHERE Outer_Set.Row_no> 0 and outr_set.row_no <= 15

2. Original statement SQL analysis:

figure 1

Second, the optimization process:

Analysis 1:

Analysis: Figure 1 WorkeffORT Table: Used TBSCAN, unused index, low efficiency,

Try Adex PK_EFFORT (Workflow_Activity_ID, Current_Status_ID)

Processing: SQL is unchanged, after the worryffort adds, SQL is explained below

Compared:

Work_effort index status

Find time

total cost

effectiveness

Unconcerned

122, 451.53

240, 639.38

low

Plus

120,614.60

238, 177.41

improve

Analysis 2: Order by Ord.Order_Date DESC

There are two places mentioned that it is obvious to repeat, remove the first Order by Ord.Order_Date DESC

Go off repeated order by

total cost

effectiveness

Go out

238, 177.41 after removal

158, 733.84

improve

Analysis 3:

Analysis: This query uses Select * from Order_Header WHERE ..., where statement adds a lot of judgment on the basis of the basic table Order_Header.

Processing: Take reduced the size of the basic table Order_Header. SQL is changed to:

Select * from (Select Row_Number () over (Order by Inner_Set.Order_Date DESC) AS Row_no, Inner_Set. * From

(Select ORDER_ID, ORDER_DATE

From

(Select Ord.order_ID, Order_Type_ID, ORD.STATUS_ID, OWNER_ID, ORDER_DATE, CREATED_BY

From Order_Header ORD, Work_Effort Effort

Where ord.order_type_id = 'drpr_order' and

Effort.Workflow_Activity_ID = 'Purchase_Approve' and current_status_id = 'wf_running'

And ord.order_id = Effort.Source_reference_id

TMP

WHERE 1 = 1 and

(1 = 2 or EXISTS (Select ORDER_ID FROM ORDER_ATTRIBUTE WHERE ATTR_NAME = 'ORDER_CORPORATION_ID' AND (1 = 2 or attr_value = '10000'

OR attr_value = '11000' or attr_value = '20000' or attr_value = '30000' or attr_value = '31000' or attr_value = '50000' OR

Attr_Value = '60000' or attr_value = '90000') And Tmp.Order_id = Order_ID)

OR EXISTS (SELECT ORDER_ID FROM ORDER_ATTRIBUTE WHERE ATTR_NAME = 'business_mode' and attr_value = 'bm_ddcg' and tmp.order_id = order_id)))

And exists (select oi.order_id from order_item oi, Product prdt where 1 = 1 and oi.product_id = prdt.product_id and prdt.Product_name Like '% Haier washing machine%')

Order by tmp.order_date dec) AS inner_set) as outr_set where outer_set.row_no> 0 and outr_set.row_no <= 15

Compared:

ORDER_HEADER processing

total cost

effectiveness

Original SQL

158, 733.84

low

Now SQL

146, 423.36

improve

Analysis 4:

Analysis: Analysis 3 SQL has repetitions for the conditions of Order_Attribute;

Processing: Modify SQL

SELECT *

From

(Select row_number () over (Order by Inner_Set.Order_Date

DESC) as row_no, inner_set. *

FROM

Select ORDER_ID, ORDER_DATE

From

(Select header.order_id, order_type_id, header.status_id,

Owner_id, order_date, created_by

From Order_Header Header, Work_Effort Effort

Where 1 = 1 and Header.Order_type_id = 'Drpr_Order' and

Workflow_Activity_ID = 'Purchase_Approve' and current_status_id = 'wf_running'

And Header.Order_id = Effort.Source_reference_id

) ORD

Where 1 = 1 and (1 = 2 or)

(Select ORDER_ID

From Order_Attribute

WHERE ((attr_name = 'order_corporation_id' and (1 = 2 OR)

Attr_Value = '10000' or attr_value = '11000' OR

Attr_Value = '20000' or attr_value = '30000' OR

Attr_Value = '31000' or attr_value = '50000' OR

Attr_Value = '60000' or attr_value = '90000')))

OR (attr_name = 'business_mode' and attr_value = 'bm_ddcg'

)) and ord.order_id = order_id)) And exists (SELECT

oi.order_id

From Order_Item Oi, Product PrDT

Where 1 = 1 and oi.product_id = prdt.product_id and prdt.product_name

Like '% Haier%')

Order by ord.order_date dec) AS inner_set) as outer_set

Where outri_set.row_no> 0 and outr_set.row_no <= 15

Compared:

SQL modification

total cost

effectiveness

Original SQL

146, 423.36

low

Now SQL

136, 035.56

improve

Analysis 5:

Increase the SQL execution efficiency in analysis 4 after the memory of this machine: 26, 783.24

Analyze the time of IXScan, Table ORDER_ITEM index lookup time 11,778 is extremely long, far higher than the lookup time of other tables. So analyze the table order_item. Table ORDER_ITEM has an index in Product_ID

Select oi.order_id from Order_Item Oi, Product PrDT

Where 1 = 1 and oi.product_id = prdt.product_id and prdt.product_name

There are 130,000 records in the Like '% Haier washing machine%', and the Product_ID has an index in Product_ID.

View the statistics of the Order_item table. It is seen that the statistics of the table are not updated.

deal with:

1. See the statistics of the Order_item table

DB2 => reorgchk current statistics on table db2inst1.order_Item

Table statistics:

F1: 100 * Overflow / Card <5

F2: 100 * Tsize / (FPAGES-1) * (TablePagesize-76)> 70

F3: 100 * NPAGES / FPAGES> 80

Creat Name Card OV NP FP Tsize F

1

Fly

2 f

3 REORG

-------------------------------------------------- ------------------------------

DB2INST1 Order_Item 138064 81 7116 7116 27888928 0 97 100 ---

-------------------------------------------------- ------------------------------

Index statistics:

F4: Clusterratio or normal CLUSTERFAACTOR> 80

F5: 100 * (Keys * (isize 8) * 4) / (NLEAF * Indexpagesize> 50

F6: (100-PCTFREE) * (ISIZE 12) ** (NLEVELS-2) * (Indexpages

Ize-96) / (keys * (isize 8) * 4) <100

Creator Name Card Leaf Lvls Isize Keys F

4

Fly

5

Fly

6 REORG

-------------------------------------------------- ------------------------------

Table: DB2Inst1.order_Item_Item

DB2INST1 Order_Item_HDR 138064 369 3 10 48245 97 81 53 ---

DB2INST1 Order_Item_ORTYP 138064 192 2 6 1 100 70 0 ---

DB2INST1 Order_Item_Product 138064 201 3 12 2465 43 71 101 * - *

DB2INST1 Order_Item_STTS 138064 193 2 17 26 88 69 0 ---

DB2Inst1 Order_Item_SYST 138064 192 2 6 1 100 70 0 ---

DB2INST1 PK_ORDER_ITEM 138064 783 3 15 138064 97 99 16 ---

-------------------------------------------------- ---------------------------- Clusterratio or normal ClusterFactor (F4) will indicate the index

REORG, the index is not in the same sequence. One or more indexes when multiple indexes are defined in the table

May be marked as needed. Specifies the most important index of the REORG order.

2. Reelate index:

REORG TAB2INST1.ORDER_ITEM INDEX DB2INST1.ORDER_ITEM_PRODUCT USE TEMPSPACE1

3. Make Statistics Runstats on Table DB2Inst1.Order_Item and INDEXES ALL

4. Run Run Reorgchk

Index statistics:

F4: Clusterratio or normal CLUSTERFAACTOR> 80

F5: 100 * (Keys * (isize 8) * 4) / (NLEAF * Indexpagesize> 50

F6: (100-PCTFREE) * (ISIZE 12) ** (NLEVELS-2) * (Indexpages

Ize-96) / (keys * (isize 8) * 4) <100

Creator Name Card Leaf Lvls Isize Keys F

4

Fly

5

Fly

6 REORG

-------------------------------------------------- ------------------------------

Table: DB2Inst1.order_Item_Item

DB2INST1 Order_Item_HDR 138086 366 3 10 48266 28 81 53 * -

DB2INST1 Order_Item_ORTYP 138086 193 2 6 1 100 69 0 ---

DB2INST1 Order_Item_Product 138086 202 3 12 2465 99 71 101 - *

DB2INST1 Order_Item_STTS 138086 194 2 17 26 83 69 0 ---

DB2Inst1 Order_Item_SYST 138086 193 2 6 1 100 69 0 ---

DB2INST1 PK_ORDER_ITEM 138086 726 3 15 138086 26 106 16 * -

-------------------------------------------------- ------------------------------

Clusterratio or normal ClusterFactor (F4) will indicate the index

REORG, the index is not in the same sequence. One or more indexes when multiple indexes are defined in the table

May be marked as needed. Specifies the most important index of the REORG order.

5. Rebind All database packages so that static SQL can take advantage of the latest system statistics.

DB2RBIND SPIDER81-L logfile.out -u db2inst1 -p ibmdb2

result:

Comparison: total cost

effectiveness

Before REORG

26, 783.24

After REORG

48, 721.15

Worse!

analysis:

Product is originally IXSCAN, now TBScan. Reorg may change the information related to the table.

The associated SQL is:

Select oi.order_id from Order_Item Oi, Product PrDT

Where 1 = 1 and oi.product_id = prdt.product_id and prdt.product_name

Like '% Haier washing machine%'

Processing: Create index product_prdname for field product_name on the Product table.

Analyze SQL

Analysis: The index is here, but the time 33, 794.18 and TBSCAN's 34, 156 have not improved significantly, why?

Run: Reorgchk Current Statistics on Table DB2Inst1.Product

Table statistics:

F1: 100 * Overflow / Card <5

F2: 100 * Tsize / (FPAGES-1) * (TablePagesize-76)> 70

F3: 100 * NPAGES / FPAGES> 80

Creat Name Card OV NP FP Tsize F

1

Fly

2 f

3 REORG

-------------------------------------------------- ------------------------------

DB2INST1 Product 103168 2 8688 8688 33839104 0 96 100 ---

-------------------------------------------------- ------------------------------

Index statistics:

F4: Clusterratio or normal CLUSTERFAACTOR> 80

F5: 100 * (Keys * (isize 8) * 4) / (NLEAF * Indexpagesize> 50

F6: (100-PCTFREE) * (ISIZE 12) ** (NLEVELS-2) * (Indexpages

Ize-96) / (keys * (isize 8) * 4) <100

Creator Name Card Leaf Lvls Isize Keys F

4

Fly

5

Fly

6 REORG

-------------------------------------------------- ------------------------------

Table: db2inst1.product

DB2INST1 PK_PRODUCT - - - - - - -

DB2Inst1 Product_PrDName - - - - - - -

-------------------------------------------------- ------------------------------

Clusterratio or normal ClusterFactor (F4) will indicate indexes require REORG, which is not in the same sequence as the base form. One or more indexes when multiple indexes are defined in the table

May be marked as needed. Specifies the most important index of the REORG order.

The result is surprising, why is the index no content?

Run Runstats on Table DB2Inst1.Product and Indexes All

Table statistics:

F1: 100 * Overflow / Card <5

F2: 100 * Tsize / (FPAGES-1) * (TablePagesize-76)> 70

F3: 100 * NPAGES / FPAGES> 80

Creat Name Card OV NP FP Tsize F

1

Fly

2 f

3 REORG

-------------------------------------------------- ------------------------------

DB2INST1 Product 103168 2 8688 8688 33839104 0 96 100 ---

-------------------------------------------------- ------------------------------

Index statistics:

F4: Clusterratio or normal CLUSTERFAACTOR> 80

F5: 100 * (Keys * (isize 8) * 4) / (NLEAF * Indexpagesize> 50

F6: (100-PCTFREE) * (ISIZE 12) ** (NLEVELS-2) * (Indexpages

Ize-96) / (keys * (isize 8) * 4) <100

Creator Name Card Leaf Lvls Isize Keys F

4

Fly

5

Fly

6 REORG

-------------------------------------------------- ------------------------------

Table: db2inst1.product

DB2INST1 PK_PRODUCT 103168 480 3 10 103168 95 94 35 ---

DB2INST1 Product_PrdName 103168 271 3 23 16107 53 76 48 * -

-------------------------------------------------- ------------------------------

Clusterratio or normal ClusterFactor (F4) will indicate the index

REORG, the index is not in the same sequence. One or more indexes when multiple indexes are defined in the table

May be marked as needed. Specifies the most important index of the REORG order.

Run SQL

Construction indexing of Product

total cost

effectiveness

Before index

48, 721.15

After the index

22, 207.1

It is also an increase in 26,783.24 of the analysis 5.

Analysis 6:

Analysis: Put the SQL of the Analysis 4 in the program, find that the number of records is actually the same as the product name and the product name query! Take a closer look at SQL, it is not associated with the Order_Header when ORDER_ITEM and PRODUCT are associated. Processing: Modify SQL as follows: Red word is new

SELECT *

From

(Select row_number () over (Order by Inner_Set.Order_Date

DESC) as row_no, inner_set. *

FROM

Select ORDER_ID, ORDER_DATE

From

(Select header.order_id, order_type_id, header.status_id,

Owner_id, order_date, created_by

From Order_Header Header, Work_Effort Effort

Where 1 = 1 and Header.Order_type_id = 'Drpr_Order' and

Workflow_Activity_ID = 'Purchase_Approve' and current_status_id = 'wf_running'

And Header.Order_id = Effort.Source_reference_id

) ORD

Where 1 = 1 and (1 = 2 or)

(Select ORDER_ID

From Order_Attribute

WHERE ((attr_name = 'order_corporation_id' and (1 = 2 OR)

Attr_Value = '10000' or attr_value = '11000' OR

Attr_Value = '20000' or attr_value = '30000' OR

Attr_Value = '31000' or attr_value = '50000' OR

Attr_Value = '60000' or attr_value = '90000')))

OR (attr_name = 'business_mode' and attr_value = 'bm_ddcg'

)) and ord.order_id = order_id)) And exists (SELECT

oi.order_id

From Order_Item Oi, Product PrDT

Where 1 = 1 and oi.product_id = prdt.product_id and prdt.product_name

Like '% Haier%' and oi.order_id = ord.order_id)

Order by ord.order_date dec) AS inner_set) as outer_set

Where outri_set.row_no> 0 and outr_set.row_no <= 15

SQL access map:

total cost

effectiveness

Before the change

22, 207.1

After the change

6,819.9

Analysis of the access map: Obviously, the index made by the ProductName of the Product table is useless, consider removing the ProductName index, and the total cost of SQL is unchanged. So remove the InductName index.

Final processing method:

1. Work_effORT Adex: PK_EFFORT2. SQL is the SQL of Analysis 6

Suggestions for optimization:

1. Try to reduce the use of Order by, Exsist

2. Construction index where a = b on two table connections

3. For WHERE (a = b or c = d) and E = f case, the index of the E and F after And

4. When the expression is changed, the necessary REORG and RUNSTATS are required.

Contact me:

Mousehellen@hotmail.com

转载请注明原文地址:https://www.9cbs.com/read-75392.html

New Post(0)