Author: Allan (allan@itpub.net)
Oracle Database often encounters a high CPU utilization, this time is mostly a SQL statement in the database. This SQL statement has greatly consumed CPU resources, causing the entire system. Of course, the reason for the SQL statement that causes severe performance is multifaceted, and specific reasons should be analyzed. The following is how to diagnose and solve the high CPU utilization.
Operating system: SOLAIRS8
Database: Oracle
9.2.0
.4
Problem Description: Field engineers report the database very slow, almost all application operations are not working properly.
First land the host, execute the TOP to find that the CPU resources are almost exhausted, there are many processes that occupy the CPU, and the memory and I / O are not high, the specific:
Last Pid: 26136; Load Averages: 8.89, 8.91, 8.12
216 Processes: 204 Sleeping, 8 Running, 4 On CPU
CPU States: 0.6% iDLE, 97.3% User, 1.8% Kernel, 0.2% iowait, 0.0% SWAP
Memory: 8192m Real, 1166M Free, 14M Swap in Use, 8179M Swap Free
Pid Username THR PRI NICE SIZE
RES
State
Time CPU Command
25725 Oracle 1 50 0 4550M 4508M CPU2
12:23
11.23% Oracle
25774 Oracle 1 41 0 4550M 4508M Run
14:25
10.66% Oracle
26016 Oracle 1 31 0 4550M 4508M RUN
5:41
10.37% Oracle
26010 Oracle 1 41 0 4550M 4508M Run
4:40
9.81% Oracle
26014 Oracle 1 51 0 4550M 4506M CPU6
4:19
9.76% Oracle
25873 Oracle 1 41 0 4550M 4508M Run
12:10
9.45% Oracle
25723 Oracle 1 50 0 4550M 4508M Run
15:09
9.40% Oracle
26121 Oracle 1 41 0 4550M 4506M CPU0
1:13
9.28% Oracle
So first check the alarm log Alert file for the database, there is nothing to find out, the log shows that the database is running normally, and the rule of the database itself exists.
Then check what the Oracle process is doing this occupied CPU resources, use the following SQL statement:
SELECT SQL_TEXT, SPID, V $ Session.program, Process from
V $ SQLAREA, V $ SESSION, V $ Process
Where v $ sqlarea.address = V $ session.sql_address
And v $ sqlarea.hash_value = v $ session.sql_hash_value
And v $ session.paddr = v $ process.addr
And v $ process.spid in (pid);
Use the PID in the PID in TOP to use the PID in the PID in the CPU, get the SQL statement executed by the corresponding Oracle process, found that the process that occupies the CPU resource is executing the same SQL statement: SELECT D.DOMAINNAME, D .mswitchdomainid, a.serviceid, a.Servicecode, A.USERTYPE, A.STATUS, A.NotifyStatus, to_char (A.Datecreated, 'YYYY-MM-DD HH24: MI: SS') Datecreated, VIPFLAG, STATUS2, CUSTOMERTYPE, CUSTOMERID FROM service a, gatewayloc b, subbureaunumber c, mswitchdomain d WHERE b.mswitchdomainid = d.mswitchdomainid and b.gatewaysn = c.gatewaysn aND a.ServiceCode like c.code || '%' and a.serviceSpecID = 1 and a .status! = '4' and a.status! = '10' and a.Servicecode Like '010987654321%' and subsidiaryid = 99999999
Basically, this SQL has caused a large number of system CPU resources. What is the reason for this SQL such a large amount of CPU resources, let's take a look at the database's process waiting for the event:
SQL> SELECT SID, Event, P1, P1text from V $ session_wait;
Sid Event P1 P1Text
---------- -------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------
12 latch free 4.3982e 12 Address
36 Latch Free 4.3982E 12 Address
37 latch free 4.3982e 12 address
84 Latch Free 4.3982E 12 Address
102 latch free 4.3982e 12 address
101 Latch Free 4.3982E 12 Address
85 latch free 4.3982e 12 address
106 Latch Free 4.3982E 12 Address
155 Latch Free 4.3982E 12 Address
151 Latch Free 4.3982E 12 Address
149 Latch Free 4.3982E 12 Address
147 Latch Free 4.3982E 12 Address
1 PMON Timer 300 Duration
From the above query we can see that most of the Latch Free is waiting for events, and then check what these Latch wait is generated:
SQL> SELECT SPID FROM V $ Process Where Addr in
(SELECT PADDR from V $ Session WHERE SID IN (84, 102, 101, 106, 155, 151));
Spid
----------------
25774
26010
25873
25725
This shows that Latch Free wait for the event that the SQL statement above is waiting, which takes up a lot of CPU resources. Let's take a look at the waiting type of Latch waiting, according to the following SQL statement: SQL> SELECT LATCH #, Name, Gets, Misses, Sleeps
From v $ latch
Where Sleeps> 0
ORDER by Sleeps;
Latch # name Gets Misses Sleeps
---------- -------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------
15 Messages 96876 20 1
159 Library Cache Pin Allocation 407322 43 1
132 DML Lock Allocation 194533 213 2
4 session allocation 304897 48 3
115 redo allocation 238031 286 4
17 enqueue hash chains 277510 85 5
7 session idle bit 2727264 314 16
158 Library Cache Pin 3881788 5586 58
156 Shared Pool 2771629 6184 662
157 Library Cache 5637573 25246 801
98 Cache Buffers Chains 1722750424 758400 109837
From the above query, you can see that the most important Latch wait is Cache Buffers, this latch Waiting indicates that data stocks compete in separate blocks These Latch, let's see this lative lative Latch and its corresponding type:
SQL> SELECT Addr, Latch #, Gets, Misses, Sleeps
From v $ latch_children
Where Sleeps> 0
And latch # = 98
ORDER by Sleeps DESC;
AddR Latch # gets miss Sleeps
-------------------------------------------------------------------------------- --------
000004000A3DFD10 98 10840661 82891 389
000004000A698C70 98 159510 2 244
0000040009B21738 98 104269771 34926 209
0000040009B227A8 98 107604659 35697 185
000004000A3E0D70 98 5447601 18922 156000004000A6C2BD0 98 853375 7 134
0000040009B24888 98 85538409 25752 106
...............
Then we come to see more than SLEEP more objects:
SQL> Select Distinct A.OWNER, A.SEGMENT_NAME, A.SEGMENT_TYPE FROM
DBA_EXTENTS A,
(Select DBARFIL, DBABLK
From x $ bh
WHERE HLADDR IN
(SELECT ADDR
From (SELECT ADDR
From v $ latch_children
Order by Sleeps DESC)
WHERE ROWNUM <5)) B
WHERE A.RELATIVE_FNO = B. DBARFIL
And a.block_id <= b.dbablk and a.block_id a.blocks> b.dbablk;
Owner segment_name segment_type
-------------------------------------------------- -------------------------
Test i_service_servicespecid index
Test i_service_subsidiaryid Index
Test Service Table
Test mswitchdomain table
TEST i_SERVICE_SC_S INDEX
.......................
We see several objects in the SQL statement in the beginning of the SQL statement, so let's take a look at the beginning of the SQL execution plan:
SQL> Set Autotrace TRACE EXPLAIN
SQL> Select D. DomainName, D. MswitchDomainid, A.ServiceId, A.Servicecode, A.USERTYPE, A.STATUS, A.NOTIFYSTATUS, TO_CHAR (A.DATECREATED, 'YYYY-MM-DD HH24: MI: SS') DATECREATED, VIPFLAG, STATUS2, CUSTOMERTYPE, CUSTOMERID FROM service a, gatewayloc b, subbureaunumber c, mswitchdomain d WHERE b.mswitchdomainid = d.mswitchdomainid and b.gatewaysn = c.gatewaysn aND a.ServiceCode like c.code || '%' And A.Servicespecid = 1 and A.Status! = '4' and a.status! = '10' and a.Servicecode Like '010987654321%' and subsidiaryid = 999999999; Execution PLAN
-------------------------------------------------- ------------
0 Select Statement Optimizer = Choose
1 0 NESTED LOOPS
2 1 nested loops
3 2 NESTED LOOPS
4 3 Table Access (Full) of 'SubbureAunumber'
5 3 Table Access (by index rowid) of 'GatewayLoc'
6 5 Index (Unique Scan) of 'PK_GATEWAYLOC' (UNIQUE)
7 2 Table Access (by index rowid) of 'mswitchdomain'
8 7 Index (Unique Scan) of 'PK_MSwitchDomain' (Unique)
9 1 Table Access (by Index Rowid) of 'Service'
10 9 and-equal
11 10 Index (Range Scan) of 'i_service_serviceSpecid' (Non
-Unique)
12 10 INDEX (Range Scan) of 'i_service_subsidiaryID' (Non-
Unique)
Depending on the beginning of the object to cause the Latch Free waiting in the Latch Free waiting, I feel that the index on the service table has problems, and it seems to have too many scans, so the same SQL statement is the same in other land. Execute on the database to view the appropriate execution plan:
SQL> Set Autotrace TRACE EXPLAIN
SQL> Select D. DomainName, D. MswitchDomainid, A.ServiceId, A.Servicecode, A.USERTYPE, A.STATUS, A.NOTIFYSTATUS, TO_CHAR (A.DATECREATED, 'YYYY-MM-DD HH24: MI: SS') DATECREATED, VIPFLAG, STATUS2, CUSTOMERTYPE, CUSTOMERID FROM service a, gatewayloc b, subbureaunumber c, mswitchdomain d WHERE b.mswitchdomainid = d.mswitchdomainid and b.gatewaysn = c.gatewaysn aND a.ServiceCode like c.code || '%' And A.Servicespecid = 1 and A.Status! = '4' and a.status! = '10' and a.Servicecode Like '010987654321%' and subsidiaryid = 999999999; Execution PLAN
-------------------------------------------------- ------------
0 Select Statement Optimizer = Choose
1 0 Table Access (By Index Rowid) of 'Service'
2 1 nested loops
3 2 NESTED LOOPS
4 3 Nested Loops
5 4 Table Access (Full) of 'SubbureAunumber'
6 4 Table Access (By Index Rowid) of 'Gatewayloc'
7 6 INDEX (UNIQUE SCAN) of 'PK_GATEWAYLOC' (UNIQUE)
8 3 Table Access (by index rowid) of 'mswitchdomain'
9 8 index (unique scan) of 'pk_mswitchdomain' (unique)
10 2 INDEX (RANGE SCAN) of 'i_service_sc_s' (non-unique)
Compare two execution plans, find index i_service_servicespecid and i_service_subsidiaryid should not be taken, so compare the number of indexes on both local service tables:
SQL> SELECT INDEX_NAME from User_indexes where Table_Name = 'Service'
Index_name
------------------------------
I_service_accountnum
I_service_cid
I_service_dateactivated
I_service_pricepland
I_service_sc_s
I_Service_ServiceCode
I_service_servicespecid
I_service_subsidiaryidID
PK_SERVICE_SID
SQL> SELECT INDEX_NAME from User_indexes where Table_Name = 'Service'; Index_Name
------------------------------
I_service_accountnum
I_service_cid
I_service_dateactivated
I_service_sc_s
I_Service_ServiceCode
PK_SERVICE_SID
Discover how many I_Service_PricePlanid, i_service_servicespecid, i_service_servicespecid, i_service_servicespecid, i_service_servicespecid, i_service_servicespeiRiD, i_service_service_service_servicespecid, i_service_service, and these indexes have caused the SQL statement to use the index that should not be used, causing the Latch Free waiting and CPU occupied A very high culprit, so I delete the three indexes, re-executing the corresponding SQL statement, soon, the result is that the utilization rate of the CPU will fall immediately as normal, and the observations are as follows:
Last Pid: 26387; Load Averages: 1.61, 1.38, 1.21
195 Processes: 194 Sleeping, 1 On CPU
CPU States: 96.2% iDLE, 1.6% User, 1.7% Kernel, 0.5% iowait, 0.0% SWAP
Memory: 8192m REAL, 1183M Free, 14M Swap in Use, 8179M Swap Free
Pid Username THR PRI NICE SIZE
RES
State
Time CPU Command
26383 Oracle 1 59 0 4550M 4506M Sleep 0:12 4.52% Oracle
409 root 15 59 0 7168K 7008K Sleep 173.1H 0.53% PICLD
25653 Oracle 1 59 0 4550M 4508M SLEEP
2:12
0.48% Oracle
26384 Root 1 59 0 2800K 1912K CPU2 0:00 0.21% TOP-3.5B8-Sun4u
25569 Oracle 1 59 0 4550M 4508M Sleep 0:12 0.09% Oracle
25717 Oracle 1 59 0 4550M 4507M Sleep 0:07 0.05% Oracle
25571 Oracle 1 59 0 4550M 4507M Sleep 0:10 0.04% Oracle
25681 Oracle 1 59 0 4550M 4508M Sleep 0:10 0.04% Oracle
25544 Oracle 1 58 0 4554M 4501M Sleep 0:14 0.03% Oracle
25703 Oracle 1 59 0 4550M 4506M Sleep 0:23 0.03% Oracle
..................
For the case where the CPU utilization is too high, if the SQL statement is relatively low, it can be diagnosed and solved according to this idea. Of course, the specific problem has to be analyzed, there are many ways to solve the problem, but here is Throwing the jade, as long as it can eventually meet the purpose of our solving the problem.
About the Author:
Photo Temporary Net Name COOLYL
9CBS Emag Oracle Electronics Magazine Editor's ITPUB Oracle Management Master. Good at the maintenance of the database, has its own unique experience for the installation, adjustment, and backup of the database. At the same time, there is also a major training for large companies in China, and there is a certain training experience. Database maintenance and support work for many large projects has been done, which has considerable practical experience in Oracle's maintenance, and is good at solving problems on site. He has worked for a large software company to do Oracle databases, customers all over the country, especially telecommunications, and government industries. He is currently based in North China Branch, DBA, responsible for the maintenance of more than 40 database systems in North China, rich in large database management experience. The editor of the "Oracle Database DBA Special Technology" and the main author.
Mail address: allan@itpub.net