About the Author:
Zhang Lezhen, usually used the net name to kamus, and has used SERAPHIM, now serves in a large software company in Beijing, Oracle Database DBA, mainly responsible for the core trading system database management and maintenance of the securities industry.
Eat care of Oracle technology and related operating system technology, out of the major database technology forums, currently China's largest Oracle technology forum www.itpub.net database management version of the Lord.
My email address: kamus@itpub.net
My msn: kamus@hotmail.com
I hope to discuss Oracle related technologies with friends.
All articles, no permission, no reprint.
table of Contents
First, preface ... 4
Second, the idea ... 4
Third, VMSTAT scripts and steps ... 4
1. Install Statspack. 4
2. Create a Stats $ VMSTAT Table ... 4
3. Create a VMSTAT directory ... 6
4. Create get_vmstat.ksh scripts ... 6
5. Create Run_VMSTAT.KSH scripts ... 8
6. Create a crontab job, perform Run_VMSTAT.KSH scripts in schedul ... 9
7. Analyze data ... 9
1) Abnormal report ... 9
2) Hourly Trend Report ... 13
3) Week Trend Report ... 14
4) Long-term trend report ... 14
Fourth, use Excel to generate trend graphs ... 15
Five, reference information ... 15
I. Introduction
As the Oracle database administrator in the UNIX system, the performance of the monitoring operating system is undoubtedly very important, fortunately, the UNIX system provides a large number of monitoring commands, such as VMSTAT, IOSTAT, SAR, TOP, etc., these monitoring commands The character interface output results, coupled with the powerful analysis function of the shell in the UNIX system, so we only need to write some scripts to implement automatic background monitoring, and automatically send mail to DBA when there is a problem.
Some of the similar monitoring scripts may have been visible everywhere, but a new idea mentioned in this article is to use the Oracle database to store the results of the monitor, and then use the database of natural retrieval advantages, which is more analyzed than the previous shell analysis. Even the monitoring results of the artificial analysis are more relaxed, smart, and can also achieve more lasting and broader monitoring.
Most of this article Source from Donald K. Burleson's Oracle9i Unix Administration Handbook, but modifications for some errors and unreasonable places in the script, and add some functions, this article is based on Sun Solaris 8 SPARC 64bit Oracle
9.2.0
.5.
The first edition of this article mainly monitors memory consumption and the VMSTAT command waiting for the memory, which will add other commands such as iostat and other commands.
Second, the idea
First install StatsPack, then manually create a table for storing the VMSTAT command output, then make a shell timing execute Vmstat, insert all the results into the database table, and finally the data in the table is analyzed by the SQL timing, exceeding the pre-specified door The limit is alarm. At the same time, according to the data in the table, we can also make system performance trend reports.
Third, VMSTAT scripts and steps
1. Install Statspack
SQLPlus "/ as sysdba"
SQL> @? / Rdbms / admin / spcreate.sql
After setting the password of the Perfstat user, due to 10g
I chose the sysaux table space as the default table space of the Perfstat user, and not to create a table space.
2. Create a Stats $ VMSTAT table
SQL> Connect Perfstat / Perfstat
SQL> @ crete_vmstat_tab.sql
# CREATE_VMSTAT_TAB.SQL
Drop Table Stats $ Vmstat; Create Table Stats $ Vmstat (Start_Date Date, Duration Number, Server_Name Varchar2 (20), Running_Queue Number,
Waiting_Queue Number,
Swap_in number,
Swap_out number, kbytes_page_in number, kbytes_page_out number,
Page_scan number, user_cpu number, system_cpu number, iDLE_CPU Number, Wait_CPU Number, TableSpace SysauxStorage (Initial
10
M
NEXT
1
M
Pctincrease 0);
Comment on column stats $ vmstat.start_date is 'Monitoring time'; Comment On Column Stats $ Vmstat.duration IS 'Monitor Powder';
Comment on column stats $ vmstat.server_name is 'server name'; Comment On Column Stats $ VMSTAT.RUNNING_QUEUE IS 'Execution Queue';
Comment On Column Stats $ VMSTAT.WAITING_QUEUE IS 'Waiting for Queue';
Comment on column stats $ vmstat.swap_in is' The average exchange of LWPs per second;
Comment On Column Stats $ VMSTAT.SWAP_OUT IS 'Exchange Process Number;
Comment on column stats $ VMSTAT.KBYTES_PAGE_IN IS 'Change Page Bytes (K)'; Comment On Column Stats $ VMSTATTAT.KBYTES_PAGE_OUT IS 'Discharge Bihandte (K)';
Comment On Column Stats $ VMSTAT.PAGE_SCAN IS 'Change The number of page daemon scanned pages';
Comment on column stats $ vmstat.user_cpu is 'user action occupied CPU'; Comment On Column Stats $ VMSTAT.System_CPU IS 'system operation occupied CPU';
Comment on column stats $ vmstat.idle_cpu is 'CPU idle rate'; Comment On Column Stats $ VMSTATAT.WAIT_CPU IS 'CPU waiting rate (AIX unique)';
3. Create a VMSTAT directory
Create a VMSTAT directory for all relevant scripts in the Oracle User Home. $ CD
$ MKDIR VMSTAT
4. Create get_vmstat.ksh scripts
The upgrade is used to execute the VMSTAT command timing and store the result into the database.
# get_vmstat.ksh
#! / bin / ksh
# ----------------------------------------
# First set the environment variable, modify it according to the actual environment..
# ----------------------------------------
Oracle_sid = kamusdb
Export Oracle_sid
Oracle_Home = `Cat / VAR / OPT / ORACLE / ORATAB | GREP / ^ $ Oracle_sid: | cut -f2 -d ':'
Export Oracle_home
PATH = $ oracle_home / bin: $ PATH
Export Path
Server_name = `uname -a | awk '{print $ 2}'`
Typeset -u server_name
Export Server_name
# ----------------------------------------
# Run a VMSTAT (300 seconds) every 5 minutes, which can be modified according to the actual situation..
# ----------------------------------------
SAMPLE_TIME = 300
# ----------------------------------------
# Script Once run, it will not stop unless the operating system is turned off.
# -S parameter represents the case of monitoring SWAP space, report Si, SO column
# MSG $$ $$ Represents an arbitrary 2 digit, the system is automatically generated
# ----------------------------------------
While True
DO
Vmstat -s $ {sample_time} 2> / TMP / MSG $$
# ----------------------------------------
# Solaris system VMSTAT does not have a WAIT CPU statistics, so we deposit 0 in that column.
# 1, $ 2, $ 6, $ 7 and other numbers indicate the sequence in the VMSTAT output,
# VMSTAT output in each UNIX system may not be the same,
# Therefore, modify these column numbers can deal with different operating systems.
# ----------------------------------------
CAT / TMP / MSG $$ | SED 1,3d | awk '{Printf ("% S% S% S% S% S% S% S% S% S% S / N", $ 1, $ 2, $ 6, $ 7 $ 8, $ 9, $ 12, $ 20, $ 21)} '| While Read Runque WaitQue Swapin Swapout Page_IN Page_OUT Page_Scan User_CPU System_CPU Idle_CPU
DO
$ Oracle_home / bin / sqlplus perfstat / perfstat << EOF
INSERT INTO Perfstat.stats / $ VMSTAT
VALUES
Sysdate,
$ SAMPLE_TIME,
'$ Server_name',
$ Robnque,
$ WAITQUE,
$ SWAPIN,
$ SWAPOUT,
$ Page_in,
$ Page_OUT,
$ Page_scan,
$ USER_CPU,
$ SYSTEM_CPU,
$ IDLE_CPU,
0
);
COMMIT;
Exit
EOF
DONE
DONE
# ----------------------------------------
# Delete temporary files
# ----------------------------------------
RM / TMP / MSG $$
5. Create Run_VMSTAT.KSH scripts
This script is placed in crontab, which is used to check the get_vmstat.ksh script is not working properly. If you are running, then don't make any action, if you are not running, then run the get_vmstat.ksh script. The meaning of this script is to prevent restarting the operating system from forgetting to run the get_vmstat.ksh script.
# Run_VMSTAT.KSH
#! / bin / ksh
# ----------------------------------------
# First set the environment variable, modify it according to the actual environment..
# ----------------------------------------
Script_path = `echo ~ oracle / vmstat`
Export script_path
Oracle_sid = kamusdb
Export Oracle_sid
Oracle_Home = `Cat / VAR / OPT / ORACLE / ORATAB | GREP $ ORACLE_SID: | CUT -F2 -D ':'`
Export Oracle_home
PATH = $ oracle_home / bin: $ PATH
Export Path
# ----------------------------------------
# Further control, stop monitoring within the system without operation
# 8 监 监 监 监 之间 之间 之间 监 监 监
# Other time if get_vmstat is not running, run it
# ----------------------------------------
Hour = `Date "% h "`
Check_stat = `ps -ef | grep get_vmstat | grep -v grep | wc -l`
vmstat_num = `expr $ check_stat`
IF [$ hour -gt 19] || [$ Hour -LT 8]
THEN
IF [$ VMSTAT_NUM -GT 0]
THEN KILL -9 `PS-EF | GREP GET_VMSTAT | GREP -V GREP | AWK '{Print $ 2}'`> / dev / null
Fi
Else
IF [$ VMSTAT_NUM -LE 0]
THEN NOHUP $ Script_Path / Get_VMSTAT.KSH> / DEV / NULL 2> & 1 & 1 & 1 & 1
Fi
Fi
6. Create a crontab job, perform Run_VMSTAT.KSH script timing
This job runs every half hour.
$ crontab -l> Oracle.cron
$ echo '00, 30 * * * * /export/home/oracle/vmstat/run_vmstat.ksh >> /export/run.lst 2> &
1'
>> Oracle.cron
$ crontab oracle.cron
7. Analyze data
The steps that have been scheduled to run VMSTAT and store VMSTAT results in the database have been completed. Nature, just collecting statistics is far less than enough, let's analyze the collection of information, generate the performance report of the operating system. The report is divided into the following categories, and the script is implemented separately.
Abnormal report: showing the time period exceeding the threshold (VMSTAT_ALERT.KSH VMSTAT_ALERT.SQL)
Time Trend Report: Shows the average system average system per hour (rpt_vmstat_hr.sql)
Week Trend Report: Shows the average daily system (rpt_vmstat_dy.sql)
Long-term trend report: Display a long-term trend line (rpt_vmstat.sql) of system performance)
1) Abnormal report
# vmstat_alert.sql
Rem ----------------------------------------
REM This SQL is used to report an exception of each server in an Oracle environment.
Rem reports the average of each hour based on the information obtained by the get_vmstat.ksh script.
The REM DBA discovers an abnormal time period, and you can check the detailed data every 5 minutes.
REM accepts a parameter in this script that indicates the time span that needs to be reported.
REM The threshold for Runing Queue should be set to the number of CPUs,
REM indicates that if there is a lot of tasks waiting to be executed, this usually means that the CPU load is too heavy.
REM is set to 1 for the threshold value of Page SCAN (SR).
REM represents the alarm as long as the Page Daemon scan page appears, which usually indicates that the memory is insufficient.
REM is set to 70 for CPU utilization, indicating that more than 70% utilization is alarm.
REM, for example, running VMSTAT_ALERT 7
REM indicates that the execution queue is greater than 4 before outputting the current date.
The REM SR is more than 1, and the CPU utilization exceeds 70% report.
Rem ----------------------------------------
set lines 80; set pages 999; set feedback off; set verify off; column my_date heading 'date hour' format a20column c2 heading waitq format 9999column c3 heading pg_in format 9999column c4 heading pg_ot format 9999column c5 heading usr format 9999column c6 heading sys format 9999column c7 heading idl format 9999column c8 heading wt format 9999ttitle 'run queue> CPUs | May indicate an overloaded CPU | When runqueue exceeds the number of CPUs | on the server, tasks are waiting for service.'; select server_name, to_char (start_date, ' YY / MM / DD HH24 ') my_date, avg (running_queue) c2, avg (kbytes_page_in) c3, avg (kbytes_page_out) c4, avg (user_cpu) c5, avg (system_cpu) c6, avg (idle_cpu) c7fromperfstat.stats $ vmstatWHERErunning_queue> 4and start_date> sysdate- & 1group by server_name, to_char (start_date, 'yy / mm / dd hh24') Order by server_name, to_char (start_date, 'yy / mm / dd hh24'); TTITLE 'PAGE_SCAN> 1 | May Indi cate overloaded memory | Whenever Unix performs a page-in, the RAM memory | on the server has been exhausted and swap pages are being used '; select server_name, to_char (start_date,' YY / MM / DD HH24 ') my_date, avg. (running_queue) c2, avg (kbytes_page_in) c3, avg (kbytes_page_out) c4, avg (user_cpu) c5, avg (system_cpu) c6, avg (idle_cpu) c7fromperfstat.stats $ vmstatWHEREpage_scan> 1and start_date> sysdate- & 1group by server_name, to_char ( START_DATE, 'YY / MM / DD HH24') Order by Server_Name, TO_CHAR (START_DATE, 'YY / MM / DD HH24'); TTITLE 'USER SYSTEM CPU>
. 70% | Indicates periods with a fully-loaded CPU subssystem | Periods of 100% utilization are only a | concern when runqueue values exceeds the number of CPs on the server '; select server_name, to_char (start_date,' YY / MM /. DD HH24 ') my_date, avg (running_queue) c2, avg (kbytes_page_in) c3, avg (kbytes_page_out) c4, avg (user_cpu) c5, avg (system_cpu) c6, avg (idle_cpu) c7fromperfstat.stats $ vmstatWHERE (user_cpu system_cpu) > 70nd start_date> sysdate- & 1group by server_name, to_char (start_date, 'yy / mm / dd hh24') Order by server_name, to_char (start_date, 'yy / mm / dd hh24'); # vmstat_alert.ksh
# ----------------------------------------
# You can join this shell into cron, run 7 o'clock a day.
# ----------------------------------------
#! / bin / ksh
# ----------------------------------------
# First set the environment variable, modify it according to the actual environment..
# Accept a parameter input, indicating the currently reported database SID
# ----------------------------------------
Oracle_sid = $ 1
Export Oracle_sid
Oracle_Home = `Cat / VAR / OPT / ORACLE / ORATAB | GREP $ ORACLE_SID: | CUT -F2 -D ':'`
Export Oracle_home
PATH = $ oracle_home / bin: $ PATH
Export Path
Script_path = `echo ~ oracle / vmstat`
Export script_path
SQLPlus perfstat / perfstat <
Spool /TMP/VMSTAT_ / ORACLE_SID.LST
@ $ Script_path / vmstat_alert 7 4
Spool OFF;
EXIT;
!
# ----------------------------------------
# Check the output result of VMSTAT_ALERT.SQL
# Normal situation should only contain the following 2 lines
# SQL> @ / export / home / oracle / vmstat / vmstat_alert 7
# SQL> Spool OFF;
# If more than 3 lines indicate an abnormal value, then direct mail to DBA
# ----------------------------------------
Check_stat = `CAT /TMP/VMSTAT_$$Oracle_sid.lst|wc -l`;
Oracle_num = `EXPR $ Check_stat`
IF [$ oracle_num -gt 3]
THEN
CAT / TMP/VMSTAT_$$Oracle_sid.lst|mailx -s "System Vmstat Alert" Kamus@itpub.net Some_other_dba@mail.address.netfi
1. Create a crontab job, execute the vmstat_alert.ksh script at 7 o'clock a day.
$ crontab -l> Oracle.cron
$ echo '00 7 * * * /export/Home/oracle/vmstat/vmstat_alert.ksh kamusdb >> /EXPORT/HOME/Oracle/vmstat/runalert.lst 2> &
1'
>> Oracle.cron
$ crontab oracle.cron
2) Daily Trend Report
# rpt_vmstat_hr.sql
Rem ----------------------------------------
REM This SQL is used to report average CPU usage in each server in an Oracle environment.
REM accepts a parameter to specify the date that needs to be reported, format YYYYMMDD
Rem ----------------------------------------
Connect Perfstat / Perfstat; SET Pages 9999; Set Feedback Off; Set Verify OFF;
column server_name heading 'server' format a10column my_hour heading 'hour' format a10column c2 heading runq format 9999column c3 heading pg_in format 9999column c4 heading pg_ot format 9999column c5 heading usr format 9999column c6 heading sys format 9999column c7 heading idl format 9999column c8 heading wt format 9999select server_name, to_char (start_date, 'HH24') my_hour, avg (runing_queue) c2, avg (kbytes_page_in) c3, avg (kbytes_page_out) c4, avg (user_cpu system_cpu) c5, avg (system_cpu) c6, avg (idle_cpu) c7 from Stats $ VMSTAT WHERE TRUNC (START_DATE) = TO_DATE (& 1, 'YYYYMMDD') Group by server_name, to_char (start_date, 'hh24') Order by Server_name, TO_CHAR (START_DATE, 'HH24');
3) Week Trend Report
# rpt_vmstat_dy.sql
Rem ----------------------------------------
REM This SQL is used to report the daily average CPU usage of each server in the Oracle environment this week.
Rem ---------------------------------------- CONNECT Perfstat / Perfstat; Set Pages 9999; Set feedback off; set verify OFF;
column server_name heading 'server' format a10column my_day heading 'day' format a20column c2 heading runq format 9999column c3 heading pg_in format 9999column c4 heading pg_ot format 9999column c5 heading usr format 9999column c6 heading sys format 9999column c7 heading idl format 9999column c8 heading wt format 9999select server_name, to_char (start_date, 'day') my_day, avg (runing_queue) c2, avg (kbytes_page_in) c3, avg (kbytes_page_out) c4, avg (user_cpu system_cpu) c5, avg (idle_cpu) c7 from stats $ vmstat where trunc ( start_date)> = trunc (next_day (sysdate, 'MONDAY')) - 7 and trunc (start_date) 4) Long-term trend report # rpt_vmstat.sql Rem ---------------------------------------- REM This SQL is used to report average CPU usage of each server day in the Oracle environment. REM report is all of the collected data Rem ---------------------------------------- Connect Perfstat / Perfstat; SET Pages 9999; Set Feedback Off; Set Verify OFF; Column Server_name heading 'Server' Format A10COLUMN MY_DATE Heading 'Date' Format A20COLUMN C2 Heading Runq Format 9999 Column C3 Heading PG_IN FORMAT 9999 Column C4 Heading PG_OT FORMAT 9999 Column C5 Heading USR Format 9999 Column C6 Heaping Sys Format 9999 Column C7 Heading Idl Format 9999 Column C8 Heading WT Format 9999