Original source:
http://www.eygle.com/statspack/statspack14-logfilesync.htm When a user submits (stritals) or rollback, the session's Redo information needs to be written to the Redo logfile. The user process will notify the LGWR to write Out, LGWR is completed after the task will notify the user process. This wait event means that the user process waits for the write completion notification of LGWR.
For rollback operations, the event records from the user issues the ROLLBACK command to the time of rollback.
If you wait too much, it may illustrate that the write efficiency of LGWR is low, or the system is submitted too frequent. For this problem, you can follow: log file parallel Write Waiting User Commits, User Rollback and other statistics can be used to observe submission or rollback frequency
Solution: 1. Improve LGWR performance Try to use a quick disk, do not store the Redo log file on the disk of RAID 5 2. Use Bulletin Submit 3. Use NOLogging / UnRecoverable options
The average REDO write size can be calculated by the following formula:
Avg.redo write size = (redo blockwritten / redo write) * 512 BYTES
If the system generates a lot of REDO, it is less written, and the LGWR is too frequently activated. It may cause too much REDO-related Latch competition, and Oracle may not be able to use Piggyback's functionality.
We extract some data from a StatsPack to study this problem.
Main information
DB Name DB ID Instance Instal Release Ops Host
------------ ------------------------- ---- --------------
DB 1222010599 Oracle 1 8.1.7.4.5 No Sun
Snap ID Snap Time sessions
--------------------------------
Begin Snap: 3473 13-OCT-04 13:43:00 540
END SNAP: 3475 13-OCT-04 14:07:28 540
ELAPSED: 24.47 (MINS)
Cache Sizes
~~~~~~~~~~~
DB_BLOCK_BUFFERS: 102400 LOG_BUFFER: 20971520
DB_BLOCK_SIZE: 8192 Shared_Pool_Size: 600M
Load Profile
~~~~~~~~~~~~ per second per transaction
-----------------------------
Redo Size: 28, 458.11 2,852.03
......
2. Waiting for the event
Event Waits Timeouts Time (CS) / TXN
-------------------------------------------------- ----------------------
Log File Sync 14,466 2 4,150 3 1.0
DB File Sequential Read 17, 202 0 2,869 2 1.2latch Free 24,841 13,489 2,072 1 1.7
Direct Path Write 121 0 1,455 120 0.0
DB File Parallel Write 1,314 0 1,383 11 0.1
Log File Sequential Read 1,540 0 63 0 0.1
....
Log File Switch Completion 1 0 3 30 0.0
Refresh ControlFile Command 23 0 1 0 0.0
LGWR Wait For Redo Copy 46 0 0 0 0.0
....
Log File Single Write 4 0 0 0.0
We see that Log File Sync and DB File Parallel Write are waiting at the same time. Obviously log file sync is waiting for DB File Parallel Write to complete.
The disk IO will definitely exist in the bottleneck, the actual user's REDO and data files simultaneously on the RAID disk, and there is performance problem. Need adjustment.
3. Statistics
Statistic Total Per Second Per Trans
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------
....
Redo Blocks Written 93, 853 63.9 6.4
Redo Buffer Allocation Retries 1 0.0 0.0
Redo Entries 135, 837 92.5 9.3
Redo Log Space Requests 1 0.0 0.0
Redo Log Space Wait Time 3 0.0 0.0
Redo Order Marks 0 0.0 0.0
Redo Size 41, 776, 508 28, 458.1 2,852.0
Redo Synch Time 4,174 2.8 0.3
Redo Synch Writes 14,198 9.7 1.0redo WaStage 4,769,200 3,248.8 325.6
Redo Write Time 3,698 2.5 0.3
Redo Writer Latching Time 0 0.0 0.0
Redo Writes 14,572 9.9 1.0
....
Sorts (Disk) 4 0.0 0.0
Sorts (Memory) 179, 856 122.5 12.3
Sorts (Rows) 2,750,980 1,874.0 187.8
....
Transaction Rollbacks 36 0.0 0.0
Transaction Tables Consistent REA 0 0.0 0.0
Transaction Tables Consistent REA 0 0.0 0.0
User Calls 1, 390, 718 947.4 94.9
User Commits 14,136 9.6 1.0
User Rollbacks 512 0.4 0.0
Write Clones Created in Backgroun 0 0.0 0.0
Write Clones Created in Foregroun 11 0.0 0.0 0.0
-------------------------------------------------- -----------
Avg.redo write size = (redo blockwritten / redo write) * 512 BYTES
= (93, 853/14, 572) * 512
= 3K
This average is too small, indicating that the system is submitted too frequently.
Latch Sleep Breakdown for DB: DPSHDB Instance: DPSHDB SNAPS: 3473 -3475
-> Ordered by Misses DESC
Get Spin &
Latch Name Requests Misses Sleeps Sleeps 1-> 4
------------------------------------------------ - - ------------------------ Row Cache Objects 12,257,850 113,299 64 113235/64/0 /
0/0
Shared Pool 3,690,715 60,279 15,857 52484/588/65
46/661/0
Library Cache 4,912,465 29,454 8,876 23823/2682/2
733/216/0
Cache Buffers CHAINS 10, 314, 526 2,856 33 2823/33/0/0 /
0
Redo Writing 76, 550 937 1 936/1/0/0
Session IDle Bit 2,871,949 225 1 224/1/0/0
Messages 107, 950 159 2 157/2/0/0
Session Allocation 184, 386 44 6 38/6/0/0/0
Checkpoint Queue Latch 96, 583 1 1 0/1/0/0/0
-------------------------------------------------- -----------
Due to frequent submission of transitions, LGWR is excessively frequently activated, and we have seen the Redo Writing Latch Competition.
About Redo Writing You can find a detailed introduction at STEVE sites: http://www.ixora.com.au/notes/lgwr_latching.htm
The transfer is as follows:
When LGWR wakes up, it first takes the redo writing latch to update the SGA variable that shows whether it is active. This prevents other Oracle processes from posting LGWR needlessly. LGWR then takes the redo allocation latch to determine how much redo might be available to write (subject to the release of the redo copy latches). If none, it takes the redo writing latch again to record that it is no longer active, before starting another rdbms ipc message wait. If there is redo to write, LGWR then inspects the latch recovery areas for the redo copy latches (without taking the latches) to determine whether there are any incomplete copies into the log buffer. for incomplete copies above the sync RBA, LGWR just defers the writing of that block and subsequent log buffer blocks. for incomplete copies below the sync RBA, LGWR sleeps on a LGWR wait for redo copy wait event, and is posted when the required copy latches have been released. The time taken by LGWR to take the redo writing and re do allocation latches and to wait for the redo copy latches is accumulated in the redo writer latching time statistic. (Prior to release 8i, foreground processes held the redo copy latches more briefly because they did not retain them for the application of the change vectors. THEREFORE, LGWR Would INSTETETEMPT TO Assure Itself That there.
After each redo write has completed, LGWR takes the redo allocation latch again in order to update the SGA variable containing the base disk block for the log buffer. This effectively frees the log buffer blocks that have just been written, so that they may be reused .
Author: eygle, Oracle technology followers, Oracle technical forum itpub.www.eygle.com from China is the biggest author's personal site you may contact the author by Guoqiang.Gai@gmail.com welcome to explore technical exchanges and links. Exchange. Original article:
http://www.eygle.com/statspack/statspack14-logfilesync.htm