ORA-600 fault resolution and causes cause excavation

zhaozj2021-02-16 125

ORA-00600 fault solution

??? No. 7.16 12:40 Database server suddenly crashes, the root user enters the system after executing the LS Times Bad Error error, perform DF display disk space enough, su - oracle is still root's environment variable, find LSNRCTL and Oracle related Performance, you have to restart the server.

???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

Start the database (already mount) Times:

??? ORA-00600: Internal Error Code, Arguments: [KCRATR1_LOSTWRT],, [], [], [], []

I want to come to the ORA600 no longer strange, how can I open the database? Search on MetaLink and find that this problem exists in the various versions of 9i. If you play the database tag ORA600 error, start to the MOUNT state, that is, the parameters after Arguments can be recovered in later method:

??? SQL> Recover Database;

??? SQL> ALTER DATABASE OPEN;

I haven't loose a breather, and the email notification has produced a lot of piles. Most of them are about bad blocks:

??? ora-01578: Oracle Data Block Corrupted (File # 2, Block # 247726)

ORA-01110: Data File 2: '/Home/oracle/oradata/esal/ts_cybercafe06.dbf'

ORA-01578: Oracle Data Block Corrupted (File # 5, Block # 243660)

ORA-01110: Data file 5: '/Home/oracle/oradata/esal/ts_cybercafe01.dbf'

Seeing just two data files, I have broken a few blocks. I checked it. It turned out that there is a bad block on two indexes. I can rest assured, delete or rebuild the index can solve this problem. I still don't worry. All data files were checked with DBV without discovered bad blocks.

Hurry with RMAN to make a full range of databases, then ftp on a backup machine. After the database recovery, when checking the BDUMP and UDUMP of the database, some of the files in the udump directory have also issued a problem. When using the ls command, return information as follows:

#LS

Ls: esal_ora_18367.trc: Input / Output Error

Ls: esal_ora_18371.trc: Input / Output Error

Ls: esal_ora_18377.trc: Input / Output Error

Ls: esal_ora_18373.trc: Input / Output Error

Ls: esal_ora_18379.trc: Input / Output Error

It is not sure if this is a bad block on the logo disk. Because even the system is doing RAID0 1. From the theory, there should be no bad blocks, and search it on Google, most of them recommend E2FSCK The partition is scanned. For the sake of insurance, only the directory is renamed, and DUMP is created, and the database is ready to back up to another machine at the end of the month.

Http://lists.debian.org/debian-user/1999/11/msg00430.html

Obviously, the server crash is definitely caused by the database. By analyzing the alertsid.log file, the database is discovered from the 7.16 12:40 points, it is already very unusual, usually 10-15 minutes generate a 10M log, from 12:40 Start, generate 2 10M logs per minute. From 12:40 to the database crash 17:20, the database produces a total of 5G logs. If the system is very slow, the customer has already called complaint, but 7.16 has never complained, in the intercontol (3:00 Arrange to log in to the database once, use TOP to see the running status of the database, Alertsid.log does not have ERROR, and there is no attention to the log generated.

What is the result of a large number of logs? If the ORA600 is the BUG of the database, the company is definitely unhappy, thinking that there is an excuse to find the accident for the database; a speculation is a large number of data processing in the database (sometimes this), Decided to mine some archived numbers between 12:40 to 17:20 with LogMnR, I hope I can find true fierce.

Generate a dictionary:

SQL> EXEC DBMS_LOGMNR_D.BUILD (Dictionary_FileName => 'esal.ora',

Dictionary_location => '/ home / oracle / Soft';

Logmnr Dictionary Procedure Started

Logmnr Dictionary File Opened

Table: Obj $ Recorded in logmnr Dictionary File

TABLE: TAB $ Recorded in Logmnr Dictionary File

Table: Col $ recorded in logmnr Dictionary file

Table: TS $ Recorded in logmnr Dictionary File

Table: IND $ recorded in logmnr Dictionary file

Table: User $ Recorded in logmnr Dictionary File

Table: Tabpart $ Recorded in Logmnr Dictionary File

Table: INDPART $ Recorded in Logmnr Dictionary File

Table: Tabsubpart $ Recorded in Logmnr Dictionary File

TABLE: Tabcompart $ Recorded in logmnr Dictionary File

Table: Type $ Recorded in Logmnr Dictionary File

Table: ColType $ Recorded in Logmnr Dictionary File

Table: Attribute $ Recorded in Logmnr Dictionary File

Table: Encryption_profile $? ORA-00942: Table or View Does Not Exist

Table: Encrypted_obj $? ORA-00942: Table or View Does Not Exist

Table: Lob $ Recorded in logmnr Dictionary File

Table: CDEF $ Recorded in logmnr Dictionary File

Table: CCOL $ Recorded in Logmnr Dictionary File

Table: ICOL $ Recorded in Logmnr Dictionary FileTable: Attrcol $ Recorded in Logmnr Dictionary File

Procedure EXECUTED SUCCESSFULLY - LOGMNR DICTIONARY CREATED

PL / SQL Procedure SuccessFully Completed.

??? This seems to be 9.2.0.4 problem, you should not affect the process of logmnr, search it on Google, and there are other people, this fault (http://www.dbazine.com/code/mhsys -logminer.log.txt), if you have an encrypted object, try to make patch.

??? Decided to start analysis from the 12:40 points starting, add the archive generated by this time to the logmnr, then start the LogMnr for analysis. Due to the connection to the database via SecureCRT, the analysis and testing is not very convenient. If you accidentally can't generate ORA03113, then reconnect, you need to re-execute the following script, so after you have logmnr, it is best to perform Create Table Logmnr2 AS SELECT * FROM V $ logmnr_contents, then analyzes the logMnR2. The advantage of this is that you can add indexes to the logmnr2.

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15079.dbf', options => dbms_logmnr.new);

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15080.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15081.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15082.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15083.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15084.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.start_logmnr (DictFileName => '/ Home / Oracle / Soft / Esal.ora');

Okey data, can you analyze the data? What should I analyze? In LogMnR2, although only six logs have been added, it has more than 180,000 records, how do you find an abnormality in so many records? Can I find an abnormally in a statistical way? As can be seen from the analysis, DELETE and ROLLBACK are very few, INTERNAL, UNSUPPORTED, START is a considerable specific gravity, and after exact calculation, SELECT 130304/188789 from dual = .690209705, now you need to normalize the system Statistics, I hope to find it.

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15736.dbf', options => dbms_logmnr.new);

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15737.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15738.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oradata / esal / archive / 1_15739.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.add_logfile (logfilename => '/ home / oracle / oraData / esal / archive / 1_15740.dbf', options => dbms_logmnr.addfile;

Exec dbms_logmnr.start_logmnr (DictFileName => '/ Home / Oracle / Soft / Esal.ora');

How can I check what the database does the database do? By using the SQL_REDO field of the LogMnR2 table, the SQL_REDO of each log is output to the OS file by spool, and it is found that these archived logs are almost all of the INSERT INTO operations for Agent_Game_Card_GM132.

Then, the agent_game_card_gm132 is subjected to from 12:40, to the machine crash, in the peak time of the system, someone passes more than 2 million records to the agent_game_card_gm132 table, which has a 5G log, and after verification, it is Qingyuan The data of XX technology has caused the system crash.

转载请注明原文地址:https://www.9cbs.com/read-15264.html

9cbs

New Post(0)