Problem Description: On-site personnel said that the product library was accidentally suspended, and it was said that it may be corrupted because the lights are not bright. So emergency dialing to the remote, check the database, after 5 minutes, the database restarts successfully. The problem is actually very simple, but later looks Alertlog, found that it will call me from the array, there is nearly one hour in the middle, during this time, because the scene does not know the problem, so repeatedly switched Cluster, trying to let The database can start successful, sweating! Problem Interpretation: A total of three copies in the product library, saved in / global / oracle, / global / backup1, / global / backup2. Two copies of the reciprocal logs and archiving logs, saved under / global / backup1, / global / backup2. / Global / Oracle, / Global / Backup2 is three separate arrays. Since the control file is consistent for Oracle, any of the three control files cannot be read, and the database will be closed. And running logs and archiving logs, as long as there is a job that can be read properly, the database can continue to run normally. So, if any array is invalid, the database will accidentally suspended due to the problem of control files. Solution process: 1. Database accident stop
2. Check $ ORACLE_HOME / ALERT_DPSHDB.LOG file, confirm the reason for accidents, such as today's content is: Tue Oct 26 16:36:40 2004erroS in file / export / oracle / product / 817 / admin / DPSHDB /bdump/dpshdb_ckpt_16541.trc:ORA-00206: error in writing (block 3, # blocks 1) of controlfileORA-00202: controlfile: '/global/backup1/oradata/dpshdb/control02.ctl'ORA-27063: skgfospo: number of bytes read / written is incorrectSVR4 Error: 6: No such device or addressAdditional information: -1Additional information: 8192Tue Oct 26 16:36:40 2004Errors in file / export / oracle / product / 817 / admin / dpshdb / bdump / dpshdb_lgwr_16539. trc: ORA-00345: redo log write error block 38713 count 2ORA-00312: online log 1 thread 1: '/global/backup1/oradata/dpshdb/redo1b.log'ORA-27063: skgfospo: number of bytes read / written is INCORRECTSVR4 ERROR: 5: I / O Erroradditional Information: -1additional information: 1024 The first part is to point out that the Control02.CTL control file cannot be read, the second part is to point out the redo1b.log re-logo. Then the database is automatically turned off, as shown below, it is 16:36:43, that is, after 16:36:40 first discovered that the array failed for 3 seconds (we know the Timeout value of CKPT is 3 seconds): Tue Oct 26 16: 36:43 2004errors in file /export/racle/Product/817/admin/dpshdb/udump/dpshdb_ora_10206.trc:ora-00221: Write control file error instance terminated by ckpt, PID = 16541 From these logs we can know / The Global / Backup1 array has failed.
3. Modify the initialization parameter file $ Oracle_Home / DBS / INitdpshdb.ora, ignore the failure control file and the invalid archive path original: control_files = ("/Global/racle/oradata/dpshdb/control01.cta/dpshdb/control01.ctl","/global/backup1/ oradata / dpshdb / control02.ctl "," / global / backup2 / oradata / dpshdb / control03.ctl ") log_archive_dest_1 =" location = / global / backup1 / oradata / dpshdb / arch "modified content: control_files = (" / Global / Oracle / ORADATA / DPSHDB / Control01.ctl "," / global / backup2 / oradata / dpsb / control03.ctl ") # LOG_ARCHIVE_DEST_1 =" Location = / global / backup1 / oradata / DPSHDB / Arch "4. Start the database, the database is already available normally.
5. In order to ensure that the re-log file is always maintained, after the database is started, we need to delete the file that fails, add a new renovation log file. This part of the operation can be made after the database is started, but it is best not to have a lot of update operations, to prevent the re-execution log switching too fast, resulting in the failure of the deletion of the re-log log failure. alter database drop logfile member '/ global / backup1 / oradata / dpshdb / redo1b.log'; alter database add logfile member '/ global / oracle / oradata / dpshdb / redo1c.log' to group 1; alter database drop logfile member '/ global / backup1 / oradata / dpshdb / redo2b.log '; alter database add logfile member' / global / oracle / oradata / dpshdb / redo2c.log 'to group 2; alter database drop logfile member' / global / backup1 / oradata / dpshdb /redo3b.log';alter database add logfile member '/ global / oracle / oradata / dpshdb / redo3c.log' to group 3; alter database drop logfile member '/ global / backup1 / oradata / dpshdb / redo4b.log'; alter Database add logfile medata / dpshdb / redo4c.log 'to group 4; check the V $ log view before executing the ALTER DATABASE DROP LOGFILE MEMBER, confirm that the STATUS field of the reproduction log group to which the file belongs is not Current, not Active, but inactive. Otherwise, delete will report, if you delete an error, then run twice as follows: ALTER System Swtich logfile; then re-execute the removal of the old log and add a new log.