Recover datafile corruption




 

 This Article show how to recover from a block failure.

In this case partitioned table will be created and single block will be corrupted.

 

·         Create Table

 

CREATE TABLE “TEST_TAB_2”

   (   “ID” NUMBER(10,0) NOT NULL ENABLE,

       “DESCRIPTION” VARCHAR2(50 BYTE) NOT NULL ENABLE,

       “CREATED_DATE” DATE NOT NULL ENABLE

   )

  TABLESPACE “USERS”

  PARTITION BY RANGE (“CREATED_DATE”)

 (PARTITION “TEST_TAB_Q1”  VALUES LESS THAN (TO_DATE(‘ 2008-01-01 00:00:00’, ‘SYYYY-MM-DD HH24:MI:SS’, ‘NLS_CALENDAR=GREGORIAN’)) SEGMENT CREATION IMMEDIATE

  TABLESPACE “USERS” ,

 PARTITION “TEST_TAB_Q2”  VALUES LESS THAN (TO_DATE(‘ 2008-04-01 00:00:00’, ‘SYYYY-MM-DD HH24:MI:SS’, ‘NLS_CALENDAR=GREGORIAN’)) SEGMENT CREATION DEFERRED

  TABLESPACE “USERS” ,

 PARTITION “TEST_TAB_Q3”  VALUES LESS THAN (TO_DATE(‘ 2008-07-01 00:00:00’, ‘SYYYY-MM-DD HH24:MI:SS’, ‘NLS_CALENDAR=GREGORIAN’)) SEGMENT CREATION DEFERRED

  TABLESPACE “USERS” ,

 PARTITION “TEST_2009”  VALUES LESS THAN (TO_DATE(‘ 2009-01-01 00:00:00’, ‘SYYYY-MM-DD HH24:MI:SS’, ‘NLS_CALENDAR=GREGORIAN’)) SEGMENT CREATION DEFERRED

  TABLESPACE “USERS” ,

 PARTITION “TEST_TAB_Q4”  VALUES LESS THAN (MAXVALUE) SEGMENT CREATION DEFERRED

  TABLESPACE “USERS” ) ;

 

 

 

 

·         Extracting block rowid

 

 SQL>select * from

 (select distinct dbms_rowid.rowid_block_number(rowid)

 from TEST_TAB_2)

 where rownum < 6;

 

 

 

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)

————————————

                            146

                            147

                            148

                            149

                            150

 

 

·         Introducing corruption on  block 148

 

 

 

 

 

dd of=/app/oracle/admin/TEST_DB/data/undo/users01.dbf bs=8192 seek=148 conv=notrunc count=1 if=/dev/zero

 

1+0 records in

1+0 records out

8192 bytes (8.2 kB) copied, 7.9165e-05 s, 103 MB/s

 

SQL> alter system flush buffer_cache;

 

System altered.

 

 

 

·         Querying corrupted table gives ORA-1578 

 

 

SQL>  select count(*) from TEST_TAB_2;

select count(*) from TEST_TAB_2

                          *

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 5, block # 148)

ORA-01110: data file 5: ‘/app/oracle/admin/TEST_DB/data/undo/users01.dbf’

 

 

·         No backup validation was performed , so we are not able to find out any information from querying v$backup_corruption

 

SQL> select * from V$backup_corruption;

 

no rows selected

 

 

·         Errors are trapped in database alert log.

 

Mon Nov 04 16:39:16 2013

ALTER SYSTEM: Flushing buffer cache

Mon Nov 04 16:39:57 2013

Hex dump of (file 5, block 148) in trace file /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/trace/TEST_DB_ora_3061.trc

Corrupt block relative dba: 0x01400094 (file 5, block 148)

Completely zero block found during multiblock buffer read

Reading datafile ‘/app/oracle/admin/TEST_DB/data/undo/users01.dbf’ for corruption at rdba: 0x01400094 (file 5, block 148)

Reread (file 5, block 148) found same corrupt data (no logical check)

Errors in file /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/trace/TEST_DB_ora_3061.trc  (incident=54539):

ORA-01578: ORACLE data block corrupted (file # 5, block # 148)

ORA-01110: data file 5: ‘/app/oracle/admin/TEST_DB/data/undo/users01.dbf’

Incident details in: /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/incident/incdir_54539/TEST_DB_ora_3061_i54539.trc

Mon Nov 04 16:39:58 2013

Corrupt Block Found

         TSN = 5, TSNAME = USERS

         RFN = 5, BLK = 148, RDBA = 20971668

         OBJN = 69138, OBJD = 69138, OBJECT = TEST_TAB_2, SUBOBJECT = TEST_TAB_Q1

         SEGMENT OWNER = TEST, SEGMENT TYPE = Table Partition

ORA-01578: ORACLE data block corrupted (file # 5, block # 148)

ORA-01110: data file 5: ‘/app/oracle/admin/TEST_DB/data/undo/users01.dbf’

Incident details in: /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/incident/incdir_54540/TEST_DB_ora_3061_i54540.trc

Hex dump of (file 5, block 148) in trace file /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/incident/incdir_54539/TEST_DB_m000_3070_i54539_a.trc

Corrupt block relative dba: 0x01400094 (file 5, block 148)

Completely zero block found during validation

Reread of blocknum=148, file=/app/oracle/admin/TEST_DB/data/undo/users01.dbf. found same corrupt data

Reread of blocknum=148, file=/app/oracle/admin/TEST_DB/data/undo/users01.dbf. found same corrupt data

Reread of blocknum=148, file=/app/oracle/admin/TEST_DB/data/undo/users01.dbf. found same corrupt data

Reread of blocknum=148, file=/app/oracle/admin/TEST_DB/data/undo/users01.dbf. found same corrupt data

Reread of blocknum=148, file=/app/oracle/admin/TEST_DB/data/undo/users01.dbf. found same corrupt data

 

 

·         Now perform database’s backup validate

 

 

RMAN> connect target /

 

connected to target database: TEST_DB (DBID=4194814192)

 

RMAN> run {BACKUP VALIDATE DATABASE;}

 

Starting backup at 04-NOV-13

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=47 device type=DISK

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

input datafile file number=00003 name=/app/oracle/admin/TEST_DB/data/undo/undo01.dbf

input datafile file number=00001 name=/app/oracle/admin/TEST_DB/data/system/system01.dbf

input datafile file number=00002 name=/app/oracle/admin/TEST_DB/data/sysaux/sysaux01.dbf

input datafile file number=00004 name=/app/oracle/admin/TEST_DB/data/data/xdb01.dbf

input datafile file number=00005 name=/app/oracle/admin/TEST_DB/data/undo/users01.dbf

 

channel ORA_DISK_1: backup set complete, elapsed time: 00:01:15

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

—- —— ————– ———— ————— ———-

1    OK     0              18233        65536           35562774 

  File Name: /app/oracle/admin/TEST_DB/data/system/system01.dbf

  Block Type Blocks Failing Blocks Processed

  ———- ————– —————-

  Data       0              36706          

  Index      0              7494           

  Other      0              3103           

 

….

…..

……..

……………

 

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

—- —— ————– ———— ————— ———-

5    FAILED 0              12581        12800           35562081 

  File Name: /app/oracle/admin/TEST_DB/data/undo/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ———- ————– —————-

  Data       0              20             

  Index      0              13             

  Other      1              186                     <<– block corruption

 

validate found one or more corrupt blocks

See trace file /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/trace/TEST_DB_ora_3114.trc for details

channel ORA_DISK_1: starting full datafile backup set

channel ORA_DISK_1: specifying datafile(s) in backup set

including current control file in backup set

including current SPFILE in backup set

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

List of Control File and SPFILE

===============================

File Type    Status Blocks Failing Blocks Examined

———— —— ————– —————

SPFILE       OK     0              2              

Control File OK     0              676             

Finished backup at 04-NOV-13

 

 

 

RMAN> list failure;

 

using target database control file instead of recovery catalog

List of Database Failures

=========================

 

Failure ID Priority Status    Time Detected Summary

———- ——– ——— ————- ——-

1142       HIGH     OPEN      30-OCT-13     Datafile 5: ‘/app/oracle/admin/TEST_DB/data/data/users01.dbf’ contains one or more corrupt blocks

 

 

·         Try to run blockrecover with no database backup will fail

 

RMAN>  run {blockrecover datafile 5 block 148;}

 

Starting recover at 04-NOV-13

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=41 device type=DISK

 

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of recover command at 11/04/2013 16:47:57

RMAN-06026: some targets not found – aborting restore

RMAN-06023: no backup or copy of datafile 5 found to restore

 

·         It is possible to use rman advise to recover our failure

 

RMAN> advise failure;

 

List of Database Failures

=========================

 

Failure ID Priority Status    Time Detected Summary

———- ——– ——— ————- ——-

1142       HIGH     OPEN      30-OCT-13     Datafile 5: ‘/app/oracle/admin/TEST_DB/data/data/users01.dbf’ contains one or more corrupt blocks

 

analyzing automatic repair options; this may take some time

using channel ORA_DISK_1

analyzing automatic repair options complete

 

Mandatory Manual Actions

========================

no manual actions available

 

Optional Manual Actions

=======================

1. Shut down, mount the database and try flush redo using ALTER SYSTEM FLUSH REDO TO ‘standby name’ command.  Then perform a Data Guard role change (failover).  Available standbys: TEST_DBDG.

 

Automated Repair Options

========================

Option Repair Description

—— ——————

1      Restore and recover datafile 5 

  Strategy: The repair includes complete media recovery with no data loss

  Repair script: /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/hm/reco_379520345.hm

 

·         Check preview of failure recover

 

RMAN> repair failure preview;

 

Strategy: The repair includes complete media recovery with no data loss

Repair script: /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/hm/reco_379520345.hm

 

contents of repair script:

   # restore and recover datafile

   sql ‘alter database datafile 5 offline’;

   restore datafile 5;

   recover datafile 5;

   sql ‘alter database datafile 5 online’;

 

 

·         In this case we use rman repair script to recover block failure

 

RMAN> repair failure noprompt;

 

Strategy: The repair includes complete media recovery with no data loss

Repair script: /app/oracle/admin/TEST_DB/diag/rdbms/TEST_DB/TEST_DB/hm/reco_379520345.hm

 

contents of repair script:

   # restore and recover datafile

   sql ‘alter database datafile 5 offline’;

   restore datafile 5;

   recover datafile 5;

   sql ‘alter database datafile 5 online’;

executing repair script

 

sql statement: alter database datafile 5 offline

 

Starting restore at 04-NOV-13

using channel ORA_DISK_1

 

creating datafile file number=5 name=/app/oracle/admin/TEST_DB/data/undo/users01.dbf

restore not done; all files read only, offline, or already restored

Finished restore at 04-NOV-13

 

Starting recover at 04-NOV-13

using channel ORA_DISK_1

 

starting media recovery

media recovery complete, elapsed time: 00:00:01

 

Finished recover at 04-NOV-13

 

sql statement: alter database datafile 5 online

repair failure complete

 

 

·         From alertlog

 

Mon Nov 04 16:49:30 2013

alter database datafile 5 offline

Completed: alter database datafile 5 offline

alter database recover datafile list clear

Completed: alter database recover datafile list clear

alter database recover if needed

 datafile 5

Media Recovery Start

Serial Media Recovery started

Recovery of Online Redo Log: Thread 1 Group 1 Seq 9 Reading mem 0

  Mem# 0: /app/oracle/admin/TEST_DB/data/redo/redo01a.log

Media Recovery Complete (TEST_DB)

Completed: alter database recover if needed

 datafile 5

alter database datafile 5 online

Completed: alter database datafile 5 online

Mon Nov 04 16:51:45 2013

Thread 1 advanced to log sequence 10 (LGWR switch)

  Current log# 2 seq# 10 mem# 0: /app/oracle/admin/TEST_DB/data/redo/redo02a.log

Mon Nov 04 16:51:46 2013

 

 

 

·         Now it is possible to query recovered table successfully

 

SQL> select count(*) from TEST_TAB_2;

 

  COUNT(*)

———-

      2009

 

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *