Role of SCN in Instance Recovery & Media Recovery
SCN after an Instance Crash
Checkpoint is an event that happens at a particular time(=SCN). When this happens, all datafiles and control files are updated with SCN at that time. A successful checkpoint guarantees that all database changes up to the checkpoint SCN have been recorded in the datafiles. As a result, only those changes made after the checkpoint need to be applied during recovery.
Oracle stores the SCN associated with the Checkpoint in four places: three
of them in the control file( accessed thru v$datafile) and one in the datafile header for each datafile ( thru v$datafile_header)
a.)System checkpoint SCN in the control file.(
v$database.current_scn)
b.)Datafile Checkpoint SCN individually in the control file for each datafile (v$datafile.chckpoint_change#)
c.)Checkpoint SCN value in the header of each datafile, which also referred as
Start SCN ( v$datafile_header.checkpoint_change#)
d.)The stop SCN is held in the control file for each datafile (
v$datafile.last_cahnge#)
Now let us see how the SCN that is stored in different places by various events are useful during recovery scenarios
Under the normal conditions, where database is running without any problem, the System Checkpoint SCN, Datafile Checkpoint SCN and Start SCN have equal values. Whereas, Stop SCN will always be NULL under normal conditions.
During a clean shutdown of the database, since it is an event, a
checkpoint occurs and the stop SCN value is set to Start SCN of datafile
header. Now, all the 4 SCN values are same.
(Just remember !!! if it is a clean shutdown – all the four SCN’s will be
same.)
Now let us see what happens when the database is started –
Upon startup, Start SCN value is checked with Datafile Checkpoint SCN, if they both match then Start SCN value is checked with Stop SCN and if they do match, Oracle opens the database without any problem. Since in our case as mentioned above all the 4 values are same and so we have a clean startup. Once the database is opened, stop SCN of the control file will be resetted to NULL again.
So, as explained above, this is how the SCN i.e stored at various places behaves under normal conditions. Now let us know how SCN’s behaves under abnormal conditions like instance crash or media failure.
When an Instance Crash occurs, since a clean shutdown didn’t happend, Stop
SCN value is not initiated with the start SCN value and it remains as NULL.
Upon startup, Oracle checks the Start SCN with Stop SCN and found that they are
not in match. Oracle determines that instance requires a crash recovery.
Instance recovery happens automatically at database startup without database
administrator (DBA) intervention.
.
In this session we are going to demonstrate the behavior of the checkpoints after an instance crash, the following SQL creates a table (which performs an implicit commit) and inserts a row of data into it without a commit:
SQL> create table dummytable (x number) tablespace test; Table created. SQL> insert into dummytable values(100); 1 row created.
During subsequent startup after instance crash Oracle will realize before
opening database that value for Last_Change# column for datafiles are not
showing SCN value , which means END Checkpoint SCN was not done, which would be
indication for Oracle that instance recovery is required.
Let’s mimic instance failure and then startup mount to examine the situation
SQL> shu abort; ORACLE instance shut down. SQL> startup mount; ORACLE instance started. Total System Global Area 939495424 bytes Fixed Size 2218952 bytes Variable Size 704644152 bytes Database Buffers 226492416 bytes Redo Buffers 6139904 bytes Database mounted. SQL> select checkpoint_change# from v$database; CHECKPOINT_CHANGE# ------------------ 2125209 SQL> select name,checkpoint_change#,last_change# from v$datafile where name like '%system%'; NAME CHECKPOINT_CHANGE# LAST_CHANGE# --------------------------------------------- ------------------ ------------ /u02/app/oracle/oradata/mask11g/system01.dbf 2125209 SQL> select name,checkpoint_change# from v$datafile_header where name like '%system%'; NAME CHECKPOINT_CHANGE# --------------------------------------------- ------------------ /u02/app/oracle/oradata/mask11g/system01.dbf 2125209
Recovery from an Instance Crash
Upon the next instance startup that takes place after SHUTDOWN ABORT or a DBMS crash, the Oracle DBMS detects that the stop SCN for datafiles is not set in the control file during startup. Oracle then performs crash recovery. During crash recovery, Oracle applies redo log records from the online redo logs in a process referred to as roll forward to ensure that all transactions committed before the crash are applied to the datafiles. Following roll forward, active transactions that did not commit are identified from the rollback segments and are undone before the blocks involved in the active transactions can be accessed. This process is referred to as roll back. In our example, the following transaction was active but not committed at the time of the SHUTDOWN ABORT, so it needs to be rolled back:
SQL> insert into dummytable values(100);
After instance startup, the dummytable table exists, but remains empty. Instance recovery happens automatically at database startup without database administrator (DBA) intervention. It may take a while because of the need to apply large amounts of outstanding redo changes to data blocks for transactions that completed and those that didn’t complete and require roll back.
Recovery from a Media Failure
Up until this point, the checkpoint start SCN in the datafile header has always matched the datafile checkpoint SCN number held in the control file. This is reasonable because during a checkpoint, the datafile checkpoint SCN in the control file and the start SCN in the datafile header are both updated, along with the system checkpoint SCN. The following SQL shows the start SCN from the datafile header and datafile checkpoint SCN from the control file for the same file:
SQL> select 'controlfile' "SCN
location",name,checkpoint_change# from v$datafile where name like
'%TEST01%'
union
select 'file header',name,checkpoint_change# from v$datafile_header where name
like '%TEST01%';
SCN location NAME
CHECKPOINT_CHANGE#
------------ -------------------------------------------- ------------------
controlfile /u02/app/oracle/oradata/mask11g/TEST01.dbf 2160246
file header /u02/app/oracle/oradata/mask11g/TEST01.dbf 2160246
Unlike the v$datafile view, there is no stop SCN column in the
v$datafile_header view because v$datafile_header is not used at instance startup
time to indicate that an instance crash occurred. However, the
v$datafile_header does provide the Oracle DBMS with the information it requires
to perform media recovery. At instance startup, the datafile checkpoint SCN in
the control file and the start SCN in the datafile header are checked for
equality. If they don’t match, it is a signal that media recovery is required.
For example, media recovery is required if a media failure has occurred and the
original datafile has been replaced with a backup copy. In this case, the start
SCN in the backup copy is less than the checkpoint SCN value in the control
file, and Oracle requests archived redo logs—generated at the time of previous
log switches—in order to reapply the changes required to bring the datafile up
to the current point in time.
In order to recover the database from a media failure, we must run the database in ARCHIVELOG mode to ensure that all database changes from the online redo logs are stored permanently in archived redo log files. In order to enable ARCHIVELOG mode, we must run the command ALTERDATABASE ARCHIVELOG when the database is in a mounted state.
So to examine the media recovery our database must be in archivelog mode so we can use tablespace or datafile level failure for our test. Before trying the media recovery scenario let’s take backup of tablespace.
SQL> alter tablespace TEST begin backup;
Tablespace altered.
Check the SCN of datafile at this point, as post begin backup datafile header will be frozen.
SQL> select 'controlfile' "SCN
location",name,checkpoint_change# from v$datafile where name like
'%TEST01%'
union
select 'file header',name,checkpoint_change# from v$datafile_header where name
like '%TEST01%';
SCN location NAME
CHECKPOINT_CHANGE#
------------ -------------------------------------------- ------------------
controlfile /u02/app/oracle/oradata/mask11g/TEST01.dbf 2160333
file header /u02/app/oracle/oradata/mask11g/TEST01.dbf 2160333
Let’s take a backup of datafile test01.dbf
SQL> !cp /u02/app/oracle/oradata/mask11g/TEST01.dbf
/u02/app/oracle/oradata/mask11g/TEST01.dbf.bkp
SQL> !ls -lrt /u02/app/oracle/oradata/mask11g/TEST01*
-rw-r----- 1 oracle oinstall 52436992 Sep 2 17:27 /u02/app/oracle/oradata/mask11g/TEST01.dbf
-rw-r----- 1 oracle oinstall 52436992 Sep 2 17:27
/u02/app/oracle/oradata/mask11g/TEST01.dbf.bkp
SQL> alter tablespace TEST end backup;
Tablespace altered.
Now create another table into Test tablespace.
SQL> create table dummy tablespace test as select * from
dba_extents;
Table created.
Remove datafile and force system to sync OS filesystem cache
# rm /u02/app/oracle/oradata/mask11g/TEST01.dbf
# sync
# sync
Let’s try DBWR to trigger the I/O
SQL> alter system flush buffer_cache;
System altered.
Report this ad
At this point we have checked and no error visible in alert log which means DB still able to access cached copy of datafile. Let’s try to access table, and we are able to see the result what we were expecting.
SQL> select count(*) from dummy;
select count(*) from dummy
*
ERROR at line 1:
ORA-00376: file 17 cannot be read at this time
ORA-01110: data file 17: '/u02/app/oracle/oradata/mask11g/TEST01.dbf'
We can identify files that need recovery by querying v$recover_file.
SQL> select file#,change# from v$recover_file;
FILE# CHANGE#
---------- ----------
17 0
At this stage datafile is no longer available for query so CHANGE# coloumn showing 0. Once we place the file from backup, we will see latest information from header of restored datafile.
# mv /u02/app/oracle/oradata/mask11g/TEST01.dbf.bkp /u02/app/oracle/oradata/mask11g/TEST01.dbf
Let’s query the status of recovery SCN , which must be the SCN number at which backup of datafile was taken, which is 2160333.
SQL> select file#,change# from v$recover_file;
FILE# CHANGE#
---------- ----------
17 2160333
Lets compare the required SCN from controlfile and actual SCN from datafile header to compare why Oracle is showing this file as restored from backup.
SQL> select 'controlfile' "SCN
location",name,checkpoint_change# from v$datafile where name like
'%TEST01%'
union
select 'file header',name,checkpoint_change# from v$datafile_header where name
like '%TEST01%';
SCN location NAME
CHECKPOINT_CHANGE#
------------ -------------------------------------------- ------------------
controlfile /u02/app/oracle/oradata/mask11g/TEST01.dbf 2166885
file header /u02/app/oracle/oradata/mask11g/TEST01.dbf 2160333
We can clearly see the expected SCN for datafile and available SCN from restored datafile, that’s how Oracle DBMS detects the requirement of media recovery.
SQL> alter database recover tablespace test;
Database altered.
SQL> alter tablespace test online;
Tablespace altered.
Let’s try to access table having extents in restored datafile.
SQL> select count(*) from dummy;
COUNT(*)
----------
9306
No comments:
Post a Comment