Beiträge getaggt mit Backup & Recovery
The Recovery Area: Why it is recommended
With 10g, the concept of the Recovery Area – associated with the parameter DB_RECOVERY_FILE_DEST – was introduced. The picture below illustrates it:
The name Recovery Area indicates that this is the place to find everything needed to do a successful recovery in case of a damage (or a logical mistake) in the Database Area.
The Recovery Area was designed in the first place for customers who use Disk Storage as their primary backup media. Meanwhile, due to the ever decreasing price of Disk Storage, this should be the majority of customers. In spite of that, there seems to be some kind of reluctance to actually make use of the multiple benefits, the Recovery Area delivers.
This posting aims to show how deal with the Recovery Area in order to
a) Protect Controlfiles & Online Logfiles
b) Implement a robust Backup Strategy easily
c) Manage Flashback Logs
Let’s start with
a) How does the Recovery Area help to protect Controlfiles & Online Logfiles?
It should be consensus that the Controlfiles and especially the Online Logfiles are sensitive and must be mirrored. If we use the parameter DB_CREATE_FILE_DEST (introduced in 9i already) together with DB_RECOVERY_FILE_DEST, this is done automatically during the Database creation. It is typically but not necessarily so that we use DB_CREATE_FILE_DEST together with ASM. My demo Database is without ASM here:
[oracle@uhesse ~]$ cat skripte/initprima.ora compatible=11.2.0.2 db_block_size=8192 db_name='prima' db_create_file_dest='/u01/app/oracle/oradata' db_recovery_file_dest='/u02/fra' db_recovery_file_dest_size=2g memory_target=640m undo_management=auto undo_tablespace=undotbs1 db_domain='' diagnostic_dest='/u01/app/oracle/diag'
That is my minimalistic init.ora. Notice the absence of the parameter CONTROL_FILES. Now look how the two parameters make it easy for me to create my Database with mirrored Control- and Online Logfiles:
[oracle@uhesse ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.2.0 Production on Tue Aug 23 11:48:08 2011 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to an idle instance. SQL> create spfile from pfile='/home/oracle/skripte/initprima.ora'; File created. SQL> startup nomount ORACLE instance started. Total System Global Area 668082176 bytes Fixed Size 2229480 bytes Variable Size 398461720 bytes Database Buffers 264241152 bytes Redo Buffers 3149824 bytes SQL> create database undo tablespace undotbs1 default tablespace users default temporary tablespace temp character set al32utf8; Database created.
That was it already and it took less than a minute. Some more minutes will it take to run catalog & catproc, though. Let’s investigate the new Database:
SQL> select status from v$instance; STATUS ------------ OPEN SQL> select name,bytes/1024/1024 as mb from v$datafile union select name,bytes/1024/1024 from v$tempfile; NAME MB ---------------------------------------------------------------------- ---------- /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf 100 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf 100 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_temp_75d3oynt_.tmp 100 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf 10 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf 100
We see above the effect of specifying the Database Area with DB_CREATE_FILE_DEST: The system created a subdirectory according to the DB_UNIQUE_NAME (derived from DB_NAME in this case) as well as another subdirectory beneath according to the file type.
Now to the mirroring of our most sensitive files:
SQL> select name,is_recovery_dest_file from v$controlfile; NAME IS_ ---------------------------------------------------------------------- --- /u01/app/oracle/oradata/PRIMA/controlfile/o1_mf_756xwk4p_.ctl NO /u02/fra/PRIMA/controlfile/o1_mf_756xwk70_.ctl YES SQL> select member,is_recovery_dest_file from v$logfile; MEMBER IS_ ---------------------------------------------------------------------- --- /u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_1_756xwk95_.log NO /u02/fra/PRIMA/onlinelog/o1_mf_1_756xwkfr_.log YES /u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_2_756xwkn5_.log NO /u02/fra/PRIMA/onlinelog/o1_mf_2_756xwl1t_.log YES
We see only with this approach a ‚YES‘ in the IS_REVOVERY_DEST_FILE column of v$controlfile. If you specify the CONTROL_FILES parameter manually (with one file pointing to the Recovery Area) before the CREATE DATABASE command, this does not count. I admit that in earlier versions of 10g, the systems handling of the CONTROL_FILES parameter was not optimal, because it did not write the automatically created Controlfiles into the spfile. But the newer releases do:
SQL> select sid,name,value from v$spparameter where isspecified='TRUE'; SID NAME VALUE ------ ------------------------------ ---------------------------------------------------------------------- * processes 100 * memory_target 671088640 * control_files /u01/app/oracle/oradata/PRIMA/controlfile/o1_mf_756xwk4p_.ctl * control_files /u02/fra/PRIMA/controlfile/o1_mf_756xwk70_.ctl * db_block_size 8192 * compatible 11.2.0.2 * db_create_file_dest /u01/app/oracle/oradata * db_recovery_file_dest /u02/fra * db_recovery_file_dest_size 2147483648 * undo_management auto * undo_tablespace undotbs1 * db_domain * db_name prima * diagnostic_dest /u01/app/oracle/diag 14 rows selected.
A valid reason not to go with this method (mirroring Controlfiles & Online Logfiles into the Recovery Area) would be if the devices there a significantly slower than the ones that make up the Database Area. We may still use the Recovery Area to keep our Backups, Archivelogs & Flashback Logs there, though. Because the devices underneath /u01 and /u02 have similar performance characteristics, it is perfectly OK as it is now on my demo system. Apart from performance aspects, it is of course of key importance to have the Recovery Area placed on different physical devices than the Database Area.
We have two V$-Views that reflect the usage of the Recovery Area:
SQL> select * from V$RECOVERY_FILE_DEST; NAME SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES ------------------------------ ----------- ---------- ----------------- --------------- /u02/fra 2147483648 218316800 0 3
The SPACE* columns have the unit bytes, unfortunately. SPACE_LIMIT is derived from the dynamic parameter DB_RECOVERY_FILE_DEST_SIZE that was set here to the (very small) value 2G. SPACE_RECLAIMABLE is derived from the size of the Backups/Archivelogs, being obsolete according to our RMAN retention policy. The next view is newer and shows also the file types, consuming space in the Recover Area:
SQL> select * from V$RECOVERY_AREA_USAGE; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE .36 0 1 REDO LOG 9.77 0 2 ARCHIVED LOG 0 0 0 BACKUP PIECE 0 0 0 IMAGE COPY 0 0 0 FLASHBACK LOG 0 0 0 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected.
Because the Database is neither in Archivelog mode nor in Flashback mode yet and we did not take backups, the only files in the Recovery Area are the mirrors of the Online Logs and the Controlfile. We wil change that with the next step.
b) How does the Recovery Area help to implement a robust Backup Strategy?
We will put the Database in Archivelog mode and see that the Archivelogs are being created automatically into the Recovery Area:
SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. SQL> startup mount ORACLE instance started. Total System Global Area 668082176 bytes Fixed Size 2229480 bytes Variable Size 444599064 bytes Database Buffers 218103808 bytes Redo Buffers 3149824 bytes Database mounted. SQL> alter database archivelog; Database altered. SQL> alter database open; Database altered. SQL> alter system switch logfile; System altered. SQL> select name from v$archived_log; NAME ---------------------------------------------------------------------- /u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_7_75d56p79_.arc 1 row selected. SQL> select * from v$recovery_area_usage; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE .36 0 1 REDO LOG 9.77 0 2 ARCHIVED LOG 1.46 0 1 BACKUP PIECE 0 0 0 IMAGE COPY 0 0 0 FLASHBACK LOG 0 0 0 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected.
Notice that I did not need to specify a single initialization parameter of the LOG_ARCHIVE* range for this; easy, isn’t it? Now we come to the almost only remaining difficulty: Specifying an appropriate value for DB_RECOVERY_FILE_DEST_SIZE. That parameter sets a logical space limit, our Database is allowed to consume inside of the Recovery Area. We must set it large enough for our Backups & Archivelogs to fit in, according to our retention policy. Let’s have a look how large our backups get for this Database:
SQL> select name,bytes/1024/1024 as mb from v$datafile union select name,bytes/1024/1024 from v$tempfile; NAME MB ---------------------------------------------------------------------- ---------- /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf 100 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf 200 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_temp_75d3oynt_.tmp 100 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf 180 /u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf 100 5 rows selected. SQL> select sum(bytes)/1024/1024 as mb from v$datafile; MB ---------- 580 1 row selected.
After catalog & catproc did run, the files grew little but still this is a very tiny Database. With Image Copies, my full Backup would take 580 MB in size, while Backup Sets would be smaller, depending on the amount of empty space inside of the datafiles that Backup Sets would not contain. We are going to take an ordinary default online backup:
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
[oracle@uhesse ~]$ rman target /
Recovery Manager: Release 11.2.0.2.0 - Production on Thu Aug 25 11:40:26 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: PRIMA (DBID=1992392973)
RMAN> show all;
using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name PRIMA are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/11.2.0.2/db_1/dbs/snapcf_prima.f'; # default
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP on;
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored
RMAN> backup database;
Starting backup at 25-AUG-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=29 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T114118_75d65z8m_.bkp tag=TAG20110825T114118 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11
Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760102882_75d6633f_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11
We could see that the Backup Set as well as the Controlfile Autobackup (a must to switch on!) were taken into the Recovery Area. Now to our calculation about the Recovery Area sizing:
SQL> select * from v$recovery_area_usage; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE .36 0 1 REDO LOG 9.77 0 2 ARCHIVED LOG 1.46 1.46 1 BACKUP PIECE 19.07 0 2 IMAGE COPY 0 0 0 FLASHBACK LOG 0 0 0 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected. SQL> alter system set DB_RECOVERY_FILE_DEST_SIZE=1g; System altered. SQL> select * from v$recovery_area_usage; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE .73 0 1 REDO LOG 19.53 0 2 ARCHIVED LOG 2.93 2.93 1 BACKUP PIECE 38.14 0 2 IMAGE COPY 0 0 0 FLASHBACK LOG 0 0 0 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected.
With this (very small) value of DB_RECOVERY_FILE_DEST_SIZE, it should still be possible for RMAN to take another full backup into the Recovery Area – because only about 50 % of space is consumed by not obsolete files – after which the previous backup becomes obsolete, together with the Archivelogs, belonging to that backup. Our goal is to just take backups (every night, for example) and let the obsolete backups & archivelogs get deleted automatically. We must keep in mind that it is of key importance that archive logs can be created always to avoid archiver stuck problems, though. Let’s look at the desired behavior:
RMAN> backup database; Starting backup at 25-AUG-11 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=1 device type=DISK channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf channel ORA_DISK_1: starting piece 1 at 25-AUG-11 channel ORA_DISK_1: finished piece 1 at 25-AUG-11 piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T115312_75d6w8bd_.bkp tag=TAG20110825T115312 comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03 Finished backup at 25-AUG-11 Starting Control File and SPFILE Autobackup at 25-AUG-11 piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760103595_75d6wcng_.bkp comment=NONE Finished Control File and SPFILE Autobackup at 25-AUG-11 RMAN> exit Recovery Manager complete. [oracle@uhesse ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.2.0 Production on Thu Aug 25 11:53:23 2011 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production SQL> set lines 200 SQL> select * from v$recovery_area_usage; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE .73 0 1 REDO LOG 19.53 0 2 ARCHIVED LOG 0 0 0 BACKUP PIECE 38.89 .74 3 IMAGE COPY 0 0 0 FLASHBACK LOG 0 0 0 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected.
We took another backup and the previous backup got obsolete. Automatically, obsolete files got deleted because of space pressure inside of the Recovery Area:
[oracle@uhesse ~]$ adrci ADRCI: Release 11.2.0.2.0 - Production on Thu Aug 25 11:56:19 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. ADR base = "/u01/app/oracle/diag" adrci> show alert -tail 2011-08-25 11:24:26.308000 +02:00 Starting background process CJQ0 CJQ0 started with pid=25, OS id=2495 2011-08-25 11:24:38.156000 +02:00 Thread 1 advanced to log sequence 8 (LGWR switch) Current log# 2 seq# 8 mem# 0: /u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_2_75d3og50_.log Current log# 2 seq# 8 mem# 1: /u02/fra/PRIMA/onlinelog/o1_mf_2_75d3ogms_.log Archived Log entry 1 added for thread 1 sequence 7 ID 0x76c1760d dest 1: 2011-08-25 11:34:24.262000 +02:00 Starting background process SMCO SMCO started with pid=23, OS id=2529 2011-08-25 11:39:10.821000 +02:00 Stopping background process CJQ0 2011-08-25 11:46:00.516000 +02:00 ALTER SYSTEM SET db_recovery_file_dest_size='1G' SCOPE=BOTH; 2011-08-25 11:53:14.365000 +02:00 Deleted Oracle managed file /u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_7_75d56p79_.arc 2011-08-25 11:53:15.647000 +02:00 Deleted Oracle managed file /u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T114118_75d65z8m_.bkp
Although it worked well in this scenario, the logical limit is set a little too optimistic here; a peak in Archivelog creation could easily lead to Archiver Stuck problems. Therefore, we increase the Recovery Area to a safer size and look at the consequences of that:
SQL> alter system set db_recovery_file_dest_size=2g; System altered. RMAN> backup database; Starting backup at 25-AUG-11 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=30 device type=DISK channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf channel ORA_DISK_1: starting piece 1 at 25-AUG-11 channel ORA_DISK_1: finished piece 1 at 25-AUG-11 piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T120213_75d7f5wk_.bkp tag=TAG20110825T120213 comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03 Finished backup at 25-AUG-11 Starting Control File and SPFILE Autobackup at 25-AUG-11 piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760104136_75d7f8wb_.bkp comment=NONE Finished Control File and SPFILE Autobackup at 25-AUG-11 RMAN> backup database; Starting backup at 25-AUG-11 using channel ORA_DISK_1 channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf channel ORA_DISK_1: starting piece 1 at 25-AUG-11 channel ORA_DISK_1: finished piece 1 at 25-AUG-11 piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T120231_75d7fqcq_.bkp tag=TAG20110825T120231 comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03 Finished backup at 25-AUG-11 Starting Control File and SPFILE Autobackup at 25-AUG-11 piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760104154_75d7ftjw_.bkp comment=NONE Finished Control File and SPFILE Autobackup at 25-AUG-11 SQL> select * from v$recovery_area_usage; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE .36 0 1 REDO LOG 9.77 0 2 ARCHIVED LOG 0 0 0 BACKUP PIECE 58.05 38.75 7 IMAGE COPY 0 0 0 FLASHBACK LOG 0 0 0 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected.
After increasing the logical limit, we took two new backups. Although our retention policy is still redundancy 1, the accordingly obsolete backups did not get deleted yet, because there is no space pressure. Keep in mind that the system will consume almost the complete space that we allow with DB_RECOVERY_FILE_DEST_SIZE, therefore! The background of that is that we may make use of the old obsolete backups for a conventional Database Point In Time Recovery (DBPITR) without Flashback Logs. From now on, we will just take our (nightly) backups and let the system take care about obsolete backups & archivelogs. We should monitor V$RECOERY_AREA_USAGE or rely on an Enterprise Manager Alert that raises automatically if space in the Recovery Area gets scarce.
In other words: We just need to figure out once an appropriate logical limit and the system takes care of itself subsequently, while we just monitor it. This approach is much easier than other methods used before the introduction of the Recovery Area.
c) How to manage Flashback Logs together with the Recovery Area?
If we want to be able to do Flashback Database, we must have a Recovery Area to place the Flashback logs into:
SQL> alter database flashback on; Database altered. SQL> select flashback_on,open_mode from v$database; FLASHBACK_ON OPEN_MODE ------------------ -------------------- YES READ WRITE SQL> show parameter flashback_retention_target NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_flashback_retention_target integer 1440
From now on, we will be able to do a very fast DBPITR with Flashback Database, by default one day (1440 minutes) back into the past. Notice that this target is not mandatory and will not lead to a „Flashback Stuck“ if space is consumed already by non obsolete backups & archivelogs. In order to show that, I will again reduce the logical limit to create a space pressure:
SQL> alter system set db_recovery_file_dest_size=700m; System altered. SQL> select * from v$recovery_area_usage; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- CONTROL FILE 1.07 0 1 REDO LOG 28.57 0 2 ARCHIVED LOG 13.72 0 1 BACKUP PIECE 58.63 1.08 4 IMAGE COPY 0 0 0 FLASHBACK LOG 3.35 0 3 FOREIGN ARCHIVED LOG 0 0 0 7 rows selected. SQL> insert /*+ append */ into sales select * from sales; 3200000 rows created. SQL> commit; Commit complete.
The above insert would normally create Flashback Logs but it runs in spite of the full Recovery Area. There is no such thing as „Flashback Stuck“ (unless you create a guaranteed restore point, that is). That scenario is somewhat artificial because the command above creates almost no Redo Protocol (the table sales is on NOLOGGING) and also this is the only activity on the system. With the Recovery Area 100% full, the Database would normally get to a hold immediately – not because of the impossibility to create Flashback Logs, but because of Archiver Stuck:
SQL> connect / as sysdba Connected. SQL> alter system switch logfile; System altered. SQL> r 1* alter system switch logfile Waits ... Second terminal: [oracle@uhesse ~]$ adrci ADRCI: Release 11.2.0.2.0 - Production on Thu Aug 25 12:36:18 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. ADR base = "/u01/app/oracle/diag" adrci> show alert -tail 2011-08-25 12:36:24.236000 +02:00 ************************************************************************ You have following choices to free up space from recovery area: 1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard, then consider changing RMAN ARCHIVELOG DELETION POLICY. 2. Back up files to tertiary device such as tape using RMAN BACKUP RECOVERY AREA command. 3. Add disk space and increase db_recovery_file_dest_size parameter to reflect the new space. 4. Delete unnecessary files using RMAN DELETE command. If an operating system command was used to delete files, then use RMAN CROSSCHECK and DELETE EXPIRED commands. ************************************************************************ ARC1: Error 19809 Creating archive log file to '/u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_9_%u_.arc'
Exactly. LGWR cannot write into the Online Logs anymore because no more Archivelogs can be stored in the Recovery Area. Remedy:
adrci> exit
[oracle@uhesse ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.2.0 Production on Thu Aug 25 12:39:53 2011
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
SQL> alter system set db_recovery_file_dest_size=2g;
System altered.
That resolved the Archiver Stuck problem immediately.
In short: Although it is desireable to have enough space to store flashback logs, it is not mandatory. That is extremely different to archive logs. We should try to make our Recovery Area large enough to store our backups & archive logs according to our Backup Retention, and our Flashback Logs according to our Flashback Retention Target. The space required is partly depending on the Database Load. Therefore, it would be a good idea to start with a somewhat gracefully large DB_RECOVERY_FILE_DEST_SIZE and a relatively short DB_FLASHBACK_RETENTION_TARGET and then monitor V$RECOVERY_AREA_USAGE to come up with an appropriate value for the two parameters.
Coming to the end of this posting, let me try to anticipate the most common objectives against the usage of the Recovery Area:
1. Our Database is too large to go with Disk Storage as the Primary Backup Location!
Agreed. You will only take backups to tape. In spite of that, you may consider to use the Recovery Area to put mirrors of your Controlfiles and Online Logfiles on and to create your Archivelogs into. You must use it if you want to be able to do Flashback Database anyway.
2. We don’t use RMAN!
Agreed. The Recovery Area is not for you. You should have a very good reason not to use RMAN, though.
3. This is too good to be true! We have never done it like that!
Believe it or not, but that is (in slightly other wording, of course) the most common objection that I hear, typically from seasoned DBAs who have crafted their own scripts & methods in 8i already and now hesitate to change that. I have much understanding for the „never touch a running system“ axiom. It should not mislead us to discredit new concepts, just because they are different from what we are used to, though. Keep an open mind for (not so new, meanwhile) New Features that are designed to help you to do your job and consider using it at least for the next Database that you create 🙂
Addendum: See this short posting from Gavin Soorma, showing „Flashback Stuck“ can happen – but only with Guaranteed Restore Points. Keep in mind to drop Guaranteed Restore Points after the critical operation that caused you to create them is over, therefore!
Real-Time Query and Automatic Block Media Recovery in 11gR2
Yesterday, I posted about dealing with Block Corruption in general. Special case is Block Corruption in a Data Guard Environment, where we introduced the New Feature Automatic Block Media Recovery (ABMR). Yes, another cool abbrevation from the the Oracle Realm 🙂
To demonstrate this, I have created a Data Guard Configuration as described in my whitepaper 11g Data Guard on the command line, that you may get from the Downloads page. It’s still 11gR1 but can easily be adopted for 11gR2. Will publish an 11gR2 version of it there soon. My scenario looks like this:
SYS@prima > select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production DGMGRL> show configuration; Configuration - myconf Protection Mode: MaxAvailability Databases: prima - Primary database physt - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS DGMGRL> show database physt; Database - physt Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds Apply Lag: 0 seconds Real Time Query: ON Instance(s): physt Database Status: SUCCESS
The first 11g New Feature related to Block Corruption in a Data Guard Environment is less striking than ABMR and similar as restoring blocks from Flashback Logs: If we have a Physical Standby Database present, the corrupted blocks can be restored from there very fast. I corrupted the emp table of scott in the same way as in the previous posting. After using RMAN to validate as shown also already, I do the Blockrecovery
RMAN> blockrecover corruption list;
Starting recover at 17-NOV-10
using channel ORA_DISK_1
finished standby search, restored 1 blocks
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 17-NOV-10
I need to emphasize that this restore of blocks from the Physical Standby does not require Real-Time Query there – so we don’t need to purchase Active Data Guard for that. But ABMR does. First ABMR scenario: Block Corruption happens on the Primary Database. I do the same steps as shown in the previous posting to corrupt the block containing the emp table of scott. Now scott connects and selects on the emp table. After a short (about 1 sec) break, the select produces the correct result set! The alert log file of the Primary Database records the following:
Wed Nov 17 08:54:29 2010 Hex dump of (file 4, block 131) in trace file /home/oracle/prima/diag/rdbms/prima/prima/trace/prima_ora_5169.trc Corrupt block relative dba: 0x01000083 (file 4, block 131) Bad header found during buffer read Data in bad block: type: 0 format: 2 rdba: 0xffc00000 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x00000000 check value in block header: 0xff66 block checksum disabled Reading datafile '/home/oracle/prima/users01.dbf' for corruption at rdba: 0x01000083 (file 4, block 131) Reread (file 4, block 131) found same corrupt data (no logical check) Starting background process ABMR Wed Nov 17 08:54:29 2010 ABMR started with pid=33, OS id=5174 Automatic block media recovery service is active. Automatic block media recovery requested for (file# 4, block# 131) Wed Nov 17 08:54:30 2010 Automatic block media recovery successful for (file# 4, block# 131) Automatic block media recovery successful for (file# 4, block# 131) WARNING: AutoBMR fixed mismatched on-disk single block ffc00000 with in-mem rdba 1000083.
An end user will probably not even notice the Block Corruption on the Primary Database. Second ABMR scenario: Block Corruption happens on the Physical Standby Database. I destroy the block of the emp table in the same way as on the Primary before. Scott now connects on the Physical Standby and does a select on the emp table. Again a short break – no error message but the correct result set is returned! Alert Log of the Physical Standby records:
Wed Nov 17 08:58:10 2010 Hex dump of (file 4, block 131) in trace file /home/oracle/physt/diag/rdbms/physt/physt/trace/physt_ora_5281.trc Corrupt block relative dba: 0x01000083 (file 4, block 131) Bad header found during buffer read Data in bad block: type: 0 format: 2 rdba: 0xffc00000 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x00000000 check value in block header: 0xff66 block checksum disabled Reading datafile '/home/oracle/physt/users01.dbf' for corruption at rdba: 0x01000083 (file 4, block 131) Reread (file 4, block 131) found same corrupt data (no logical check) Automatic block media recovery requested for (file# 4, block# 131) Wed Nov 17 08:58:11 2010 Automatic block media recovery requested for (file# 4, block# 131) Automatic block media recovery successful for (file# 4, block# 131) Automatic block media recovery successful for (file# 4, block# 131)
Conclusion: We have just seen another benefit of Real-Time Query. Apart from it’s main purpose of making it possible to offload Read-Only (or even Read-Mostly) Applications to the Physical Standby Database, it is now in 11gR2 also leveraged to resolve Block Corruption fast and transparently. We call this 11gR2 New Feature Automatic Block Media Recovery (ABMR)
Dealing with Oracle Database Block Corruption

Media errors don’t always destroy files completely. Sometimes, only small parts of the file are damaged respectively corrupted. It may even not be noticed by end users or admins for a while. This article shows how to detect block corruption and recover from it. The demo is done on 11g but the shown techniques work in the same way for 12c also. I have corrupted blocks on my demo database affecting the emp table of the user scott:
SQL> select * from scott.emp;
select * from scott.emp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 131)
ORA-01110: data file 4: '/home/oracle/prima/users01.dbf'
This shows that not the whole tablespace is affected:
SQL> select * from scott.dept; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON SQL> select table_name,tablespace_name from dba_tables where owner='SCOTT'; TABLE_NAME TABLESPACE_NAME ------------------------------ ------------------------------ DEPT USERS EMP USERS
Whenever we get these kind of error messages, we need to check all the blocks. Typically, error messages about block corruption come up during an RMAN backup, but I like to defer that a little to show an 11g New Feature before. Checking all blocks now:
RMAN> validate check logical database; Starting validate at 16-NOV-10 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=107 device type=DISK channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: specifying datafile(s) for validation input datafile file number=00001 name=/home/oracle/prima/system01.dbf input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf input datafile file number=00004 name=/home/oracle/prima/users01.dbf channel ORA_DISK_1: validation complete, elapsed time: 00:00:01 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 1 OK 0 17594 38400 277491 File Name: /home/oracle/prima/system01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 13854 Index 0 4487 Other 0 2465 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 2 OK 0 20381 25600 277631 File Name: /home/oracle/prima/sysaux01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 869 Index 0 957 Other 0 3393 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 3 OK 0 541 22784 277631 File Name: /home/oracle/prima/undotbs01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 0 Index 0 0 Other 0 22243 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 4 FAILED 0 1133 1280 271968 File Name: /home/oracle/prima/users01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 10 Index 0 0 Other 1 137 validate found one or more corrupt blocks See trace file /home/oracle/prima/diag/rdbms/prima/prima/trace/prima_ora_18316.trc for details channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: specifying datafile(s) for validation including current control file for validation including current SPFILE in backup set channel ORA_DISK_1: validation complete, elapsed time: 00:00:01 List of Control File and SPFILE =============================== File Type Status Blocks Failing Blocks Examined ------------ ------ -------------- --------------- SPFILE OK 0 2 Control File OK 0 612 Finished validate at 16-NOV-10
We have already a couple of 11g New Features here: The syntax has changed from backup validate (since 9i) to just validate (11g) – probably to make clear that this does not perform a backup but a check of corrupted blocks instead. Before 11g, the command did not show the verbose list of checked respectively corrupted blocks like we see above.
The addition check logical will also check for logical block corruption, which is not done by default.
Checking all the blocks here is more efficient than doing an immediate recovery of the one block mentioned in the error message above. There may be many more not spotted yet. Same is true for an ordinary backup that would interrupt at the first spotted corrupted block as we will see later on.
The validate command populated the view v$database_block_corruption, that is now internally read by RMAN in order to repair all the found corrupted blocks. The next 11g New Feature here is: It will take the block out of the Flashback Logs, if present there!
RMAN> blockrecover corruption list; Starting recover at 16-NOV-10 using channel ORA_DISK_1 searching flashback logs for block images finished flashback log search, restored 1 blocks starting media recovery media recovery complete, elapsed time: 00:00:01 Finished recover at 16-NOV-10
I was so bold that I did not even take a backup before – to make sure this new feature must be used:
RMAN> list backup; specification does not match any backup in the repository
I’m going to take a backup now, but before that, I cause again block corruption. So we will see that RMAN stops at the first noticed corrupted block. No Third-Party-Tool would recognize the block corruption, BTW, so we have another reason to actually use RMAN here. If we say backup check logical database instead of just backup database, RMAN will also check for logical block corruption during the backup.
[oracle@uhesse-pc skripte]$ rman target sys/oracle@prima
Recovery Manager: Release 11.2.0.2.0 - Production on Tue Nov 16 15:22:14 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: PRIMA (DBID=1967518488)
RMAN> backup database;
Starting backup at 16-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=110 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
channel ORA_DISK_1: starting piece 1 at 16-NOV-10
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/16/2010 15:22:22
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/prima/users01.dbf
continuing other job steps, job failed will not be re-run
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 16-NOV-10
channel ORA_DISK_1: finished piece 1 at 16-NOV-10
piece handle=/home/oracle/flashback/PRIMA/backupset/2010_11_16/o1_mf_ncsnf_TAG20101116T152221_6g54wzkb_.bkp tag=TAG20101116T152221 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/16/2010 15:22:22
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/prima/users01.dbf
Again the same sequence as above validate check logical database & blockrecover corruption list will solve the problem. During the whole process, the users tablespace remains online and usable, except the emp table of scott.
Conclusion: We have a powerful tool with RMAN to spot and repair corrupted blocks by using intact versions of the corrupted blocks from backup (since 9i already) or even from Flashback Logs (since 11g) – which is probably faster – while keeping up the availability of the affected tablespace.
