Backup & Recovery | Pickleball spielen

The Recovery Area: Why it is recommended

Veröffentlicht von Uwe Hesse in TOI am September 22, 2011

With 10g, the concept of the Recovery Area – associated with the parameter DB_RECOVERY_FILE_DEST – was introduced. The picture below illustrates it:

The name Recovery Area indicates that this is the place to find everything needed to do a successful recovery in case of a damage (or a logical mistake) in the Database Area.

The Recovery Area was designed in the first place for customers who use Disk Storage as their primary backup media. Meanwhile, due to the ever decreasing price of Disk Storage, this should be the majority of customers. In spite of that, there seems to be some kind of reluctance to actually make use of the multiple benefits, the Recovery Area delivers.

This posting aims to show how deal with the Recovery Area in order to
a) Protect Controlfiles & Online Logfiles
b) Implement a robust Backup Strategy easily
c) Manage Flashback Logs

Let’s start with
a) How does the Recovery Area help to protect Controlfiles & Online Logfiles?

It should be consensus that the Controlfiles and especially the Online Logfiles are sensitive and must be mirrored. If we use the parameter DB_CREATE_FILE_DEST (introduced in 9i already) together with DB_RECOVERY_FILE_DEST, this is done automatically during the Database creation. It is typically but not necessarily so that we use DB_CREATE_FILE_DEST together with ASM. My demo Database is without ASM here:

[oracle@uhesse ~]$ cat skripte/initprima.ora 

compatible=11.2.0.2
db_block_size=8192
db_name='prima'
db_create_file_dest='/u01/app/oracle/oradata'
db_recovery_file_dest='/u02/fra'
db_recovery_file_dest_size=2g
memory_target=640m
undo_management=auto
undo_tablespace=undotbs1
db_domain=''
diagnostic_dest='/u01/app/oracle/diag'

That is my minimalistic init.ora. Notice the absence of the parameter CONTROL_FILES. Now look how the two parameters make it easy for me to create my Database with mirrored Control- and Online Logfiles:

[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Tue Aug 23 11:48:08 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> create spfile from pfile='/home/oracle/skripte/initprima.ora';

File created.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  668082176 bytes
Fixed Size            2229480 bytes
Variable Size          398461720 bytes
Database Buffers      264241152 bytes
Redo Buffers            3149824 bytes
SQL> create database undo tablespace undotbs1
     default tablespace users
     default temporary tablespace temp
     character set al32utf8;

Database created.

That was it already and it took less than a minute. Some more minutes will it take to run catalog & catproc, though. Let’s investigate the new Database:

SQL> select status from v$instance;

STATUS
------------
OPEN

SQL> select name,bytes/1024/1024 as mb from v$datafile union
     select name,bytes/1024/1024 from v$tempfile;

NAME                                                                           MB
---------------------------------------------------------------------- ----------
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf             100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf             100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_temp_75d3oynt_.tmp               100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf            10
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf              100

We see above the effect of specifying the Database Area with DB_CREATE_FILE_DEST: The system created a subdirectory according to the DB_UNIQUE_NAME (derived from DB_NAME in this case) as well as another subdirectory beneath according to the file type.
Now to the mirroring of our most sensitive files:

SQL> select name,is_recovery_dest_file from v$controlfile;

NAME                                                                   IS_
---------------------------------------------------------------------- ---
/u01/app/oracle/oradata/PRIMA/controlfile/o1_mf_756xwk4p_.ctl           NO
/u02/fra/PRIMA/controlfile/o1_mf_756xwk70_.ctl                         YES

SQL> select member,is_recovery_dest_file from v$logfile;

MEMBER                                                                 IS_
---------------------------------------------------------------------- ---
/u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_1_756xwk95_.log           NO
/u02/fra/PRIMA/onlinelog/o1_mf_1_756xwkfr_.log                         YES
/u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_2_756xwkn5_.log           NO
/u02/fra/PRIMA/onlinelog/o1_mf_2_756xwl1t_.log                         YES

We see only with this approach a ‚YES‘ in the IS_REVOVERY_DEST_FILE column of v$controlfile. If you specify the CONTROL_FILES parameter manually (with one file pointing to the Recovery Area) before the CREATE DATABASE command, this does not count. I admit that in earlier versions of 10g, the systems handling of the CONTROL_FILES parameter was not optimal, because it did not write the automatically created Controlfiles into the spfile. But the newer releases do:

SQL> select sid,name,value from v$spparameter where isspecified='TRUE';

SID    NAME                           VALUE
------ ------------------------------ ----------------------------------------------------------------------
*      processes                      100
*      memory_target                  671088640
*      control_files                  /u01/app/oracle/oradata/PRIMA/controlfile/o1_mf_756xwk4p_.ctl
*      control_files                  /u02/fra/PRIMA/controlfile/o1_mf_756xwk70_.ctl
*      db_block_size                  8192
*      compatible                     11.2.0.2
*      db_create_file_dest            /u01/app/oracle/oradata
*      db_recovery_file_dest          /u02/fra
*      db_recovery_file_dest_size     2147483648
*      undo_management                auto
*      undo_tablespace                undotbs1
*      db_domain
*      db_name                        prima
*      diagnostic_dest                /u01/app/oracle/diag

14 rows selected.

A valid reason not to go with this method (mirroring Controlfiles & Online Logfiles into the Recovery Area) would be if the devices there a significantly slower than the ones that make up the Database Area. We may still use the Recovery Area to keep our Backups, Archivelogs & Flashback Logs there, though. Because the devices underneath /u01 and /u02 have similar performance characteristics, it is perfectly OK as it is now on my demo system. Apart from performance aspects, it is of course of key importance to have the Recovery Area placed on different physical devices than the Database Area.

We have two V$-Views that reflect the usage of the Recovery Area:

SQL> select * from V$RECOVERY_FILE_DEST;

NAME                           SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
------------------------------ ----------- ---------- ----------------- ---------------
/u02/fra                        2147483648  218316800                 0               3

The SPACE* columns have the unit bytes, unfortunately. SPACE_LIMIT is derived from the dynamic parameter DB_RECOVERY_FILE_DEST_SIZE that was set here to the (very small) value 2G. SPACE_RECLAIMABLE is derived from the size of the Backups/Archivelogs, being obsolete according to our RMAN retention policy. The next view is newer and shows also the file types, consuming space in the Recover Area:

SQL> select * from V$RECOVERY_AREA_USAGE;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .36                         0               1
REDO LOG                           9.77                         0               2
ARCHIVED LOG                          0                         0               0
BACKUP PIECE                          0                         0               0
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

Because the Database is neither in Archivelog mode nor in Flashback mode yet and we did not take backups, the only files in the Recovery Area are the mirrors of the Online Logs and the Controlfile. We wil change that with the next step.

b) How does the Recovery Area help to implement a robust Backup Strategy?

We will put the Database in Archivelog mode and see that the Archivelogs are being created automatically into the Recovery Area:

SQL> shutdown immediate
 Database closed.
 Database dismounted.
 ORACLE instance shut down.
 SQL> startup mount
 ORACLE instance started.
Total System Global Area  668082176 bytes
 Fixed Size                  2229480 bytes
 Variable Size             444599064 bytes
 Database Buffers          218103808 bytes
 Redo Buffers                3149824 bytes
 Database mounted.
 SQL> alter database archivelog;
Database altered.
SQL> alter database open;
Database altered.
SQL> alter system switch logfile;
System altered.
SQL> select name from v$archived_log;
NAME
 ----------------------------------------------------------------------
 /u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_7_75d56p79_.arc
1 row selected.
SQL> select * from v$recovery_area_usage;
FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
 -------------------- ------------------ ------------------------- ---------------
 CONTROL FILE                        .36                         0               1
 REDO LOG                           9.77                         0               2
 ARCHIVED LOG                       1.46                         0               1
 BACKUP PIECE                          0                         0               0
 IMAGE COPY                            0                         0               0
 FLASHBACK LOG                         0                         0               0
 FOREIGN ARCHIVED LOG                  0                         0               0
7 rows selected.

Notice that I did not need to specify a single initialization parameter of the LOG_ARCHIVE* range for this; easy, isn’t it? Now we come to the almost only remaining difficulty: Specifying an appropriate value for DB_RECOVERY_FILE_DEST_SIZE. That parameter sets a logical space limit, our Database is allowed to consume inside of the Recovery Area. We must set it large enough for our Backups & Archivelogs to fit in, according to our retention policy. Let’s have a look how large our backups get for this Database:

SQL> select name,bytes/1024/1024 as mb from v$datafile union
     select name,bytes/1024/1024 from v$tempfile;    

NAME                                                                           MB
---------------------------------------------------------------------- ----------
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf             100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf             200
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_temp_75d3oynt_.tmp               100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf           180
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf              100

5 rows selected.

SQL> select sum(bytes)/1024/1024 as mb from v$datafile;

        MB
----------
       580

1 row selected.

After catalog & catproc did run, the files grew little but still this is a very tiny Database. With Image Copies, my full Backup would take 580 MB in size, while Backup Sets would be smaller, depending on the amount of empty space inside of the datafiles that Backup Sets would not contain. We are going to take an ordinary default online backup:

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
[oracle@uhesse ~]$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Thu Aug 25 11:40:26 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (DBID=1992392973)

RMAN> show all;

using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name PRIMA are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/11.2.0.2/db_1/dbs/snapcf_prima.f'; # default

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP on;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> backup database;

Starting backup at 25-AUG-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=29 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T114118_75d65z8m_.bkp tag=TAG20110825T114118 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760102882_75d6633f_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

We could see that the Backup Set as well as the Controlfile Autobackup (a must to switch on!) were taken into the Recovery Area. Now to our calculation about the Recovery Area sizing:

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .36                         0               1
REDO LOG                           9.77                         0               2
ARCHIVED LOG                       1.46                      1.46               1
BACKUP PIECE                      19.07                         0               2
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

SQL> alter system set DB_RECOVERY_FILE_DEST_SIZE=1g;

System altered.

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .73                         0               1
REDO LOG                          19.53                         0               2
ARCHIVED LOG                       2.93                      2.93               1
BACKUP PIECE                      38.14                         0               2
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

With this (very small) value of DB_RECOVERY_FILE_DEST_SIZE, it should still be possible for RMAN to take another full backup into the Recovery Area – because only about 50 % of space is consumed by not obsolete files – after which the previous backup becomes obsolete, together with the Archivelogs, belonging to that backup. Our goal is to just take backups (every night, for example) and let the obsolete backups & archivelogs get deleted automatically. We must keep in mind that it is of key importance that archive logs can be created always to avoid archiver stuck problems, though. Let’s look at the desired behavior:

RMAN> backup database;

Starting backup at 25-AUG-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T115312_75d6w8bd_.bkp tag=TAG20110825T115312 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760103595_75d6wcng_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

RMAN> exit

Recovery Manager complete.
[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Thu Aug 25 11:53:23 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

SQL> set lines 200
SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .73                         0               1
REDO LOG                          19.53                         0               2
ARCHIVED LOG                          0                         0               0
BACKUP PIECE                      38.89                       .74               3
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

We took another backup and the previous backup got obsolete. Automatically, obsolete files got deleted because of space pressure inside of the Recovery Area:

[oracle@uhesse ~]$ adrci

ADRCI: Release 11.2.0.2.0 - Production on Thu Aug 25 11:56:19 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle/diag"
adrci> show alert -tail
2011-08-25 11:24:26.308000 +02:00
Starting background process CJQ0
CJQ0 started with pid=25, OS id=2495
2011-08-25 11:24:38.156000 +02:00
Thread 1 advanced to log sequence 8 (LGWR switch)
  Current log# 2 seq# 8 mem# 0: /u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_2_75d3og50_.log
  Current log# 2 seq# 8 mem# 1: /u02/fra/PRIMA/onlinelog/o1_mf_2_75d3ogms_.log
Archived Log entry 1 added for thread 1 sequence 7 ID 0x76c1760d dest 1:
2011-08-25 11:34:24.262000 +02:00
Starting background process SMCO
SMCO started with pid=23, OS id=2529
2011-08-25 11:39:10.821000 +02:00
Stopping background process CJQ0
2011-08-25 11:46:00.516000 +02:00
ALTER SYSTEM SET db_recovery_file_dest_size='1G' SCOPE=BOTH;
2011-08-25 11:53:14.365000 +02:00
Deleted Oracle managed file /u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_7_75d56p79_.arc
2011-08-25 11:53:15.647000 +02:00
Deleted Oracle managed file /u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T114118_75d65z8m_.bkp

Although it worked well in this scenario, the logical limit is set a little too optimistic here; a peak in Archivelog creation could easily lead to Archiver Stuck problems. Therefore, we increase the Recovery Area to a safer size and look at the consequences of that:

SQL> alter system set db_recovery_file_dest_size=2g;

System altered.

RMAN> backup database;

Starting backup at 25-AUG-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=30 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T120213_75d7f5wk_.bkp tag=TAG20110825T120213 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760104136_75d7f8wb_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

RMAN> backup database;

Starting backup at 25-AUG-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T120231_75d7fqcq_.bkp tag=TAG20110825T120231 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760104154_75d7ftjw_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .36                         0               1
REDO LOG                           9.77                         0               2
ARCHIVED LOG                          0                         0               0
BACKUP PIECE                      58.05                     38.75               7
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

After increasing the logical limit, we took two new backups. Although our retention policy is still redundancy 1, the accordingly obsolete backups did not get deleted yet, because there is no space pressure. Keep in mind that the system will consume almost the complete space that we allow with DB_RECOVERY_FILE_DEST_SIZE, therefore! The background of that is that we may make use of the old obsolete backups for a conventional Database Point In Time Recovery (DBPITR) without Flashback Logs. From now on, we will just take our (nightly) backups and let the system take care about obsolete backups & archivelogs. We should monitor V$RECOERY_AREA_USAGE or rely on an Enterprise Manager Alert that raises automatically if space in the Recovery Area gets scarce.

In other words: We just need to figure out once an appropriate logical limit and the system takes care of itself subsequently, while we just monitor it. This approach is much easier than other methods used before the introduction of the Recovery Area.

c) How to manage Flashback Logs together with the Recovery Area?

If we want to be able to do Flashback Database, we must have a Recovery Area to place the Flashback logs into:

SQL> alter database flashback on;

Database altered.

SQL> select flashback_on,open_mode from v$database;

FLASHBACK_ON       OPEN_MODE
------------------ --------------------
YES                READ WRITE

SQL> show parameter flashback_retention_target

NAME                                 TYPE         VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target         integer     1440

From now on, we will be able to do a very fast DBPITR with Flashback Database, by default one day (1440 minutes) back into the past. Notice that this target is not mandatory and will not lead to a „Flashback Stuck“ if space is consumed already by non obsolete backups & archivelogs. In order to show that, I will again reduce the logical limit to create a space pressure:

SQL> alter system set db_recovery_file_dest_size=700m;

System altered.

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                       1.07                         0               1
REDO LOG                          28.57                         0               2
ARCHIVED LOG                      13.72                         0               1
BACKUP PIECE                      58.63                      1.08               4
IMAGE COPY                            0                         0               0
FLASHBACK LOG                      3.35                         0               3
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

SQL> insert /*+ append */ into sales select * from sales;

3200000 rows created.

SQL> commit;

Commit complete.

The above insert would normally create Flashback Logs but it runs in spite of the full Recovery Area. There is no such thing as „Flashback Stuck“ (unless you create a guaranteed restore point, that is). That scenario is somewhat artificial because the command above creates almost no Redo Protocol (the table sales is on NOLOGGING) and also this is the only activity on the system. With the Recovery Area 100% full, the Database would normally get to a hold immediately – not because of the impossibility to create Flashback Logs, but because of Archiver Stuck:

SQL> connect / as sysdba
Connected.
SQL> alter system switch logfile;

System altered.

SQL> r
  1* alter system switch logfile

Waits ... Second terminal:

[oracle@uhesse ~]$ adrci

ADRCI: Release 11.2.0.2.0 - Production on Thu Aug 25 12:36:18 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle/diag"
adrci> show alert -tail
2011-08-25 12:36:24.236000 +02:00
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
   then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
   BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
   reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
   system command was used to delete files, then use RMAN CROSSCHECK and
   DELETE EXPIRED commands.
************************************************************************
ARC1: Error 19809 Creating archive log file to '/u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_9_%u_.arc'

Exactly. LGWR cannot write into the Online Logs anymore because no more Archivelogs can be stored in the Recovery Area. Remedy:

adrci> exit
[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Thu Aug 25 12:39:53 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

SQL> alter system set db_recovery_file_dest_size=2g;

System altered.

That resolved the Archiver Stuck problem immediately.

In short: Although it is desireable to have enough space to store flashback logs, it is not mandatory. That is extremely different to archive logs. We should try to make our Recovery Area large enough to store our backups & archive logs according to our Backup Retention, and our Flashback Logs according to our Flashback Retention Target. The space required is partly depending on the Database Load. Therefore, it would be a good idea to start with a somewhat gracefully large DB_RECOVERY_FILE_DEST_SIZE and a relatively short DB_FLASHBACK_RETENTION_TARGET and then monitor V$RECOVERY_AREA_USAGE to come up with an appropriate value for the two parameters.

Coming to the end of this posting, let me try to anticipate the most common objectives against the usage of the Recovery Area:

1. Our Database is too large to go with Disk Storage as the Primary Backup Location!
Agreed. You will only take backups to tape. In spite of that, you may consider to use the Recovery Area to put mirrors of your Controlfiles and Online Logfiles on and to create your Archivelogs into. You must use it if you want to be able to do Flashback Database anyway.

2. We don’t use RMAN!
Agreed. The Recovery Area is not for you. You should have a very good reason not to use RMAN, though.

3. This is too good to be true! We have never done it like that!
Believe it or not, but that is (in slightly other wording, of course) the most common objection that I hear, typically from seasoned DBAs who have crafted their own scripts & methods in 8i already and now hesitate to change that. I have much understanding for the „never touch a running system“ axiom. It should not mislead us to discredit new concepts, just because they are different from what we are used to, though. Keep an open mind for (not so new, meanwhile) New Features that are designed to help you to do your job and consider using it at least for the next Database that you create 🙂

Addendum: See this short posting from Gavin Soorma, showing „Flashback Stuck“ can happen – but only with Guaranteed Restore Points. Keep in mind to drop Guaranteed Restore Points after the critical operation that caused you to create them is over, therefore!

Backup & Recovery, flashback, RMAN

24 Kommentare

Dealing with Oracle Database Block Corruption

Veröffentlicht von Uwe Hesse in TOI am November 16, 2010

recover

Media errors don’t always destroy files completely. Sometimes, only small parts of the file are damaged respectively corrupted. It may even not be noticed by end users or admins for a while. This article shows how to detect block corruption and recover from it. The demo is done on 11g but the shown techniques work in the same way for 12c also. I have corrupted blocks on my demo database affecting the emp table of the user scott:

SQL> select * from scott.emp;
select * from scott.emp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 131)
ORA-01110: data file 4: '/home/oracle/prima/users01.dbf'

This shows that not the whole tablespace is affected:

SQL> select * from scott.dept;

 DEPTNO DNAME          LOC
---------- -------------- -------------
 10 ACCOUNTING     NEW YORK
 20 RESEARCH       DALLAS
 30 SALES          CHICAGO
 40 OPERATIONS     BOSTON

SQL> select table_name,tablespace_name from dba_tables where owner='SCOTT';

TABLE_NAME                     TABLESPACE_NAME
------------------------------ ------------------------------
DEPT                           USERS
EMP                            USERS

Whenever we get these kind of error messages, we need to check all the blocks. Typically, error messages about block corruption come up during an RMAN backup, but I like to defer that a little to show an 11g New Feature before. Checking all blocks now:

RMAN> validate check logical database;

Starting validate at 16-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=107 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              17594        38400           277491
 File Name: /home/oracle/prima/system01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              13854
 Index      0              4487
 Other      0              2465

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     0              20381        25600           277631
 File Name: /home/oracle/prima/sysaux01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              869
 Index      0              957
 Other      0              3393

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
3    OK     0              541          22784           277631
 File Name: /home/oracle/prima/undotbs01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              0
 Index      0              0
 Other      0              22243

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
4    FAILED 0              1133         1280            271968
 File Name: /home/oracle/prima/users01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              10
 Index      0              0
 Other      1              137

validate found one or more corrupt blocks
See trace file /home/oracle/prima/diag/rdbms/prima/prima/trace/prima_ora_18316.trc for details
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
including current control file for validation
including current SPFILE in backup set
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Control File and SPFILE
===============================
File Type    Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE       OK     0              2
Control File OK     0              612
Finished validate at 16-NOV-10

We have already a couple of 11g New Features here: The syntax has changed from backup validate (since 9i) to just validate (11g) – probably to make clear that this does not perform a backup but a check of corrupted blocks instead. Before 11g, the command did not show the verbose list of checked respectively corrupted blocks like we see above.

The addition check logical will also check for logical block corruption, which is not done by default.

Checking all the blocks here is more efficient than doing an immediate recovery of the one block mentioned in the error message above. There may be many more not spotted yet. Same is true for an ordinary backup that would interrupt at the first spotted corrupted block as we will see later on.

The validate command populated the view v$database_block_corruption, that is now internally read by RMAN in order to repair all the found corrupted blocks. The next 11g New Feature here is: It will take the block out of the Flashback Logs, if present there!

RMAN> blockrecover corruption list;

Starting recover at 16-NOV-10
using channel ORA_DISK_1
searching flashback logs for block images
finished flashback log search, restored 1 blocks

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 16-NOV-10

I was so bold that I did not even take a backup before – to make sure this new feature must be used:

RMAN> list backup;

specification does not match any backup in the repository

I’m going to take a backup now, but before that, I cause again block corruption. So we will see that RMAN stops at the first noticed corrupted block. No Third-Party-Tool would recognize the block corruption, BTW, so we have another reason to actually use RMAN here. If we say backup check logical database instead of just backup database, RMAN will also check for logical block corruption during the backup.

[oracle@uhesse-pc skripte]$ rman target sys/oracle@prima

Recovery Manager: Release 11.2.0.2.0 - Production on Tue Nov 16 15:22:14 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (DBID=1967518488)

RMAN> backup database;

Starting backup at 16-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=110 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
channel ORA_DISK_1: starting piece 1 at 16-NOV-10
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/16/2010 15:22:22
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/prima/users01.dbf
continuing other job steps, job failed will not be re-run
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 16-NOV-10
channel ORA_DISK_1: finished piece 1 at 16-NOV-10
piece handle=/home/oracle/flashback/PRIMA/backupset/2010_11_16/o1_mf_ncsnf_TAG20101116T152221_6g54wzkb_.bkp tag=TAG20101116T152221 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/16/2010 15:22:22
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/prima/users01.dbf

Again the same sequence as above validate check logical database & blockrecover corruption list will solve the problem. During the whole process, the users tablespace remains online and usable, except the emp table of scott.

Conclusion: We have a powerful tool with RMAN to spot and repair corrupted blocks by using intact versions of the corrupted blocks from backup (since 9i already) or even from Flashback Logs (since 11g) – which is probably faster – while keeping up the availability of the affected tablespace.

11g New Features, Backup & Recovery, flashback, High Availability, RMAN

25 Kommentare