Posts Tagged RMAN

The Recovery Area: Why it is recommended

With 10g, the concept of the Recovery Area – associated with the parameter DB_RECOVERY_FILE_DEST – was introduced. It was designed in the first place for customers who use Disk Storage as their primary backup media. Meanwhile, due to the ever decreasing price of Disk Storage, this should be the majority of customers. In spite of that, there seems to be some kind of reluctance to actually make use of the multiple benefits, the Recovery Area delivers.

This posting aims to show how deal with the Recovery Area in order to
a) Protect Controlfiles & Online Logfiles
b) Implement a robust Backup Strategy easily
c) Manage Flashback Logs

Let’s start with
a) How does the Recovery Area help to protect Controlfiles & Online Logfiles?

It should be consense that the Controlfiles and especially the Online Logfiles are sensitive and must be mirrored. If we use the parameter DB_CREATE_FILE_DEST (introduced in 9i already) together with DB_RECOVERY_FILE_DEST, this is done automatically during the Database creation. It is typically but not necessarily so that we use DB_CREATE_FILE_DEST together with ASM. My demo Database is without ASM here:

[oracle@uhesse ~]$ cat skripte/initprima.ora 

compatible=11.2.0.2
db_block_size=8192
db_name='prima'
db_create_file_dest='/u01/app/oracle/oradata'
db_recovery_file_dest='/u02/fra'
db_recovery_file_dest_size=2g
memory_target=640m
undo_management=auto
undo_tablespace=undotbs1
db_domain=''
diagnostic_dest='/u01/app/oracle/diag'

That is my minimalistic init.ora. Notice the absence of the parameter CONTROL_FILES. Now look how the two parameters make it easy for me to create my Database with mirrored Control- and Online Logfiles:

[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Tue Aug 23 11:48:08 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> create spfile from pfile='/home/oracle/skripte/initprima.ora';

File created.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  668082176 bytes
Fixed Size            2229480 bytes
Variable Size          398461720 bytes
Database Buffers      264241152 bytes
Redo Buffers            3149824 bytes
SQL> create database undo tablespace undotbs1
     default tablespace users
     default temporary tablespace temp
     character set al32utf8;

Database created.

That was it already and it took less than a minute. Some more minutes will it take to run catalog & catproc, though. Let’s investigate the new Database:

SQL> select status from v$instance;

STATUS
------------
OPEN

SQL> select name,bytes/1024/1024 as mb from v$datafile union
     select name,bytes/1024/1024 from v$tempfile;

NAME                                                                           MB
---------------------------------------------------------------------- ----------
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf             100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf             100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_temp_75d3oynt_.tmp               100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf            10
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf              100

We see above the effect of specifying the Database Area with DB_CREATE_FILE_DEST: The system created a subdirectory according to the DB_UNIQUE_NAME (derived from DB_NAME in this case) as well as another subdirectory beneath according to the file type.
Now to the mirroring of our most sensitive files:

SQL> select name,is_recovery_dest_file from v$controlfile;

NAME                                                                   IS_
---------------------------------------------------------------------- ---
/u01/app/oracle/oradata/PRIMA/controlfile/o1_mf_756xwk4p_.ctl           NO
/u02/fra/PRIMA/controlfile/o1_mf_756xwk70_.ctl                         YES

SQL> select member,is_recovery_dest_file from v$logfile;

MEMBER                                                                 IS_
---------------------------------------------------------------------- ---
/u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_1_756xwk95_.log           NO
/u02/fra/PRIMA/onlinelog/o1_mf_1_756xwkfr_.log                         YES
/u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_2_756xwkn5_.log           NO
/u02/fra/PRIMA/onlinelog/o1_mf_2_756xwl1t_.log                         YES

We see only with this approach a ‘YES’ in the IS_REVOVERY_DEST_FILE column of v$controlfile. If you specify the CONTROL_FILES parameter manually (with one file pointing to the Recovery Area) before the CREATE DATABASE command, this does not count. I admit that in earlier versions of 10g, the systems handling of the CONTROL_FILES parameter was not optimal, because it did not write the automatically created Controlfiles into the spfile. But the newer releases do:

SQL> select sid,name,value from v$spparameter where isspecified='TRUE';

SID    NAME                           VALUE
------ ------------------------------ ----------------------------------------------------------------------
*      processes                      100
*      memory_target                  671088640
*      control_files                  /u01/app/oracle/oradata/PRIMA/controlfile/o1_mf_756xwk4p_.ctl
*      control_files                  /u02/fra/PRIMA/controlfile/o1_mf_756xwk70_.ctl
*      db_block_size                  8192
*      compatible                     11.2.0.2
*      db_create_file_dest            /u01/app/oracle/oradata
*      db_recovery_file_dest          /u02/fra
*      db_recovery_file_dest_size     2147483648
*      undo_management                auto
*      undo_tablespace                undotbs1
*      db_domain
*      db_name                        prima
*      diagnostic_dest                /u01/app/oracle/diag

14 rows selected.

A valid reason not to go with this method (mirroring Controlfiles & Online Logfiles into the Recovery Area) would be if the devices there a significantly slower than the ones that make up the Database Area. We may still use the Recovery Area to keep our Backups, Archivelogs & Flashback Logs there, though. Because the devices underneath /u01 and /u02 have similar performance characteristics, it is perfectly OK as it is now on my demo system. Apart from performance aspects, it is of course of key importance to have the Recovery Area placed on different physical devices than the Database Area.

We have two V$-Views that reflect the usage of the Recovery Area:

SQL> select * from V$RECOVERY_FILE_DEST;

NAME                           SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
------------------------------ ----------- ---------- ----------------- ---------------
/u02/fra                        2147483648  218316800                 0               3

The SPACE* columns have the unit bytes, unfortunately. SPACE_LIMIT is derived from the dynamic parameter DB_RECOVERY_FILE_DEST_SIZE that was  set here to the (very small) value 2G. SPACE_RECLAIMABLE is derived from the size of the Backups/Archivelogs, being obsolete according to our RMAN retention policy. The next view is newer and shows also the file types, consuming space in the Recover Area:

SQL> select * from V$RECOVERY_AREA_USAGE;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .36                         0               1
REDO LOG                           9.77                         0               2
ARCHIVED LOG                          0                         0               0
BACKUP PIECE                          0                         0               0
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

Because the Database is neither in Archivelog mode nor in Flashback mode yet and we did not take backups, the only files in the Recovery Area are the mirrors of the Online Logs and the Controlfile. We wil change that with the next step.

b) How does the Recovery Area help to implement a robust Backup Strategy?

We will put the Database in Archivelog mode and see that the Archivelogs are being created automatically into the Recovery Area:

SQL> shutdown immediate
 Database closed.
 Database dismounted.
 ORACLE instance shut down.
 SQL> startup mount
 ORACLE instance started.
Total System Global Area  668082176 bytes
 Fixed Size                  2229480 bytes
 Variable Size             444599064 bytes
 Database Buffers          218103808 bytes
 Redo Buffers                3149824 bytes
 Database mounted.
 SQL> alter database archivelog;
Database altered.
SQL> alter database open;
Database altered.
SQL> alter system switch logfile;
System altered.
SQL> select name from v$archived_log;
NAME
 ----------------------------------------------------------------------
 /u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_7_75d56p79_.arc
1 row selected.
SQL> select * from v$recovery_area_usage;
FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
 -------------------- ------------------ ------------------------- ---------------
 CONTROL FILE                        .36                         0               1
 REDO LOG                           9.77                         0               2
 ARCHIVED LOG                       1.46                         0               1
 BACKUP PIECE                          0                         0               0
 IMAGE COPY                            0                         0               0
 FLASHBACK LOG                         0                         0               0
 FOREIGN ARCHIVED LOG                  0                         0               0
7 rows selected.

Notice that I did not need to specify a single initialization parameter of the LOG_ARCHIVE* range for this; easy, isn’t it? Now we come to the almost only remaining difficulty: Specifying an appropriate value for DB_RECOVERY_FILE_DEST_SIZE. That parameter sets a logical space limit, our Database is allowed to consume inside of the Recovery Area. We must set it large enough for our Backups & Archivelogs to fit in, according to our retention policy. Let’s have a look how large our backups get for this Database:

SQL> select name,bytes/1024/1024 as mb from v$datafile union
     select name,bytes/1024/1024 from v$tempfile;    

NAME                                                                           MB
---------------------------------------------------------------------- ----------
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf             100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf             200
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_temp_75d3oynt_.tmp               100
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf           180
/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf              100

5 rows selected.

SQL> select sum(bytes)/1024/1024 as mb from v$datafile;

        MB
----------
       580

1 row selected.

After catalog & catproc did run, the files grew little but still this is a very tiny Database. With Image Copies, my full Backup would take 580 MB in size, while Backup Sets would be smaller, depending on the amount of empty space inside of the datafiles that Backup Sets would not contain. We are going to take an ordinary default online backup:

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
[oracle@uhesse ~]$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Thu Aug 25 11:40:26 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (DBID=1992392973)

RMAN> show all;

using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name PRIMA are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/11.2.0.2/db_1/dbs/snapcf_prima.f'; # default

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP on;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> backup database;

Starting backup at 25-AUG-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=29 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T114118_75d65z8m_.bkp tag=TAG20110825T114118 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760102882_75d6633f_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

We could see that the Backup Set as well as the Controlfile Autobackup (a must to switch on!) were taken into the Recovery Area. Now to our calculation about the Recovery Area sizing:

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .36                         0               1
REDO LOG                           9.77                         0               2
ARCHIVED LOG                       1.46                      1.46               1
BACKUP PIECE                      19.07                         0               2
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

SQL> alter system set DB_RECOVERY_FILE_DEST_SIZE=1g;

System altered.

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .73                         0               1
REDO LOG                          19.53                         0               2
ARCHIVED LOG                       2.93                      2.93               1
BACKUP PIECE                      38.14                         0               2
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

With this (very small) value of DB_RECOVERY_FILE_DEST_SIZE, it should still be possible for RMAN to take another full backup into the Recovery Area – because only about 50 % of space is consumed by not obsolete files – after which the previous backup becomes obsolete, together with the Archivelogs, belonging to that backup. Our goal is to just take backups (every night, for example) and let the obsolete backups & archivelogs get deleted automatically. We must keep in mind that it is of key importance that archive logs can be created always to avoid archiver stuck problems, though. Let’s look at the desired behavior:

RMAN> backup database;

Starting backup at 25-AUG-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T115312_75d6w8bd_.bkp tag=TAG20110825T115312 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760103595_75d6wcng_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

RMAN> exit

Recovery Manager complete.
[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Thu Aug 25 11:53:23 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

SQL> set lines 200
SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .73                         0               1
REDO LOG                          19.53                         0               2
ARCHIVED LOG                          0                         0               0
BACKUP PIECE                      38.89                       .74               3
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

We took another backup and the previous backup got obsolete. Automatically, obsolete files got deleted because of space pressure inside of the Recovery Area:

[oracle@uhesse ~]$ adrci

ADRCI: Release 11.2.0.2.0 - Production on Thu Aug 25 11:56:19 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle/diag"
adrci> show alert -tail
2011-08-25 11:24:26.308000 +02:00
Starting background process CJQ0
CJQ0 started with pid=25, OS id=2495
2011-08-25 11:24:38.156000 +02:00
Thread 1 advanced to log sequence 8 (LGWR switch)
  Current log# 2 seq# 8 mem# 0: /u01/app/oracle/oradata/PRIMA/onlinelog/o1_mf_2_75d3og50_.log
  Current log# 2 seq# 8 mem# 1: /u02/fra/PRIMA/onlinelog/o1_mf_2_75d3ogms_.log
Archived Log entry 1 added for thread 1 sequence 7 ID 0x76c1760d dest 1:
2011-08-25 11:34:24.262000 +02:00
Starting background process SMCO
SMCO started with pid=23, OS id=2529
2011-08-25 11:39:10.821000 +02:00
Stopping background process CJQ0
2011-08-25 11:46:00.516000 +02:00
ALTER SYSTEM SET db_recovery_file_dest_size='1G' SCOPE=BOTH;
2011-08-25 11:53:14.365000 +02:00
Deleted Oracle managed file /u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_7_75d56p79_.arc
2011-08-25 11:53:15.647000 +02:00
Deleted Oracle managed file /u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T114118_75d65z8m_.bkp

Although it worked well in this scenario, the logical limit is set a little too optimistic here; a peak in Archivelog creation could easily lead to Archiver Stuck problems. Therefore, we increase the Recovery Area to a safer size and look at the consequences of that:

SQL> alter system set db_recovery_file_dest_size=2g;

System altered.

RMAN> backup database;

Starting backup at 25-AUG-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=30 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T120213_75d7f5wk_.bkp tag=TAG20110825T120213 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760104136_75d7f8wb_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

RMAN> backup database;

Starting backup at 25-AUG-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_system_75d3on35_.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_undotbs1_75d3oxw5_.dbf
input datafile file number=00002 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_sysaux_75d3os4b_.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/PRIMA/datafile/o1_mf_users_75d3oyp3_.dbf
channel ORA_DISK_1: starting piece 1 at 25-AUG-11
channel ORA_DISK_1: finished piece 1 at 25-AUG-11
piece handle=/u02/fra/PRIMA/backupset/2011_08_25/o1_mf_nnndf_TAG20110825T120231_75d7fqcq_.bkp tag=TAG20110825T120231 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 25-AUG-11

Starting Control File and SPFILE Autobackup at 25-AUG-11
piece handle=/u02/fra/PRIMA/autobackup/2011_08_25/o1_mf_s_760104154_75d7ftjw_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 25-AUG-11

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                        .36                         0               1
REDO LOG                           9.77                         0               2
ARCHIVED LOG                          0                         0               0
BACKUP PIECE                      58.05                     38.75               7
IMAGE COPY                            0                         0               0
FLASHBACK LOG                         0                         0               0
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

After increasing the logical limit, we took two new backups. Although our retention policy is still redundancy 1, the accordingly obsolete backups did not get deleted yet, because there is no space pressure. Keep in mind that the system will consume almost the complete space that we allow with DB_RECOVERY_FILE_DEST_SIZE, therefore! The background of that is that we may make use of the old obsolete backups for a conventional Database Point In Time Recovery (DBPITR) without Flashback Logs. From now on, we will just take our (nightly) backups and let the system take care about obsolete backups & archivelogs. We should monitor V$RECOERY_AREA_USAGE or rely on an Enterprise Manager Alert that raises automatically if space in the Recovery Area gets scarce.

In other words: We just need to figure out once an appropriate logical limit and the system takes care of itself subsequently, while we just monitor it. This approach is much easier than other methods used before the introduction of the Recovery Area.

c) How to manage Flashback Logs together with the Recovery Area?

If we want to be able to do Flashback Database, we must have a Recovery Area to place the Flashback logs into:

SQL> alter database flashback on;

Database altered.

SQL> select flashback_on,open_mode from v$database;

FLASHBACK_ON       OPEN_MODE
------------------ --------------------
YES                READ WRITE

SQL> show parameter flashback_retention_target

NAME                                 TYPE         VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target         integer     1440

From now on, we will be able to do a very fast DBPITR with Flashback Database, by default one day (1440 minutes) back into the past. Notice that this target is not mandatory and will not lead to a “Flashback Stuck” if space is consumed already by non obsolete backups & archivelogs. In order to show that, I will again reduce the logical limit to create a space pressure:

SQL> alter system set db_recovery_file_dest_size=700m;

System altered.

SQL> select * from v$recovery_area_usage;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                       1.07                         0               1
REDO LOG                          28.57                         0               2
ARCHIVED LOG                      13.72                         0               1
BACKUP PIECE                      58.63                      1.08               4
IMAGE COPY                            0                         0               0
FLASHBACK LOG                      3.35                         0               3
FOREIGN ARCHIVED LOG                  0                         0               0

7 rows selected.

SQL> insert /*+ append */ into sales select * from sales;

3200000 rows created.

SQL> commit;

Commit complete.

The above insert would normally create Flashback Logs but it runs in spite of the full Recovery Area. There is no such thing as “Flashback Stuck” (unless you create a guaranteed restore point, that is). That scenario is somewhat artificial because the command above creates almost no Redo Protocol (the table sales is on NOLOGGING) and also this is the only activity on the system. With the Recovery Area 100% full, the Database would normally get to a hold immediately – not because of the impossibility to create Flashback Logs, but because of Archiver Stuck:

SQL> connect / as sysdba
Connected.
SQL> alter system switch logfile;

System altered.

SQL> r
  1* alter system switch logfile

Waits ... Second terminal:

[oracle@uhesse ~]$ adrci

ADRCI: Release 11.2.0.2.0 - Production on Thu Aug 25 12:36:18 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle/diag"
adrci> show alert -tail
2011-08-25 12:36:24.236000 +02:00
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
   then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
   BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
   reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
   system command was used to delete files, then use RMAN CROSSCHECK and
   DELETE EXPIRED commands.
************************************************************************
ARC1: Error 19809 Creating archive log file to '/u02/fra/PRIMA/archivelog/2011_08_25/o1_mf_1_9_%u_.arc'

Exactly. LGWR cannot write into the Online Logs anymore because no more Archivelogs can be stored in the Recovery Area. Remedy:

adrci> exit
[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Thu Aug 25 12:39:53 2011

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production

SQL> alter system set db_recovery_file_dest_size=2g;

System altered.

That resolved the Archiver Stuck problem immediately.

In short: Although it is desireable to have enough space to store flashback logs, it is not mandatory. That is extremely different to archive logs. We should try to make our Recovery Area large enough to store our backups & archive logs according to our Backup Retention, and our Flashback Logs according to our Flashback Retention Target. The space required is partly depending on the Database Load. Therefore, it would be a good idea to start with a somewhat gracefully large DB_RECOVERY_FILE_DEST_SIZE and a relatively short DB_FLASHBACK_RETENTION_TARGET and then monitor V$RECOVERY_AREA_USAGE to come up with an appropriate value for the two parameters.

Coming to the end of this posting, let me try to anticipate the most common objectives against the usage of the Recovery Area:

1. Our Database is too large to go with Disk Storage as the Primary Backup Location!
Agreed. You will only take backups to tape. In spite of that, you may consider to use the Recovery Area to put mirrors of your Controlfiles and Online Logfiles on and to create your Archivelogs into. You must use it if you want to be able to do Flashback Database anyway.

2. We don’t use RMAN!
Agreed. The Recovery Area is not for you. You should have a very good reason not to use RMAN, though.

3. This is too good to be true! We have never done it like that!
Believe it or not, but that is (in slightly other wording, of course) the most common objection that I hear, typically from seasoned DBAs who have crafted their own scripts & methods in 8i already and now hesitate to change that. I have much understanding for the “never touch a running system” axiom. It should not mislead us to discredit new concepts, just because they are different from what we are used to, though. Keep an open mind for (not so new, meanwhile) New Features that are designed to help you to do your job and consider using it at least for the next Database that you create :-)

Addendum: See this short posting from Gavin Soorma, showing “Flashback Stuck” can happen – but only with Guaranteed Restore Points. Keep in mind to drop Guaranteed Restore Points after the critical operation that caused you to create them is over, therefore!

, ,

6 Comments

Clone Database from Snapshot Standby Database

In 11g, we introduced the new feature Snapshot Standby Database for Data Guard. In my current Data Guard course, one of the attendees asked, whether it is possible to create a Database Clone out of such a Snapshot Standby Database. Maybe you want to preserve the changes that have been done during the time, your Physical Standby Database was turned into a READ-WRITE opened (Test-)Database. I answered that this is mot likely possible, but I did not try yet. Now I did :-)

DGMGRL> convert database physt to snapshot standby;
Converting database "physt" to a Snapshot Standby database, please wait...
Database "physt" converted successfully

There is my Snapshot Standby Database. You may reproduce the Data Guard Configuration in place here with the Whitepaper 11g Data Guard on the Command Line from my Downloads page.

DGMGRL> show configuration

Configuration - myconf

 Protection Mode: MaxPerformance
 Databases:
 prima - Primary database
 physt - Snapshot standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

I continue to prepare a Clone Database by creating an spfile and a password file for it (very similar as described in the Whitepaper for the physt Database). Then I start it into NOMOUNT:

uhesse $ sqlplus sys/oracle@clon as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Tue Dec 21 17:00:26 2010

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  313860096 bytes
Fixed Size                  1336232 bytes
Variable Size             209718360 bytes
Database Buffers           96468992 bytes
Redo Buffers                6336512 bytes
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

The version in use is 11gR2, but it should work with 11gR1 the same way. Now RMAN is used to create the Clone from the Snapshot Standby:

RMAN> connect target sys/oracle@physt

connected to target database: PRIMA (DBID=1970640575)

RMAN> connect auxiliary sys/oracle@clon

connected to auxiliary database: CLON (not mounted)

RMAN> duplicate target database to clon from active database;

Starting Duplicate Db at 21-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=96 device type=DISK
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 12/21/2010 17:03:19
RMAN-05541: no archived logs found in target database

RMAN> exit

Oops! I need Archivelogs on the Snapshot Standby. Note that they are in the same directory but with a different Resetlogs Identifier as the Archivelogs created when the Database was a Physical Standby Database.

uhesse $ sqlplus sys/oracle@physt as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on Tue Dec 21 17:03:49 2010

Copyright (c) 1982, 2009, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter system switch logfile;

System altered.

SQL> select name from v$archived_log;

NAME
--------------------------------------------------------------------------------
/home/oracle/physt/archive/1_12_738347903.dbf
/home/oracle/physt/archive/1_8_738347903.dbf
/home/oracle/physt/archive/1_7_738347903.dbf
/home/oracle/physt/archive/1_9_738347903.dbf
/home/oracle/physt/archive/1_11_738347903.dbf
/home/oracle/physt/archive/1_10_738347903.dbf
/home/oracle/physt/archive/1_13_738347903.dbf
/home/oracle/physt/archive/1_14_738347903.dbf
/home/oracle/physt/archive/1_15_738347903.dbf
/home/oracle/physt/archive/1_1_738349098.dbf

10 rows selected.

Again we go with the duplicate:

RMAN> connect target sys/oracle@physt

connected to target database: PRIMA (DBID=1970640575)

RMAN> connect auxiliary sys/oracle@clon

connected to auxiliary database: CLON (not mounted)

RMAN> duplicate target database to clon from active database;

Starting Duplicate Db at 21-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=95 device type=DISK

contents of Memory Script:
{
 sql clone "alter system set  db_name =
 ''PRIMA'' comment=
 ''Modified by RMAN duplicate'' scope=spfile";
 sql clone "alter system set  db_unique_name =
 ''CLON'' comment=
 ''Modified by RMAN duplicate'' scope=spfile";
 shutdown clone immediate;
 startup clone force nomount
 backup as copy current controlfile auxiliary format  '/home/oracle/clon/control01.ctl';
 alter clone database mount;
}
executing Memory Script

sql statement: alter system set  db_name =  ''PRIMA'' comment= ''Modified by RMAN duplicate'' scope=spfile

sql statement: alter system set  db_unique_name =  ''CLON'' comment= ''Modified by RMAN duplicate'' scope=spfile

Oracle instance shut down

Oracle instance started

Total System Global Area     313860096 bytes

Fixed Size                     1336232 bytes
Variable Size                209718360 bytes
Database Buffers              96468992 bytes
Redo Buffers                   6336512 bytes

Starting backup at 21-DEC-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=24 device type=DISK
channel ORA_DISK_1: starting datafile copy
copying current control file
output file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/snapcf_physt.f tag=TAG20101221T172429 RECID=6 STAMP=738350670
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 21-DEC-10

database mounted

contents of Memory Script:
{
 set newname for datafile  1 to
 "/home/oracle/clon/system01.dbf";
 set newname for datafile  2 to
 "/home/oracle/clon/sysaux01.dbf";
 set newname for datafile  3 to
 "/home/oracle/clon/undotbs01.dbf";
 set newname for datafile  4 to
 "/home/oracle/clon/users01.dbf";
 backup as copy reuse
 datafile  1 auxiliary format
 "/home/oracle/clon/system01.dbf"   datafile
 2 auxiliary format
 "/home/oracle/clon/sysaux01.dbf"   datafile
 3 auxiliary format
 "/home/oracle/clon/undotbs01.dbf"   datafile
 4 auxiliary format
 "/home/oracle/clon/users01.dbf"   ;
 sql 'alter system archive log current';
}
executing Memory Script

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting backup at 21-DEC-10
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/home/oracle/physt/system01.dbf
output file name=/home/oracle/clon/system01.dbf tag=TAG20101221T172436
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/home/oracle/physt/sysaux01.dbf
output file name=/home/oracle/clon/sysaux01.dbf tag=TAG20101221T172436
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/home/oracle/physt/undotbs01.dbf
output file name=/home/oracle/clon/undotbs01.dbf tag=TAG20101221T172436
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/home/oracle/physt/users01.dbf
output file name=/home/oracle/clon/users01.dbf tag=TAG20101221T172436
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
Finished backup at 21-DEC-10

sql statement: alter system archive log current

contents of Memory Script:
{
 backup as copy reuse
 archivelog like  "/home/oracle/physt/archive/1_3_738349098.dbf" auxiliary format
 "/home/oracle/clon/archive/1_3_738349098.dbf"   ;
 catalog clone archivelog  "/home/oracle/clon/archive/1_3_738349098.dbf";
 switch clone datafile all;
}
executing Memory Script

Starting backup at 21-DEC-10
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log copy
input archived log thread=1 sequence=3 RECID=12 STAMP=738350751
output file name=/home/oracle/clon/archive/1_3_738349098.dbf RECID=0 STAMP=0
channel ORA_DISK_1: archived log copy complete, elapsed time: 00:00:15
Finished backup at 21-DEC-10

cataloged archived log
archived log file name=/home/oracle/clon/archive/1_3_738349098.dbf RECID=12 STAMP=738350766

datafile 1 switched to datafile copy
input datafile copy RECID=6 STAMP=738350766 file name=/home/oracle/clon/system01.dbf
datafile 2 switched to datafile copy
input datafile copy RECID=7 STAMP=738350767 file name=/home/oracle/clon/sysaux01.dbf
datafile 3 switched to datafile copy
input datafile copy RECID=8 STAMP=738350767 file name=/home/oracle/clon/undotbs01.dbf
datafile 4 switched to datafile copy
input datafile copy RECID=9 STAMP=738350767 file name=/home/oracle/clon/users01.dbf

contents of Memory Script:
{
 set until scn  226204;
 recover
 clone database
 delete archivelog
 ;
}
executing Memory Script

executing command: SET until clause

Starting recover at 21-DEC-10
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=95 device type=DISK

starting media recovery

archived log for thread 1 with sequence 3 is already on disk as file /home/oracle/clon/archive/1_3_738349098.dbf
archived log file name=/home/oracle/clon/archive/1_3_738349098.dbf thread=1 sequence=3
media recovery complete, elapsed time: 00:00:00
Finished recover at 21-DEC-10

contents of Memory Script:
{
 shutdown clone immediate;
 startup clone nomount;
 sql clone "alter system set  db_name =
 ''CLON'' comment=
 ''Reset to original value by RMAN'' scope=spfile";
 sql clone "alter system reset  db_unique_name scope=spfile";
 shutdown clone immediate;
 startup clone nomount;
}
executing Memory Script

database dismounted
Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area     313860096 bytes

Fixed Size                     1336232 bytes
Variable Size                209718360 bytes
Database Buffers              96468992 bytes
Redo Buffers                   6336512 bytes

sql statement: alter system set  db_name =  ''CLON'' comment= ''Reset to original value by RMAN'' scope=spfile

sql statement: alter system reset  db_unique_name scope=spfile

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area     313860096 bytes

Fixed Size                     1336232 bytes
Variable Size                209718360 bytes
Database Buffers              96468992 bytes
Redo Buffers                   6336512 bytes
sql statement: CREATE CONTROLFILE REUSE SET DATABASE "CLON" RESETLOGS ARCHIVELOG
 MAXLOGFILES     26
 MAXLOGMEMBERS      5
 MAXDATAFILES      100
 MAXINSTANCES     1
 MAXLOGHISTORY      292
 LOGFILE
 GROUP  1 ( '/home/oracle/clon/log_g1m1.rdo' ) SIZE 100 M  REUSE,
 GROUP  2 ( '/home/oracle/clon/log_g2m1.rdo' ) SIZE 100 M  REUSE
 DATAFILE
 '/home/oracle/clon/system01.dbf'
 CHARACTER SET WE8ISO8859P1


contents of Memory Script:
{
 set newname for tempfile  1 to
 "/home/oracle/clon/temp01.dbt";
 switch clone tempfile all;
 catalog clone datafilecopy  "/home/oracle/clon/sysaux01.dbf",
 "/home/oracle/clon/undotbs01.dbf",
 "/home/oracle/clon/users01.dbf";
 switch clone datafile all;
}
executing Memory Script

executing command: SET NEWNAME

renamed tempfile 1 to /home/oracle/clon/temp01.dbt in control file

cataloged datafile copy
datafile copy file name=/home/oracle/clon/sysaux01.dbf RECID=1 STAMP=738350787
cataloged datafile copy
datafile copy file name=/home/oracle/clon/undotbs01.dbf RECID=2 STAMP=738350787
cataloged datafile copy
datafile copy file name=/home/oracle/clon/users01.dbf RECID=3 STAMP=738350787

datafile 2 switched to datafile copy
input datafile copy RECID=1 STAMP=738350787 file name=/home/oracle/clon/sysaux01.dbf
datafile 3 switched to datafile copy
input datafile copy RECID=2 STAMP=738350787 file name=/home/oracle/clon/undotbs01.dbf
datafile 4 switched to datafile copy
input datafile copy RECID=3 STAMP=738350787 file name=/home/oracle/clon/users01.dbf

contents of Memory Script:
{
 Alter clone database open resetlogs;
}
executing Memory Script

database opened
Finished Duplicate Db at 21-DEC-10

So this was pretty much the same as with an ordinary Clone from a Production Database – as expected. Now we turn the Snapshot Standby back into Physical Standby, thereby losing all the changes that might be done there by using Flashback Database internally. But the Clone Database will remain, now being an independent Database with another Name and DBID than the Primary resp. Snapshot Standby.

DGMGRL> convert database physt to physical standby;
Converting database "physt" to a Physical Standby database, please wait...
Operation requires shutdown of instance "physt" on database "physt"
Shutting down instance "physt"...
Database closed.
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "physt" on database "physt"
Starting instance "physt"...
ORACLE instance started.
Database mounted.
Continuing to convert database "physt" ...
Operation requires shutdown of instance "physt" on database "physt"
Shutting down instance "physt"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "physt" on database "physt"
Starting instance "physt"...
ORACLE instance started.
Database mounted.
Database "physt" converted successfully
DGMGRL> show configuration

Configuration - myconf

 Protection Mode: MaxPerformance
 Databases:
 prima - Primary database
 physt - Physical standby database
 
Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS
SQL> connect sys/oracle@clon as sysdba
Connected.
SQL> select name,dbid,database_role from v$database;

NAME            DBID DATABASE_ROLE
--------- ---------- ----------------
CLON      2202079427 PRIMARY

SQL> connect sys/oracle@physt as sysdba
Connected.
SQL> select name,dbid,database_role from v$database;

NAME            DBID DATABASE_ROLE
--------- ---------- ----------------
PRIMA     1970640575 PHYSICAL STANDBY

SQL> connect sys/oracle@prima as sysdba
Connected.
SQL> select name,dbid,database_role from v$database;

NAME            DBID DATABASE_ROLE
--------- ---------- ----------------
PRIMA     1970640575 PRIMARY

Conclusion: With 11g, we can very comfortably transform our Physical Standby into a fully changeable Snapshot Standby, clone that modified Database into an autonomous Database to keep the changes and afterwards transform the Snapshot Standby back into Physical Standby. During the whole action, Redo Protocol from the Primary Database can still be received – but not applied – on the Standby Site.

 

, , , ,

2 Comments

Database Migration to ASM with short downtime

I made up my mind to start a series of postings related to ASM. We recommend using ASM since 10g and recently introduced many new features with it in 11g. On the other hand, I often encounter customers even in advanced courses who didn’t have much contact with ASM yet. So there may be quite a demand for some explanations. On my Downloads page, there is a brief paper in German called ‘ASM Vortrag’ that summarizes some of the main benefits of 10g ASM. Or look at OTN for more details.

We start our first scenario with a small Database running in a conventional filesystem, setup ASM instance and two ASM diskgroups, then migrate to ASM with the downtime it takes only to shutdown and restart the instance – in other words, the downtime will be in the range of only minutes on a productive system. The example is done on a small Linux server using Oracle Database Enterprise Edition 11.2.0.2, but should work very similar on other platforms and with the 10g version as well.

SQL> select * from v$version;

BANNER
----------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select log_mode from v$database;

LOG_MODE
------------
ARCHIVELOG
SQL> select name from v$datafile union select name from v$tempfile;

NAME
---------------------------------
/home/oracle/prima/sysaux01.dbf
/home/oracle/prima/system01.dbf
/home/oracle/prima/temp01.dbt
/home/oracle/prima/undotbs01.dbf
/home/oracle/prima/users01.dbf

SQL> select name from v$controlfile;

NAME
--------------------------------
/home/oracle/prima/control01.ctl

SQL> select member from v$logfile;

MEMBER
-------------------------------
/home/oracle/prima/log_g1m1.rdo
/home/oracle/prima/log_g2m1.rdo

This is my standard demo Database. I have installed already Grid Infrastructure (Marketing name for the combination of Oracle Restart & ASM) for a standalone server. Also, I have already prepared 16 fake ‘Raw Devices’, each of 250m in size. Yes, my system is tiny – it’s my notebook. Continuing to start the ASM instance an then create the two recommended diskgroups DATA and FRA. We have the option to use the comfortable GUI asmca (11g New Feature, it can also create  quorum failgroups meanwhile) or go with the command line:

[oracle@uhesse-pc ~]$ cat /u01/app/11.2.0/grid/dbs/init+ASM.ora
#init+ASM.ora

instance_type='asm'
asm_diskstring='/dev/raw/raw*'
remote_login_passwordfile='EXCLUSIVE'
diagnostic_dest='/u01/app/oracle/'
asm_diskgroups=data,fra

After connecting to the ASM instance and startup , which will produce an error message, because data and fra do not exist yet, we create them like this:

SQL> CREATE DISKGROUP data NORMAL REDUNDANCY
 FAILGROUP fg1 DISK
 '/dev/raw/raw1' NAME disk01,
 '/dev/raw/raw2' NAME disk02,
 '/dev/raw/raw3' NAME disk03,
 '/dev/raw/raw4' NAME disk04
 FAILGROUP fg2 DISK
 '/dev/raw/raw5' NAME disk05,
 '/dev/raw/raw6' NAME disk06,
 '/dev/raw/raw7' NAME disk07,
 '/dev/raw/raw8' NAME disk08;

Each file, placed on DATA will be mirrored across the fg1 and fg2 on the stripe layer. All drives in fg1 or in fg2 could fail without losing data.

SQL> CREATE DISKGROUP fra EXTERNAL REDUNDANCY
 DISK
 '/dev/raw/raw9'  NAME disk09,
 '/dev/raw/raw10' NAME disk10,
 '/dev/raw/raw11' NAME disk11,
 '/dev/raw/raw12' NAME disk12,
 '/dev/raw/raw13' NAME disk13,
 '/dev/raw/raw14' NAME disk14,
 '/dev/raw/raw15' NAME disk15,
 '/dev/raw/raw16' NAME disk16;

FRA has no redundancy – should one drive fail we would lose all data on FRA. Next is an Online Backup with Image Copies to the DATA diskgroup. No downtime involved.

RMAN> backup as copy database format '+DATA';

Starting backup at 01-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=19 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
output file name=+DATA/prima/datafile/system.256.736599607 tag=TAG20101201T110002 RECID=1 STAMP=736599624
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
output file name=+DATA/prima/datafile/sysaux.257.736599629 tag=TAG20101201T110002 RECID=2 STAMP=736599634
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
output file name=+DATA/prima/datafile/undotbs1.258.736599641 tag=TAG20101201T110002 RECID=3 STAMP=736599646
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
output file name=+DATA/prima/datafile/users.259.736599655 tag=TAG20101201T110002 RECID=4 STAMP=736599655
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
copying current control file
output file name=+DATA/prima/controlfile/backup.260.736599657 tag=TAG20101201T110002 RECID=5 STAMP=736599662
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 01-DEC-10
channel ORA_DISK_1: finished piece 1 at 01-DEC-10
piece handle=+DATA/prima/backupset/2010_12_01/nnsnf0_tag20101201t110002_0.261.736599665 tag=TAG20101201T110002 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
Finished backup at 01-DEC-10

Now we change some dynamic parameters to point to the new Database resp. Recovery Area:

SQL> alter system set db_recovery_file_dest_size=1800m;

System altered.

SQL> alter system set db_recovery_file_dest='+FRA';

System altered.

SQL> alter system set db_create_file_dest='+DATA';

System altered.

Archivelogs will now be created into FRA. We also put our spfile there. The ‘from memory’ clause is an 11g New Feature.

SQL> create spfile='+DATA/spfileprima.ora' from memory;
 
 File created.

We remove our spfile from $ORACLE_HOME/dbs and replace it with a pointer to the new spfile:

SQL> host cat /u01/app/oracle/product/11.2.0/dbhome_1/dbs/initprima.ora
spfile='+DATA/spfileprima.ora'

Attention: Now we need to shutdown and restart the productive instance and have a short downtime:

SQL> startup force nomount

ORACLE instance started.

Total System Global Area  417546240 bytes
Fixed Size                  2227072 bytes
Variable Size             352322688 bytes
Database Buffers           54525952 bytes
Redo Buffers                8470528 bytes

The controlfiles should also be (mirrored) on ASM. Therefore:

SQL> alter system set control_files='+DATA','+FRA' scope=spfile;

System altered.

Restart to make the modified CONTROL_FILES parameter active:

SQL> startup force nomount
ORACLE instance started.

Total System Global Area  417546240 bytes
Fixed Size                  2227072 bytes
Variable Size             352322688 bytes
Database Buffers           54525952 bytes
Redo Buffers                8470528 bytes

We only need to restore controlfiles to the new location, switch to the new datafiles on DATA and recover the latest changes that where done since the online backup:

[oracle@uhesse-pc ~]$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Wed Dec 1 11:13:33 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (not mounted)

RMAN> restore controlfile from '/home/oracle/prima/control01.ctl';

Starting restore at 01-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=99 device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=+DATA/prima/controlfile/current.263.736600443
output file name=+FRA/prima/controlfile/current.256.736600443
Finished restore at 01-DEC-10

RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

RMAN> switch database to copy;

datafile 1 switched to datafile copy "+DATA/prima/datafile/system.256.736599607"
datafile 2 switched to datafile copy "+DATA/prima/datafile/sysaux.257.736599629"
datafile 3 switched to datafile copy "+DATA/prima/datafile/undotbs1.258.736599641"
datafile 4 switched to datafile copy "+DATA/prima/datafile/users.259.736599655"

RMAN> recover database;

Starting recover at 01-DEC-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=99 device type=DISK

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 01-DEC-10

RMAN> alter database open;

Downtime is over! End users can connect while we do some additional work:

SQL> select name from v$datafile union select name from v$tempfile;

NAME
--------------------------------------------------------------------------------
+DATA/prima/datafile/sysaux.257.736599629
+DATA/prima/datafile/system.256.736599607
+DATA/prima/datafile/undotbs1.258.736599641
+DATA/prima/datafile/users.259.736599655
/home/oracle/prima/temp01.dbt

The tempfile was not touched by RMAN during the backup or switch to copy. We need to do that manually:

SQL> alter database  tempfile '/home/oracle/prima/temp01.dbt' drop;

Database altered.

SQL> alter tablespace temp add tempfile size 50m;

Tablespace altered.

SQL> select name from v$datafile union select name from v$tempfile;

NAME
--------------------------------------------------------------------------------
+DATA/prima/datafile/sysaux.257.736599629
+DATA/prima/datafile/system.256.736599607
+DATA/prima/datafile/undotbs1.258.736599641
+DATA/prima/datafile/users.259.736599655
+DATA/prima/tempfile/temp.264.736600915

Everything nice. But our Online Logs are still on the filesystem:

SQL> select member from v$logfile;

MEMBER
-----------------------------------
/home/oracle/prima/log_g1m1.rdo
/home/oracle/prima/log_g2m1.rdo

This can also be fixed online:

SQL> alter database add logfile size 20m;
 
 Database altered.
 
 SQL> alter database add logfile size 20m;
 
 Database altered.

Gave us two new groups mirrored across DATA and FRA. Now we drop the old groups:

SQL> alter system switch logfile;
 
 System altered.
 
 SQL> alter system switch logfile;
 
 System altered.
 
 SQL> alter system checkpoint;
 
 System altered.
 
 SQL> alter database drop logfile group 1;
 
 Database altered.
 
 SQL> alter database drop logfile group 2;
 
 Database altered.
 
 SQL> select member from v$logfile;
 
 MEMBER
 --------------------------------------------------------------------------------
 +DATA/prima/onlinelog/group_3.264.736612757
 +FRA/prima/onlinelog/group_3.257.736612759
 +DATA/prima/onlinelog/group_4.265.736612765
 +FRA/prima/onlinelog/group_4.258.736612769

That was it. We may now do a backup of the database to FRA:

RMAN> backup database;

Starting backup at 01-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=105 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATA/prima/datafile/system.256.736608851
input datafile file number=00002 name=+DATA/prima/datafile/sysaux.257.736608873
input datafile file number=00003 name=+DATA/prima/datafile/undotbs1.258.736608885
input datafile file number=00004 name=+DATA/prima/datafile/users.259.736608899
channel ORA_DISK_1: starting piece 1 at 01-DEC-10
channel ORA_DISK_1: finished piece 1 at 01-DEC-10
piece handle=+FRA/prima/backupset/2010_12_01/nnndf0_tag20101201t145715_0.259.736613837 tag=TAG20101201T145715 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:16
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 01-DEC-10
channel ORA_DISK_1: finished piece 1 at 01-DEC-10
piece handle=+FRA/prima/backupset/2010_12_01/ncsnf0_tag20101201t145715_0.260.736613853 tag=TAG20101201T145715 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 01-DEC-10

Conclusion: It is relatively simple and only needs a short downtime to migrate your Database to ASM, mainly the process is an RMAN Online Backup with Image Copies.

, ,

7 Comments

Real-Time Query and Automatic Block Media Recovery in 11gR2

Yesterday, I posted about dealing with Block Corruption in general. Special case is Block Corruption in a Data Guard Environment, where we introduced the New Feature Automatic Block Media Recovery (ABMR). Yes, another cool abbrevation from the  the Oracle Realm :-)

To demonstrate this, I have created a Data Guard Configuration as described in my whitepaper 11g Data Guard on the command line, that you may get from the Downloads page. It’s still 11gR1 but can easily be adopted for 11gR2. Will publish an 11gR2 version of it there soon. My scenario looks like this:

SYS@prima > select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
DGMGRL> show configuration;

Configuration - myconf

 Protection Mode: MaxAvailability
 Databases:
 prima - Primary database
 physt - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

DGMGRL> show database physt;

Database - physt

 Role:            PHYSICAL STANDBY
 Intended State:  APPLY-ON
 Transport Lag:   0 seconds
 Apply Lag:       0 seconds
 Real Time Query: ON
 Instance(s):
 physt

Database Status:
SUCCESS

The first 11g New Feature related to Block Corruption in a Data Guard Environment is less striking than ABMR and similar as restoring blocks from Flashback Logs: If we have a Physical Standby Database present, the corrupted blocks can be restored from there very fast. I corrupted the emp table of scott in the same way as in the previous posting. After using RMAN to validate as shown also already, I do the Blockrecovery

RMAN> blockrecover corruption list;

Starting recover at 17-NOV-10
using channel ORA_DISK_1
finished standby search, restored 1 blocks

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 17-NOV-10

I need to emphasize that this restore of blocks from the Physical Standby does not require Real-Time Query there – so we don’t need to purchase Active Data Guard for that. But ABMR does. First ABMR scenario: Block Corruption happens on the Primary Database. I do the same steps as shown in the previous posting to corrupt the block containing the emp table of scott. Now scott connects and selects on the emp table. After a short (about 1 sec) break, the select produces the correct result set! The alert log file of the Primary Database records the following:

Wed Nov 17 08:54:29 2010
Hex dump of (file 4, block 131) in trace file /home/oracle/prima/diag/rdbms/prima/prima/trace/prima_ora_5169.trc
Corrupt block relative dba: 0x01000083 (file 4, block 131)
Bad header found during buffer read
Data in bad block:
 type: 0 format: 2 rdba: 0xffc00000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000000
 check value in block header: 0xff66
 block checksum disabled
Reading datafile '/home/oracle/prima/users01.dbf' for corruption at rdba: 0x01000083 (file 4, block 131)
Reread (file 4, block 131) found same corrupt data (no logical check)
Starting background process ABMR
Wed Nov 17 08:54:29 2010
ABMR started with pid=33, OS id=5174
Automatic block media recovery service is active.
Automatic block media recovery requested for (file# 4, block# 131)
Wed Nov 17 08:54:30 2010
Automatic block media recovery successful for (file# 4, block# 131)
Automatic block media recovery successful for (file# 4, block# 131)
WARNING: AutoBMR fixed mismatched on-disk single block ffc00000 with in-mem rdba 1000083.

 

An end user will probably not even notice the Block Corruption on the Primary Database. Second ABMR scenario: Block Corruption happens on the Physical Standby Database. I destroy the block of the emp table in the same way as on the Primary before. Scott now connects on the Physical Standby and does a select on the emp table. Again a short break – no error message but the correct result set is returned! Alert Log of the Physical Standby records:

Wed Nov 17 08:58:10 2010
Hex dump of (file 4, block 131) in trace file /home/oracle/physt/diag/rdbms/physt/physt/trace/physt_ora_5281.trc
Corrupt block relative dba: 0x01000083 (file 4, block 131)
Bad header found during buffer read
Data in bad block:
 type: 0 format: 2 rdba: 0xffc00000
 last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000000
 check value in block header: 0xff66
 block checksum disabled
Reading datafile '/home/oracle/physt/users01.dbf' for corruption at rdba: 0x01000083 (file 4, block 131)
Reread (file 4, block 131) found same corrupt data (no logical check)
Automatic block media recovery requested for (file# 4, block# 131)
Wed Nov 17 08:58:11 2010
Automatic block media recovery requested for (file# 4, block# 131)
Automatic block media recovery successful for (file# 4, block# 131)
Automatic block media recovery successful for (file# 4, block# 131)

Conclusion: We have just seen another benefit of Real-Time Query. Apart from it’s main purpose of making it possible to offload Read-Only (or even Read-Mostly) Applications to the Physical Standby Database, it is now in 11gR2 also leveraged to resolve Block Corruption fast and transparently. We call this 11gR2 New Feature Automatic Block Media Recovery (ABMR)

 

 

, , , , , , ,

3 Comments

Dealing with Oracle Database Block Corruption in 11g

When we look at physical damage of datafiles, the effects are not always so drastic that the file is completely destroyed. Sometimes, only small parts of the file are damaged resp. corrupted. It may very well not even noticed for a while. This posting is designed to show how to a) detect block corruption and b) recover from it. First requirement for a demonstration is to actually get a block corruption. The following little script damages one block of which the table emp from user scott consists:

sqlplus -s system/manager <<stop
set echo off
set heading off
set feed off
set verify off

spool rowid.sh
select min(dbms_rowid.rowid_block_number(rowid)) from scott.emp;
spool off

spool blksize.sh
select value from v\$parameter where name ='db_block_size';
spool off

spool file.sh
select file_name from dba_data_files
 where tablespace_name = (select tablespace_name from dba_tables
 where table_name = 'EMP'
 and owner = 'SCOTT');
spool off
exit
stop

FILE=`cat file.sh`
BLKSIZ=`cat blksize.sh`
START1=`cat rowid.sh`;
START2=`expr ${START1} + 1`

dd if=`echo ${FILE}` of=tab1 bs=`echo ${BLKSIZ}` count=`echo ${START1}`
dd if=`echo ${FILE}` of=tab2 bs=`echo ${BLKSIZ}` count=1
dd if=`echo ${FILE}` of=tab3 bs=`echo ${BLKSIZ}` skip=${START2}

cat tab1 > file
cat tab2 >> file
cat tab3 >> file

cp file ${FILE}

You don’t need that kind of script in production :-)

[oracle@uhesse-pc skripte]$ ./blockrec.sh

 131

8192

/home/oracle/prima/users01.dbf
131+0 records in
131+0 records out
1073152 bytes (1.1 MB) copied, 0.002112 seconds, 508 MB/s
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 2.9e-05 seconds, 282 MB/s
1149+0 records in
1149+0 records out
9412608 bytes (9.4 MB) copied, 0.02027 seconds, 464 MB/s

The block that contained all the rows of the little emp table is now damaged. The demo is done on 11.2.0.2 but the mentioned 11g New Features are already present in 11.1.0.6

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

Please notice that not the whole tablespace is affected:

SQL> select * from scott.emp;

 EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
 7369 SMITH      CLERK           7902 17-DEC-80        800                    20
 7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
 7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
 7566 JONES      MANAGER         7839 02-APR-81       2975                    20
 7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
 7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
 7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
 7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
 7839 KING       PRESIDENT            17-NOV-81       5000                    10
 7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
 7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
 7900 JAMES      CLERK           7698 03-DEC-81        950                    30
 7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
 7934 MILLER     CLERK           7782 23-JAN-82       1300                    10

14 rows selected.

This is still possible if the block is read from the buffer cache. Therefore:

SQL> alter system flush buffer_cache;

System altered.

SQL> select * from scott.emp;
select * from scott.emp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 131)
ORA-01110: data file 4: '/home/oracle/prima/users01.dbf'

SQL> select * from scott.dept;

 DEPTNO DNAME          LOC
---------- -------------- -------------
 10 ACCOUNTING     NEW YORK
 20 RESEARCH       DALLAS
 30 SALES          CHICAGO
 40 OPERATIONS     BOSTON

SQL> select table_name,tablespace_name from dba_tables where owner='SCOTT';

TABLE_NAME                     TABLESPACE_NAME
------------------------------ ------------------------------
DEPT                           USERS
EMP                            USERS

Whenever we get these kind of error messages, we need to check all the blocks. Typically, we get them during an RMAN backup, but I like to defer that a little to show an 11g New Feature before. Checking all blocks now:

RMAN> validate check logical database;

Starting validate at 16-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=107 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              17594        38400           277491
 File Name: /home/oracle/prima/system01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              13854
 Index      0              4487
 Other      0              2465

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
2    OK     0              20381        25600           277631
 File Name: /home/oracle/prima/sysaux01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              869
 Index      0              957
 Other      0              3393

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
3    OK     0              541          22784           277631
 File Name: /home/oracle/prima/undotbs01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              0
 Index      0              0
 Other      0              22243

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
4    FAILED 0              1133         1280            271968
 File Name: /home/oracle/prima/users01.dbf
 Block Type Blocks Failing Blocks Processed
 ---------- -------------- ----------------
 Data       0              10
 Index      0              0
 Other      1              137

validate found one or more corrupt blocks
See trace file /home/oracle/prima/diag/rdbms/prima/prima/trace/prima_ora_18316.trc for details
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
including current control file for validation
including current SPFILE in backup set
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Control File and SPFILE
===============================
File Type    Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE       OK     0              2
Control File OK     0              612
Finished validate at 16-NOV-10

We have already a couple of 11g New Features here: The syntax has changed from backup validate (since 9i) to just validate (11g) – probably to make clear that this does not perform a backup but a check of corrupted blocks instead. Before 11g, the command did not show the verbose list of checked resp. corrupted blocks like we see above. The addition check logical will also check for logical block corruption, which is not done by default. Checking all the blocks here is more efficient than doing an immediate recovery of the one block mentioned in the error message above. There may be many more not spotted yet. Same is true for an ordinary backup that would interrupt at the first spotted corrupted block as we will see later on. The validate command populated the view v$database_block_corruption, that is now internally read by RMAN in order to repair all the found corrupted blocks. The next 11g New Feature here is: It will take the block out of the Flashback Logs, if present there!

RMAN> blockrecover corruption list;

Starting recover at 16-NOV-10
using channel ORA_DISK_1
searching flashback logs for block images
finished flashback log search, restored 1 blocks

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 16-NOV-10

I was so bold that I did not even take a backup before – to make sure this new feature must be used:

RMAN> list backup;

specification does not match any backup in the repository

I’m going to take a backup now, but before that, I trigger again block corruption. So we will see that RMAN stops at the first noticed corrupted block. No Third-Party-Tool would recognize the block corruption, BTW, so we have another reason to actually use RMAN here. If we say backup check logical database instead of just backup database, RMAN will also check for logical block corruption during the backup.

[oracle@uhesse-pc skripte]$ rman target sys/oracle@prima

Recovery Manager: Release 11.2.0.2.0 - Production on Tue Nov 16 15:22:14 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (DBID=1967518488)

RMAN> backup database;

Starting backup at 16-NOV-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=110 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
channel ORA_DISK_1: starting piece 1 at 16-NOV-10
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/16/2010 15:22:22
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/prima/users01.dbf
continuing other job steps, job failed will not be re-run
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 16-NOV-10
channel ORA_DISK_1: finished piece 1 at 16-NOV-10
piece handle=/home/oracle/flashback/PRIMA/backupset/2010_11_16/o1_mf_ncsnf_TAG20101116T152221_6g54wzkb_.bkp tag=TAG20101116T152221 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/16/2010 15:22:22
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/prima/users01.dbf

Again the same sequence as above validate check logical database & blockrecover corruption list will solve the problem. During the whole process, the users tablespace remains online and usable, except the emp table of scott.

Conclusion: We have a powerful tool with RMAN to spot and repair corrupted blocks by using intact versions of the corrupted blocks from backup (since 9i already) or even from Flashback Logs (since 11g) – which is probably faster – while keeping up the availability of the affected tablespace.

, , , ,

5 Comments

Follow

Get every new post delivered to your Inbox.

Join 635 other followers