How to reinstate the old Primary as a Standby after Failover in #Oracle

You have done a failover to your Standby database so it becomes the new Primary. It may be possible to convert the old Primary into a Standby database now instead of having to do a time consuming duplicate again. The old Primary must have been running in flashback mode before the failover. The playground:

DGMGRL> show configuration;

Configuration - myconf

  Protection Mode: MaxAvailability
  Members:
  prima - Primary database
    physt - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 36 seconds ago)

SYS@prima > host echo kaputt > /home/oracle/prima/users01.dbf

SYS@prima > select * from scott.emp;
select * from scott.emp
                    *
ERROR at line 1:
ORA-01115: IO error reading block from file  (block # )
ORA-01110: data file 4: '/home/oracle/prima/users01.dbf'
ORA-27072: File I/O error
Additional information: 4
Additional information: 130

I did just cause a damage on my Primary database in order to have a reason to failover. I took backups on the Primary previously and could do restore and recovery instead of the failover. But failover ist way faster. After all, that is what we have the Standby for🙂

[oracle@uhesse scripts]$ dgmgrl sys/oracle@physt
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected as SYSDBA.
DGMGRL> failover to physt;
Performing failover NOW, please wait...
Failover succeeded, new primary is "physt"

Took less than 10 seconds. Now I want a Standby for my new Primary. If the old Primary has had a power outage only, the Reinstate could be done immediately. But here, one datafile is damaged. Reinstate needs intact datafiles. Therefore first a restore, then reinstate:

[oracle@uhesse scripts]$ rman target sys/oracle@prima

Recovery Manager: Release 12.1.0.2.0 - Production on Thu Sep 15 16:58:24 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

connected to target database (not started)

RMAN> startup mount

Oracle instance started
database mounted

Total System Global Area     838860800 bytes

Fixed Size                     2929936 bytes
Variable Size                490736368 bytes
Database Buffers             339738624 bytes
Redo Buffers                   5455872 bytes

RMAN> restore datafile 4;

Starting restore at 2016-09-15 16:59:04
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=176 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /home/oracle/prima/users01.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/fra/PRIMA/backupset/2016_09_15/o1_mf_nnndf_TAG20160915T163533_cxomgq0q_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/fra/PRIMA/backupset/2016_09_15/o1_mf_nnndf_TAG20160915T163533_cxomgq0q_.bkp tag=TAG20160915T163533
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 2016-09-15 16:59:05

RMAN> select flashback_on from v$database;

FLASHBACK_ON
------------------
YES

RMAN> exit


Recovery Manager complete.
[oracle@uhesse scripts]$ dgmgrl sys/oracle@physt
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected as SYSDBA.
DGMGRL> show configuration;

Configuration - myconf

  Protection Mode: MaxAvailability
  Members:
  physt - Primary database
    Warning: ORA-16629: database reports a different protection level from the protection mode

    prima - Physical standby database (disabled)
      ORA-16661: the standby database needs to be reinstated

Fast-Start Failover: DISABLED

Configuration Status:
WARNING   (status updated 17 seconds ago)

DGMGRL> reinstate database prima;
Reinstating database "prima", please wait...
Reinstatement of database "prima" succeeded
DGMGRL> show database prima;

Database - prima

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          0 seconds (computed 1 second ago)
  Average Apply Rate: 44.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    prima

Database Status:
SUCCESS

Optionally, I may switch back now:

DGMGRL> switchover to prima;
Performing switchover NOW, please wait...
Operation requires a connection to instance "prima" on database "prima"
Connecting to instance "prima"...
Connected as SYSDBA.
New primary database "prima" is opening...
Operation requires start up of instance "physt" on database "physt"
Starting instance "physt"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "prima"
DGMGRL> show configuration;

Configuration - myconf

  Protection Mode: MaxAvailability
  Members:
  prima - Primary database
    physt - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 40 seconds ago)

This is one reason why we recommend that you turn on flashback for the databases that are part of a Data Guard configuration. As always: Don’t believe it, test it🙂

3 Comments

How to use DURATION with RMAN backups in #Oracle

The DURATION clause enables you to reduce the performance impact of RMAN backups respectively it sets a certain time limit for the backup.

Let’s suppose your RMAN backup takes one hour now and you take it online while end users work with the database. This reduces the performance impact of the online backup by half approximately:

RMAN> backup duration 02:00 minimize load database;

The first two digits are hours, the second two digits are minutes. Above command tells RMAN to spend 02 hours and 00 minutes with the backup that takes normally one hour. That way, RMAN gets throttled down, causing roughly half the load on the system than otherwise.

In another scenario, let’s suppose that you want to limit the backup run to take only 30 minutes every night, because you want to run a batch job afterwards that must not be impacted by the backup. Without the limit, backup takes one hour. You are fine with backing up only half of your datafiles every night. This command does the trick:

RMAN> backup duration 00:30 partial minimize time database 
      not backed up since time='sysdate-1' filesperset 1;

RMAN will backup as many datafiles as possible within 30 minutes, generating one backuset per datafile. It stops after 30 minutes. Tomorrow, the remaining datafiles are being backed up. Drawback is that it takes longer in case to recover the datafiles that have two days old backups.

This is one little topic from the Oracle Database 12c Backup and Recovery Workshop that I deliver this week in Prague. Great city, by the way🙂

,

3 Comments

How to resolve media failures with the Recovery Advisor in #Oracle

Error messages are showing up because files have been damaged? Database Recovery Advisor to the rescue!

[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Sep 7 11:52:24 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/prima/system01.dbf
/u01/app/oracle/oradata/prima/sysaux01.dbf
/u01/app/oracle/oradata/prima/undotbs01.dbf
/u01/app/oracle/oradata/prima/users01.dbf

SQL> host echo kaputt > /u01/app/oracle/oradata/prima/users01.dbf

SQL> select count(*) from adam.sales;
select count(*) from adam.sales
                          *
ERROR at line 1:
ORA-01115: IO error reading block from file  (block # )
ORA-01115: IO error reading block from file 4 (block # 147)
ORA-27072: File I/O error
Additional information: 4
Additional information: 147

Don’t do that at home, but the above prepared the playground to show how to resolve media errors easily. I’m using 12.1.0.2 here, but the shown functionality is available since 11.1 already.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
[oracle@uhesse ~]$ rman target /

Recovery Manager: Release 12.1.0.2.0 - Production on Wed Sep 7 12:02:47 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (DBID=2131944058)

RMAN> list failure;

using target database control file instead of recovery catalog
Database Role: PRIMARY

no failures found that match specification

If the failure is not already listed, you should run the following check (works in 12c from the RMAN shell as shown, in 11g you need to run it from SQL*Plus):

RMAN> begin
 dbms_hm.run_check ('DB Structure Integrity Check','mycheck');
end;
/

Statement processed

RMAN> list failure;

Database Role: PRIMARY

List of Database Failures
=========================

Failure ID Priority Status    Time Detected       Summary
---------- -------- --------- ------------------- -------
42         HIGH     OPEN      2016-09-07 12:05:04 One or more non-system datafiles are corrupt

The failure is now listed. Next steps:

RMAN> advise failure;

Database Role: PRIMARY

List of Database Failures
=========================

Failure ID Priority Status    Time Detected       Summary
---------- -------- --------- ------------------- -------
42         HIGH     OPEN      2016-09-07 12:05:04 One or more non-system datafiles are corrupt

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
no manual actions available

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 4  
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/app/oracle/diag/rdbms/prima/prima/hm/reco_1884775430.hm

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/prima/prima/hm/reco_1884775430.hm

contents of repair script:
   # restore and recover datafile
   sql 'alter database datafile 4 offline';
   restore ( datafile 4 );
   recover datafile 4;
   sql 'alter database datafile 4 online';

So that’s good to know: ‘The repair includes complete media recovery with no data loss’🙂 Let’s do that:

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/prima/prima/hm/reco_1884775430.hm

contents of repair script:
   # restore and recover datafile
   sql 'alter database datafile 4 offline';
   restore ( datafile 4 );
   recover datafile 4;
   sql 'alter database datafile 4 online';

Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script

sql statement: alter database datafile 4 offline

Starting restore at 2016-09-07 12:08:43
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/prima/users01.dbf
channel ORA_DISK_1: reading from backup piece /u02/reco/PRIMA/backupset/2016_04_26/o1_mf_nnndf_TAG20160426T105118_ckyc07n5_.bkp
channel ORA_DISK_1: piece handle=/u02/reco/PRIMA/backupset/2016_04_26/o1_mf_nnndf_TAG20160426T105118_ckyc07n5_.bkp tag=TAG20160426T105118
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:15
Finished restore at 2016-09-07 12:09:58

Starting recover at 2016-09-07 12:09:58
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 37 is already on disk as file /u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_37_ckyc311q_.arc
archived log for thread 1 with sequence 38 is already on disk as file /u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_38_ckyc373o_.arc
archived log for thread 1 with sequence 39 is already on disk as file /u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_39_ckyc3co9_.arc
archived log file name=/u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_37_ckyc311q_.arc thread=1 sequence=37
archived log file name=/u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_38_ckyc373o_.arc thread=1 sequence=38
media recovery complete, elapsed time: 00:00:01
Finished recover at 2016-09-07 12:10:01

sql statement: alter database datafile 4 online
repair failure complete

The Recovery Advisor did not only resolve the problem but it did also choose the solution with the least impact on availability: Restore and recovery has been done with the instance in status OPEN. Is the error now gone?

RMAN> select count(*) from adam.sales;

  COUNT(*)
----------
  10000000

Problem solved! Think of the Recovery Advisor as your personal first level support: There are problems that are too difficult for this tool to resolve, but it may be able to do a pretty good job as in this example. It doesn’t support RAC and cannot resolve issues at a standby database in a Data Guard  environment. Many more details are here, but essentially, you just need to memorize these three commands: LIST FAILURE, ADVISE FAILURE, REPAIR FAILURE. I think that’s not too much to ask, even with a very high stress level🙂

2 Comments

%d bloggers like this: