Beiträge getaggt mit High Availability
The Data Guard Broker: Why it is recommended

When it comes to Data Guard on a recent version, I will always use the Data Guard Broker. Not the Enterprise Manager; don’t get me wrong: Strictly Command Line with DGMGRL. It is for Standby Databases what RMAN is for Backup & Recovery: The recommended way to go. Why? Four reasons at least:
1. The Broker helps during the setup
This demo uses two Linux machines: uhesse1 has the Primary Database prima running. uhesse2 is for the Standby Database physt. The Oracle Net Configuration on uhesse1:
[oracle@uhesse1 ~]$ cat /u01/app/oracle/product/11.2.0/db_1/network/admin/listener.ora
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = uhesse1)(PORT = 1521))
)
)
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = prima_DGMGRL)
(ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1)
(SID_NAME = prima)
)
)
[oracle@uhesse1 ~]$ cat /u01/app/oracle/product/11.2.0/db_1/network/admin/tnsnames.ora
PRIMA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = uhesse1)(PORT = 1521))
)
(CONNECT_DATA =
(SID = prima)
)
)
PHYST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = uhesse2)(PORT = 1521))
)
(CONNECT_DATA =
(SID = physt)
)
)
_DGMGRL is the only special part here: The Broker needs that to be able to restart the instance during Role Changes. Standby Configuration is the same, except that there is physt instead of prima in the listener.ora. The initialization parameters for prima are
[oracle@uhesse1 ~]$ sqlplus sys/oracle@prima as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jul 8 11:44:05 2013
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> create pfile='/home/oracle/initprima.ora' from spfile;
File created.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
[oracle@uhesse1 ~]$ cat initprima.ora
*.compatible='11.2.0.3'
*.control_files='/home/oracle/prima/control01.ctl'
*.db_block_size=8192
*.db_name='prima'
*.db_recovery_file_dest='/home/oracle/flashback'
*.db_recovery_file_dest_size=5g
*.diagnostic_dest='/home/oracle/prima'
*.remote_login_passwordfile='exclusive'
*.undo_management='auto'
*.undo_tablespace='undotbs1'
I kept that as minimalistic as possible in order to give you an easy overview about what is relevant for Data Guard here – defaults almost everywhere. Only a few customizations for the Standby gives me
[oracle@uhesse1 ~]$ cat initphyst.ora *.compatible='11.2.0.3' *.control_files='/home/oracle/physt/control01.ctl' *.db_block_size=8192 *.db_name='prima' *.db_unique_name=physt *.db_file_name_convert='prima','physt' *.log_file_name_convert='prima','physt' *.db_recovery_file_dest='/home/oracle/flashback' *.db_recovery_file_dest_size=5g *.diagnostic_dest='/home/oracle/physt' *.remote_login_passwordfile='exclusive' *.undo_management='auto' *.undo_tablespace='undotbs1'
I copy that pfile and the password file to the Standby host and go there into NOMOUNT before the duplicate command
[oracle@uhesse1 ~]$ scp initphyst.ora uhesse2:/home/oracle initphyst.ora 100% 431 0.4KB/s 00:00 [oracle@uhesse1 ~]$ scp /u01/app/oracle/product/11.2.0/db_1/dbs/orapwprima uhesse2:/u01/app/oracle/product/11.2.0/db_1/dbs/orapwphyst orapwprima 100% 1536 1.5KB/s 00:00 [oracle@uhesse1 ~]$ ssh uhesse2 mkdir /home/oracle/physt [oracle@uhesse1 ~]$ ssh uhesse2 mkdir /home/oracle/flashback [oracle@uhesse1 ~]$ sqlplus sys/oracle@physt as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Mon Jul 8 12:07:30 2013 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to an idle instance. SQL> create spfile from pfile='/home/oracle/initphyst.ora'; File created. SQL> startup nomount ORACLE instance started. Total System Global Area 238034944 bytes Fixed Size 2227136 bytes Variable Size 180356160 bytes Database Buffers 50331648 bytes Redo Buffers 5120000 bytes SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, OLAP and Data Mining options
It is most efficient to create the Standby Redo Logs (SRLs) at this point on the Primary, because (from 11g on) RMAN will duplicate them onto the Standby then. SRLs are recommended on either side and are required on the Standby for LGWR Transport
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC
---------- ---------- ---------- ---------- ---------- ---------- ---
STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------------- ------------- --------- ------------ ---------
1 1 9 104857600 512 1 NO
CURRENT 238195 19-JAN-12 2.8147E+14
2 1 8 104857600 512 1 YES
INACTIVE 234561 18-JAN-12 238195 19-JAN-12
SQL> alter database add standby logfile '/home/oracle/prima/srl_g3.rdo' size 100m;
Database altered.
SQL> alter database add standby logfile '/home/oracle/prima/srl_g4.rdo' size 100m;
Database altered.
SQL> alter database add standby logfile '/home/oracle/prima/srl_g5.rdo' size 100m;
Database altered.
[oracle@uhesse1 ~]$ rman target sys/oracle@prima auxiliary sys/oracle@physt
Recovery Manager: Release 11.2.0.3.0 - Production on Mon Jul 8 12:08:56 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: PRIMA (DBID=2003897072)
connected to auxiliary database: PRIMA (not mounted)
RMAN> duplicate target database for standby from active database;
Starting Duplicate Db at 08-JUL-13
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=20 device type=DISK
contents of Memory Script:
{
backup as copy reuse
targetfile '/u01/app/oracle/product/11.2.0/db_1/dbs/orapwprima' auxiliary format
'/u01/app/oracle/product/11.2.0/db_1/dbs/orapwphyst' ;
}
executing Memory Script
Starting backup at 08-JUL-13
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=16 device type=DISK
Finished backup at 08-JUL-13
contents of Memory Script:
{
backup as copy current controlfile for standby auxiliary format '/home/oracle/physt/control01.ctl';
}
executing Memory Script
Starting backup at 08-JUL-13
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
copying standby control file
output file name=/u01/app/oracle/product/11.2.0/db_1/dbs/snapcf_prima.f tag=TAG20130708T121004 RECID=3 STAMP=820239005
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
Finished backup at 08-JUL-13
contents of Memory Script:
{
sql clone 'alter database mount standby database';
}
executing Memory Script
sql statement: alter database mount standby database
contents of Memory Script:
{
set newname for tempfile 1 to
"/home/oracle/physt/temp01.dbt";
switch clone tempfile all;
set newname for datafile 1 to
"/home/oracle/physt/system01.dbf";
set newname for datafile 2 to
"/home/oracle/physt/sysaux01.dbf";
set newname for datafile 3 to
"/home/oracle/physt/undotbs01.dbf";
set newname for datafile 4 to
"/home/oracle/physt/users01.dbf";
backup as copy reuse
datafile 1 auxiliary format
"/home/oracle/physt/system01.dbf" datafile
2 auxiliary format
"/home/oracle/physt/sysaux01.dbf" datafile
3 auxiliary format
"/home/oracle/physt/undotbs01.dbf" datafile
4 auxiliary format
"/home/oracle/physt/users01.dbf" ;
sql 'alter system archive log current';
}
executing Memory Script
executing command: SET NEWNAME
renamed tempfile 1 to /home/oracle/physt/temp01.dbt in control file
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
Starting backup at 08-JUL-13
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
output file name=/home/oracle/physt/system01.dbf tag=TAG20130708T121013
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
output file name=/home/oracle/physt/sysaux01.dbf tag=TAG20130708T121013
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
output file name=/home/oracle/physt/undotbs01.dbf tag=TAG20130708T121013
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
output file name=/home/oracle/physt/users01.dbf tag=TAG20130708T121013
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 08-JUL-13
sql statement: alter system archive log current
contents of Memory Script:
{
switch clone datafile all;
}
executing Memory Script
datafile 1 switched to datafile copy
input datafile copy RECID=3 STAMP=820239080 file name=/home/oracle/physt/system01.dbf
datafile 2 switched to datafile copy
input datafile copy RECID=4 STAMP=820239080 file name=/home/oracle/physt/sysaux01.dbf
datafile 3 switched to datafile copy
input datafile copy RECID=5 STAMP=820239080 file name=/home/oracle/physt/undotbs01.dbf
datafile 4 switched to datafile copy
input datafile copy RECID=6 STAMP=820239080 file name=/home/oracle/physt/users01.dbf
Finished Duplicate Db at 08-JUL-13
So far no DGMGRL involved. RMAN gave me a Physical Standby but did not configure Redo Transport from Primary to Standby nor did it start Redo Apply on the Standby. DGMGRL will now do that:
[oracle@uhesse1 ~]$ dgmgrl sys/oracle@prima
DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> help create
Creates a broker configuration
Syntax:
CREATE CONFIGURATION AS
PRIMARY DATABASE IS
CONNECT IDENTIFIER IS ;
DGMGRL> CREATE CONFIGURATION myconf AS PRIMARY DATABASE IS prima CONNECT IDENTIFIER IS prima;
Error:
ORA-16525: the Data Guard broker is not yet available
ORA-06512: at "SYS.DBMS_DRS", line 157
ORA-06512: at line 1
DGMGRL> exit
[oracle@uhesse1 ~]$ oerr ora 16525
16525, 00000, "the Data Guard broker is not yet available"
// *Cause: The Data Guard broker process was either not yet started, was
// initializing, or failed to start.
// *Action: If the broker has not been started, set the DG_BROKER_START
// initialization parameter to true and allow the broker to finish
// initializing before making the request. If the broker failed to
// start, check the Data Guard log for possible errors. Otherwise,
// retry the operation.
Oops, I forgot to set that parameter – that was of course intentionally for didactical reasons 😉
SQL> alter system set dg_broker_start=true; System altered. SQL> connect sys/oracle@physt as sysdba Connected. SQL> alter system set dg_broker_start=true; System altered. SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, OLAP and Data Mining options [oracle@uhesse1 ~]$ dgmgrl sys/oracle@prima "CREATE CONFIGURATION myconf AS PRIMARY DATABASE IS prima CONNECT IDENTIFIER IS prima" DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production Copyright (c) 2000, 2009, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. Connected. Configuration "myconf" created with primary database "prima"
The built in help function is so good (wished that RMAN had it also!) that I don’t need the documentation here:
[oracle@uhesse1 ~]$ dgmgrl sys/oracle@prima
DGMGRL for Linux: Version 11.2.0.3.0 - 64bit Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> help add
Adds a standby database to the broker configuration
Syntax:
ADD DATABASE
[AS CONNECT IDENTIFIER IS ]
[MAINTAINED AS {PHYSICAL|LOGICAL}];
DGMGRL> ADD DATABASE physt AS CONNECT IDENTIFIER IS physt MAINTAINED AS PHYSICAL;
Database "physt" added
DGMGRL> enable configuration;
Enabled.
You should monitor the alert.log of the two databases while that enabling is in progress – the Broker does a lot here, especially it configures Redo Transport and Redo Apply.
DGMGRL> show configuration;
Configuration - myconf
Protection Mode: MaxPerformance
Databases:
prima - Primary database
physt - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> show database physt;
Database - physt
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
physt
Database Status:
SUCCESS
That was already it. The hardest part was the Oracle Net Configuration, right? Our heroes RMAN & DGMGRL did the rest.
2. Role Changes are much easier with the Broker
Without the Broker, Data Guard Role Changes require a complex sequence of steps (versions before 12c) on both sides that differ between Logical and Physical Standby. Not so with DGMGRL:
DGMGRL> show configuration;
Configuration - myconf
Protection Mode: MaxPerformance
Databases:
prima - Primary database
physt - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> switchover to physt;
Performing switchover NOW, please wait...
New primary database "physt" is opening...
Operation requires shutdown of instance "prima" on database "prima"
Shutting down instance "prima"...
ORACLE instance shut down.
Operation requires startup of instance "prima" on database "prima"
Starting instance "prima"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "physt"
DGMGRL> show configuration;
Configuration - myconf
Protection Mode: MaxPerformance
Databases:
physt - Primary database
prima - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
Try that with SQL commands and you spend a significant amount of time reading the documentation in order to get these steps right. Furthermore, I don’t need to bother about LOG_ARCHIVE_DEST_2 because the Broker sets it correctly – without manual intervention and without VALID_FOR. The other Role Changes are also one-liners with the Broker:
DGMGRL> failover to physt;
That’s it for the manual failover. And there is also only one command needed for the Snapshot Standby:
DGMGRL> convert database physt to snapshot standby;
Easy, isn’t it?
3. The Data Guard Broker delivers basic monitoring of the Configuration
The Broker is quite sensitive and spots problems with the Data Guard Configuration fast. It is a good indicator that everything is actually okay when you see this:
DGMGRL> show configuration;
Configuration - myconf
Protection Mode: MaxPerformance
Databases:
prima - Primary database
physt - Snapshot standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
4. Fast-Start Failover requires the Data Guard Broker
Conclusion: In many ways, the Data Guard Broker respectively DGMGRL is comparable to RMAN: The tool is recommended because it makes critical tasks easier to do with less risk of errors than a manual approach has. Some features even require it, like RMAN for incremental backups and the Broker for Fast-Start Failover. It is the most efficient and reliable way to go. Yes, there have been bugs (some in version 9 for the Broker, and RMAN wasn’t reliable in version 8) – but we don’t live in the past. I’ll go with the Broker for Data Guard anytime. The shown demo should be easy to reproduce for you, so as always: „Don’t believe it, test it!“ 🙂
Addendum: I didn’t want to give the impression that you shouldn’t use Enterprise Manager for Data Guard – it is of course perfectly valid to do so. It is just my personal preference to maintain Data Guard on the Command Line. EM also triggers the Broker under the covers, pretty much like it triggers RMAN when you manage Backup & Recovery over EM. Check out this great article when you are interested in Data Guard administration with EM 12c Cloud Control: http://www.oracle.com/technetwork/articles/oem/havewala-odg-oem12c-1999410.html
Automatic Block Media Recovery in Action on Video
The little video below shows the 11gR2 New Feature Automatic Block Media Recovery in Action. I have already introduced the feature in this post, but some things are just more impressive when you actually see it happening, don’t you agree?
Drop an ASM Disk that contains a Voting Disk?
That was a question I got during my present Oracle 11gR2 RAC accelerated course in Duesseldorf: What happens if we drop an ASM Disk that contains a Voting Disk? My answer was: „I suppose that is not allowed“ but my motto is „Don’t believe it, test it!“ and that is what I did. That is actually one of the good things about doing a course at Oracle University: We can just check out things without affecting critical production systems here in our course environment:
[grid@host01 ~]$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 48d3710843274f88bf1eb9b3b5129a7d (ORCL:ASMDISK01) [DATA]
2. ONLINE 354cfa8376364fd2bfaa1921534fe23b (ORCL:ASMDISK02) [DATA]
3. ONLINE 762ad94a98554fdcbf4ba5130ac0384c (ORCL:ASMDISK03) [DATA]
Located 3 voting disk(s).
We are on 11.2.0.1 here. The Voting Disk being part of an ASM Diskgroup was an 11gR2 New Feature that I introduced in this posting already. Now let’s try to drop ASMDISK01:
[grid@host01 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 13 17:18:21 2012 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production SQL> select name,group_number from v$asm_diskgroup; NAME GROUP_NUMBER ------------------------------ ------------ DATA 1 ACFS 2 FRA 3 SQL> select name from v$asm_disk where group_number=1; NAME ------------------------------ ASMDISK01 ASMDISK02 ASMDISK03 ASMDISK04 SQL> alter diskgroup data drop disk 'ASMDISK01'; Diskgroup altered.
It just did it without error message! We look further:
SQL> select name from v$asm_disk where group_number=1; NAME ------------------------------ ASMDISK02 ASMDISK03 ASMDISK04 SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Real Application Clusters and Automatic Storage Management options [grid@host01 ~]$ crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 354cfa8376364fd2bfaa1921534fe23b (ORCL:ASMDISK02) [DATA] 2. ONLINE 762ad94a98554fdcbf4ba5130ac0384c (ORCL:ASMDISK03) [DATA] 3. ONLINE 3f0bf16b6eb64f3cbf440a3c2f0da2fd (ORCL:ASMDISK04) [DATA] Located 3 voting disk(s).
It just moved the Voting Disk silently to another ASM Disk of that Diskgroup. When I try to drop another ASM Disk from that Diskgroup, the command seems to be silently ignored, because 3 ASM Disks are required here to keep the 3 Voting Disks. Similar behavior with External Redundancy:
[grid@host01 ~]$ asmcmd lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED EXTERN N 512 4096 1048576 9788 9645 0 9645 0 N ACFS/ MOUNTED NORMAL N 512 4096 1048576 7341 6431 438 2996 0 N DATA/ MOUNTED EXTERN N 512 4096 1048576 4894 4755 0 4755 0 N FRA/
I will move the Voting Disk to the FRA Diskgroup. It is a bug of 11.2.0.1 that the Voting_files flag is not Y for the DATA Diskgroup here, by the way.
[grid@host01 ~]$ sudo crsctl replace votedisk +FRA Successful addition of voting disk 4d586fbecf664f8abf01d272a354fa67. Successful deletion of voting disk 354cfa8376364fd2bfaa1921534fe23b. Successful deletion of voting disk 762ad94a98554fdcbf4ba5130ac0384c. Successful deletion of voting disk 3f0bf16b6eb64f3cbf440a3c2f0da2fd. Successfully replaced voting disk group with +FRA. CRS-4266: Voting file(s) successfully replaced [grid@host01 ~]$ crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 4d586fbecf664f8abf01d272a354fa67 (ORCL:ASMDISK10) [FRA] Located 1 voting disk(s). [grid@host01 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 13 17:36:06 2012 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> alter diskgroup fra drop disk 'ASMDISK10'; Diskgroup altered. SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Real Application Clusters and Automatic Storage Management options [grid@host01 ~]$ crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 0b051cf6e6a14ff1bf31ef7bc66098e0 (ORCL:ASMDISK11) [FRA] Located 1 voting disk(s).
Not sure whether I would dare that all in a production system, though 🙂
Conclusion: We can drop ASM Disks that contain Voting Disks as long as there are enough Disks left in the Diskgroup to retain the same number of Voting Disks (each inside a separate Failure Group) afterwards. Apparently – but: „Don’t believe it, test it!“
