Beiträge getaggt mit Oracle Restart

See & modify the attributes of Oracle Restart resources with crsctl

I am presently teaching 11gR2 New Features for DBAs on a Private Event, where one attendee asked how we could retrieve and possibly modify the restart attempts, ohasd is trying for a resource, governed by Oracle Restart. My first guess was that this should be possible similar as with the Grid Infrastructure clusterware utility crsctl, but the commands are not listed in the Online Documentation about Oracle Restart. So I investigated a little and that is what I found out:

$ crsctl status res ora.orcl.db -f
NAME=ora.orcl.db
TYPE=ora.database.type
STATE=OFFLINE
TARGET=ONLINE
ACL=owner:oracle:--x,pgrp:dba:--x,other::r--,group:oinstall:r-x,user:oracle:rwx
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=1
CARDINALITY_ID=0
CHECK_INTERVAL=1
CHECK_TIMEOUT=600
CLUSTER_DATABASE=false
CREATION_SEED=16
CURRENT_RCOUNT=0
DB_UNIQUE_NAME=orcl
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=database) PROPERTY(DB_UNIQUE_NAME= CONCAT(PARSE(%NAME%, ., 2), %USR_ORA_DOMAIN%, .)) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%)
DEGREE=1
DESCRIPTION=Oracle Database resource
ENABLED=1
FAILOVER_DELAY=0
FAILURE_COUNT=0
FAILURE_HISTORY=
FAILURE_INTERVAL=60
FAILURE_THRESHOLD=1
GEN_AUDIT_FILE_DEST=/u01/app/oracle/admin/orcl/adump
GEN_USR_ORA_INST_NAME=orcl
HOSTING_MEMBERS=
ID=ora.orcl.db
INCARNATION=0
INSTANCE_FAILOVER=0
LAST_FAULT=0
LAST_RESTART=0
LAST_SERVER=
LOAD=1
LOGGING_LEVEL=1
MANAGEMENT_POLICY=AUTOMATIC
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
ORACLE_HOME=/u01/app/oracle/acfsmounts/acfs_db1
PLACEMENT=balanced
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=2
ROLE=PRIMARY
SCRIPT_TIMEOUT=60
SERVER_POOLS=
SPFILE=+DATA/orcl/spfileorcl.ora
START_DEPENDENCIES=hard(ora.DATA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DATA.dg,ora.FRA.dg)
START_TIMEOUT=600
STATE_CHANGE_TEMPLATE=
STATE_CHANGE_VERS=0
STATE_DETAILS=
STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DATA.dg,shutdown:ora.FRA.dg)
STOP_TIMEOUT=600
UPTIME_THRESHOLD=1h
USR_ORA_DB_NAME=orcl
USR_ORA_DOMAIN=
USR_ORA_ENV=
USR_ORA_FLAGS=
USR_ORA_INST_NAME=orcl
USR_ORA_OPEN_MODE=open
USR_ORA_OPI=false
USR_ORA_STOP_MODE=immediate
VERSION=11.2.0.1.0

We see amongst others the highlighted Restart Attempts and the Uptime Threshold of the Database resource orcl with their default values. We can also modify them as follows:

$ crsctl modify res ora.orcl.db -attr "RESTART_ATTEMPTS=1"

Now if within 1 hour, the Database Instance fails more than 1 times, ohasd will no longer try to restart it. This can be watched in the ohasd logfile. I open a second terminal to monitor the ohasd logfile and kill the smon of the orcl instance from the first terminal. After the second kill -9 on the smon process, ohasd refuses to restart the instance, with this entry:

$ cat /u01/app/oracle/product/11.2.0/grid/log/edd2r6p0/ohasd/ohasd.log | grep attempt
2011-11-08 16:05:26.944: [   CRSPE][2781559696] No Local Restart of [ora.orcl.db 1 1], restart attempt exhausted

The Uptime Threshold will start anew after the command

$ srvctl start database -d orcl



,

13 Kommentare

Data Guard & Oracle Restart in 11gR2

Just playing around with Data Guard & Oracle Restart in the current release 11gR2 – it’s really cool!  Look at Joel Goodmans introduction into Oracle Restart, if you never heared of it before. At first, I installed Oracle Database 11gR2 Grid Infrastructure for Standalone Server to get Oracle Restart, then Oracle Database 11gR2 Software only. I did not use ASM in this setup – everything is running inside a VM on my notebook with only one single hard drive really, so ASM is of not much help here 🙂

Listener and Database must be registered manually if NETCA and DBCA have not been used to create them. I did that on the command line, therefore the following steps.

First registering the listener:

$ srvctl add listener
-o /u01/app/oracle/product/11.2.0/dbhome_1
$ srvctl add database -d prima
-o /u01/app/oracle/product/11.2.0/dbhome_1
-r PRIMARY

Above added the Primary to the Oracle Local Registry (OLR)

$ srvctl add database -d physt
-o /u01/app/oracle/product/11.2.0/dbhome_1
-r PHYSICAL_STANDBY -s MOUNT

Above added the Physical Standby to the OLR. -s is the startmode which defaults to OPEN. My Standby shall get mounted automatically. I would choose OPEN if I am inclined to use the 11g New Feature Real-Time Query – supposed I had purchased Active Data Guard.

I will now add an Application Service to my Configuration – with the same purpose demonstrated already in this posting, but without having to use DBMS_SERVICE. Instead, Oracle Restart will take care for the start of the services in my Data Guard environment according to the assigned Database Role. The creation of a Database Trigger for that purpose is no longer necessary. Instead, the usage of Oracle Restart is recommended:

$ srvctl add service -d prima
-s prod -l PRIMARY
-e SELECT
-m BASIC

-l is the Database Role in which this service prod is supposed to get started, -e is the Failover Type and -m the Failover Method. Next I configure Failover Retries (-z) and Failover Delays (-w):

$ srvctl modify service -d prima
-s prod
-z 180
-w 1

The same has to be done for the Standby Database also – the -l PRIMARY parameter takes care that the service will not be offered unless this database becomes Primary:

$ srvctl add service -d physt
-s prod
-l PRIMARY
-e SELECT
-m BASIC

$ srvctl modify service -d physt
-s prod
-z 180
-w 1

The nect command is going to start the service actually. We need to do this only for the first time manually – again without needing DBMS_SERVICE for it:

$ srvctl start service -d prima -s prod

Now let’s investigate the properties of our new service:

$ srvctl config service -d prima -s prod
Service name: prod
Service is enabled
Cardinality: SINGLETON
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 180
TAF failover delay: 1
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE

I have prepared a connect descriptor prod in the same way as in this posting shown already. Using it to connect as system user. Now I kill the SMON process of my Primary Instance (Don’t do that at home). Then a select on v$instance in my user system session waits for the reestablishing of the connection (maximum 180 seconds with my settings above) – after less than 5 seconds, the ohasd (Oracle High Availabilty Services Daemon) recognizes this and restarts the Primary Instance. After less than 20 seconds, I see my select on v$instance come through. This was a Transparent Application Failover without another Instance or Database in the game 🙂 Really cool, isn’t it?

Now let’s do a switchover and see how Oracle Restart will take care to start the prod service on the new Primary and how the connection again is reestablished there:

$ dgmgrl sys/oracle@prima
DGMGRL for Linux: Version 11.2.0.1.0 - Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> show configuration
Configuration - myconf
 Protection Mode: MaxAvailability
 Databases:
 prima - Primary database
 physt - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> switchover to physt
Performing switchover NOW, please wait...
New primary database "physt" is opening...
Operation requires shutdown of instance "prima" on database "prima"
Shutting down instance "prima"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "prima" on database "prima"
Starting instance "prima"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "physt"

The switchover was done as usual. Now what about the service and the session?

$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 06-SEP-2010 08:45:25

Copyright (c) 1991, 2009, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.1.0 - Production
Start Date                06-SEP-2010 07:40:29
Uptime                    0 days 1 hr. 4 min. 57 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/11.2.0/dbhome_1/
                           network/admin/listener.ora
Listener Log File         /u01/app/oracle/product/11.2.0/dbhome_1/log/diag/
                          tnslsnr/uhesse/listener/alert/log.xml
Listening Endpoints Summary...
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=uhesse)(PORT=1521)))
Services Summary...
Service "logst_DGMGRL" has 1 instance(s).
 Instance "logst", status UNKNOWN, has 1 handler(s) for this service...
Service "physt" has 1 instance(s).
 Instance "physt", status READY, has 1 handler(s) for this service...
Service "physt_DGB" has 1 instance(s).
 Instance "physt", status READY, has 1 handler(s) for this service...
Service "physt_DGMGRL" has 1 instance(s).
 Instance "physt", status UNKNOWN, has 1 handler(s) for this service...
Service "prima" has 1 instance(s).
 Instance "prima", status READY, has 1 handler(s) for this service...
Service "prima_DGB" has 1 instance(s).
 Instance "prima", status READY, has 1 handler(s) for this service...
Service "prima_DGMGRL" has 1 instance(s).
 Instance "prima", status UNKNOWN, has 1 handler(s) for this service...
Service "prod" has 1 instance(s).
 Instance "physt", status READY, has 1 handler(s) for this service...
The command completed successfully

The service was started on the new Primary by Oracle Restart automatically. Now the session:

SYSTEM@prod > select instance_name from v$instance;
INSTANCE_NAME
----------------
physt
SYSTEM@prod > select username,FAILED_OVER,FAILOVER_METHOD,FAILOVER_TYPE 
              from v$session where username='SYSTEM';
USERNAME                       FAI FAILOVER_M FAILOVER_TYPE
------------------------------ --- ---------- -------------
SYSTEM                         YES BASIC      SELECT

The session was again „unsinkable“ and survived the switchover by failing over to the new Primary. When I reboot my server, ohasd together with DMON now take care that Listener, Primary, Standby and Service all get started in the correct order.

 

, , ,

50 Kommentare