Posts Tagged ASM

Drop an ASM Disk that contains a Voting Disk?

That was a question I got during my present Oracle 11gR2 RAC accelerated course in Duesseldorf: What happens if we drop an ASM Disk that contains a Voting Disk? My answer was: “I suppose that is not allowed” but my motto is “Don’t believe it, test it!” and that is what I did. That is actually one of the good things about doing a course at Oracle University: We can just check out things without affecting critical production systems here in our course environment:

[grid@host01 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   48d3710843274f88bf1eb9b3b5129a7d (ORCL:ASMDISK01) [DATA]
 2. ONLINE   354cfa8376364fd2bfaa1921534fe23b (ORCL:ASMDISK02) [DATA]
 3. ONLINE   762ad94a98554fdcbf4ba5130ac0384c (ORCL:ASMDISK03) [DATA]
Located 3 voting disk(s).

We are on 11.2.0.1 here. The Voting Disk being part of an ASM Diskgroup was an 11gR2 New Feature that I introduced in this posting already. Now let’s try to drop ASMDISK01:

[grid@host01 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 13 17:18:21 2012

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> select name,group_number from v$asm_diskgroup;

NAME                           GROUP_NUMBER
------------------------------ ------------
DATA                                      1
ACFS                                      2
FRA                                       3

SQL> select name from v$asm_disk where group_number=1;

NAME
------------------------------
ASMDISK01
ASMDISK02
ASMDISK03
ASMDISK04

SQL> alter diskgroup data drop disk 'ASMDISK01';

Diskgroup altered.

It just did it without error message! We look further:

SQL> select name from v$asm_disk where group_number=1;

NAME
------------------------------
ASMDISK02
ASMDISK03
ASMDISK04

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
[grid@host01 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   354cfa8376364fd2bfaa1921534fe23b (ORCL:ASMDISK02) [DATA]
 2. ONLINE   762ad94a98554fdcbf4ba5130ac0384c (ORCL:ASMDISK03) [DATA]
 3. ONLINE   3f0bf16b6eb64f3cbf440a3c2f0da2fd (ORCL:ASMDISK04) [DATA]
Located 3 voting disk(s).

It just moved the Voting Disk silently to another ASM Disk of that Diskgroup.  When I try to drop another ASM Disk from that Diskgroup, the command seems to be silently ignored, because 3 ASM Disks are required here to keep the 3 Voting Disks. Similar behavior with External Redundancy:

[grid@host01 ~]$ asmcmd lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576      9788     9645                0            9645              0             N  ACFS/
MOUNTED  NORMAL  N         512   4096  1048576      7341     6431              438            2996              0             N  DATA/
MOUNTED  EXTERN  N         512   4096  1048576      4894     4755                0            4755              0             N  FRA/

I will move the Voting Disk to the FRA Diskgroup. It is a bug of 11.2.0.1 that the Voting_files flag is not Y for the DATA Diskgroup here, by the way.

[grid@host01 ~]$ sudo crsctl replace votedisk +FRA
Successful addition of voting disk 4d586fbecf664f8abf01d272a354fa67.
Successful deletion of voting disk 354cfa8376364fd2bfaa1921534fe23b.
Successful deletion of voting disk 762ad94a98554fdcbf4ba5130ac0384c.
Successful deletion of voting disk 3f0bf16b6eb64f3cbf440a3c2f0da2fd.
Successfully replaced voting disk group with +FRA.
CRS-4266: Voting file(s) successfully replaced
[grid@host01 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   4d586fbecf664f8abf01d272a354fa67 (ORCL:ASMDISK10) [FRA]
Located 1 voting disk(s).
[grid@host01 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Wed Jun 13 17:36:06 2012

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter diskgroup fra drop disk 'ASMDISK10';

Diskgroup altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
[grid@host01 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   0b051cf6e6a14ff1bf31ef7bc66098e0 (ORCL:ASMDISK11) [FRA]
Located 1 voting disk(s).

Not sure whether I would dare that all in a production system, though :-)

Conclusion: We can drop ASM Disks that contain Voting Disks as long as there are enough Disks left in the Diskgroup to retain the same number of Voting Disks (each inside a separate Failure Group) afterwards. Apparently – but: “Don’t believe it, test it!”

About these ads

, , ,

5 Comments

No DISK_REPAIR_TIME on Exadata Cells

Starting with version 11.2.1.3.1, Exadata Cells use Pro-Active Disk Quarantine to override any setting of DISK_REPAIR_TIME. This and some other topics related to ASM mirroring on Exadata Storage Servers is explained in a recent posting of my dear colleage Joel Goodman. Even if you are familiar with ASM on non-Exadata Environments, you may not have used ASM redundancy yet and therefore benefit from his explanations about it.

Addendum: Maybe the headline is a little misleading as I just got aware. DISK_REPAIR_TIME set on an ASM Diskgroup that is built upon Exadata Storage Cells is still in use and valid. It is just not referring to the Disk level (Griddisk on Exadata) but instead on the Cell level.

In other words: If a physical disk inside a Cell gets damaged, the Griddisks built upon this damaged disk get dropped from the ASM Diskgroups immediately without waiting for DISK_REPAIR_TIME, due to Pro-Active Disk Quarantine. But if a whole Cell goes offline (Reboot of that Storage Server, for example), the dependant ASM disks get not dropped from the respective Diskgroups for the duration of DISK_REPAIR_TIME.

, ,

5 Comments

Database Migration to ASM with short downtime

I made up my mind to start a series of postings related to ASM. We recommend using ASM since 10g and recently introduced many new features with it in 11g. On the other hand, I often encounter customers even in advanced courses who didn’t have much contact with ASM yet. So there may be quite a demand for some explanations. On my Downloads page, there is a brief paper in German called ‘ASM Vortrag’ that summarizes some of the main benefits of 10g ASM. Or look at OTN for more details.

We start our first scenario with a small Database running in a conventional filesystem, setup ASM instance and two ASM diskgroups, then migrate to ASM with the downtime it takes only to shutdown and restart the instance – in other words, the downtime will be in the range of only minutes on a productive system. The example is done on a small Linux server using Oracle Database Enterprise Edition 11.2.0.2, but should work very similar on other platforms and with the 10g version as well.

SQL> select * from v$version;

BANNER
----------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select log_mode from v$database;

LOG_MODE
------------
ARCHIVELOG
SQL> select name from v$datafile union select name from v$tempfile;

NAME
---------------------------------
/home/oracle/prima/sysaux01.dbf
/home/oracle/prima/system01.dbf
/home/oracle/prima/temp01.dbt
/home/oracle/prima/undotbs01.dbf
/home/oracle/prima/users01.dbf

SQL> select name from v$controlfile;

NAME
--------------------------------
/home/oracle/prima/control01.ctl

SQL> select member from v$logfile;

MEMBER
-------------------------------
/home/oracle/prima/log_g1m1.rdo
/home/oracle/prima/log_g2m1.rdo

This is my standard demo Database. I have installed already Grid Infrastructure (Marketing name for the combination of Oracle Restart & ASM) for a standalone server. Also, I have already prepared 16 fake ‘Raw Devices’, each of 250m in size. Yes, my system is tiny – it’s my notebook. Continuing to start the ASM instance an then create the two recommended diskgroups DATA and FRA. We have the option to use the comfortable GUI asmca (11g New Feature, it can also create  quorum failgroups meanwhile) or go with the command line:

[oracle@uhesse-pc ~]$ cat /u01/app/11.2.0/grid/dbs/init+ASM.ora
#init+ASM.ora

instance_type='asm'
asm_diskstring='/dev/raw/raw*'
remote_login_passwordfile='EXCLUSIVE'
diagnostic_dest='/u01/app/oracle/'
asm_diskgroups=data,fra

After connecting to the ASM instance and startup , which will produce an error message, because data and fra do not exist yet, we create them like this:

SQL> CREATE DISKGROUP data NORMAL REDUNDANCY
 FAILGROUP fg1 DISK
 '/dev/raw/raw1' NAME disk01,
 '/dev/raw/raw2' NAME disk02,
 '/dev/raw/raw3' NAME disk03,
 '/dev/raw/raw4' NAME disk04
 FAILGROUP fg2 DISK
 '/dev/raw/raw5' NAME disk05,
 '/dev/raw/raw6' NAME disk06,
 '/dev/raw/raw7' NAME disk07,
 '/dev/raw/raw8' NAME disk08;

Each file, placed on DATA will be mirrored across the fg1 and fg2 on the stripe layer. All drives in fg1 or in fg2 could fail without losing data.

SQL> CREATE DISKGROUP fra EXTERNAL REDUNDANCY
 DISK
 '/dev/raw/raw9'  NAME disk09,
 '/dev/raw/raw10' NAME disk10,
 '/dev/raw/raw11' NAME disk11,
 '/dev/raw/raw12' NAME disk12,
 '/dev/raw/raw13' NAME disk13,
 '/dev/raw/raw14' NAME disk14,
 '/dev/raw/raw15' NAME disk15,
 '/dev/raw/raw16' NAME disk16;

FRA has no redundancy – should one drive fail we would lose all data on FRA. Next is an Online Backup with Image Copies to the DATA diskgroup. No downtime involved.

RMAN> backup as copy database format '+DATA';

Starting backup at 01-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=19 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/home/oracle/prima/system01.dbf
output file name=+DATA/prima/datafile/system.256.736599607 tag=TAG20101201T110002 RECID=1 STAMP=736599624
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/home/oracle/prima/sysaux01.dbf
output file name=+DATA/prima/datafile/sysaux.257.736599629 tag=TAG20101201T110002 RECID=2 STAMP=736599634
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/home/oracle/prima/undotbs01.dbf
output file name=+DATA/prima/datafile/undotbs1.258.736599641 tag=TAG20101201T110002 RECID=3 STAMP=736599646
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
output file name=+DATA/prima/datafile/users.259.736599655 tag=TAG20101201T110002 RECID=4 STAMP=736599655
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile copy
copying current control file
output file name=+DATA/prima/controlfile/backup.260.736599657 tag=TAG20101201T110002 RECID=5 STAMP=736599662
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 01-DEC-10
channel ORA_DISK_1: finished piece 1 at 01-DEC-10
piece handle=+DATA/prima/backupset/2010_12_01/nnsnf0_tag20101201t110002_0.261.736599665 tag=TAG20101201T110002 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
Finished backup at 01-DEC-10

Now we change some dynamic parameters to point to the new Database resp. Recovery Area:

SQL> alter system set db_recovery_file_dest_size=1800m;

System altered.

SQL> alter system set db_recovery_file_dest='+FRA';

System altered.

SQL> alter system set db_create_file_dest='+DATA';

System altered.

Archivelogs will now be created into FRA. We also put our spfile there. The ‘from memory’ clause is an 11g New Feature.

SQL> create spfile='+DATA/spfileprima.ora' from memory;
 
 File created.

We remove our spfile from $ORACLE_HOME/dbs and replace it with a pointer to the new spfile:

SQL> host cat /u01/app/oracle/product/11.2.0/dbhome_1/dbs/initprima.ora
spfile='+DATA/spfileprima.ora'

Attention: Now we need to shutdown and restart the productive instance and have a short downtime:

SQL> startup force nomount

ORACLE instance started.

Total System Global Area  417546240 bytes
Fixed Size                  2227072 bytes
Variable Size             352322688 bytes
Database Buffers           54525952 bytes
Redo Buffers                8470528 bytes

The controlfiles should also be (mirrored) on ASM. Therefore:

SQL> alter system set control_files='+DATA','+FRA' scope=spfile;

System altered.

Restart to make the modified CONTROL_FILES parameter active:

SQL> startup force nomount
ORACLE instance started.

Total System Global Area  417546240 bytes
Fixed Size                  2227072 bytes
Variable Size             352322688 bytes
Database Buffers           54525952 bytes
Redo Buffers                8470528 bytes

We only need to restore controlfiles to the new location, switch to the new datafiles on DATA and recover the latest changes that where done since the online backup:

[oracle@uhesse-pc ~]$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Wed Dec 1 11:13:33 2010

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (not mounted)

RMAN> restore controlfile from '/home/oracle/prima/control01.ctl';

Starting restore at 01-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=99 device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=+DATA/prima/controlfile/current.263.736600443
output file name=+FRA/prima/controlfile/current.256.736600443
Finished restore at 01-DEC-10

RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

RMAN> switch database to copy;

datafile 1 switched to datafile copy "+DATA/prima/datafile/system.256.736599607"
datafile 2 switched to datafile copy "+DATA/prima/datafile/sysaux.257.736599629"
datafile 3 switched to datafile copy "+DATA/prima/datafile/undotbs1.258.736599641"
datafile 4 switched to datafile copy "+DATA/prima/datafile/users.259.736599655"

RMAN> recover database;

Starting recover at 01-DEC-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=99 device type=DISK

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 01-DEC-10

RMAN> alter database open;

Downtime is over! End users can connect while we do some additional work:

SQL> select name from v$datafile union select name from v$tempfile;

NAME
--------------------------------------------------------------------------------
+DATA/prima/datafile/sysaux.257.736599629
+DATA/prima/datafile/system.256.736599607
+DATA/prima/datafile/undotbs1.258.736599641
+DATA/prima/datafile/users.259.736599655
/home/oracle/prima/temp01.dbt

The tempfile was not touched by RMAN during the backup or switch to copy. We need to do that manually:

SQL> alter database  tempfile '/home/oracle/prima/temp01.dbt' drop;

Database altered.

SQL> alter tablespace temp add tempfile size 50m;

Tablespace altered.

SQL> select name from v$datafile union select name from v$tempfile;

NAME
--------------------------------------------------------------------------------
+DATA/prima/datafile/sysaux.257.736599629
+DATA/prima/datafile/system.256.736599607
+DATA/prima/datafile/undotbs1.258.736599641
+DATA/prima/datafile/users.259.736599655
+DATA/prima/tempfile/temp.264.736600915

Everything nice. But our Online Logs are still on the filesystem:

SQL> select member from v$logfile;

MEMBER
-----------------------------------
/home/oracle/prima/log_g1m1.rdo
/home/oracle/prima/log_g2m1.rdo

This can also be fixed online:

SQL> alter database add logfile size 20m;
 
 Database altered.
 
 SQL> alter database add logfile size 20m;
 
 Database altered.

Gave us two new groups mirrored across DATA and FRA. Now we drop the old groups:

SQL> alter system switch logfile;
 
 System altered.
 
 SQL> alter system switch logfile;
 
 System altered.
 
 SQL> alter system checkpoint;
 
 System altered.
 
 SQL> alter database drop logfile group 1;
 
 Database altered.
 
 SQL> alter database drop logfile group 2;
 
 Database altered.
 
 SQL> select member from v$logfile;
 
 MEMBER
 --------------------------------------------------------------------------------
 +DATA/prima/onlinelog/group_3.264.736612757
 +FRA/prima/onlinelog/group_3.257.736612759
 +DATA/prima/onlinelog/group_4.265.736612765
 +FRA/prima/onlinelog/group_4.258.736612769

That was it. We may now do a backup of the database to FRA:

RMAN> backup database;

Starting backup at 01-DEC-10
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=105 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATA/prima/datafile/system.256.736608851
input datafile file number=00002 name=+DATA/prima/datafile/sysaux.257.736608873
input datafile file number=00003 name=+DATA/prima/datafile/undotbs1.258.736608885
input datafile file number=00004 name=+DATA/prima/datafile/users.259.736608899
channel ORA_DISK_1: starting piece 1 at 01-DEC-10
channel ORA_DISK_1: finished piece 1 at 01-DEC-10
piece handle=+FRA/prima/backupset/2010_12_01/nnndf0_tag20101201t145715_0.259.736613837 tag=TAG20101201T145715 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:16
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 01-DEC-10
channel ORA_DISK_1: finished piece 1 at 01-DEC-10
piece handle=+FRA/prima/backupset/2010_12_01/ncsnf0_tag20101201t145715_0.260.736613853 tag=TAG20101201T145715 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 01-DEC-10

Conclusion: It is relatively simple and only needs a short downtime to migrate your Database to ASM, mainly the process is an RMAN Online Backup with Image Copies.

, ,

13 Comments

Follow

Get every new post delivered to your Inbox.

Join 2,600 other followers

%d bloggers like this: