How to resolve media failures with the Recovery Advisor in #Oracle

Error messages are showing up because files have been damaged? Database Recovery Advisor to the rescue!

[oracle@uhesse ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Sep 7 11:52:24 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/prima/system01.dbf
/u01/app/oracle/oradata/prima/sysaux01.dbf
/u01/app/oracle/oradata/prima/undotbs01.dbf
/u01/app/oracle/oradata/prima/users01.dbf

SQL> host echo kaputt > /u01/app/oracle/oradata/prima/users01.dbf

SQL> select count(*) from adam.sales;
select count(*) from adam.sales
                          *
ERROR at line 1:
ORA-01115: IO error reading block from file  (block # )
ORA-01115: IO error reading block from file 4 (block # 147)
ORA-27072: File I/O error
Additional information: 4
Additional information: 147

Don’t do that at home, but the above prepared the playground to show how to resolve media errors easily. I’m using 12.1.0.2 here, but the shown functionality is available since 11.1 already.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
[oracle@uhesse ~]$ rman target /

Recovery Manager: Release 12.1.0.2.0 - Production on Wed Sep 7 12:02:47 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PRIMA (DBID=2131944058)

RMAN> list failure;

using target database control file instead of recovery catalog
Database Role: PRIMARY

no failures found that match specification

If the failure is not already listed, you should run the following check (works in 12c from the RMAN shell as shown, in 11g you need to run it from SQL*Plus):

RMAN> begin
 dbms_hm.run_check ('DB Structure Integrity Check','mycheck');
end;
/

Statement processed

RMAN> list failure;

Database Role: PRIMARY

List of Database Failures
=========================

Failure ID Priority Status    Time Detected       Summary
---------- -------- --------- ------------------- -------
42         HIGH     OPEN      2016-09-07 12:05:04 One or more non-system datafiles are corrupt

The failure is now listed. Next steps:

RMAN> advise failure;

Database Role: PRIMARY

List of Database Failures
=========================

Failure ID Priority Status    Time Detected       Summary
---------- -------- --------- ------------------- -------
42         HIGH     OPEN      2016-09-07 12:05:04 One or more non-system datafiles are corrupt

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
========================
no manual actions available

Optional Manual Actions
=======================
no manual actions available

Automated Repair Options
========================
Option Repair Description
------ ------------------
1      Restore and recover datafile 4  
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/app/oracle/diag/rdbms/prima/prima/hm/reco_1884775430.hm

RMAN> repair failure preview;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/prima/prima/hm/reco_1884775430.hm

contents of repair script:
   # restore and recover datafile
   sql 'alter database datafile 4 offline';
   restore ( datafile 4 );
   recover datafile 4;
   sql 'alter database datafile 4 online';

So that’s good to know: ‘The repair includes complete media recovery with no data loss’🙂 Let’s do that:

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/prima/prima/hm/reco_1884775430.hm

contents of repair script:
   # restore and recover datafile
   sql 'alter database datafile 4 offline';
   restore ( datafile 4 );
   recover datafile 4;
   sql 'alter database datafile 4 online';

Do you really want to execute the above repair (enter YES or NO)? yes
executing repair script

sql statement: alter database datafile 4 offline

Starting restore at 2016-09-07 12:08:43
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/prima/users01.dbf
channel ORA_DISK_1: reading from backup piece /u02/reco/PRIMA/backupset/2016_04_26/o1_mf_nnndf_TAG20160426T105118_ckyc07n5_.bkp
channel ORA_DISK_1: piece handle=/u02/reco/PRIMA/backupset/2016_04_26/o1_mf_nnndf_TAG20160426T105118_ckyc07n5_.bkp tag=TAG20160426T105118
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:15
Finished restore at 2016-09-07 12:09:58

Starting recover at 2016-09-07 12:09:58
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 37 is already on disk as file /u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_37_ckyc311q_.arc
archived log for thread 1 with sequence 38 is already on disk as file /u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_38_ckyc373o_.arc
archived log for thread 1 with sequence 39 is already on disk as file /u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_39_ckyc3co9_.arc
archived log file name=/u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_37_ckyc311q_.arc thread=1 sequence=37
archived log file name=/u02/reco/PRIMA/archivelog/2016_04_26/o1_mf_1_38_ckyc373o_.arc thread=1 sequence=38
media recovery complete, elapsed time: 00:00:01
Finished recover at 2016-09-07 12:10:01

sql statement: alter database datafile 4 online
repair failure complete

The Recovery Advisor did not only resolve the problem but it did also choose the solution with the least impact on availability: Restore and recovery has been done with the instance in status OPEN. Is the error now gone?

RMAN> select count(*) from adam.sales;

  COUNT(*)
----------
  10000000

Problem solved! Think of the Recovery Advisor as your personal first level support: There are problems that are too difficult for this tool to resolve, but it may be able to do a pretty good job as in this example. It doesn’t support RAC and cannot resolve issues at a standby database in a Data Guard  environment. Many more details are here, but essentially, you just need to memorize these three commands: LIST FAILURE, ADVISE FAILURE, REPAIR FAILURE. I think that’s not too much to ask, even with a very high stress level🙂

3 Comments

How to fix a problem with the spfile in #Oracle

An invalid entry in the spfile may prevent the instance from starting up:

SQL> alter system set sga_target=500m scope=spfile;

System altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORA-00821: Specified value of sga_target 512M is too small, needs to be at least 1392M

The instance doesn’t come up! This is easy to fix without having to restore the spfile from backup:

SQL> create pfile='/home/oracle/init.ora' from spfile;

File created.

SQL> host vi /home/oracle/init.ora

Now correct the value in the text file. I just removed the sga_target parameter from it here. Then

SQL> create spfile from pfile='/home/oracle/init.ora' ;

File created.

SQL> startup
ORACLE instance started.

Total System Global Area 1342177280 bytes
Fixed Size		    2924112 bytes
Variable Size		  218104240 bytes
Database Buffers	 1107296256 bytes
Redo Buffers		   13852672 bytes
Database mounted.
Database opened.

Problem solved! That fix works for any invalid entry in the spfile, not only for sga_target. I don’t know how often I had to do that over the years🙂

,

3 Comments

Index Competition in #Oracle 12c

Suppose you want to find out which type of index is best for performance with your workload. Why not set up a competition and let the optimizer decide? The playground:

ADAM@pdb1 > select max(amount_sold) from sales where channel_id=9;

MAX(AMOUNT_SOLD)
----------------
            5000

ADAM@pdb1 > @lastplan

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
SQL_ID  3hrvrf1r6kn8s, child number 0
-------------------------------------
select max(amount_sold) from sales where channel_id=9

Plan hash value: 3593230073

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE                      |       |     1 |     6 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SALES |     1 |     6 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | BSTAR |     1 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CHANNEL_ID"=9)


20 rows selected.

There is a standard B*tree index on the column CHANNEL_ID that speeds up the SELECT above. I think a bitmap index would be better:

ADAM@pdb1 > create bitmap index bmap on sales(channel_id) invisible nologging;

Index created.

ADAM@pdb1 > alter index bstar invisible;

Index altered.

ADAM@pdb1 > alter index bmap visible;

Index altered.

ADAM@pdb1 > select max(amount_sold) from sales where channel_id=9;

MAX(AMOUNT_SOLD)
----------------
            5000

ADAM@pdb1 > @lastplan

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
select max(amount_sold) from sales where channel_id=9

Plan hash value: 2178022915

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE                      |       |     1 |     6 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SALES |     1 |     6 |     3   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS       |       |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE        | BMAP  |       |       |            |          |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CHANNEL_ID"=9)


21 rows selected.

With this 12c New Feature (two indexes on the same column), I got a smooth transition to the new index type. But this left no choice to the optimizer. What about this?

ADAM@pdb1 > alter index bmap invisible;

Index altered.

ADAM@pdb1 > alter session set optimizer_use_invisible_indexes=true;

Now both indexes are invisible and the optimizer may choose any of them. Turns out that it likes the bitmap index better here. Instead of watching the execution plans, V$SEGMENT_STATISTICS can also be used to find out:

ADAM@pdb1 > select object_name,statistic_name,value
            from v$segment_statistics
            where object_name in ('BSTAR','BMAP')
            and statistic_name in ('physical reads','logical reads');

OBJECT STATISTIC_NAME                      VALUE
------ ------------------------------ ----------
BSTAR  logical reads                       22800
BSTAR  physical reads                       6212
BMAP   logical reads                        1696
BMAP   physical reads                          0

The numbers of BSTAR remain static while BMAP numbers increase. You may also monitor that with DBA_HIST_SEG_STAT across AWR snapshots. Now isn’t that cool?🙂
Couple of things to be aware of here:
Watch out for more than just physical/logical reads – bitmap indexes may cause a locking problem in an OLTP environment.
Don’t keep the two indexes invisible forever – after you saw which one performs better, drop the other one. Invisible indexes need to be maintained upon DML and therefore slow it down.

, ,

7 Comments

%d bloggers like this: