Restoring old Optimizer Statistics for troubleshooting purpose

We deliver a Oracle Database 10g (sic!) New Features course this week in our education center in Düsseldorf. Many customers are still on 10g productively and some even just upgrade from 9i to 10g, still being in the process of evaluating 11g. Many new features in the 10g version are related to the optimizer. Some are quite well known to the public, like the obsolence of the rule based optimizer and the presence of a scheduler job that gathers optimizer statistics every night out-of-the-box.

Others lead a more quiet life, one of them is the possibility to restore old optimizer statistics (i.e. as a troubleshooting measure), should there be issues with the newly gathered ones. Since 10g, we do not simply overwrite old optimizer statistics by gathering new. Instead, the old optimizer statistics are automatically historized and kept for one month by default, so that they can be restored easily if that should be desired. Now did you know that? Here we go with a short demonstration of restoring historic optimizer statistics:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> grant dba to adam identified by adam;
Grant succeeded.

SQL> connect adam/adam
Connected.

SQL> create table sales as select
rownum as id,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id,
5000 as amount_sold,
sysdate as time_id
from dual connect by level<=3e7;
Table created.

SQL> create index sales_idx on sales(id) nologging;
Index created.

SQL> select segment_name,bytes/1024/1024 as mb from user_segments;

SEGMENT_NAME                     MB
------------------------ ----------
SALES                           942
SALES_IDX                       566

I just created a demo user, a table and an index on that table. Notice that the two segments take about 1.5 Gig space, should you like to reproduce the demo yourself. Right now, there are no optimizer statistics for the table:

SQL> select num_rows,blocks from user_tables;

 NUM_ROWS     BLOCKS
---------- ----------
"NULL values here"

I am now going to gather statistics on the table manually – the same would be done automatically by the standard scheduler job during the night:

SQL> exec dbms_stats.gather_table_stats('ADAM','SALES')

PL/SQL procedure successfully completed.

SQL> select num_rows,blocks from user_tables;

 NUM_ROWS     BLOCKS
---------- ----------
 29933962     119585

SQL> select count(*) from sales;

 COUNT(*)
----------
 30000000

As we can see, the statistics are quite accurate, reflecting well the actual size of the table. The index is used for the following query, as we can tell by runtime already:

SQL> set timing on
SQL> select amount_sold from sales where id=4711;

AMOUNT_SOLD
-----------
 5000

Elapsed: 00:00:00.00

I am now going to introduce a problem with the optimizer statistics artificially by just setting them very inaccurate. A real-world problem that is caused by new optimizer statistics is a little harder to come up with – probably you will never encounter it during your career…

SQL> exec dbms_stats.set_table_stats('ADAM','SALES',numrows=>100,numblks=>1)

PL/SQL procedure successfully completed.

SQL> select num_rows,blocks from user_tables;

 NUM_ROWS     BLOCKS
---------- ----------
 100          1

With the above (completely misleading) statistics, the optimizer will think that a full table scan of the sales table is fairly cheap. Please notice that I ask after id 4712 and not 4711 again, because it could happen that the already computed execution plan (index range scan) is still in the library cache available for reuse. I could also flush the shared pool here to make sure that a new execution plan has to be generated for the id 4711.

SQL> select amount_sold from sales where id=4712;

AMOUNT_SOLD
-----------
 5000

Elapsed: 00:00:01.91

We can tell by the runtime of almost 2 seconds here that this was a full table scan. Proof would be to retrieve the execution plan from the library cache. I leave that to your studies. Please be aware that the autotrace feature might be misleading here. For our scope, it is enough to say that we have an issue caused by generation of new optimizer statistics. We want to get back our good old statistics! Therefore, we look at the historic optimizer statistics:

SQL> alter session set NLS_TIMESTAMP_TZ_FORMAT='yyyy-mm-dd:hh24:mi:ss';

Session altered.

SQL> select table_name,stats_update_time from user_tab_stats_history;

TABLE_NAME                     STATS_UPDATE_TIME
------------------------------ --------------------------------------
SALES                          2010-05-18:09:47:16
SALES                          2010-05-18:09:47:38

We see two rows, representing the old statistics of the sales table. The first was from the time, as there where NULL entries  (before the first gather_table_stats). The second row represents the accurate statistics. I am going to restore them:

SQL> begin
dbms_stats.restore_table_stats('ADAM','SALES',
to_timestamp('2010-05-18:09:47:38','yyyy-mm-dd:hh24:mi:ss'));
end;
/
PL/SQL procedure successfully completed.
SQL> select num_rows,blocks from user_tables;
 NUM_ROWS     BLOCKS
---------- ----------
 29933962     119585

The good old statistics are back, but the statements are still in the library cache and may execute with the same misguided execution plan:

SQL> set timing on
SQL> select amount_sold from sales where id=4712;

AMOUNT_SOLD
-----------
 5000

Elapsed: 00:00:01.92

Still full table scan! The following empties the shared pool and forces the computation of new execution plans, using the accurate statistics:

SQL> alter system flush shared_pool;

System altered.

Elapsed: 00:00:00.04
SQL> select amount_sold from sales where id=4712;

AMOUNT_SOLD
-----------
 5000

Elapsed: 00:00:00.01

We fixed the problem by restoring the old stats. Should the problem persist – because apparently, optimizer statistics did not cause it – we can get back the new computed statistics also:

SQL> select table_name,stats_update_time from user_tab_stats_history;

TABLE_NAME                     STATS_UPDATE_TIME
------------------------------ ---------------------------------------------------------------------------
SALES                          2010-05-18:09:47:16
SALES                          2010-05-18:09:47:38
SALES                          2010-05-18:09:52:30

The last line represents the last historized stats.

SQL> begin
dbms_stats.restore_table_stats('ADAM','SALES',
to_timestamp('2010-05-18:09:52:30','yyyy-mm-dd:hh24:mi:ss'));
end;
/  2    3    4    5

PL/SQL procedure successfully completed.

SQL> select num_rows,blocks from user_tables;

 NUM_ROWS     BLOCKS
---------- ----------
 100          1

Conclusion: If we suspect that new Optimizer Statistics are responsible for a performance problem, we can restore the “good old statistics” very easy since 10g.

About these ads

,

  1. #1 by Hemant K Chitale on May 19, 2010 - 04:03

    You say “A real-world problem that is caused by new optimizer statistics is a little harder to come up with – probably you will never encounter it during your career…”

    Oh no. It is frequently seen. Happens with tables that undergo a lot of DML intra-day.

    Hemant K Chitale

  2. #2 by Uwe Hesse on May 19, 2010 - 07:35

    Hemant,
    thank you for your comment! Could you give an example for that with a little more detail?

    If you refer to high volatile tables (statistics are almost outdated right after gathering them) – this could be addressed by deleting statistics, lock (empty) statistics and then go with dynamic sampling. Personally, I would not say that the new statistics caused the problem in the above case, though. They are just too fast outdated.

  3. #3 by Wolfgang Breitling on May 19, 2010 - 19:23

    This “historization” of the old statistics does not happen when statistics are gathered en-passant when rebuilding an index.

    Try the following:

    gather statistics of a table ( with an index, of course )
    wait a few minutes
    rebuild the index
    restore the statistics giving a time just after gathering the statistics but before the rebuild.

    You end up with no statistics at all on the index. Not the ones gathered with the table nor the ones gathered as a result of the rebuild. The statistics are just wiped out.

    If Oracle touts a new “feature” they should make sure it works. I have an SR open for over two years now. No resolution in sight.

  4. #4 by Anand on May 20, 2010 - 05:51

    Hi Uwe,

    Again very simple,clear and easy to understand.Awesome :)

    Regards,
    Anand

  5. #5 by anu on November 22, 2010 - 23:20

    Hi Uwe,
    Great post! Exactly what I was looking for. Thanks a lot.

  6. #6 by Yury Velikanov on February 21, 2012 - 06:51

    Hey Uwe! I regret that I didn’t find this post long long time ago :)
    It would save me plenty of time. Thank you for sharing and thanks to Anand Prakash (@anandprakash86) who pointed me to the post.

    I found it a bit uncomfortable seeing that there is no information about the statistics itself available from %_tab_stats_history view.

    The following SQL will give us a bit more information for making an intelligent decisions:

    select o.name, to_char(h.savtime,’yyyy-mm-dd:hh24:mi:ss’), h.blkcnt, h.rowcnt
    from sys.user$ u, sys.obj$ o, sys.wri$_optstat_tab_history h
    where h.obj# = o.obj# and o.type# = 2 and o.owner# = u.user#
    and o.name=’SALES’ and u.name=’ADAM’
    order by h.savtime;

    20:38:09 SYS@ORCL:2> select o.name, to_char(h.savtime,’yyyy-mm-dd:hh24:mi:ss’), h.blkcnt, h.rowcnt
    20:42:30 SYS@ORCL:2> from sys.user$ u, sys.obj$ o, sys.wri$_optstat_tab_history h
    20:42:30 SYS@ORCL:2> where h.obj# = o.obj# and o.type# = 2 and o.owner# = u.user#
    20:42:30 SYS@ORCL:2> and o.name=’SALES’ and u.name=’ADAM’
    20:42:30 SYS@ORCL:2> order by h.savtime;

    NAME TO_CHAR(H.SAVTIME,’YYYY-MM-DD: BLKCNT ROWCNT
    —————————— ————————————————————————— ———- ———-
    SALES 2012-02-20:20:10:42
    SALES 2012-02-20:20:32:20 3016 142468
    SALES 2012-02-20:20:33:35 1 100
    SALES 2012-02-20:20:34:22 3016 142468

    20:42:30 SYS@ORCL:2>
    20:42:31 SYS@ORCL:2>

    Yury

  7. #7 by Uwe Hesse on February 21, 2012 - 20:25

    Yury, thank you for stopping by and for sharing this information!

  1. Restoring old Optimizer Statistics for troubleshooting purpose … | Make Computers
  2. Chirpir News | Restoring old Optimizer Statistics for troubleshooting purpose …
  3. DIFF_TABLE_STATS_IN_HISTORY Example « The Oracle Instructor
  4. DIFF_TABLE_STATS_IN_HISTORY Example | Oracle Administrators Blog - by Aman Sood

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,760 other followers

%d bloggers like this: