No Smart Scan with AUTO_SAMPLE_SIZE

During the Exadata Database Machine Admin Workshop that I deliver this week in Munich, I got the question whether the gathering of Optimizer Statistics could be done via Smart Scan on Exadata. I didn’t know that and that particular topic seems not to be covered much yet in the docs or other Blogs that I researched briefly. Therefore I came up with this little test case here:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
SQL> alter session set workarea_size_policy=manual;

Session altered.

SQL> alter session set sort_area_size=1000000000;

Session altered.

SQL> create table sales as
select
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2010' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=1e7;

Table created.

We go on with the recommended method to gather statistics and compare that with 30 % and 100 % sample size:

SQL> connect adam/adam
Connected.
SQL> exec dbms_stats.gather_table_stats('ADAM','SALES',estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE)

PL/SQL procedure successfully completed.

SQL> select name,value/1024/1024 as mb from v$mystat natural join v$statname
    where name='cell physical IO interconnect bytes returned by smart scan';

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan                0

SQL> exec dbms_stats.delete_table_stats('ADAM','SALES')

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('ADAM','SALES',estimate_percent=>30)

PL/SQL procedure successfully completed.

SQL> select name,value/1024/1024 as mb from v$mystat natural join v$statname
    where name='cell physical IO interconnect bytes returned by smart scan';

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan                0

SQL> exec dbms_stats.delete_table_stats('ADAM','SALES')

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('ADAM','SALES',estimate_percent=>100)

PL/SQL procedure successfully completed.

SQL> select name,value/1024/1024 as mb from v$mystat natural join v$statname
    where name='cell physical IO interconnect bytes returned by smart scan';

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan       477.690498

Conclusion: It is possible to gather Optimizer Statistics via Smart Scan, but only with a 100 % sample size. Now the question is, whether AUTO_SAMPLE_SIZE will gather statistics faster although running without Smart Scan. The quality of the resulting stats will be very similar in both cases. If you are using ESTIMATE_PERCENT < 100 presently, consider to change to AUTO_SAMPLE_SIZE or 100 % on Exadata, as this may complete faster there with Smart Scan than your non-AUTO_SAMPLE_SIZE value does. As always: Don’t believe it, test it :)

 

About these ads

,

  1. #1 by alexandruneda on July 4, 2012 - 19:31

    Excellent! Very useful, thank you!

  2. #2 by mwidlake on July 4, 2012 - 23:07

    Hi Uwe,

    As I understand it, you only get the benefit of the faster one-pass NDV (number of distinct values) algorithm if you leave the sample size at auto. So going to 100% and gaining the benefit of Smart scan may well be cancelled out by losing the NDV functionality.

    As you say, something to test!

  3. #3 by Uwe Hesse on July 4, 2012 - 23:15

    Alexandru, you’re welcome :)

  4. #4 by Uwe Hesse on July 4, 2012 - 23:20

    Martin, I suppose that AUTO_SAMPLE_SIZE will also be (much) faster than 100 % sample size even with Smart Scan on most production systems that run on Exadata. So the question is probably kind of academic, but I couldn’t resist to check it and share the answer :)

  5. #5 by mwidlake on July 4, 2012 - 23:57

    I was having an extended discussion with Julian Dyke (during a user group meeting as we both presented!) about the benefits or not of smart scan on exadata. It’s jolly fast but of course nothing gets put in the buffer cache by it. So maybe it is not so great in some circumstances. The more we understand, the better choices we can make :-) I’ve only been using it for 4 months so I claim no expertise.

  6. #6 by Bosson on July 11, 2012 - 13:41

    slt moi j’aimerais avoir une bourse d’étude afin de pouvoir une bonne formation concernant les bases de données sur ” ORACLE”

  7. #7 by Yury Pudovchenko on October 24, 2012 - 10:02

    Hello, Uwe,

    0. Thanks for interesting point !
    1. I use the DB Control/Performancs/SQL Monitor to analyze smart scan. In my experiments I see the smart scan in DB Control for 10% and for 30% and for 100% and not for AUTO_SAMPLE_SIZE .
    2. The times I obtained:
    10% = 148 sec, cell offloading = -5.26%
    100% = 1434 sec, cell offloading = -22%
    AUTO_SAMPLE_SIZE = 181 sec , No cel offloading

  8. #8 by Yury Pudovchenko on October 24, 2012 - 10:47

    Hello,

    Add the degree=>DBMS_STATS.AUTO_DEGREE to the estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE
    and you will have the much more faster time + SMART SCAN !

  9. #9 by Yury Pudovchenko on October 24, 2012 - 11:39

    But if I run estimate_percent=>100,degree=>DBMS_STATS.AUTO_DEGREE then I don’t see smart scan !
    I detect smart scan as the column in SQL Monitor.

  10. #10 by Yury Pudovchenko on October 24, 2012 - 11:48

    And – important thing – I see the negative smart scan % (less than 0), while gathering stats run in 1 thread ( no parallel), because of many sorting on the disk.

    Thanks Uwe for interesting problem ! :)

  11. #11 by Uwe Hesse on October 29, 2012 - 15:33

    Yury, thank YOU for taking the effort to share your findings here :-)

  12. #12 by Rich Headrick on February 2, 2013 - 01:09

    I just logged a bug with development on this issue. It is not the expected behavior.

  13. #13 by Uwe Hesse on February 4, 2013 - 09:26

    Thank you for stopping by, Rich, and for logging that bug!

  1. Exadata « Oracle Scratchpad

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,888 other followers

%d bloggers like this: