Beiträge getaggt mit Performance Tuning
Exadata Part I: Smart Scan
As many of you will know, Oracle has introduced a new flagship in the core area: The Exadata Database Machine is a combined Hardware & Software offer, consisting of an 8 Node Cluster (each Node with 2 Six-Core CPUs and 72 GB RAM) and 14 Storage Servers (each with 2 Six-Core CPUs and 12 SAS or 12 SATA drives *) in case of a Full Rack, connected (RAC Interconnect & Storage) over Infiniband. Operating System is Oracle Enterprise Linux; Database Version is 11gR2 Enterprise Edition, both mandatory.
I have the privilege to get access to our Oracle University Database Machine – I really do love my job 🙂 But I am limited to only one Database Node with two Storage Servers – which explains the relatively slow speed during the demonstrations compared to a Full Rack.
The idea of this posting is to show that there is more behind the Exadata Database Machine than sophisticated and well combined Hardware, but that there is actually some kind of (Database) Intelligence built into the Storage Layer that enables it to do column & predicate filtering before it sends results to the Database Layer. This functionality is called Smart Scan:
SQL> create table sales as
select
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2010' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=2e7;
Table created.
SQL> select bytes/1024/1024/1024 as gb from user_segments;
GB
----------
1.0625
Above created a roughly 1 Gig sized demo table that is now blown up to 4 Gig:
SQL> alter table sales nologging;
Table altered.
SQL> insert /*+ append */ into sales select * from sales;
20000000 rows created.
SQL> commit;
Commit complete.
SQL> insert /*+ append */ into sales select * from sales;
40000000 rows created.
SQL> commit;
Commit complete.
SQL> select bytes/1024/1024/1024 as gb from user_segments;
GB
----------
4.17822266
SQL> exec dbms_stats.gather_table_stats('ADAM','SALES')
PL/SQL procedure successfully completed.
The playing field is setup. We will look at first after the scan speed without Smart Scan:
SQL> alter session set cell_offload_processing=false; Session altered. SQL> set timing on SQL> select /* NO_SMART_SCAN */ count(*) from sales where channel_id=1; COUNT(*) ---------- 16000000 Elapsed: 00:00:22.96 SQL> select /* NO_SMART_SCAN */ count(*) from sales where channel_id=1; COUNT(*) ---------- 16000000 Elapsed: 00:00:22.85 SQL> set timing off
Twice the statement to show that caching has not much to say in this situation. We needed about 23 seconds to scan about 4 Gig, with the Storage Servers delivering the full amount of data to the Database Layer. Now the very same statement with the standard functionality Smart Scan:
SQL> alter session set cell_offload_processing=true; Session altered. SQL> set timing on SQL> select /* WITH_SMART_SCAN */ count(*) from sales where channel_id=1; COUNT(*) ---------- 16000000 Elapsed: 00:00:05.58 SQL> select /* WITH_SMART_SCAN */ count(*) from sales where channel_id=1; COUNT(*) ---------- 16000000 Elapsed: 00:00:05.64 SQL> set timing off
Runtime was reduced to less than 6 seconds with Smart Scan. Let’s examine the execution plans for the two statements. I included the comment (it is NOT a hint) to retrieve the statements and their execution plans easier from the Library Cache:
SQL> select sql_id from v$sql where sql_text like '%NO_SMART_SCAN%' and sql_text not like '%like%';
SQL_ID
-------------
81vy8y7fkvzta
SQL> select plan_table_output from table (dbms_xplan.display_cursor('81vy8y7fkvzta'));
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 81vy8y7fkvzta, child number 0
-------------------------------------
select /* NO_SMART_SCAN */ count(*) from sales where channel_id=1
Plan hash value: 1047182207
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 147K(100)| |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS STORAGE FULL| SALES | 16M| 46M| 147K (1)| 00:29:30 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("CHANNEL_ID"=1)
An ordinary full table scan as expected. Now the second trial:
SQL> select sql_id from v$sql where sql_text like '%WITH_SMART_SCAN%' and sql_text not like '%like%';
SQL_ID
-------------
91w7a8mx8c977
SQL> select plan_table_output from table (dbms_xplan.display_cursor('91w7a8mx8c977'));
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 91w7a8mx8c977, child number 0
-------------------------------------
select /* WITH_SMART_SCAN */ count(*) from sales where channel_id=1
Plan hash value: 1047182207
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 147K(100)| |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS STORAGE FULL| SALES | 16M| 46M| 147K (1)| 00:29:30 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - storage("CHANNEL_ID"=1)
filter("CHANNEL_ID"=1)
The Storage Layer did filter on the predicate channel_id=1 before transmitting the result to the Database Layer, which is the reason for the reduced runtime. Apart from Smart Scan, we have some more conventional ways to speed up the statement like doing it in parallel:
SQL> select /*+ parallel(sales) */ count(*) from sales where channel_id=1;
COUNT(*)
----------
16000000
Elapsed: 00:00:02.83
SQL> set timing off
SQL> select sql_id from v$sql where sql_text like '% parallel(sales) %' and sql_text not like '%like%';
SQL_ID
-------------
ckq93tnythxa7
QL> select plan_table_output from table (dbms_xplan.display_cursor('ckq93tnythxa7'));
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID ckq93tnythxa7, child number 0
-------------------------------------
select /*+ parallel(sales) */ count(*) from sales where channel_id=1
Plan hash value: 3130505568
------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3409 (100)| | | | |
| 1 | SORT AGGREGATE | | 1 | 3 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 3 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 3 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 16M| 46M| 3409 (1)| 00:00:01 | Q1,00 | PCWC | |
|* 6 | TABLE ACCESS STORAGE FULL| SALES | 16M| 46M| 3409 (1)| 00:00:01 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - storage(:Z>=:Z AND :Z<=:Z AND "CHANNEL_ID"=1)
filter("CHANNEL_ID"=1)
Parallel Query combined with Smart Scan was even faster, delivering
SQL> select 4.17/2.83 as GB_PER_SEC from dual; GB_PER_SEC ---------- 1.47349823
a scan rate of about 1.5 Gigabyte per Second. With an Oracle Database at hand, I don’t need a calculator 🙂
No idea what the weird storage filter :Z>=:Z and :Z<=:Z means here, to be honest.
Conclusion: Exadata is not only strong hardware but also Database intelligence on the storage layer. Smart Scan means the capability of the Storage Layer to do filtering of columns and predicates before sending the result to the Database Layer.
Addendum:
Using Enterprise Manager (Grid Control or Database Control flavor even with Exadata), we have an easier way to determine Smart Scans:
* Second Addendum: Meanwhile (May 2011), also the High Capacity Drives are SAS – no more SATA
Reorganizing Tables in Oracle – is it worth the effort?

This topic seems to be some kind of „Evergreen“, since it comes up regularly in my courses and also in the OTN Discussion Forum. I decided therefore to cover it briefly here in order to be able to point to this post in the future.
Short answer: Probably not
If the intention of the reorganizing operation is to gain space resp. „defragment“ the table, this operation is very likely just a waste of effort & resources. I will try to illustrate my point with a simple demonstration that anybody with access to an Oracle Database can easily reproduce. My demo Database is 11gR2, but the same can be done with 10g also. If you are on an even older version, there is no SHRINK SPACE available, so you would have to use MOVE instead. I prepare a little demo Table with 1Mio rows now – the time_id population was the hardest part of that demo for me 🙂
SQL> create table sales as
select
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2010' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=1e6; Table created. SQL> select time_id ,count(*)
from sales group by time_id
order by 1;
TIME_ID COUNT(*)
--------- ----------
01-JAN-10 83333
01-FEB-10 83334
01-MAR-10 83334
01-APR-10 83334
01-MAY-10 83334
01-JUN-10 83333
01-JUL-10 83333
01-AUG-10 83333
01-SEP-10 83333
01-OCT-10 83333
01-NOV-10 83333
01-DEC-10 83333
12 rows selected.
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 54
The table contains about 83000 rows per month. Now I will delete the first quarter of rows:
SQL> delete from sales where time_id<to_date('01.04.2010','dd.mm.yyyy');
250001 rows deleted.
SQL> commit;
Commit complete.
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 54
This is the starting point of a possible reorganization: Although 250k rows got deleted, the table consumes the same space as before. In other words: The High Water Mark did not move. A reorganization would move the High Water Mark and would regain the space that was consumed by the 250k rows, like shown in the below picture:
The question is: Is that necessary? If inserts would take place after the deletion again, then the space would get reused without any need to reorganize:
SQL> insert into sales
select
rownum as id,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,3)+1),2,'0') || '.2011' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=2.5e5;
250000 rows created.
SQL> commit;
Commit complete.
SQL> select time_id ,count(*)
from sales group by time_id
order by 1;
TIME_ID COUNT(*)
--------- ----------
01-APR-10 83334
01-MAY-10 83334
01-JUN-10 83333
01-JUL-10 83333
01-AUG-10 83333
01-SEP-10 83333
01-OCT-10 83333
01-NOV-10 83333
01-DEC-10 83333
01-JAN-11 83333
01-FEB-11 83334
01-MAR-11 83333
12 rows selected.
I inserted a new quarter of rows. The table remains in the same size as before:
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 54
That is exactly the point I like to emphasize: If you have inserts following deletes, reorganization of tables is not necessary! Only if that is not the case (Table gets no longer inserts after deletion), you may reorganize it:
SQL> delete from sales where time_id<to_date('01.07.2010','dd.mm.yyyy'); 250001 rows deleted. SQL > commit;
Commit complete.
SQL> alter table sales enable row movement;
Table altered.
SQL> alter table sales shrink space;
Table altered.
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 40.25
The space consumed by the table got reduced now and is usable for other segments. Although the table is not locked during the SHRINK SPACE operation and users can do DML on the table as usual, the operation is not „for free“ in terms of resource consumption. It does a lot of I/O, creates a lot before images that consume space in the UNDO tablespace and the operation modifies many blocks, so lots of redo protocol (and therefore many archived logs, probably) gets generated.
If you really think that you need that kind of reorganization regularly, you should probably evaluate the Partitioning Option here:
SQL> create table sales_part
(product char(25), channel_id number,
cust_id number, amount_sold number, time_id date)
partition by range (time_id)
interval (numtoyminterval(3,'month'))
(partition p1 values less than (to_date('01.04.2010','dd.mm.yyyy')))
;
Above command created a table that is partitioned by the quarter. We could simply drop a partition instead of deleting lots of rows and we would never need to reorganize to regain space here. If you are on 10g, the feature INTERVAL PARTITIONING is not available there. You can then use RANGE PARTITIONING with the additional effort that you need to create the necessary range partitions manually.
Conclusion: Before you decide to reorganize a table, make sure that this is really necessary, because likely it isn’t 🙂
Watch me explaining the above on YouTube:
Performance Monitoring: Active Session History at work
Teaching an Oracle Database 10g Performance Tuning course this week, I introduced the 10g New Feature Active Session History (ASH) to the students. That was one major improvement – together with the Automatic Workload Repository (AWR) and the Automatic Database Diagnostic Monitor (ADDM) – of the 10g version. Way better than STATSPACK was before!
Imagine you are a DBA on a production system and get an emergency call like „The Database is dead slow!“. You are supposed to spot the cause as soon as possible. ASH kicks in here: We sample the Wait-Events of active sessions every second into the ASH-Buffer. It is accessed most comfortable with the Enterprise Manager GUI from the Performance Page (Button ASH Report there). Or with little effort from the command line like this:
----------------------------------------- -- -- Top 10 CPU consumers in last 5 minutes -- ----------------------------------------- SQL> select * from ( select session_id, session_serial#, count(*) from v$active_session_history where session_state= 'ON CPU' and sample_time > sysdate - interval '5' minute group by session_id, session_serial# order by count(*) desc ) where rownum <= 10; -------------------------------------------- -- -- Top 10 waiting sessions in last 5 minutes -- -------------------------------------------- SQL> select * from ( select session_id, session_serial#,count(*) from v$active_session_history where session_state='WAITING' and sample_time > sysdate - interval '5' minute group by session_id, session_serial# order by count(*) desc ) where rownum <= 10;
These 2 queries should spot the most incriminating sessions of the last 5 minutes. But who is that and what SQL was running?
-------------------- -- -- Who is that SID? -- -------------------- set lines 200 col username for a10 col osuser for a10 col machine for a10 col program for a10 col resource_consumer_group for a10 col client_info for a10 SQL> select serial#, username, osuser, machine, program, resource_consumer_group, client_info from v$session where sid=&sid; ------------------------- -- -- What did that SID do? -- ------------------------- SQL> select distinct sql_id, session_serial# from v$active_session_history where sample_time > sysdate - interval '5' minute and session_id=&sid; ---------------------------------------------- -- -- Retrieve the SQL from the Library Cache: -- ---------------------------------------------- col sql_text for a80 SQL> select sql_text from v$sql where sql_id='&sqlid';
You may spot the cause of the current performance problem in very short time with the shown technique. But beware: You need to purchase the Diagnostic Pack in order to be allowed to use AWR, ADDM and ASH 🙂


