Beiträge getaggt mit partitioning
Partition-Pruning: Do & Don’t

This is about how to write SQL in a way that supports Partition-Pruning – and what should be avoided. The playing field looks as follows:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
SQL> select table_name,partitioning_type,partition_count from user_part_tables where table_name='SALES_YEAR';
TABLE_NAME PARTITION PARTITION_COUNT
------------------------------ --------- ---------------
SALES_YEAR RANGE 1048575
SQL> select segment_name,partition_name,sum(bytes)/1024/1024 as mb
from user_segments where segment_name='SALES_YEAR'
group by rollup (segment_name,partition_name)
order by 1,2;
SEGMENT_NAME PARTITION_NAME MB
------------------------------ ------------------------------ ----------
SALES_YEAR P1 16
SALES_YEAR SYS_P181 32
SALES_YEAR SYS_P182 32
SALES_YEAR SYS_P183 32
SALES_YEAR SYS_P184 32
SALES_YEAR SYS_P185 32
SALES_YEAR SYS_P186 32
SALES_YEAR SYS_P187 32
SALES_YEAR SYS_P188 32
SALES_YEAR SYS_P189 32
SALES_YEAR SYS_P190 32
SALES_YEAR SYS_P191 32
SALES_YEAR SYS_P192 32
SALES_YEAR SYS_P193 32
SALES_YEAR SYS_P194 32
SALES_YEAR SYS_P195 32
SALES_YEAR SYS_P196 32
SALES_YEAR SYS_P197 32
SALES_YEAR SYS_P198 32
SALES_YEAR SYS_P199 32
SALES_YEAR SYS_P200 32
SALES_YEAR SYS_P201 32
SALES_YEAR SYS_P202 32
SALES_YEAR SYS_P203 32
SALES_YEAR SYS_P204 32
SALES_YEAR SYS_P205 32
SALES_YEAR SYS_P206 32
SALES_YEAR SYS_P207 24
SALES_YEAR 872
872
30 rows selected.
SQL> select to_char(order_date,'yyyy'),count(*) from sales_year group by to_char(order_date,'yyyy') order by 1;
TO_C COUNT(*)
---- ----------
1985 158000
1986 365000
1987 365000
1988 366000
1989 365000
1990 365000
1991 365000
1992 366000
1993 365000
1994 365000
1995 365000
1996 366000
1997 365000
1998 365000
1999 365000
2000 366000
2001 365000
2002 365000
2003 365000
2004 366000
2005 365000
2006 365000
2007 365000
2008 366000
2009 365000
2010 365000
2011 365000
2012 346000
28 rows selected.
My moderately sized table is Interval partitioned (therefore PARTITION_COUNT in USER_PART_TABLES shows the possible maximum number) by the year on ORDER_DATE with 28 partitions. Now imagine we want to have the summarized AMOUNT_SOLD of the year 2011. What about this statement?
SQL> set timing on SQL> select sum(amount_sold) from sales_year where to_char(order_date,'yyyy')='2011'; SUM(AMOUNT_SOLD) ---------------- 1825000000 Elapsed: 00:00:05.15 SQL> select plan_table_output from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------- SQL_ID cv54q4mt7ajjr, child number 0 ------------------------------------- select sum(amount_sold) from sales_year where to_char(order_date,'yyyy')='2011' Plan hash value: 3345868052 --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24384 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | | 2 | PARTITION RANGE ALL| | 287K| 6181K| 24384 (2)| 00:00:07 | 1 |1048575| |* 3 | TABLE ACCESS FULL | SALES_YEAR | 287K| 6181K| 24384 (2)| 00:00:07 | 1 |1048575| --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(TO_CHAR(INTERNAL_FUNCTION("ORDER_DATE"),'yyyy')='2011')
It produces the required result, but using a Full Table Scan across all partitions. Very much better instead:
SQL> select sum(amount_sold) from sales_year where order_date between to_date('01.01.2011','dd.mm.yyyy') and to_date('31.12.2011','dd.mm.yyyy'); SUM(AMOUNT_SOLD) ---------------- 1825000000 Elapsed: 00:00:00.11 SQL> select plan_table_output from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------ SQL_ID 6rwm3z7rhgmd6, child number 0 ------------------------------------- select sum(amount_sold) from sales_year where order_date between to_date('01.01.2011','dd.mm.yyyy') and to_date('31.12.2011','dd.mm.yyyy') Plan hash value: 767904852 ------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 1033 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | | 2 | PARTITION RANGE SINGLE| | 378K| 8128K| 1033 (16)| 00:00:01 | 27 | 27 | |* 3 | TABLE ACCESS FULL | SALES_YEAR | 378K| 8128K| 1033 (16)| 00:00:01 | 27 | 27 | ------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(("ORDER_DATE">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "ORDER_DATE"<=TO_DATE(' 2011-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
The same result but much faster, scanning only one partition!
Conclusion: It is quite important to have no functions around the partition key in the WHERE-clause here. Personally, the first SQL looks easier to me and has less coding, but it is obviously not as good as the second. Might be worth to spend some time thinking and adding some more characters to the code to make Partition-Pruning possible. Don’t believe it, test it! With some big enough tables, I mean 🙂
Materialized Views & Partition Change Tracking
During the 11g Data Warehouse Administration course that I delivered this week in Munich, I have shown this demonstration about Partition Change Tracking (PCT) that I’d like to share with the Oracle Community. You may have a look here for an Introduction into Materialized Views if that topic is new for you. Often, the Base Tables of Materialized Views are partitioned, which gives use additional options for the Refresh as well as for the Rewrite:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL> grant dba to adam identified by adam;
Grant succeeded.
SQL> connect adam/adam
Connected.
SQL> create table sales
(product varchar2(50),
channel_id number,
cust_id number,
amount_sold number(10,2),
time_id date)
partition by list (channel_id)
(partition c0 values (0),
partition c1 values (1),
partition c2 values (2),
partition c3 values (3),
partition c4 values (4)
);
Table created.
SQL> alter table sales nologging;
Table altered.
SQL> insert /*+ append */ into sales
select
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2010' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=1e6;
1000000 rows created.
SQL> commit;
Commit complete.
The above was my demo setup, done with 11.2.0.1 Most of the shown should work with lower versions also, though. We will now see a Materialized View that contains the Partition Key of the Base Table, which makes it easily possible for the system to associate the rows of the Materialized View with the Partitions. Otherwise, we would need to explain that with a Partition Marker.
SQL> create materialized view mv_pct enable query rewrite as select channel_id,sum(amount_sold) from sales group by channel_id; Materialized view created.
The MV is now fresh and may get used for Query Rewrite:
SQL> select mview_name,last_refresh_type,staleness from user_mviews; MVIEW_NAME LAST_REF STALENESS ------------------------------ -------- ------------------- MV_PCT COMPLETE FRESH SQL> set autotrace on explain SQL> select channel_id,sum(amount_sold) from sales group by channel_id; CHANNEL_ID SUM(AMOUNT_SOLD) ---------- ---------------- 0 1000000000 1 1000000000 2 1000000000 3 1000000000 4 1000000000 Execution Plan ---------------------------------------------------------- Plan hash value: 470332451 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 130 | 3 (0)| 00:00:01 | | 1 | MAT_VIEW REWRITE ACCESS FULL| MV_PCT | 5 | 130 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2)
Now we modify one row:
SQL> set autotrace off SQL> update sales set amount_sold=1 where rownum<2; 1 row updated. SQL> commit; Commit complete.:
This makes the MV stale right? Well actually not all of it:
SQL> select mview_name,last_refresh_type,staleness from user_mviews; MVIEW_NAME LAST_REF STALENESS ------------------------------ -------- ------------------- MV_PCT COMPLETE NEEDS_COMPILE SQL> select detail_partition_name, freshness from user_mview_detail_partition; DETAIL_PARTITION_NAME FRESH ------------------------------ ----- C4 FRESH C3 FRESH C2 FRESH C1 FRESH C0 STALE
The second Dictionary View is new in 11g, not the behavior of the next query. Only Partition C0 is stale, which is why queries that do not need channel_id 0 can still use the MV:
SQL> set autotrace on explain
SQL> select sum(amount_sold) from sales where channel_id=1;
SUM(AMOUNT_SOLD)
----------------
1000000000
Execution Plan
----------------------------------------------------------
Plan hash value: 1805549181
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | MAT_VIEW REWRITE ACCESS FULL| MV_PCT | 1 | 26 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("MV_PCT"."CHANNEL_ID"=1)
Note
-----
- dynamic sampling used for this statement (level=2)
The Rewrite is still possible although part of the MV is stale. We call that PCT Rewrite. Furthermore, we can now do a PCT Fast Refresh, although there is no MV Log on the Base Table sales. The PCT Refresh does not need to scan the whole Base Table but only the stale partitions, which is in this case about 5 times faster than a Complete Refresh, because we have 5 equally sized partitions. The value ‚P‚ is new in 11g, but it also works with ‚F‘ or ‚?‘ in earlier versions.
SQL> set autotrace off
SQL> exec dbms_mview.refresh('MV_PCT','P')
PL/SQL procedure successfully completed.
SQL> select mview_name,last_refresh_type,staleness from user_mviews;
MVIEW_NAME LAST_REF STALENESS
------------------------------ -------- -------------------
MV_PCT FAST_PCT FRESH
Conclusion: Partitioning delivers some extra benefits when used together with Materialized Views. Especially, Rewrite & Refresh can be done on the Partition layer then. As always: „Don’t believe it, test it!“ 🙂
CELL_PARTITION_LARGE_EXTENTS now obsolete
During the Exadata course that I am just delivering in Munich, I noticed that the fairly new parameter CELL_PARTITION_LARGE_EXTENTS is already obsolete now:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production SQL> alter system set cell_partition_large_extents=true; alter system set cell_partition_large_extents=true * ERROR at line 1: ORA-25138: CELL_PARTITION_LARGE_EXTENTS initialization parameter has been made obsolete
This parameter was introduced in 11.2.0.1 especially for Exadata Database Machine because the Allocation Unit Size (AU_SIZE) for Diskgroups built upon Exadata Cells is recommended with 4 MB. Large Segments should therefore use a multiple of 4 MB already for their initial extents. Although the parameter was made obsolete, the effect that was achievable with it is still present:
SQL> create table t (n number) partition by range (n) (partition p1 values less than (2)); Table created. SQL> insert into t values (1); 1 row created. SQL> select bytes/1024/1024 as mb from user_extents where segment_name='T'; MB ---------- 8
I inserted before checking USER_EXTENTS because of the 11g New Feature deferred segment creation:
SQL> drop table t purge; Table dropped. SQL> create table t (n number); Table created. SQL> select bytes/1024/1024 as mb from user_extents where segment_name='T'; no rows selected SQL> insert into t values (1); 1 row created. SQL> select bytes/1024/1024 as mb from user_extents where segment_name='T'; MB ---------- .0625
Notice that only partitioned tables are affected by the 8 MB initial extent behavior. The new hidden parameter _PARTITION_LARGE_EXTENTS (defaults to true!) is now responsible for that:
SQL> alter session set "_partition_large_extents"=false; Session altered. SQL> drop table t purge; Table dropped. SQL> create table t (n number) partition by range (n) (partition p1 values less than (2)); Table created. SQL> insert into t values (1); 1 row created. SQL> select bytes/1024/1024 as mb from user_extents where segment_name='T'; MB ---------- .0625
Notice that the setting of CELL_PARTITION_LARGE_EXTENTS with alter session is silently overridden by the underscore parameter:
SQL> drop table t purge; Table dropped. SQL> alter session set cell_partition_large_extents=true; Session altered. SQL> create table t (n number) partition by range (n) (partition p1 values less than (2)); Table created. SQL> insert into t values (1); 1 row created. SQL> select bytes/1024/1024 as mb from user_extents where segment_name='T'; MB ---------- .0625
The parameter setting of the underscore parameter was still false.
SQL> drop table t purge; Table dropped. SQL> alter session set "_partition_large_extents"=true; Session altered. SQL> alter session set cell_partition_large_extents=false; Session altered. SQL> create table t (n number) partition by range (n) (partition p1 values less than (2)); Table created. SQL> insert into t values (1); 1 row created. SQL> select bytes/1024/1024 as mb from user_extents where segment_name='T'; MB ---------- 8
Conclusion: With 11.2.0.2, partitioned tables get initial extents of 8 MB in size, which is particular useful in Exadata Environments where the ASM AU_SIZE will be 4 MB. But also ordinary Databases are affected – which is probably a good thing if we assume that partitioned tables will be large in size anyway and will therefore benefit from a large initial extent size as well.
Addendum: During my present Exadata course (03-JUN-2012), I saw a similar parameter for partitioned indexes also: _INDEX_PARTITION_LARGE_EXTENTS defaults to FALSE, though. Brief test:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production SQL> create table parti (n number) partition by range (n) (partition p1 values less than (2)); Table created. SQL> insert into parti values (1); 1 row created. SQL> commit; Commit complete. SQL> create index parti_idx on parti(n) local; Index created. SQL> select bytes from user_extents where segment_name='PARTI_IDX'; BYTES ---------- 65536 SQL> drop index parti_idx; Index dropped. SQL> alter session set "_index_partition_large_extents"=true; Session altered. SQL> create index parti_idx on parti(n) local; Index created. SQL> select bytes from user_extents where segment_name='PARTI_IDX'; BYTES ---------- 8388608
So this parameter gives us also 8M sized extents for partitioned indexes, but not by default.
