This is about how to write SQL in a way that supports Partition-Pruning – and what should be avoided. The playing field looks as follows:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
SQL> select table_name,partitioning_type,partition_count from user_part_tables where table_name='SALES_YEAR';
TABLE_NAME PARTITION PARTITION_COUNT
------------------------------ --------- ---------------
SALES_YEAR RANGE 1048575
SQL> select segment_name,partition_name,sum(bytes)/1024/1024 as mb
from user_segments where segment_name='SALES_YEAR'
group by rollup (segment_name,partition_name)
order by 1,2;
SEGMENT_NAME PARTITION_NAME MB
------------------------------ ------------------------------ ----------
SALES_YEAR P1 16
SALES_YEAR SYS_P181 32
SALES_YEAR SYS_P182 32
SALES_YEAR SYS_P183 32
SALES_YEAR SYS_P184 32
SALES_YEAR SYS_P185 32
SALES_YEAR SYS_P186 32
SALES_YEAR SYS_P187 32
SALES_YEAR SYS_P188 32
SALES_YEAR SYS_P189 32
SALES_YEAR SYS_P190 32
SALES_YEAR SYS_P191 32
SALES_YEAR SYS_P192 32
SALES_YEAR SYS_P193 32
SALES_YEAR SYS_P194 32
SALES_YEAR SYS_P195 32
SALES_YEAR SYS_P196 32
SALES_YEAR SYS_P197 32
SALES_YEAR SYS_P198 32
SALES_YEAR SYS_P199 32
SALES_YEAR SYS_P200 32
SALES_YEAR SYS_P201 32
SALES_YEAR SYS_P202 32
SALES_YEAR SYS_P203 32
SALES_YEAR SYS_P204 32
SALES_YEAR SYS_P205 32
SALES_YEAR SYS_P206 32
SALES_YEAR SYS_P207 24
SALES_YEAR 872
872
30 rows selected.
SQL> select to_char(order_date,'yyyy'),count(*) from sales_year group by to_char(order_date,'yyyy') order by 1;
TO_C COUNT(*)
---- ----------
1985 158000
1986 365000
1987 365000
1988 366000
1989 365000
1990 365000
1991 365000
1992 366000
1993 365000
1994 365000
1995 365000
1996 366000
1997 365000
1998 365000
1999 365000
2000 366000
2001 365000
2002 365000
2003 365000
2004 366000
2005 365000
2006 365000
2007 365000
2008 366000
2009 365000
2010 365000
2011 365000
2012 346000
28 rows selected.
My moderately sized table is Interval partitioned (therefore PARTITION_COUNT in USER_PART_TABLES shows the possible maximum number) by the year on ORDER_DATE with 28 partitions. Now imagine we want to have the summarized AMOUNT_SOLD of the year 2011. What about this statement?
SQL> set timing on SQL> select sum(amount_sold) from sales_year where to_char(order_date,'yyyy')='2011'; SUM(AMOUNT_SOLD) ---------------- 1825000000 Elapsed: 00:00:05.15 SQL> select plan_table_output from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------- SQL_ID cv54q4mt7ajjr, child number 0 ------------------------------------- select sum(amount_sold) from sales_year where to_char(order_date,'yyyy')='2011' Plan hash value: 3345868052 --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24384 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | | 2 | PARTITION RANGE ALL| | 287K| 6181K| 24384 (2)| 00:00:07 | 1 |1048575| |* 3 | TABLE ACCESS FULL | SALES_YEAR | 287K| 6181K| 24384 (2)| 00:00:07 | 1 |1048575| --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(TO_CHAR(INTERNAL_FUNCTION("ORDER_DATE"),'yyyy')='2011')
It produces the required result, but using a Full Table Scan across all partitions. Very much better instead:
SQL> select sum(amount_sold) from sales_year where order_date between to_date('01.01.2011','dd.mm.yyyy') and to_date('31.12.2011','dd.mm.yyyy'); SUM(AMOUNT_SOLD) ---------------- 1825000000 Elapsed: 00:00:00.11 SQL> select plan_table_output from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------ SQL_ID 6rwm3z7rhgmd6, child number 0 ------------------------------------- select sum(amount_sold) from sales_year where order_date between to_date('01.01.2011','dd.mm.yyyy') and to_date('31.12.2011','dd.mm.yyyy') Plan hash value: 767904852 ------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 1033 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | | 2 | PARTITION RANGE SINGLE| | 378K| 8128K| 1033 (16)| 00:00:01 | 27 | 27 | |* 3 | TABLE ACCESS FULL | SALES_YEAR | 378K| 8128K| 1033 (16)| 00:00:01 | 27 | 27 | ------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(("ORDER_DATE">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "ORDER_DATE"<=TO_DATE(' 2011-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
The same result but much faster, scanning only one partition!
Conclusion: It is quite important to have no functions around the partition key in the WHERE-clause here. Personally, the first SQL looks easier to me and has less coding, but it is obviously not as good as the second. Might be worth to spend some time thinking and adding some more characters to the code to make Partition-Pruning possible. Don’t believe it, test it! With some big enough tables, I mean 🙂
#1 von Matthias Rogel am Dezember 13, 2012 - 18:15
Hi Uwe,
date literals are useful, why does nobody uses them ?
select sum(amount_sold) from sales_year where order_date between
date’2011-01-01′ and date’2011-12-31′
looks easier than your first *and* your second sql ( and results in partition prunning ), no need to use a type-conversion at all !
Regards
Matthias
#2 von Uwe Hesse am Dezember 13, 2012 - 18:43
I am sorry, Matthias, but your code above fails. There is no such function like date’2011-01-01′ apparently. Should you mean the mere literal ’2011-12-31′ ? This would require implicit type conversion and is very unreliable here, because how is the engine supposed to know what is month and what is day? Probably I just miss your point…
#3 von Christian Antognini am Dezember 13, 2012 - 19:59
Hi Uwe
Matthias is referring to Datetime Literals, i.e. http://docs.oracle.com/cd/E14072_01/server.112/e10592/sql_elements003.htm#BABGIGCJ .
Best,
Chris
#4 von Matthias Rogel am Dezember 13, 2012 - 23:07
Hi Uwe,
exactly,
sokrates@11.2 > select date’2012-12-13′ from dual;
DATE’2012
———
13-DEC-12
1 row selected.
sokrates@11.2 > select timestamp’2012-12-13 22:01:00′ „now“ from dual;
now
————————————————————————–
13-DEC-12 10.01.00.000000000 PM
1 row selected.
work quite good for me.
however, when it comes to partition pruning, you have to be very careful sometimes.
See
sokrates@11.2 > create table hourly_partitioned(d)
2 partition by range(d)
3 interval(interval ‚1‘ hour)
4 (partition p0 values less than (timestamp’2012-12-13 22:00:00′))
5 as
6 select sysdate+level/86400 from dual connect by level select count(*) from user_tab_partitions where table_name=’HOURLY_PARTITIONED‘;
COUNT(*)
———-
29
1 row selected.
sokrates@11.2 > exec dbms_stats.gather_table_stats(null, ‚hourly_partitioned‘)
PL/SQL procedure successfully completed.
sokrates@11.2 > set autotr traceonly explain
sokrates@11.2 > REM now compare the plans of the following 4 selects:
sokrates@11.2 > select * from hourly_partitioned
2 where d between timestamp’2012-12-14 01:00:00′ and timestamp’2012-12-14 01:30:00′;
Execution Plan
———————————————————-
Plan hash value: 3527284174
———————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————————-
| 0 | SELECT STATEMENT | | 250 | 2000 | 38 (8)| 00:00:01 | | |
| 1 | PARTITION RANGE ALL| | 250 | 2000 | 38 (8)| 00:00:01 | 1 |1048575|
|* 2 | TABLE ACCESS FULL | HOURLY_PARTITIONED | 250 | 2000 | 38 (8)| 00:00:01 | 1 |1048575|
———————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
2 – filter(INTERNAL_FUNCTION(„D“)>=TIMESTAMP‘ 2012-12-14 01:00:00.000000000′ AND
INTERNAL_FUNCTION(„D“) select * from hourly_partitioned
2 where d between cast(timestamp’2012-12-14 01:00:00′ as date) and cast(timestamp’2012-12-14 01:30:00′ as date);
Execution Plan
———————————————————-
Plan hash value: 2219087052
—————————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
—————————————————————————————————————-
| 0 | SELECT STATEMENT | | 1802 | 14416 | 40 (13)| 00:00:01 | | |
|* 1 | FILTER | | | | | | | |
| 2 | PARTITION RANGE ITERATOR| | 1802 | 14416 | 40 (13)| 00:00:01 | KEY | KEY |
|* 3 | TABLE ACCESS FULL | HOURLY_PARTITIONED | 1802 | 14416 | 40 (13)| 00:00:01 | KEY | KEY |
—————————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
1 – filter(CAST(TIMESTAMP‘ 2012-12-14 01:00:00.000000000′ AS date)<=CAST(TIMESTAMP' 2012-12-14
01:30:00.000000000' AS date))
3 – filter("D"=CAST(TIMESTAMP‘
2012-12-14 01:00:00.000000000′ AS date))
REM pruning at runtime
sokrates@11.2 > select * from hourly_partitioned
2 where d between date’2012-12-14′ + 1/24 and date’2012-12-14′ + (1+30/60)/24;
Execution Plan
———————————————————-
Plan hash value: 281903001
————————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
————————————————————————————————————-
| 0 | SELECT STATEMENT | | 1801 | 14408 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1801 | 14408 | 3 (0)| 00:00:01 | 5 | 5 |
|* 2 | TABLE ACCESS FULL | HOURLY_PARTITIONED | 1801 | 14408 | 3 (0)| 00:00:01 | 5 | 5 |
————————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
2 – filter(„D“=TO_DATE(‚
2012-12-14 01:00:00′, ’syyyy-mm-dd hh24:mi:ss‘))
REM pruning at compiletime
sokrates@11.2 > select * from hourly_partitioned
2 where d between to_date(‚2012-12-14 01:00:00‘, ‚yyyy-mm-dd HH24:MI:SS‘) and to_date(‚2012-12-14 01:30:00‘, ‚yyyy-mm-dd HH24:MI:SS‘);
Execution Plan
———————————————————-
Plan hash value: 281903001
————————————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
————————————————————————————————————-
| 0 | SELECT STATEMENT | | 1801 | 14408 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1801 | 14408 | 3 (0)| 00:00:01 | 5 | 5 |
|* 2 | TABLE ACCESS FULL | HOURLY_PARTITIONED | 1801 | 14408 | 3 (0)| 00:00:01 | 5 | 5 |
————————————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
2 – filter(„D“=TO_DATE(‚
2012-12-14 01:00:00′, ’syyyy-mm-dd hh24:mi:ss‘))
REM pruning at compiletime
#5 von Uwe Hesse am Dezember 14, 2012 - 09:06
Chris, thank you for showing the right place in the documentation where date literals can be found. I was looking into the SQL Reference http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions.htm#i1482196 and did not see any function named ‚date‘ there
#6 von Uwe Hesse am Dezember 14, 2012 - 09:20
Matthias,
Thank you for your comment that brought Datetime Literals to my attention!
When I briefly checked your suggestion yesterday, I did something like
SQL> create table t (d date);
Table created.
SQL> insert into t values (to_date(‚13.12.2012′,’dd.mm.yyyy‘));
1 row created.
SQL> commit;
Commit complete.
SQL> select * from t where d=date’13.12.2012′;
select * from t where d=date’13.12.2012′
*
ERROR at line 1:
ORA-01861: literal does not match format string
That was of course my mistake, because according to the documentation: „The ANSI date literal contains no time portion, and must be specified in the format ‚YYYY-MM-DD‘.“
Which also explains how the engine can tell which is day and which is month: There is just no alternative.
Probably the answer to your question „date literals are useful, why does nobody uses them ?“
is: Almost nobody knows about them – at least I didn’t.
This also taught me another thing: It is not a good idea to answer (non-trivial) comments at the airport, waiting to get boarded 🙂
#7 von Matthias Rogel am Dezember 14, 2012 - 10:12
Uwe,
‚….Probably the answer to your question “date literals are useful, why does nobody uses them ?”
is: Almost nobody knows about them – at least I didn’t. …‘
True, I found them only recently, though they are available since Version 9 and part of Ansi92.
(I think)
I find them extremely useful, cause they can decrease the headaches about datatype conversions in very many cases.
However, it is a bit annoying, that a date-literal
date’2012-12-14′
does not support a time-portion, so you can express the date – take the example
to_date(‚2012-12-14 09:01:53‘, ‚yyyy-mm-dd HH24:MI:SS‘)
–
with literals only with either
(a) date’2012-12-14′ + (9 + ((1 + 53 / 60) / 60)) / 24 or
(b) cast(timestamp’2012-12-14 09:01:53′ as date)
(a) is not very handy when it comes to seconds
(b) is better, but must be explicitly casted to a DATE !! (don’t forget this, it will result in partitioning pruning not to occur for example !!! )
By the way, thanks for your great side, a great resource of cool stuff, one of my favourites concerning DataGuard !
Matthias
#8 von Uwe Hesse am Dezember 14, 2012 - 10:39
Hi Matthias, yes, they look useful. But I suppose I would not use something like a) and b) because in this case, to_date or to_timestamp are not more difficult to use and at the same time more familiar to me. And thank you for the nice words about my Blog! Very much appreciated 🙂
#9 von TIROUMALAI am Dezember 14, 2012 - 11:35
Uwe,
Through this article, I learnt on partitioning; your approach is really nice and well explained.
Thanking you.
Tirou
#10 von Matthias Rogel am Dezember 17, 2012 - 17:34
select date’2012-12-14′ + interval ‚9:1:53‘ hour to second a from dual;
#11 von Uwe Hesse am Dezember 19, 2012 - 09:33
TIROUMALAI, you’re welcome 🙂
Matthias, yes, I like that better than the other format. Thank you for sharing!
#12 von masdoell am September 3, 2014 - 10:41
thanks, i also discovered this after some trial research. I usually do this to replace query which search specific date:
select * from sales_year where order_date between to_date(:inpDate,’dd.mm.yyyy‘) and to_date(:inpDate,’dd.mm.yyyy‘)+1;
do you have any suggestion to replace this solution so i dont have to use two variables for bindvar?
#13 von ctripblog am Mai 9, 2016 - 15:00
HI!, in your SQL „select sum(amount_sold) from sales_year where order_date between to_date(‚01.01.2011′,’dd.mm.yyyy‘) and to_date(‚31.12.2011′,’dd.mm.yyyy‘);“
Are the records that have dates like these: 31/12/2011 14:00:00 included? or just ignored cos it evaluates on 31/12/2011 00:00:00 ?
thanks!