Partition-Pruning: Do & Don’t

This is about how to write SQL in a way that supports Partition-Pruning – and what should be avoided. The playing field looks as follows:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
SQL> select table_name,partitioning_type,partition_count from user_part_tables where table_name='SALES_YEAR';
TABLE_NAME PARTITION PARTITION_COUNT
------------------------------ --------- ---------------
SALES_YEAR RANGE 1048575
SQL> select segment_name,partition_name,sum(bytes)/1024/1024 as mb
from user_segments where segment_name='SALES_YEAR'
group by rollup (segment_name,partition_name)
order by 1,2;
SEGMENT_NAME PARTITION_NAME MB
------------------------------ ------------------------------ ----------
SALES_YEAR P1 16
SALES_YEAR SYS_P181 32
SALES_YEAR SYS_P182 32
SALES_YEAR SYS_P183 32
SALES_YEAR SYS_P184 32
SALES_YEAR SYS_P185 32
SALES_YEAR SYS_P186 32
SALES_YEAR SYS_P187 32
SALES_YEAR SYS_P188 32
SALES_YEAR SYS_P189 32
SALES_YEAR SYS_P190 32
SALES_YEAR SYS_P191 32
SALES_YEAR SYS_P192 32
SALES_YEAR SYS_P193 32
SALES_YEAR SYS_P194 32
SALES_YEAR SYS_P195 32
SALES_YEAR SYS_P196 32
SALES_YEAR SYS_P197 32
SALES_YEAR SYS_P198 32
SALES_YEAR SYS_P199 32
SALES_YEAR SYS_P200 32
SALES_YEAR SYS_P201 32
SALES_YEAR SYS_P202 32
SALES_YEAR SYS_P203 32
SALES_YEAR SYS_P204 32
SALES_YEAR SYS_P205 32
SALES_YEAR SYS_P206 32
SALES_YEAR SYS_P207 24
SALES_YEAR 872
872
30 rows selected.
SQL> select to_char(order_date,'yyyy'),count(*) from sales_year group by to_char(order_date,'yyyy') order by 1;
TO_C COUNT(*)
---- ----------
1985 158000
1986 365000
1987 365000
1988 366000
1989 365000
1990 365000
1991 365000
1992 366000
1993 365000
1994 365000
1995 365000
1996 366000
1997 365000
1998 365000
1999 365000
2000 366000
2001 365000
2002 365000
2003 365000
2004 366000
2005 365000
2006 365000
2007 365000
2008 366000
2009 365000
2010 365000
2011 365000
2012 346000
28 rows selected.
My moderately sized table is Interval partitioned (therefore PARTITION_COUNT in USER_PART_TABLES shows the possible maximum number) by the year on ORDER_DATE with 28 partitions. Now imagine we want to have the summarized AMOUNT_SOLD of the year 2011. What about this statement?
SQL> set timing on SQL> select sum(amount_sold) from sales_year where to_char(order_date,'yyyy')='2011'; SUM(AMOUNT_SOLD) ---------------- 1825000000 Elapsed: 00:00:05.15 SQL> select plan_table_output from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------- SQL_ID cv54q4mt7ajjr, child number 0 ------------------------------------- select sum(amount_sold) from sales_year where to_char(order_date,'yyyy')='2011' Plan hash value: 3345868052 --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24384 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | | 2 | PARTITION RANGE ALL| | 287K| 6181K| 24384 (2)| 00:00:07 | 1 |1048575| |* 3 | TABLE ACCESS FULL | SALES_YEAR | 287K| 6181K| 24384 (2)| 00:00:07 | 1 |1048575| --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(TO_CHAR(INTERNAL_FUNCTION("ORDER_DATE"),'yyyy')='2011')
It produces the required result, but using a Full Table Scan across all partitions. Very much better instead:
SQL> select sum(amount_sold) from sales_year where order_date between to_date('01.01.2011','dd.mm.yyyy') and to_date('31.12.2011','dd.mm.yyyy'); SUM(AMOUNT_SOLD) ---------------- 1825000000 Elapsed: 00:00:00.11 SQL> select plan_table_output from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------ SQL_ID 6rwm3z7rhgmd6, child number 0 ------------------------------------- select sum(amount_sold) from sales_year where order_date between to_date('01.01.2011','dd.mm.yyyy') and to_date('31.12.2011','dd.mm.yyyy') Plan hash value: 767904852 ------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 1033 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 22 | | | | | | 2 | PARTITION RANGE SINGLE| | 378K| 8128K| 1033 (16)| 00:00:01 | 27 | 27 | |* 3 | TABLE ACCESS FULL | SALES_YEAR | 378K| 8128K| 1033 (16)| 00:00:01 | 27 | 27 | ------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(("ORDER_DATE">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "ORDER_DATE"<=TO_DATE(' 2011-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
The same result but much faster, scanning only one partition!
Conclusion: It is quite important to have no functions around the partition key in the WHERE-clause here. Personally, the first SQL looks easier to me and has less coding, but it is obviously not as good as the second. Might be worth to spend some time thinking and adding some more characters to the code to make Partition-Pruning possible. Don’t believe it, test it! With some big enough tables, I mean 🙂
Backup & Restore one Datafile in Parallel

A lesser known 11g New Feature is the option to backup and restore single large datafiles with multiple channels in parallel, which can speed up these processes dramatically. This posting is supposed to give an example for it.
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select file#,bytes/1024/1024 as mb from v$datafile;
FILE# MB
---------- ----------
1 300
2 200
3 179
4 2136
My demo system is on 11gR2, but the feature was there in 11gR1 already – it is easy to miss and just keep the old backup scripts in place like with 10g, though, where one channel could only read one datafile. bk is the same service that we have seen in a previous posting. I will now just backup & restore datafile 4 to show this can be done with two channels:
[oracle@uhesse1 ~]$ time rman target sys/oracle@uhesse1/bk cmdfile=backup_par.rmn
Recovery Manager: Release 11.2.0.3.0 - Production on Wed Dec 12 21:20:49 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: PRIMA (DBID=2003897072)
RMAN> configure device type disk parallelism 2;
2> backup datafile 4 section size 1100m;
3>
using target database control file instead of recovery catalog
old RMAN configuration parameters:
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters:
CONFIGURE DEVICE TYPE DISK PARALLELISM 2 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters are successfully stored
Starting backup at 2012-12-12:21:20:50
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=24 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=9 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
backing up blocks 1 through 140800
channel ORA_DISK_1: starting piece 1 at 2012-12-12:21:20:51
channel ORA_DISK_2: starting full datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00004 name=/home/oracle/prima/users01.dbf
backing up blocks 140801 through 273408
channel ORA_DISK_2: starting piece 2 at 2012-12-12:21:20:51
channel ORA_DISK_2: finished piece 2 at 2012-12-12:21:21:46
piece handle=/home/oracle/flashback/PRIMA/backupset/2012_12_12/o1_mf_nnndf_TAG20121212T212051_8dkss3kr_.bkp tag=TAG20121212T212051 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:55
channel ORA_DISK_1: finished piece 1 at 2012-12-12:21:22:06
piece handle=/home/oracle/flashback/PRIMA/backupset/2012_12_12/o1_mf_nnndf_TAG20121212T212051_8dkss3bm_.bkp tag=TAG20121212T212051 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:15
Finished backup at 2012-12-12:21:22:06
Recovery Manager complete.
real 1m17.681s
user 0m1.356s
sys 0m0.129s
The script backup_par.rmn contains these lines:
[oracle@uhesse1 ~]$ cat backup_par.rmn
configure device type disk parallelism 2;
backup datafile 4 section size 1100m;
As you can see, the two channels were running in parallel, each taking about 1 minute to backup its section into a separate backupset. Also the restore can now be done in parallel for a single datafile:
[oracle@uhesse1 ~]$ time rman target sys/oracle@uhesse1/bk cmdfile=restore_par.rmn
Recovery Manager: Release 11.2.0.3.0 - Production on Wed Dec 12 21:23:28 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: PRIMA (DBID=2003897072)
RMAN> configure device type disk parallelism 2;
2> sql "alter database datafile 4 offline";
3> restore datafile 4;
4> recover datafile 4;
5> sql "alter database datafile 4 online";
6>
7>
using target database control file instead of recovery catalog
old RMAN configuration parameters:
CONFIGURE DEVICE TYPE DISK PARALLELISM 2 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters:
CONFIGURE DEVICE TYPE DISK PARALLELISM 2 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters are successfully stored
sql statement: alter database datafile 4 offline
Starting restore at 2012-12-12:21:23:30
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=9 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=24 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /home/oracle/prima/users01.dbf
channel ORA_DISK_1: restoring section 1 of 2
channel ORA_DISK_1: reading from backup piece /home/oracle/flashback/PRIMA/backupset/2012_12_12/o1_mf_nnndf_TAG20121212T212051_8dkss3bm_.bkp
channel ORA_DISK_2: starting datafile backup set restore
channel ORA_DISK_2: specifying datafile(s) to restore from backup set
channel ORA_DISK_2: restoring datafile 00004 to /home/oracle/prima/users01.dbf
channel ORA_DISK_2: restoring section 2 of 2
channel ORA_DISK_2: reading from backup piece /home/oracle/flashback/PRIMA/backupset/2012_12_12/o1_mf_nnndf_TAG20121212T212051_8dkss3kr_.bkp
channel ORA_DISK_2: piece handle=/home/oracle/flashback/PRIMA/backupset/2012_12_12/o1_mf_nnndf_TAG20121212T212051_8dkss3kr_.bkp tag=TAG20121212T212051
channel ORA_DISK_2: restored backup piece 2
channel ORA_DISK_2: restore complete, elapsed time: 00:02:05
channel ORA_DISK_1: piece handle=/home/oracle/flashback/PRIMA/backupset/2012_12_12/o1_mf_nnndf_TAG20121212T212051_8dkss3bm_.bkp tag=TAG20121212T212051
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:02:16
Finished restore at 2012-12-12:21:25:46
Starting recover at 2012-12-12:21:25:46
using channel ORA_DISK_1
using channel ORA_DISK_2
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 2012-12-12:21:25:48
sql statement: alter database datafile 4 online
Recovery Manager complete.
real 2m20.137s
user 0m1.229s
sys 0m0.187s
This is the script I have used for the restore:
[oracle@uhesse1 ~]$ cat restore_par.rmn
configure device type disk parallelism 2;
sql "alter database datafile 4 offline";
restore datafile 4;
recover datafile 4;
sql "alter database datafile 4 online";
Conclusion: Multisection backup & restore can be very useful for the processing of large (bigfile) datafiles with multiple channels in parallel. If you have not done it yet, you should definitely give it a try! As always: Don’t believe it, test it 🙂
Addendum: With 12c, this feature got enhanced to support also image copies.
Impressions from the #UKOUG2012
I’d like to share my personal impressions of this years UK Oracle User Group Conference. They are of course purely subjective and by no means comprehensive. First things first: It was a great event, in my opinion! Fantastic speakers including many Oracle Celebrities, very nice location and service by the UKOUG. Especially I liked the opportunity to meet so many guys in person that I communicated with previously via social media 🙂
Every single presentation I attended was worth it, to say the least. My goal in the following list of presentations is to highlight just one thing about them that I think is worthwhile to memorize, although it was of course not the only important thing in that talk.
On Monday, I saw Julian Dyke with ‚Is RAT Worth Catching?‘ Since I have taught Real Application Testing in many 11g New Features & Performance Tuning courses, I was curious what he had to say about it. Although he appeared not very enthusiastic about it – the price is of course a concern here – a key point here was: Database Replay is indeed good for capture & replay of realistic production workload and will show the impact of a change pretty accurately measured in DB Time.
Next was Alex Gorbachev with ‚Oracle ASM New Features‘. It was about 12c, so everything here is under Oracle’s safe harbor statement:
‚All information provided outlines our general product direction. It’s intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code or functionality, and should not be relied upon in making a purchasing decision. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.‘ Having said that, one major thing I found interesting was the potential upcoming of clusters where not every node needs an ASM instance running, thereby reducing the need of ASM instance communication in spite of large clusters with very many nodes.
Another highlight was Christian Antognini with ‚Shareable Cursors‘. Especially I liked his many live demonstrations. Very instructive! One take away: Starting from 11.2.0.3, a parent cursor that has more than 100 child cursors is obsoleted, which is controlled by _cursor_obsolete_threshold
On Tuesday I saw Emre Baransel with ‚Oracle Data Guard Deep Dive‘. I knew the information here before, but I wanted to see him present it after we have chatted so often via twitter already 🙂 One of many things to memorize: There is no point in using ARCH Redo Transport any more for Max Performance because ASYNC has also no performance impact but delivers far better protection than ARCH.
Next came Joel Goodman with ‚Global Resource Management in RAC‘. One key point out of many important things my colleague said: The resource mastership of an Oracle block is determined by an internal algorithm but can be dynamically remastered (in case of blocks on the segment level) so that another node becomes the master of a certain segment when blocks out of that segment are frequently required from that node.
James Morle was referring about ‚Building a Winning Oracle Database Architecture‘. One remarkable thing in my opinion here: It is probably much less painful to follow the mainstream (regarding Oracle Database Architecture) even if that is boring and uncool. Better be boring & reliable instead of cool & chaotic – now doesn’t that pretty much reflect the image of a DBA? 🙂
Larry Carpenter was then talking about ‚Oracle Data Guard Zero Data Loss Protection at Any Distance‘ Again that is 12c stuff and falls under the same safe harbor disclaimer as above. Larry is one of the very few people I have an Oracle book from in the shelf and probably the best source of knowledge about Data Guard you can find. Looks like there will be something called ‚Far Sync‘ standby databases in the future that receive redo in SYNC mode, cascading it with ASYNC (not with archive shipping) to potentially extremely distant remote standby databases.
Then came Tom Kyte with ‚What’s New in Security in the Latest Generation of Database Technology‘. Again see safe harbor disclaimer above, as it is 12c. We will probably have a standard procedure that analyzes the actually used privileges of a given user, making it easy to avoid granting too powerful or too many privileges was one cool thing here.
On Wednesday, I attended Dan Norris with ‚Exadata X3: The Fourth Generation‘. Although I knew that presentation, I was curious to see him in person, after we have had some conversation about Exadata before. One key point here: The Write-Back Cache stores writes mirrored across cells and persistent across cell reboot. I mention that here because I have seen some misconceptions especially about this in the Exadata community.
Then came Andy Colvin with ‚Exadata Zero Downtime Migration‘. It turned out that it was not exactly zero but about 2 minutes downtime, because they did it with Data Guard switchover. Particular striking in my opinion: Even with primary databases of multi-TB sizes, duplicate from active database can reasonably be done – it may take days to complete but who cares as long as there is no downtime during that on the primary 🙂
Finally, I saw Tanel Poder with an amazing presentation about ‚Troubleshooting the Most Complex Performance Issues I’ve Seen‘. Very instructive indeed! One thing to memorize out of many: If a wait-event is not properly instrumented by Oracle development, it may falsely show up as CPU-Wait. You could drill down with pstack then to reveal the root cause.
By the way, my own presentation was about Active Data Guard – a great way to get additional benefits from your DR solution. Key points can be found in here, here and here 🙂

