Posts Tagged Data Warehouse
Materialized Views & Partition Change Tracking
During the 11g Data Warehouse Administration course that I delivered this week in Munich, I have shown this demonstration about Partition Change Tracking (PCT) that I’d like to share with the Oracle Community. You may have a look here for an Introduction into Materialized Views if that topic is new for you. Often, the Base Tables of Materialized Views are partitioned, which gives use additional options for the Refresh as well as for the Rewrite:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL> grant dba to adam identified by adam;
Grant succeeded.
SQL> connect adam/adam
Connected.
SQL> create table sales
(product varchar2(50),
channel_id number,
cust_id number,
amount_sold number(10,2),
time_id date)
partition by list (channel_id)
(partition c0 values (0),
partition c1 values (1),
partition c2 values (2),
partition c3 values (3),
partition c4 values (4)
);
Table created.
SQL> alter table sales nologging;
Table altered.
SQL> insert /*+ append */ into sales
select
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2010' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=1e6;
1000000 rows created.
SQL> commit;
Commit complete.
The above was my demo setup, done with 11.2.0.1 Most of the shown should work with lower versions also, though. We will now see a Materialized View that contains the Partition Key of the Base Table, which makes it easily possible for the system to associate the rows of the Materialized View with the Partitions. Otherwise, we would need to explain that with a Partition Marker.
SQL> create materialized view mv_pct enable query rewrite as select channel_id,sum(amount_sold) from sales group by channel_id; Materialized view created.
The MV is now fresh and may get used for Query Rewrite:
SQL> select mview_name,last_refresh_type,staleness from user_mviews; MVIEW_NAME LAST_REF STALENESS ------------------------------ -------- ------------------- MV_PCT COMPLETE FRESH SQL> set autotrace on explain SQL> select channel_id,sum(amount_sold) from sales group by channel_id; CHANNEL_ID SUM(AMOUNT_SOLD) ---------- ---------------- 0 1000000000 1 1000000000 2 1000000000 3 1000000000 4 1000000000 Execution Plan ---------------------------------------------------------- Plan hash value: 470332451 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 130 | 3 (0)| 00:00:01 | | 1 | MAT_VIEW REWRITE ACCESS FULL| MV_PCT | 5 | 130 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2)
Now we modify one row:
SQL> set autotrace off SQL> update sales set amount_sold=1 where rownum<2; 1 row updated. SQL> commit; Commit complete.:
This makes the MV stale right? Well actually not all of it:
SQL> select mview_name,last_refresh_type,staleness from user_mviews; MVIEW_NAME LAST_REF STALENESS ------------------------------ -------- ------------------- MV_PCT COMPLETE NEEDS_COMPILE SQL> select detail_partition_name, freshness from user_mview_detail_partition; DETAIL_PARTITION_NAME FRESH ------------------------------ ----- C4 FRESH C3 FRESH C2 FRESH C1 FRESH C0 STALE
The second Dictionary View is new in 11g, not the behavior of the next query. Only Partition C0 is stale, which is why queries that do not need channel_id 0 can still use the MV:
SQL> set autotrace on explain
SQL> select sum(amount_sold) from sales where channel_id=1;
SUM(AMOUNT_SOLD)
----------------
1000000000
Execution Plan
----------------------------------------------------------
Plan hash value: 1805549181
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | MAT_VIEW REWRITE ACCESS FULL| MV_PCT | 1 | 26 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("MV_PCT"."CHANNEL_ID"=1)
Note
-----
- dynamic sampling used for this statement (level=2)
The Rewrite is still possible although part of the MV is stale. We call that PCT Rewrite. Furthermore, we can now do a PCT Fast Refresh, although there is no MV Log on the Base Table sales. The PCT Refresh does not need to scan the whole Base Table but only the stale partitions, which is in this case about 5 times faster than a Complete Refresh, because we have 5 equally sized partitions. The value ‘P‘ is new in 11g, but it also works with ‘F’ or ‘?’ in earlier versions.
SQL> set autotrace off
SQL> exec dbms_mview.refresh('MV_PCT','P')
PL/SQL procedure successfully completed.
SQL> select mview_name,last_refresh_type,staleness from user_mviews;
MVIEW_NAME LAST_REF STALENESS
------------------------------ -------- -------------------
MV_PCT FAST_PCT FRESH
Conclusion: Partitioning delivers some extra benefits when used together with Materialized Views. Especially, Rewrite & Refresh can be done on the Partition layer then. As always: “Don’t believe it, test it!”
Auto DOP: Differences of parallel_degree_policy=auto/limited
Recently, I delivered a Seminar about Parallel Processing in 11g where I came across some interesting findings, that I’d like to share with the Oracle Community. See my introduction into the 11g New Feature Auto DOP here, if that topic is completely new for you. There are big differences in the handling of Hints resp. Table-assigned parallel degrees, depending on the setting of parallel_degree_policy.
The parameter defaults to MANUAL, which gives you the known behavior of versions before 11g. LIMITED will only assign a system computed degree of parallelism (DOP) for tables, decorated with a parallel degree of DEFAULT, while prallel_degree_policy=AUTO will consider to assign a system computed DOP to all tables. Let’s see some details:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production
I did my tests on 11.2.0.1 and 11.2.0.2.
SQL> grant dba to adam identified by adam; Grant succeeded. SQL> connect adam/adam@prima Connected. SQL> set pages 300 SQL> show parameter parallel NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ fast_start_parallel_rollback string LOW parallel_adaptive_multi_user boolean TRUE parallel_automatic_tuning boolean FALSE parallel_degree_limit string CPU parallel_degree_policy string MANUAL parallel_execution_message_size integer 16384 parallel_force_local boolean FALSE parallel_instance_group string parallel_io_cap_enabled boolean FALSE parallel_max_servers integer 20 parallel_min_percent integer 0 parallel_min_servers integer 0 parallel_min_time_threshold string AUTO parallel_server boolean FALSE parallel_server_instances integer 1 parallel_servers_target integer 8 parallel_threads_per_cpu integer 2 recovery_parallelism integer 0 SQL> create table t as select * from dual; Table created.
That is my test case. All the red parameters have default values. The table t is of course way too small to justify a parallel operation; especially, it will not meet the parallel_min_time_threshold of estimated runtime (about 10 seconds with AUTO). The setting parallel_degree_policy=MANUAL would leave the system as dumb as in earlier versions regarding an appropriate DOP, though. It would give me any DOP I demand with Hints or Parallel Degree on the table. See how that is different with AUTO/LIMITED:
SQL> alter system set parallel_degree_policy=auto; System altered. SQL> select /*+ parallel (t,8) */ * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 0 0 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 0 0 Server Threads 0 0 Allocation Height 0 0 Allocation Width 0 0 Local Msgs Sent 0 0 Distr Msgs Sent 0 0 Local Msgs Recv'd 0 0 Distr Msgs Recv'd 0 0 11 rows selected.
Although this was a valid hint, I got no parallel operation! That is different with LIMITED:
SQL> alter system set parallel_degree_policy=limited; System altered. SQL> select /*+ parallel (t,8) */ * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 1 1 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 1 1 Server Threads 8 0 Allocation Height 8 0 Allocation Width 1 0 Local Msgs Sent 26 26 Distr Msgs Sent 0 0 Local Msgs Recv'd 26 26 Distr Msgs Recv'd 0 0 11 rows selected.
Same statement, now I got my (not sensible) DOP. There is a new hint in 11g on the statement level, though, that is also delivering my requested DOP with AUTO:
SQL> connect adam/adam@prima Connected. SQL> alter system set parallel_degree_policy=auto; System altered. SQL> select /*+ parallel (8) */ * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 1 1 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 1 1 Server Threads 8 0 Allocation Height 8 0 Allocation Width 1 0 Local Msgs Sent 26 26 Distr Msgs Sent 0 0 Local Msgs Recv'd 26 26 Distr Msgs Recv'd 0 0 11 rows selected.
That is the only way to overrule the Auto DOP with parallel_degree_policy=AUTO. Similar that is with Parallel Degree on the table:
SQL> connect adam/adam@prima Connected. SQL> alter system set parallel_degree_policy=auto; System altered. SQL> alter table t parallel; Table altered. SQL> select * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 0 0 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 0 0 Server Threads 0 0 Allocation Height 0 0 Allocation Width 0 0 Local Msgs Sent 0 0 Distr Msgs Sent 0 0 Local Msgs Recv'd 0 0 Distr Msgs Recv'd 0 0 11 rows selected. SQL> alter table t parallel 8; Table altered. SQL> select * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 0 0 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 0 0 Server Threads 0 0 Allocation Height 0 0 Allocation Width 0 0 Local Msgs Sent 0 0 Distr Msgs Sent 0 0 Local Msgs Recv'd 0 0 Distr Msgs Recv'd 0 0 11 rows selected. SQL> alter system set parallel_degree_policy=limited; System altered. SQL> alter table t parallel; Table altered. SQL> select * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 0 0 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 0 0 Server Threads 0 0 Allocation Height 0 0 Allocation Width 0 0 Local Msgs Sent 0 0 Distr Msgs Sent 0 0 Local Msgs Recv'd 0 0 Distr Msgs Recv'd 0 0 11 rows selected. SQL> alter table t parallel 8; Table altered. SQL> select * from t; D - X SQL> select * from v$pq_sesstat; STATISTIC LAST_QUERY SESSION_TOTAL ------------------------------ ---------- ------------- Queries Parallelized 1 1 DML Parallelized 0 0 DDL Parallelized 0 0 DFO Trees 1 1 Server Threads 8 0 Allocation Height 8 0 Allocation Width 1 0 Local Msgs Sent 26 26 Distr Msgs Sent 0 0 Local Msgs Recv'd 26 26 Distr Msgs Recv'd 0 0 11 rows selected.
You saw a behavior like in the below table described:
| parallel_degree_policy | parallel (t,8) | Parallel (8) | degree DEFAULT | degree 8 |
| manual | 8 | 8 | 4 | 8 |
| limited | 8 | 8 | 1 | 8 |
| auto | 1* | 8 | 1 | 1 |
The default degree with parallel_degree_policy=MANUAL is cpu_count * parallel_threads_per_cpu; 4 in my case.
Apart from the shown differences between AUTO and LIMITED, only AUTO enables the also New Features Parallel Statement Queueing and In-Memory Parallel Execution. My personal impression is that LIMITED works like we have hoped that parallel_automatic_tuning would but never did
Conclusion: parallel_degree_policy=LIMITED will give you the DOP you request and compute an appropriate DOP only with a parallel degree of DEFAULT as an attribute of the table. Use this parameter if you trust that your applications/designers know why they use a certain DOP. parallel_degree_policy=AUTO will overrule any specific DOP you gave – except the new 11g parallel (n) Hint – and consider to do things in parallel for all tables even without a Hint or Degree.
Addendum: See this nice related posting by Gwen Shapira, especially the part about the I/O calibration.
* Second Addendum: With 11.2.0.3, the hint /*+ parallel (t,8) */ determines the DOP to 8, regardless of the parallel_degree_policy setting. Everything else is the same as shown, especially the different behavior of the values AUTO and LIMITED with the parallel degree of the table t explicitly set to 8.
Exadata Part III: Compression
With this posting, we look at the New Feature Hybrid Columnar Compression that is only available with the Exadata Database Machine, offering stunning compression ratios together with high query speed. Meanwhile we have not less than 6 different kinds of Table Data Compression methods:
- BASIC compression, introduced in Oracle 8 already and only recommended for Data Warehouse
- OLTP compression, introduced in Oracle 11 and recommended for OLTP Databases as well
- QUERY LOW compression (Exadata only), recommended for Data Warehouse with Load Time as a critical factor
- QUERY HIGH compression (Exadata only), recommended for Data Warehouse with focus on Space Saving
- ARCHIVE LOW compression (Exadata only), recommended for Archival Data with Load Time as a critical factor
- ARCHIVE HIGH compression (Exadata only), recommended for Archival Data with maximum Space Saving
Method 1 and 2 are working by compressing the rows inside of Oracle Blocks – the more redundant values inside the block, the better the compression ratio. OLTP compression might be used with Exadata Database Machine, because Exadata V2 is designed to support OLTP as well as Data Warehouse. Methods 3 to 6 use Hybrid Columnar Compression (HCC), which is designed to achieve better compression ratios by organizing sets of rows into compression units and then detaching the columns from the rows and collecting them. It is more likely that the same columns store similar (redundant) data, therefore the compression ratio improves with HCC.
I will give examples for the 6 compression methods now. Keep in mind that I have only limited access to our Oracle University Exadata Database Machine, restricting me to just one Database Server and two Storage Servers – speed on a Full Rack will probably be faster. Also, the demo table is somewhat artificial and may therefore deliver better compression ratios compared to Real World tables.
In short: Although you should get the picture from the examples, it will be different on your site and it requires thorough testing before you take any kind of compression into production.
The (uncompressed) demonstration table is the same as in the previous posting and was created with this statement:
SQL> create table sales as select rownum as id, rownum + 1 as flag, 'Oracle Enterprise Edition' as product, mod(rownum,5) as channel_id, mod(rownum,1000) as cust_id , 5000 as amount_sold, trunc(sysdate - 10000 + mod(rownum,10000)) as order_date, trunc(sysdate - 9999 + mod(rownum,10000)) as ship_date from dual connect by level<=2e7;
It is 1.5 GB in size. Let’s look how small we can compress and how fast we can load!
BASIC method
SQL> create table comp_classic compress as select * from sales where 1=2; Table created. SQL> alter table comp_classic nologging; Table altered. SQL> insert /*+ append */ into comp_classic select * from sales; 20000000 rows created. Elapsed: 00:00:39.80 SQL> commit; Commit complete.
OLTP method. Notice that we cannot assume direct loads in an OLTP environment. Therefore conventional insert:
SQL> create table comp_oltp compress for oltp as select * from sales where 1=2; Table created. SQL> insert into comp_oltp select * from sales; 20000000 rows created. Elapsed: 00:02:16.56 SQL> commit; Commit complete.
QUERY LOW method. We expect this method to have the fastest Load Time:
SQL> create table comp_query_low compress for query low as select * from sales where 1=2; Table created. SQL> alter table comp_query_low nologging; Table altered. SQL> insert /*+ append */ into comp_query_low select * from sales; 20000000 rows created. Elapsed: 00:00:26.96 SQL> commit; Commit complete.
QUERY HIGH method
SQL> create table comp_query_high compress for query high as select * from sales where 1=2; Table created. SQL> alter table comp_query_high nologging; Table altered. SQL> insert /*+ append */ into comp_query_high select * from sales; 20000000 rows created. Elapsed: 00:00:33.99 SQL> commit; Commit complete.
ARCHIVE LOW method
SQL> create table comp_archive_low compress for archive low as select * from sales where 1=2; Table created. SQL> alter table comp_archive_low nologging; Table altered. SQL> insert /*+ append */ into comp_archive_low select * from sales; 20000000 rows created. Elapsed: 00:00:35.14 SQL> commit; Commit complete.
ARCHIVE HIGH method. We expect this method to have the slowest Load Time and the best Compression Ratio:
SQL> create table comp_archive_high compress for archive high as select * from sales where 1=2; Table created. SQL> alter table comp_archive_high nologging; Table altered. SQL> insert /*+ append */ into comp_archive_high select * from sales; 20000000 rows created. Elapsed: 00:04:13.13 SQL> commit; Commit complete.
That was indeed slow! Now let’s look at the outcome:
SQL> select segment_name,bytes/1024/1024 as mb from user_segments order by 1; SEGMENT_NAME MB ------------------------------ ---------- COMP_ARCHIVE_HIGH 5 COMP_ARCHIVE_LOW 8 COMP_CLASSIC 736 COMP_OLTP 864 COMP_QUERY_HIGH 8 COMP_QUERY_LOW 304 SALES 1535.5 7 rows selected.
We have condensend 1.5 Gig into 5 Megabyte with the ARCHIVE HIGH method!
Summary: Except for ARCHIVE HIGH, we have faster Load Times with HCC than with Block Compression. Every HCC method delivers a better compression ratio than the Block Compression methods. Now how about Query Times?
SQL> set timing on
SQL> connect adam/adam
Connected.
SQL> select count(*) from sales where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:02.07
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 46.807457
We take the FTS (with Smart Scan) on the uncompressed table as Baseline. I always reconnect to initialize v$mystat. Now Query Performance of BASIC compression:
SQL> connect adam/adam
Connected.
SQL> select count(*) from comp_classic where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:01.79
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 123.650536
Runtime was faster as uncompressed, Smart Scan was possible. Now OLTP compression:
SQL> connect adam/adam
Connected.
SQL> select count(*) from comp_oltp where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:01.80
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 130.22644
Dito. Now QUERY LOW (HCC) compression:
SQL> connect adam/adam
Connected.
SQL> select count(*) from comp_query_low where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:00.86
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 2.68922424
Fastest Runtime yet. Smart Scan was possible. Now QUERY HIGH (HCC):
SQL> connect adam/adam
Connected.
SQL> select count(*) from comp_query_high where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:01.27
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 0
Still faster than uncompressed, although not using Smart Scan. Now ARCHIVE LOW (HCC):
SQL> connect adam/adam
Connected.
SQL> select count(*) from comp_archive_low where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:01.13
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 0
Faster than uncompressed (and surprisingly faster than QUERY HIGH) without Smart Scan. Finally ARCHIVE HIGH (HCC):
SQL> connect adam/adam
Connected.
SQL> select count(*) from comp_archive_high where channel_id=1;
COUNT(*)
----------
4000000
Elapsed: 00:00:02.93
SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat
where name = 'cell physical IO interconnect bytes returned by smart scan'; 2
NAME MB
---------------------------------------------------------------- ----------
cell physical IO interconnect bytes returned by smart scan 0
Slowest Query Performance (the only one slower than uncompressed in this experiment) without Smart Scan.
Conclusion: HCC delivers (far) better Compression Ratios than the Block Compression methods. Load Time increases from QUERY LOW (best) over QUERY HIGH and ARCHIVE LOW (both moderate) to ARCHIVE HIGH (longest Load Time). Query Performance decreases similar from QUERY LOW (best) to ARCHIVE HIGH (longest Query Time). All HCC methods except ARCHIVE HIGH delivered better Query Performance than uncompressed, though. In short: HCC is most likely able to save large amounts of space in your Data Warehouse without decreasing Query Performance much if at all. It will probably compress your Archival Data significantly.

Partitioning a table online with DBMS_REDEFINITION
If there is a requirement to change the structure of a table that is already in use productively, it may be impossible to get a maintenance downtime for that table, because it is constantly in use. That can be the case for all kind of structural changes of a table, particulary for the change from an ordinary heap table into a partitioned table, which I am going to take here as an example, because I am getting asked frequently in my courses how to achieve it. In order to demonstrate that, I will create a demonstration user with a non-partitioned table with privileges and additional dependent objects on it:
SQL> grant dba to adam identified by adam; Grant succeeded. SQL> connect adam/adam Connected. SQL> create table original as select rownum as id, mod(rownum,5) as channel_id, 5000 as amount_sold, mod (rownum,1000) as cust_id, sysdate as time_id from dual connect by level<=1e6; Table created. SQL> create index original_id_idx on original(id) nologging; Index created. SQL> grant select on original to hr; Grant succeeded.
The challenge is now to change this table into a partitioned one while it is used with DML & queries by end users. For this purpose, we introduced already in 9i (if I recall it right) the package DBMS_REDEFINITION. First step would be to ask, whether it can be used in this case:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production
SQL> begin dbms_redefinition.can_redef_table (uname=>'ADAM', tname=>'ORIGINAL', options_flag=>DBMS_REDEFINITION.CONS_USE_ROWID); end; /
PL/SQL procedure successfully completed.
Because there is no Primary Key on the original table, I have to use CONS_USE_ROWID, else I could use CONS_USE_PK. There are no objections against the online redefinition of the table here – else an error message would appear. Next step is to create an interim table of the structure, desired for the original table. In my case, I create it interval partitioned (an 11g New Feature). I could also change storage attributes and add or remove columns during that process.
SQL> create table interim (id number, channel_id number(1), amount_sold number(4), cust_id number(4), time_id date) partition by range (cust_id) interval (10) (partition p1 values less than (10)); Table created.
My original table has 1000 distinct cust_ids, so this will lead to 100 partitions – each partion will contain 10 distinct cust_ids. One benefit of that would be the possibility of partition pruning, should there be statements, specifying the cust_id in the where-condition. These statements will be about 100 times faster as a full table scan. The next step will basically insert all the rows from the orginal table into the interim table (thereby automatically generating 99 partitions), while DML during that period is recorded:
SQL> set timing on SQL> BEGIN DBMS_REDEFINITION.START_REDEF_TABLE (uname=>'ADAM', orig_table=>'ORIGINAL', int_table=>'INTERIM', options_flag=>DBMS_REDEFINITION.CONS_USE_ROWID); end; / PL/SQL procedure successfully completed. Elapsed: 00:00:22.76
If this step takes a long time to run it might be beneficial to use the SYNC_INTERIM_TABLE procedure occasionally from another session. That prevents a longer locking time for the last step, the calling of FINISH_REDEF_TABLE. Next step is now to add the dependent objects/privileges to the interim table:
SQL> set timing off
SQL> vari num_errors number
BEGIN
DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS
(uname=>'ADAM',
orig_table=>'ORIGINAL',
int_table=>'INTERIM',
num_errors=>:num_errors);
END;
/
PL/SQL procedure successfully completed.
SQL> print num_errors
NUM_ERRORS
----------
0
There was no problem with this step. Until now the original table is still an ordinary heap table – only the interim table is partitioned:
SQL> select table_name from user_part_tables; TABLE_NAME ------------------------------ INTERIM
In the last step, the two tables change their names and the recorded DML that occured in the meantime gets used for actualization:
SQL> begin dbms_redefinition.finish_redef_table (uname=>'ADAM', orig_table=>'ORIGINAL', int_table=>'INTERIM'); end; / PL/SQL procedure successfully completed.
We will now determine that the original table is partitioned and the dependencies are still there:
SQL> select table_name,partitioning_type from user_part_tables;
TABLE_NAME PARTITION
------------------------------ ---------
ORIGINAL RANGE
SQL> select count(*) from user_tab_partitions;
COUNT(*)
----------
100
SQL> select grantee,privilege from user_tab_privs_made where table_name='ORIGINAL';
GRANTEE PRIVILEGE
------------------------------ ----------------------------------------
HR SELECT
SQL> select index_name,table_name from user_indexes;
INDEX_NAME TABLE_NAME
------------------------------ ------------------------------
ORIGINAL_ID_IDX ORIGINAL
TMP$$_ORIGINAL_ID_IDX0 INTERIM
The interim table can now be dropped. We changed the table into a partitioned table without any end user noticing it!
Result Cache: Another brilliant 11g New Feature
I have just finished an Oracle Database 11g New Features course in Vienna. There are many amazing New Features in the 11g version, one of them is the possibility to cache the result sets of statements, that access large tables but return relatively few rows. Think of it like automagically created materialized views inside the SGA. Example:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production
SQL> select bytes/1024/1024 as mb from user_segments where segment_name='SALES'; MB ---------- 560
SQL> alter table sales result_cache (mode force);
Table altered.
The ALTER TABLE statement is a new feature of 11g Release 2. In Release 1, you can control the feature only with the following parameters:
SQL> show parameter result_cache_m
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
result_cache_max_result integer 5
result_cache_max_size big integer 2080K
result_cache_mode string MANUAL
Another possibility, already introduced in 11g R1 is the RESULT_CACHE hint. Now let’s look at the effect of the ALTER TABLE statement:
SQL> set timing on
SQL> select channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926004170
9 4438820.16
Elapsed: 00:00:03.72
This was the first time, the sales table was accessed after the ALTER TABLE above. The runtime signalizes we have got a full table scan here (there are no indexes on the table anyway). Of course, blocks of the table are now cached in the database buffer cache – as in previous versions. But now, also the result set is cached!
SQL> select channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926004170
9 4438820.16
Elapsed: 00:00:00.01
That is obvious by runtime already, so I omit AUTOTRACE here. If the table gets changed, the result set gets “stale”, similar like a materialized view would:
SQL> update sales set amount_sold=1 where rownum<2;
1 row updated.
Elapsed: 00:00:00.02
SQL> commit;
Commit complete.
Elapsed: 00:00:00.00
SQL> select channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926002938
9 4438820.16
Elapsed: 00:00:03.08
Second access after the DML will again use the (newly cached) result set – even if the statement is slightly different, in the same way as materialized views can be used for query rewrite, even if the SELECT differs from the query that built the materialized view. That gives me the opportunity to introduce the new SUPERFAST hint
SQL> select /*+ superfast */ channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926002938
9 4438820.16
Elapsed: 00:00:00.00
The “hint” is just a strange commentar for the engine and is simply ignored, of course. The speed comes from using the cached result set, stored in the SGA by the previous SELECT with the 3 seconds runtime.
Addendum: See these two fine postings by Alex Fatkulin that talk about the 11gR2 improvements regarding the latch contention issue that surfaced with the Result Cache feature in 11gR1:
11GR2 Result Cache Scalability
Result Cache Latch in 11GR2: Shared Mode Gets
Thanks for the comments below that mentioned possible drawbacks of Result Caching! Much appreciated
Recent Comments