Posts Tagged 10g New Features
DIFF_TABLE_STATS_IN_HISTORY Example
In 10g, we introduced the automatic retention of old Optimizer Statistics. We might want to compare the Optimizer Statistics, presently in use with the old stats. This posting gives an example that shows how to do that. The Online Documentation for the function has no example for that – probably it is considered too trivial
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> connect scott/tiger Connected. SQL> alter session set nls_timestamp_tz_format='yyyy-mm-dd:hh24:mi:ss'; Session altered. SQL> alter session set nls_date_format='yyyy-mm-dd:hh24:mi:ss'; Session altered. SQL> select num_rows,last_analyzed from user_tables where table_name='DEPT'; NUM_ROWS LAST_ANAL ---------- ---------
There are no Optimizer Statistics yet for the DEPT table of SCOTT. We change that:
SQL> exec dbms_stats.gather_table_stats('SCOTT','DEPT')
PL/SQL procedure successfully completed.
SQL> select num_rows,last_analyzed from user_tables where table_name='DEPT';
NUM_ROWS LAST_ANALYZED
---------- -------------------
4 2012-04-23:16:14:38
This is my demo setup. Now we have some DML on the table and gather statistics afterwards:
SQL> insert into dept values (50,'TEST','TEST');
1 row created.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_table_stats('SCOTT','DEPT')
PL/SQL procedure successfully completed.
SQL> select num_rows,last_analyzed from user_tables where table_name='DEPT';
NUM_ROWS LAST_ANALYZED
---------- -------------------
5 2012-04-23:16:17:42
The above stats are actually in use, while the old stats got recorded. They show up in dba_tab_stats_history with the time they got stored there:
SQL> select table_name,stats_update_time from user_tab_stats_history;
TABLE_NAME STATS_UPDATE_TIME
------------------------------ ---------------------------------------------------------------------------
DEPT 2012-04-23:16:14:38
DEPT 2012-04-23:16:17:42
I got two rows above; the first represents the NULL stats before the first gather, the second row are the stats before the second gather. We can now compare the present stats with the old stats in this way:
SQL> select * from table(dbms_stats.diff_table_stats_in_history(
ownname => user,
tabname => upper('&tabname'),
time1 => systimestamp,
time2 => to_timestamp('&time2','yyyy-mm-dd:hh24:mi:ss'),
pctthreshold => 0));
Enter value for tabname: dept
old 3: tabname => upper('&tabname'),
new 3: tabname => upper('dept'),
Enter value for time2: 2012-04-23:16:17:42
old 5: time2 => to_timestamp('&time2','yyyy-mm-dd:hh24:mi:ss'),
new 5: time2 => to_timestamp('2012-04-23:16:17:42','yyyy-mm-dd:hh24:mi:ss'),
STATISTICS DIFFERENCE REPORT FOR:
.................................
TABLE : DEPT
OWNER : SCOTT
SOURCE A : Statistics as of 2012-04-23:16:25:16
SOURCE B : Statistics as of 2012-04-23:16:17:42
PCTTHRESHOLD : 0
TABLE / (SUB)PARTITION STATISTICS DIFFERENCE:
.............................................
OBJECTNAME TYP SRC ROWS BLOCKS ROWLEN SAMPSIZE
...............................................................................
DEPT T A 5 8 18 5
B 4 4 20 4
I had to shorten and adjust the above report to make it look nice on WordPress – there is also a section about Column Statistics in it that I left out here.
Conclusion: We can not only restore old Optimizer Statistics easy, we can also compare the old stats with the present stats beforehand.
Optimizer Stats: Treat some tables different than AutoTask does
This posting was inspired by an OTN thread. Since 10g, we have an Automatic Maintenance Task (AutoTask) that gathers Optimizer Statistics during the night. This task uses DBMS_STATS.GATHER_DATABASE_STATS with default values. For whatever reason, we may consider these defaults not appropriate for some tables. We have 2 options:
- Write our own procedure/job to gather statistics different (10g way)
- Change the Statistic Preferences for these tables (11g way)
We look at the 11g New Feature first, because it is the more efficient way to go:
SQL> create table adam.a as
select
rownum as id,
rownum + 1 as flag,
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
trunc(sysdate - 10000 + mod(rownum,10000)) as order_date,
trunc(sysdate - 9999 + mod(rownum,10000)) as ship_date
from dual connect by level<=1e6;
Table created.
SQL> create table adam.b as select * from adam.a;
Table created.
SQL> select table_name,num_rows,sample_size,to_char(last_analyzed,'hh24:mi:ss') from user_tables;
TABLE_NAME NUM_ROWS SAMPLE_SIZE TO_CHAR(
------------------------------ ---------- ----------- --------
A
B
SQL> exec dbms_stats.gather_schema_stats('ADAM')
PL/SQL procedure successfully completed.
SQL> select table_name,num_rows,sample_size,to_char(last_analyzed,'hh24:mi:ss') from user_tables;
TABLE_NAME NUM_ROWS SAMPLE_SIZE TO_CHAR(
------------------------------ ---------- ----------- --------
A 1000000 1000000 10:03:11
B 1000000 1000000 10:03:12
You may notice that DBMS_STATS analyzed all the blocks/rows by default – that is because the tables are rather small in size. I want to change that for table b:
SQL> exec dbms_stats.set_table_prefs('ADAM','B','ESTIMATE_PERCENT','50')
PL/SQL procedure successfully completed.
Next, I will delete more than 10 Percent of both tables, so that DBMS_STATS will consider both as stale and collect new stats for them. Then I call again GATHER_SCHEMA_STATS as before, but b is treated different:
SQL> delete from a where rownum<200000;
199999 rows deleted.
SQL> delete from b where rownum<200000;
199999 rows deleted.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_schema_stats('ADAM')
PL/SQL procedure successfully completed.
SQL> select table_name,num_rows,sample_size,to_char(last_analyzed,'hh24:mi:ss') from user_tables;
TABLE_NAME NUM_ROWS SAMPLE_SIZE TO_CHAR(
------------------------------ ---------- ----------- --------
A 800001 800001 10:10:06
B 801196 400598 10:10:09
Notice the different sample size of b. Now to the old (10g) way to achieve the same. If we would just let the AutoTask collect all the stats (including our “different tables”) and after that collect stats again on those tables, we would do the doubled work for them. Therefore we lock the stats (10g New Feature). I drop & recreate the tables a & b the same as before. Then:
SQL> exec dbms_stats.lock_table_stats('ADAM','B')
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_schema_stats('ADAM')
PL/SQL procedure successfully completed.
SQL> create procedure mygather
as
begin
dbms_stats.unlock_table_stats('ADAM','B');
dbms_stats.gather_table_stats('ADAM','B',estimate_percent=>50);
dbms_stats.lock_table_stats('ADAM','B');
end;
/
Procedure created.
SQL> exec mygather
PL/SQL procedure successfully completed.
SQL> select table_name,num_rows,sample_size,to_char(last_analyzed,'hh24:mi:ss') from user_tables;
TABLE_NAME NUM_ROWS SAMPLE_SIZE TO_CHAR(
------------------------------ ---------- ----------- --------
A 1000000 1000000 10:18:27
B 999662 499831 10:18:54
Conclusion: If we are aware of the (New) Features of each Oracle Database version, we are able to do our work more efficiently. One aspect of this general rule is the dealing with Optimizer Statistics, that we can collect different from the AutoTasks default for some tables if we are inclined to do so.
Reorganizing Tables in Oracle – is it worth the effort?
This topic seems to be some kind of “Evergreen”, since it comes up regularly in my courses and also in the OTN Discussion Forum. I decided therefore to cover it briefly here in order to be able to point to this post in the future.
Short answer: Probably not
If the intention of the reorganizing operation is to gain space resp. “defragment” the table, this operation is very likely just a waste of effort & resources. I will try to illustrate my point with a simple demonstration that anybody with access to an Oracle Database can easily reproduce:
SQL> select * from v$version;
BANNER
------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
My demo Database is 11gR2, but the same can be done with 10g also. If you are on an even older version, there is no SHRINK SPACE available, so you would have to use MOVE instead. I prepare a little demo Table with 1Mio rows now – the time_id population was the hardest part of that demo for me
SQL> create table sales as
select
'Oracle Enterprise Edition' as product,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2010' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=1e6;
Table created.
SQL> select time_id ,count(*)
from sales group by time_id
order by 1;
TIME_ID COUNT(*)
--------- ----------
01-JAN-10 83333
01-FEB-10 83334
01-MAR-10 83334
01-APR-10 83334
01-MAY-10 83334
01-JUN-10 83333
01-JUL-10 83333
01-AUG-10 83333
01-SEP-10 83333
01-OCT-10 83333
01-NOV-10 83333
01-DEC-10 83333
12 rows selected.
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 54
The table contains about 83000 rows per month. Now I will delete the first quarter of rows:
SQL> delete from sales where time_id<to_date('01.04.2010','dd.mm.yyyy');
250001 rows deleted.
SQL> commit;
Commit complete.
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 54
This is the starting point of a possible reorganization: Although 250k rows got deleted, the table consumes the same space as before. In other words: The High Water Mark did not move. A reorganization would move the High Water Mark and would regain the space that was consumed by the 250k rows. The question is: Is that necessary? If inserts would take place after the deletion again, then the space would get reused without any need to reorganize:
SQL> insert into sales
select
rownum as id,
mod(rownum,5) as channel_id,
mod(rownum,1000) as cust_id ,
5000 as amount_sold,
to_date
('01.' || lpad(to_char(mod(rownum,3)+1),2,'0') || '.2011' ,'dd.mm.yyyy')
as time_id
from dual connect by level<=2.5e5;
250000 rows created.
SQL> commit;
Commit complete.
SQL> select time_id ,count(*)
from sales group by time_id
order by 1; 2 3
TIME_ID COUNT(*)
--------- ----------
01-APR-10 83334
01-MAY-10 83334
01-JUN-10 83333
01-JUL-10 83333
01-AUG-10 83333
01-SEP-10 83333
01-OCT-10 83333
01-NOV-10 83333
01-DEC-10 83333
01-JAN-11 83333
01-FEB-11 83334
01-MAR-11 83333
12 rows selected.
I inserted a new quarter of rows. The table remains in the same size as before:
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 54
That is exactly the point I like to emphazise: If you have inserts following deletes, reorganization of tables is not necessary! Only if that is not the case (Table gets no longer inserts after deletion), you may reorganize it:
SQL> delete from sales where time_id<to_date('01.07.2010','dd.mm.yyyy');
250001 rows deleted.
SQL> commit;
Commit complete.
SQL> alter table sales enable row movement;
Table altered.
SQL> alter table sales shrink space;
Table altered.
SQL> select segment_name,bytes/1024/1024 as mb from user_segments;
SEGMENT_NAME MB
-------------------- ----------
SALES 40.25
The space consumed by the table got reduced now and is usable for other segments. Although the table is not locked during the SHRINK SPACE operation and users can do DML on the table as usual, the operation is not “for free” in terms of resource consumption. It does a lot of I/O, creates a lot before images that consume space in the UNDO tablespace and the operation modifies many blocks, so lots of redo protocol (and therefore many archivelogs, probably) gets generated.
If you really think that you need that kind of reorganization regulary, you should probably evaluate the Partitioning Option here:
SQL> create table sales_part
(product char(25), channel_id number,
cust_id number, amount_sold number, time_id date)
partition by range (time_id)
interval (numtoyminterval(3,'month'))
(partition p1 values less than (to_date('01.04.2010','dd.mm.yyyy')))
;
Above command create a table that is partitioned by the quarter. We could simply drop a partition instead of deleting lots of rows and we would never need to reorganize to regain space here. If you are on 10g, the feature INTERVAL PARTITIONING is not available there. You can then use RANGE PARTITIONING with the additional effort that you need to create the necessary range partitions manually.
Conclusion: Before you decide to reorganize a table, make sure that this is really necessary, because likely it isn’t
Performance Monitoring: Active Session History at work
Teaching an Oracle Database 10g Performance Tuning course this week, I introduced the 10g New Feature Active Session History (ASH) to the students. That was one major improvement – together with the Automatic Workload Repository (AWR) and the Automatic Database Diagnostic Monitor (ADDM) – of the 10g version. Way better than STATSPACK was before!
Imagine you are a DBA on a production system and get an emergency call like “The Database is dead slow!”. You are supposed to spot the cause as soon as possible. ASH kicks in here: We sample the Wait-Events of active sessions every second into the ASH-Buffer. It is accessed most comfortable with the Enterprise Manager GUI from the Performance Page (Button ASH Report there). Or with little effort from the command line like this:
----------------------------------------- -- -- Top 10 CPU consumers in last 5 minutes -- ----------------------------------------- SQL> select * from ( select session_id, session_serial#, count(*) from v$active_session_history where session_state= 'ON CPU' and sample_time > sysdate - interval '5' minute group by session_id, session_serial# order by count(*) desc ) where rownum <= 10; -------------------------------------------- -- -- Top 10 waiting sessions in last 5 minutes -- -------------------------------------------- SQL> select * from ( select session_id, session_serial#,count(*) from v$active_session_history where session_state='WAITING' and sample_time > sysdate - interval '5' minute group by session_id, session_serial# order by count(*) desc ) where rownum <= 10;
These 2 queries should spot the most incriminating sessions of the last 5 minutes. But who is that and what SQL was running?
-------------------- -- -- Who is that SID? -- -------------------- set lines 200 col username for a10 col osuser for a10 col machine for a10 col program for a10 col resource_consumer_group for a10 col client_info for a10 SQL> select serial#, username, osuser, machine, program, resource_consumer_group, client_info from v$session where sid=&sid; ------------------------- -- -- What did that SID do? -- ------------------------- SQL> select distinct sql_id, session_serial# from v$active_session_history where sample_time > sysdate - interval '5' minute and session_id=&sid; ---------------------------------------------- -- -- Retrieve the SQL from the Library Cache: -- ---------------------------------------------- col sql_text for a80 SQL> select sql_text from v$sql where sql_id='&sqlid';
You may spot the cause of the current performance problem in very short time with the shown technique. But beware: You need to purchase the Diagnostic Pack in order to be allowed to use AWR, ADDM and ASH
Restoring old Optimizer Statistics for troubleshooting purpose
We deliver a Oracle Database 10g (sic!) New Features course this week in our education center in Düsseldorf. Many customers are still on 10g productively and some even just upgrade from 9i to 10g, still being in the process of evaluating 11g. Many new features in the 10g version are related to the optimizer. Some are quite well known to the public, like the obsolence of the rule based optimizer and the presence of a scheduler job that gathers optimizer statistics every night out-of-the-box.
Others lead a more quiet life, one of them is the possibility to restore old optimizer statistics (i.e. as a troubleshooting measure), should there be issues with the newly gathered ones. Since 10g, we do not simply overwrite old optimizer statistics by gathering new. Instead, the old optimizer statistics are automatically historized and kept for one month by default, so that they can be restored easily if that should be desired. Now did you know that? Here we go with a short demonstration of restoring historic optimizer statistics:
SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod PL/SQL Release 10.2.0.1.0 - Production CORE 10.2.0.1.0 Production TNS for Linux: Version 10.2.0.1.0 - Production NLSRTL Version 10.2.0.1.0 - Production SQL> grant dba to adam identified by adam; Grant succeeded. SQL> connect adam/adam Connected. SQL> create table sales as select rownum as id, mod(rownum,5) as channel_id, mod(rownum,1000) as cust_id, 5000 as amount_sold, sysdate as time_id from dual connect by level<=3e7; Table created. SQL> create index sales_idx on sales(id) nologging; Index created. SQL> select segment_name,bytes/1024/1024 as mb from user_segments; SEGMENT_NAME MB ------------------------ ---------- SALES 942 SALES_IDX 566
I just created a demo user, a table and an index on that table. Notice that the two segments take about 1.5 Gig space, should you like to reproduce the demo yourself. Right now, there are no optimizer statistics for the table:
SQL> select num_rows,blocks from user_tables; NUM_ROWS BLOCKS ---------- ---------- "NULL values here"
I am now going to gather statistics on the table manually – the same would be done automatically by the standard scheduler job during the night:
SQL> exec dbms_stats.gather_table_stats('ADAM','SALES')
PL/SQL procedure successfully completed.
SQL> select num_rows,blocks from user_tables;
NUM_ROWS BLOCKS
---------- ----------
29933962 119585
SQL> select count(*) from sales;
COUNT(*)
----------
30000000
As we can see, the statistics are quite accurate, reflecting well the actual size of the table. The index is used for the following query, as we can tell by runtime already:
SQL> set timing on SQL> select amount_sold from sales where id=4711; AMOUNT_SOLD ----------- 5000 Elapsed: 00:00:00.00
I am now going to introduce a problem with the optimizer statistics artificially by just setting them very inaccurate. A real-world problem that is caused by new optimizer statistics is a little harder to come up with – probably you will never encounter it during your career…
SQL> exec dbms_stats.set_table_stats('ADAM','SALES',numrows=>100,numblks=>1) PL/SQL procedure successfully completed. SQL> select num_rows,blocks from user_tables; NUM_ROWS BLOCKS ---------- ---------- 100 1
With the above (completely misleading) statistics, the optimizer will think that a full table scan of the sales table is fairly cheap. Please notice that I ask after id 4712 and not 4711 again, because it could happen that the already computed execution plan (index range scan) is still in the library cache available for reuse. I could also flush the shared pool here to make sure that a new execution plan has to be generated for the id 4711.
SQL> select amount_sold from sales where id=4712; AMOUNT_SOLD ----------- 5000 Elapsed: 00:00:01.91
We can tell by the runtime of almost 2 seconds here that this was a full table scan. Proof would be to retrieve the execution plan from the library cache. I leave that to your studies. Please be aware that the autotrace feature might be misleading here. For our scope, it is enough to say that we have an issue caused by generation of new optimizer statistics. We want to get back our good old statistics! Therefore, we look at the historic optimizer statistics:
SQL> alter session set NLS_TIMESTAMP_TZ_FORMAT='yyyy-mm-dd:hh24:mi:ss';
Session altered.
SQL> select table_name,stats_update_time from user_tab_stats_history;
TABLE_NAME STATS_UPDATE_TIME
------------------------------ --------------------------------------
SALES 2010-05-18:09:47:16
SALES 2010-05-18:09:47:38
We see two rows, representing the old statistics of the sales table. The first was from the time, as there where NULL entries (before the first gather_table_stats). The second row represents the accurate statistics. I am going to restore them:
SQL> begin
dbms_stats.restore_table_stats('ADAM','SALES',
to_timestamp('2010-05-18:09:47:38','yyyy-mm-dd:hh24:mi:ss'));
end;
/
PL/SQL procedure successfully completed.
SQL> select num_rows,blocks from user_tables;
NUM_ROWS BLOCKS
---------- ----------
29933962 119585The good old statistics are back, but the statements are still in the library cache and may execute with the same misguided execution plan:
SQL> set timing on SQL> select amount_sold from sales where id=4712; AMOUNT_SOLD ----------- 5000 Elapsed: 00:00:01.92Still full table scan! The following empties the shared pool and forces the computation of new execution plans, using the accurate statistics:
SQL> alter system flush shared_pool; System altered. Elapsed: 00:00:00.04 SQL> select amount_sold from sales where id=4712; AMOUNT_SOLD ----------- 5000 Elapsed: 00:00:00.01We fixed the problem by restoring the old stats. Should the problem persist - because apparently, optimizer statistics did not cause it - we can get back the new computed statistics also:
SQL> select table_name,stats_update_time from user_tab_stats_history; TABLE_NAME STATS_UPDATE_TIME ------------------------------ --------------------------------------------------------------------------- SALES 2010-05-18:09:47:16 SALES 2010-05-18:09:47:38 SALES 2010-05-18:09:52:30The last line represents the last historized stats.
SQL> begin dbms_stats.restore_table_stats('ADAM','SALES', to_timestamp('2010-05-18:09:52:30','yyyy-mm-dd:hh24:mi:ss')); end; / 2 3 4 5 PL/SQL procedure successfully completed. SQL> select num_rows,blocks from user_tables; NUM_ROWS BLOCKS ---------- ---------- 100 1Conclusion: If we suspect that new Optimizer Statistics are responsible for a performance problem, we can restore the "good old statistics" very easy since 10g.
Recent Comments