How #Exadata benefits your Data Warehouse

A migration of your Data Warehouse to Exadata will deliver most likely the combined benefits of Hybrid Columnar Compression, Smart Scan and Storage Indexes. That means better performance with less storage space consumption. Let’s see an example:

 

SQL> desc sales_noexa
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER
 PRODUCT                                            CHAR(25)
 CHANNEL_ID                                         NUMBER
 CUST_ID                                            NUMBER
 AMOUNT_SOLD                                        NUMBER
 ORDER_DATE                                         DATE
 SHIP_DATE                                          DATE

SQL> select count(*) from sales_noexa;

  COUNT(*)
----------
  80000000

SQL> select sum(bytes)/1024/1024 as mb from user_segments where segment_name='SALES_NOEXA';

        MB
----------
      5520

This table is partitioned by ORDER_DATE with one partition per year. Without Exadata, performance is as follows:

SQL> select count(*),avg(amount_sold) from sales_noexa where channel_id=2;

  COUNT(*) AVG(AMOUNT_SOLD)
---------- ----------------
  16000000             5000

Elapsed: 00:00:11.06
SQL> select sum(amount_sold) from sales_noexa where ship_date = to_date('01.01.2011','dd.mm.yyyy');

SUM(AMOUNT_SOLD)
----------------
        40000000

Elapsed: 00:00:06.07
SQL> select sum(amount_sold) from sales_noexa where order_date = to_date('01.01.2011','dd.mm.yyyy');

SUM(AMOUNT_SOLD)
----------------
        40000000

Elapsed: 00:00:00.15

There is no index, so each statement above was a Full Table Scan. The last was much faster because of Partition Pruning. Now the same on Exadata:

SQL> desc sales_exa
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER
 PRODUCT                                            CHAR(25)
 CHANNEL_ID                                         NUMBER
 CUST_ID                                            NUMBER
 AMOUNT_SOLD                                        NUMBER
 ORDER_DATE                                         DATE
 SHIP_DATE                                          DATE

SQL> select count(*) from sales_exa;

  COUNT(*)
----------
  80000000

SQL> select sum(bytes)/1024/1024 as mb from user_segments where segment_name='SALES_EXA';

        MB
----------
      1574

The table is way smaller although it contains the very same rows. That’s because I used Hybrid Columnar Compression to create this table:

create table sales_exa (id number, product char(25), channel_id number, cust_id number, amount_sold number, order_date date, ship_date date)
partition by range (order_date)
(
partition p1990 values less than (to_date('01.01.1991','dd.mm.yyyy')) compress for archive high,
partition p1991 values less than (to_date('01.01.1992','dd.mm.yyyy')) compress for archive high,
partition p1992 values less than (to_date('01.01.1993','dd.mm.yyyy')) compress for archive high,
partition p1993 values less than (to_date('01.01.1994','dd.mm.yyyy')) compress for archive high,
partition p1994 values less than (to_date('01.01.1995','dd.mm.yyyy')) compress for archive high,
partition p1995 values less than (to_date('01.01.1996','dd.mm.yyyy')) compress for archive high,
partition p1996 values less than (to_date('01.01.1997','dd.mm.yyyy')) compress for archive low,
partition p1997 values less than (to_date('01.01.1998','dd.mm.yyyy')) compress for archive low,
partition p1998 values less than (to_date('01.01.1999','dd.mm.yyyy')) compress for archive low,
partition p1999 values less than (to_date('01.01.2000','dd.mm.yyyy')) compress for archive low,
partition p2000 values less than (to_date('01.01.2001','dd.mm.yyyy')) compress for archive low,
partition p2001 values less than (to_date('01.01.2002','dd.mm.yyyy')) compress for query high,
partition p2002 values less than (to_date('01.01.2003','dd.mm.yyyy')) compress for query high,
partition p2003 values less than (to_date('01.01.2004','dd.mm.yyyy')) compress for query high,
partition p2004 values less than (to_date('01.01.2005','dd.mm.yyyy')) compress for query high,
partition p2005 values less than (to_date('01.01.2006','dd.mm.yyyy')) compress for query high,
partition p2006 values less than (to_date('01.01.2007','dd.mm.yyyy')) compress for query low,
partition p2007 values less than (to_date('01.01.2008','dd.mm.yyyy')) compress for query low,
partition p2008 values less than (to_date('01.01.2009','dd.mm.yyyy')) compress for query low,
partition p2009 values less than (to_date('01.01.2010','dd.mm.yyyy')) compress for query low,
partition p2010 values less than (to_date('01.01.2011','dd.mm.yyyy')) compress for query low,
partition p2011 values less than (to_date('01.01.2012','dd.mm.yyyy')),
partition p2012 values less than (to_date('01.01.2013','dd.mm.yyyy')),
partition p2013 values less than (to_date('01.01.2014','dd.mm.yyyy')),
partition p2014 values less than (to_date('01.01.2015','dd.mm.yyyy')),
partition p2015 values less than (to_date('01.01.2016','dd.mm.yyyy')),
partition p2016 values less than (to_date('01.01.2017','dd.mm.yyyy')),
partition p2017 values less than (to_date('01.01.2018','dd.mm.yyyy'))
);

The older the partitions, the stronger the compression – that saved about 4 gig of storage space in this case. How about the performance?

SQL> set timing on
SQL> select count(*),avg(amount_sold) from sales_exa where channel_id=2;

  COUNT(*) AVG(AMOUNT_SOLD)
---------- ----------------
  16000000             5000

Elapsed: 00:00:05.83
SQL> select sum(amount_sold) from sales_exa where order_date = to_date('01.01.2011','dd.mm.yyyy');

SUM(AMOUNT_SOLD)
----------------
        40000000

Elapsed: 00:00:00.14

The first Full Table Scan is twice as fast as before now because of Smart Scan, while the Partition Pruning shows about the same effect as before. A real whopper is that one:

SQL> select sum(amount_sold) from sales_exa where ship_date = to_date('01.01.2011','dd.mm.yyyy');

SUM(AMOUNT_SOLD)
----------------
        40000000

Elapsed: 00:00:00.22

That was more than 6 seconds without Exadata before! Why is that so much faster now, almost as good as the Partition Pruning performance?

SQL> connect adam/adam
Connected.

SQL> select sum(amount_sold) from sales_exa where ship_date = to_date('01.01.2011','dd.mm.yyyy');

SUM(AMOUNT_SOLD)
----------------
        40000000

SQL> select name,value/1024/1024 as mb from v$statname
     natural join v$mystat where name='cell physical IO bytes saved by storage index';

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index                    1385.64844

I connected newly to initialize v$mystat. The above shows that not only do we need to do less I/O because of the decreased size of the table due to compression and not only did we have a Smart Scan but additionally a Storage Index could be used to further significantly limit the amount of data the cells had to scan through. That is possible because the partitioning on ORDER_DATE leads to a physical sort on disk not only on the ORDER_DATE column but also on the SHIP_DATE column. SHIP_DATE is usually very close to ORDER_DATE. And this ordered way the data is stored is the foundation of the successful usage of Storage Indexes.

Overall, space consumption went down while performance went up 🙂

The above demonstration has been done on our Accenture Enkitec Exadata lab environment, using an X3-2 quarter rack. A more recent Exadata generation respectively a half or full rack would most likely deliver even better results. The tests with SALES_NOEXA have been done by putting CELL_OFFLOAD_PROCESSING=FALSE. That way, I get the hardware-related performance of an Exadata machine without the secret sauce of the Exadata software on the cells that enables them to do Smart Scan in particular.

Now does the demonstrated apply only to Data Warehouse systems? At least to some degree, the answer is yes: Hybrid Columnar Compression is not suitable for many subsequent updates; you wouldn’t want to use that for a table with Online Transaction Processing (OLTP) access pattern. You may have some rather static tables in an OLTP system that may benefit from it, though. Smart Scan requires Full Table Scan respectively Full Index Scan done as Direct Reads. Again, that is not the typical access pattern of an OLTP system although you may have some situations where that happens also there. Same for Storage Indexes – they can only get involved on behalf of a Smart Scan.

Leave a comment

Combining Resource Consumer Groups with Application Modules in #Oracle

This article contains a complete working example for the Resource Manager on the command line for those of you who can’t use the Enterprise Manager fort it. Believe me, I feel your pain 😉

As a good practice, PL/SQL procedures should be using DBMS_APPLICATION_INFO to mark their modules and actions. Not only for monitoring purpose but also to provide a way to tweak the system if things start going ugly in terms of performance. Here’s where the Resource Manager steps in.

Sessions can be assigned to different consumer groups depending on the module. Say we have an application with certain modules that sometimes consume an awful lot of CPU resources or way too much parallel processes. When the problem surfaces, you may not have enough time to fix the coding because it’s a live production run. The mentioned tweak – if prepared beforehand – may save the day. Let’s look at an example:

 
BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();

  DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING_PRI(
    EXPLICIT => 1,
    SERVICE_MODULE_ACTION => 2,
    SERVICE_MODULE => 3,
    MODULE_NAME_ACTION => 4,
    MODULE_NAME => 5,
    SERVICE_NAME => 6,
    ORACLE_USER => 7,
    CLIENT_PROGRAM => 8,
    CLIENT_OS_USER => 9,
    CLIENT_MACHINE => 10,
    CLIENT_ID => 11);

  DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA();
END;
/

The above set the priority of MODULE_NAME over ORACLE_USER, which is not the default. The state of the priorities can be seen in DBA_RSRC_MAPPING_PRIORITY. Now I create two consumer groups:

BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();

  DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP (
     CONSUMER_GROUP => 'A_GROUP',
     COMMENT        => 'A Group');

  DBMS_RESOURCE_MANAGER.CREATE_CONSUMER_GROUP (
     CONSUMER_GROUP => 'B_GROUP',
     COMMENT        => 'B Group');

  DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA();
END;
/

My demo user ADAM gets the right to be a member of these consumer groups:

BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();

  DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP (
   GRANTEE_NAME   => 'ADAM',
   CONSUMER_GROUP => 'A_GROUP',
   GRANT_OPTION   =>  FALSE);

  DBMS_RESOURCE_MANAGER_PRIVS.GRANT_SWITCH_CONSUMER_GROUP (
   GRANTEE_NAME   => 'ADAM',
   CONSUMER_GROUP => 'B_GROUP',
   GRANT_OPTION   =>  FALSE);

  DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA();
END;
/

Now the part where consumer group and module is combined respectively mapped:

BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();

  DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING
     (DBMS_RESOURCE_MANAGER.MODULE_NAME, 'A_MODULE', 'A_GROUP');

  DBMS_RESOURCE_MANAGER.SET_CONSUMER_GROUP_MAPPING
     (DBMS_RESOURCE_MANAGER.MODULE_NAME, 'B_MODULE', 'B_GROUP');

  DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA();
END;
/

Next comes the Resource Manager Plan. The restrictions are a bit rigid to show an obvious effect – 95 to 5 percent favors Group A very much over Group B:

BEGIN
  DBMS_RESOURCE_MANAGER.CREATE_PENDING_AREA();

  DBMS_RESOURCE_MANAGER.CREATE_PLAN(
     PLAN    => 'TESTPLAN',
     COMMENT => 'test');

  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE (
     PLAN                     => 'MYPLAN', 
     GROUP_OR_SUBPLAN         => 'SYS_GROUP',    /* built-in group */
     COMMENT                  => 'SYS Group',
     MGMT_P1                  => 100);

  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE (
     PLAN                     => 'MYPLAN', 
     GROUP_OR_SUBPLAN         => 'A_GROUP',
     COMMENT                  => 'A GROUP',
     parallel_degree_limit_p1 => 8 ,          /* RESTRICTION HERE */
     MGMT_P2                  => 95);

  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE (
     PLAN                     => 'MYPLAN', 
     GROUP_OR_SUBPLAN         => 'B_GROUP',
     COMMENT                  => 'B GROUP',
      parallel_degree_limit_p1 => 2 ,          /* RESTRICTION HERE */
      MGMT_P2                  => 5);

  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE (
     PLAN                     => 'MYPLAN', 
     GROUP_OR_SUBPLAN         => 'OTHER_GROUPS', /* built-in group */
     COMMENT                  => 'Others',
     MGMT_P3                  => 100);

  DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
  DBMS_RESOURCE_MANAGER.CLEAR_PENDING_AREA();

END;
/

So far, no restriction is in place, because the plan is not yet active. But everything is now prepared. Should Module B consume too much CPU or demand too much parallel processes, the plan can be set with this :

BEGIN
    DBMS_RESOURCE_MANAGER.SWITCH_PLAN(plan_name => 'MYPLAN');
END;
/

The sessions that have the module set are subject to the restrictions as soon as the plan is activated. If a new module is set during an existing session, the session is switched into the new consumer group. The parallel restriction have precedence over parallel hints:

SQL> connect adam/adam@prima
Connected.
SQL> select distinct sid from v$mystat;

       SID
----------
	 4

SQL> exec dbms_application_info.set_module(module_name => 'A_MODULE',action_name => 'A-ACTION')

PL/SQL procedure successfully completed.

SQL> select resource_consumer_group from v$session where sid=4;

RESOURCE_CONSUMER_GROUP
--------------------------------
A_GROUP

SQL> select /*+ parallel (dual,16) */ * from dual;

D
-
X

SQL> select * from v$pq_sesstat; 

STATISTIC		       LAST_QUERY SESSION_TOTAL     CON_ID
------------------------------ ---------- ------------- ----------
Queries Parallelized			1	      1 	 0
DML Parallelized			0	      0 	 0
DDL Parallelized			0	      0 	 0
DFO Trees				1	      1 	 0
Server Threads				8	      0 	 0
Allocation Height			8	      0 	 0
Allocation Width			1	      0 	 0
Local Msgs Sent 		       24	     24 	 0
Distr Msgs Sent 			0	      0 	 0
Local Msgs Recv'd		       22	     22 	 0
Distr Msgs Recv'd			0	      0 	 0
DOP					8	      0 	 0
Slave Sets				1	      0 	 0

13 rows selected.

SQL> exec dbms_application_info.set_module(module_name => 'B_MODULE',action_name => 'B-ACTION')

PL/SQL procedure successfully completed.

SQL> select resource_consumer_group from v$session where sid=4;

RESOURCE_CONSUMER_GROUP
--------------------------------
B_GROUP

SQL> select /*+ parallel (dual,16) */ * from dual;

D
-
X

SQL> select * from v$pq_sesstat; 

STATISTIC		       LAST_QUERY SESSION_TOTAL     CON_ID
------------------------------ ---------- ------------- ----------
Queries Parallelized			1	      2 	 0
DML Parallelized			0	      0 	 0
DDL Parallelized			0	      0 	 0
DFO Trees				1	      2 	 0
Server Threads				2	      0 	 0
Allocation Height			2	      0 	 0
Allocation Width			1	      0 	 0
Local Msgs Sent 			8	     32 	 0
Distr Msgs Sent 			0	      0 	 0
Local Msgs Recv'd			8	     30 	 0
Distr Msgs Recv'd			0	      0 	 0
DOP					2	      0 	 0
Slave Sets				1	      0 	 0

13 rows selected.

To test the CPU restrictions, I used scripts like this:

set serveroutput on
declare
    v_starttime timestamp;
    v_endtime timestamp;
begin
    dbms_application_info.set_module(module_name => 'A_MODULE',action_name => 'A-ACTION');
    v_starttime:=current_timestamp;
    for i in 1..1000000000 loop
        for j in 1..1000000000 loop
            for k in 1..10000 loop
                null;
            end loop;
        end loop;
    end loop;
    v_endtime:=current_timestamp;
    dbms_output.put_line('Seconds elapsed Module A: '||to_char(extract(second from v_endtime-v_starttime)));
end;
/

With CPU_COUNT set to 1 (remember this is a dynamic parameter since 11g and this Instance Caging feature requires a Resource Manager plan to be active), two sessions each running scripts like that one setting module A and the other module B are enough to see the effect. On my system, both sessions need about 15 seconds without the plan while module A completes in about 10 seconds vs module B in 20 seconds with the plan active.

Apart from the shown restrictions, there are other useful options available like Active Session Pool, Maximum Estimated Execution Time, Undo Quota and Idle Blocker Time. Each of these can come in handy to tweak or troubleshoot a misbehaving application without having to touch the code. See here for a whole lot of more details.

The demo was done with 12c but works the same in 11g, probably also in 10g. As always: Don’t believe it, test it! 🙂

, ,

1 Comment

orapwd gives OPW-00029 Password complexity failed in #Oracle 12.2

When I tried to create a password file for a 12.2 database, it initially failed with my favorite (simple) password:

[oracle@uhesse dbs]$ orapwd file=$ORACLE_HOME/dbs/orapwprima password=oracle

OPW-00029: Password complexity failed for SYS user : Password must contain at least 8 characters.

Two options to resolve this: Either provide a password that passes the complexity check, like:

[oracle@uhesse dbs]$ orapwd file=$ORACLE_HOME/dbs/orapwprima password=Very_Complex§1

Or create the password file in 12.1 format (default being 12.2 format)

[oracle@uhesse dbs]$ orapwd describe file=orapwprima
Password file Description : format=12.2 
[oracle@uhesse dbs]$ rm orapwprima
[oracle@uhesse dbs]$ orapwd file=$ORACLE_HOME/dbs/orapwprima password=oracle format=12
[oracle@uhesse dbs]$ orapwd describe file=orapwprima
Password file Description : format=12 

The only drawback of the 12.1 format seems to be the lack of support for granting administrative privileges to external users and enabling SSL and Kerberos authentication for administrative users, according to the documentation. Which means for me I will keep my passwords simple 🙂

, ,

2 Comments

%d bloggers like this: