Archiv für die Kategorie TOI
When to use Bindvariables or Literals
From many Oracle Database Administration courses and Performance Tuning courses that I have tought in the past, I can tell that a very basic (as we should assume) knowledge about the appropriate use of Bindvariables resp. Literals is missing, even among people who have dealt with Oracle Databases for years. So I have decided to cover that extremely important topic as short but illustrative as possible.
If a SELECT statement is processed, this is done in the steps 1) Parse 2) Bind 3) Execute 4) Fetch. The Parse step contains the computation of an execution plan and is relatively expensive for statements with a short duration. Typically, in an OLTP like Database, statements have a runtime of the fraction of a second. Many users run these statements at the same time. Avoiding the parse step is therefore a huge benefit and reduces the overall runtime greatly. The statements are cached in the Shared Pool, together with their execution plan and can be reexecuted with the same plan, if exactly the same statement occurs again. Thats where Bindvariables (step 2) kick in. They can be used to get exactly the same statements that need not to be parsed again:
SQL> create table parseornot as select rownum as id from dual connect by level<=1e4;
Table created.
SQL> alter system flush shared_pool -- to have a level playing field ;
System altered.
create or replace procedure goodbinds
as
v_id number;
begin
for i in 1..10000 loop
select id into v_id from parseornot where id=i;
end loop;
end;
/
Procedure created.
Apart from the complete senselessnes of the SELECT inside this PL/SQL procedure, it uses Bindvariables (the i is one). If called, that SELECT gets parsed once and then executed 10000 times. Please notice also that there is only one possible execution plan for this statement – there is no index on the table.
SQL> set timing on SQL> exec goodbinds PL/SQL procedure successfully completed. Elapsed: 00:00:06.36 SQL> select count(*),count(parse_calls) from v$sql where lower(sql_text) like '%from parseornot%' and sql_text not like '%like%'; COUNT(*) COUNT(PARSE_CALLS) ---------- ------------------ 1 1 col sql_text format a70 select sql_text,executions from v$sql where lower(sql_text) like '%from parseornot%' and sql_text not like '%like%'; SQL_TEXT EXECUTIONS ---------------------------------------------------------------------- ---------- SELECT ID FROM PARSEORNOT WHERE ID=:B1 10000
In this case, the usage of literals resp. dynamic SQL is not advisable:
SQL> alter system flush shared_pool -- to have a level playing field ; System altered. create or replace procedure badlits as v_id number; begin for i in 1..10000 loop execute immediate 'select id from parseornot where id=' || i into v_id; end loop; end; / Procedure created. SQL>exec badlits PL/SQL procedure successfully completed. Elapsed: 00:00:47.13
Although the above procedure does exactly the same (senseless) thing, it has a much longer runtime. Why? Because:
SQL> select count(*),count(parse_calls) from v$sql where lower(sql_text) like '%from parseornot%' and sql_text not like '%like%'; COUNT(*) COUNT(PARSE_CALLS) ---------- ------------------ 10000 10000 SQL> col sql_text format a70 select sql_text,executions from v$sql where lower(sql_text) like '%from parseornot%' and sql_text not like '%like%' and rownum<10;
I limit the output because I don’t want 10000 rows on the screen.
SQL_TEXT EXECUTIONS ---------------------------------------------------------------------- ---------- select id from parseornot where id=1012 1 select id from parseornot where id=7028 1 select id from parseornot where id=9988 1 select id from parseornot where id=5318 1 select id from parseornot where id=8209 1 select id from parseornot where id=4986 1 select id from parseornot where id=2807 1 select id from parseornot where id=5990 1 select id from parseornot where id=1159 1 9 rows selected.
We have just seen a classical example why Bindvariables are good and literals are bad – under certain circumstances. It would be wrong, though, to generalize that and simply always use Bindvariables, as they can be not appropriate as well. If statements have a relatively long runtime, the percentual part of the parse step contributing to the runtime gets neglectable. In Data Warehouse like Databases, statements have runtimes in the ranges of minutes, so it doesn’t matter much if they got parsed for a fraction of a second. Also, few users are issuing statements and often statements are ad hoc constructed and will not be repeated (soon). In such an environment, best execution plans are important and literals are preferrable.
Why? Because the execution plan gets computed (Step 1) before the content of the Bindvariable gets bound (Step 2). If there are multiple execution plans for the statement, depending of the value of the (Bind-)variable, that is extremely important:
SQL> select channel_id,count(*) from sales group by channel_id; CHANNEL_ID COUNT(*) ---------- ---------- 1 1 2 4128400 3 8645247 4 1894656 9 33184
Above constructs a typical situation where Bindvariables are evil and Literals are the way to go. There is a (Bitmap) Index on the column channel_id. The value 1 is extremely selective. The value 3 is extremely unselective. Depending on the value of channel_id in a where condition, different execution plans are optimal:
SQL> select max(amount_sold) from sales where channel_id=1; MAX(AMOUNT_SOLD) ---------------- 1232.16 Elapsed: 00:00:00.03 Execution Plan ---------------------------------------------------------- Plan hash value: 3388944652 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 3306 (1)| 00:00:40 | | 1 | SORT AGGREGATE | | 1 | 8 | | | | 2 | TABLE ACCESS BY INDEX ROWID | SALES | 17995 | 140K| 3306 (1)| 00:00:40 | | 3 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 4 | BITMAP INDEX SINGLE VALUE | SALES_BIX | | | | | ------------------------------------------------------------------------------------------- SQL> select max(amount_sold) from sales where channel_id=3; MAX(AMOUNT_SOLD) ---------------- 1782.72 Elapsed: 00:00:03.57 Execution Plan ---------------------------------------------------------- Plan hash value: 1047182207 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 19464 (1)| 00:03:54 | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | TABLE ACCESS FULL| SALES | 8649K| 65M| 19464 (1)| 00:03:54 | ----------------------------------------------------------------------------
When we use literals (1 resp. 3), the optimizer can pick the appropriate execution plan during the parse step. Here, Bindvariables are bad and Literals are good:
create or replace procedure badbinds (v_id number) as v_max_amount_sold number; begin select max(amount_sold) into v_max_amount_sold from sales where channel_id=v_id; dbms_output.put_line(to_char(v_max_amount_sold)); end; / alter system flush shared_pool; alter system flush buffer_cache; SQL> set timing on SQL> set serveroutput on SQL> exec badbinds(3) 1782.72 PL/SQL procedure successfully completed. Elapsed: 00:00:03.43 SQL> exec badbinds(1) 1232.16 PL/SQL procedure successfully completed. Elapsed: 00:00:02.97
The usage of the Bindvariables refrain the optimizer from using the index in the second call of the procedure, as you can tell by runtime. With literals, the same outcome takes much less time because now the index can be used:
create or replace procedure goodlits (v_id number) as v_max_amount_sold number; begin execute immediate 'select max(amount_sold) from sales where channel_id=' || v_id into v_max_amount_sold; dbms_output.put_line(to_char(v_max_amount_sold)); end; / alter system flush shared_pool; alter system flush buffer_cache; SQL> exec goodlits(3) 1782.72 PL/SQL procedure successfully completed. Elapsed: 00:00:03.96 SQL> exec goodlits(1) 1232.16 PL/SQL procedure successfully completed. Elapsed: 00:00:00.05
When I demonstrated the above the last time, one student got to the conclusion: „So in Oracle, Bindvariables make it impossible to use indexes!“ That is of course a misconception. If there where only one good execution plan for the statement (index access) then it will be chosen with Bindvariables as well. The point is: If there are multiple good execution plans for one statement, then Bindvariables will probably lead to suboptimal execution plans. As a Developer, you should be aware of the kind of Database you are developing statements against and choose Bindvariables resp. Literals as appropriate. They are both not good or evil per se.
I have intentionally not mentioned CURSOR_SHARING=FORCE/SIMILAR, Bind Peeking and Adaptive Cursor Sharing in order to keep the posting as simple and clear as possible and to get to the point as exactly as I could. The three techniques have in common that they can reduce the harm of an inappropriate usage of Literals resp. Bindvariables to some degree.
This will be my last posting in 2009, I guess, and I could hardly find the time for it due to much other work. I wish you all a Merry Christmas & a Happy New Year 2010 🙂
Result Cache: Another brilliant 11g New Feature
I have just finished an Oracle Database 11g New Features course in Vienna. There are many amazing New Features in the 11g version, one of them is the possibility to cache the result sets of statements, that access large tables but return relatively few rows. Think of it like automagically created materialized views inside the SGA. Example:
SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production
SQL> select bytes/1024/1024 as mb from user_segments where segment_name='SALES'; MB ---------- 560
SQL> alter table sales result_cache (mode force);
Table altered.
The ALTER TABLE statement is a new feature of 11g Release 2. In Release 1, you can control the feature only with the following parameters:
SQL> show parameter result_cache_m
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
result_cache_max_result integer 5
result_cache_max_size big integer 2080K
result_cache_mode string MANUAL
Another possibility, already introduced in 11g R1 is the RESULT_CACHE hint. Now let’s look at the effect of the ALTER TABLE statement:
SQL> set timing on
SQL> select channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926004170
9 4438820.16
Elapsed: 00:00:03.72
This was the first time, the sales table was accessed after the ALTER TABLE above. The runtime signalizes we have got a full table scan here (there are no indexes on the table anyway). Of course, blocks of the table are now cached in the database buffer cache – as in previous versions. But now, also the result set is cached!
SQL> select channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926004170
9 4438820.16
Elapsed: 00:00:00.01
That is obvious by runtime already, so I omit AUTOTRACE here. If the table gets changed, the result set gets „stale“, similar like a materialized view would:
SQL> update sales set amount_sold=1 where rownum<2;
1 row updated.
Elapsed: 00:00:00.02
SQL> commit;
Commit complete.
Elapsed: 00:00:00.00
SQL> select channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926002938
9 4438820.16
Elapsed: 00:00:03.08
Second access after the DML will again use the (newly cached) result set – even if the statement is slightly different, in the same way as materialized views can be used for query rewrite, even if the SELECT differs from the query that built the materialized view. That gives me the opportunity to introduce the new SUPERFAST hint 🙂
SQL> select /*+ superfast */ channel_id,sum(amount_sold) from sales group by channel_id;
CHANNEL_ID SUM(AMOUNT_SOLD)
---------- ----------------
2 421541477
4 219308832
3 926002938
9 4438820.16
Elapsed: 00:00:00.00
The „hint“ is just a strange commentar for the engine and is simply ignored, of course. The speed comes from using the cached result set, stored in the SGA by the previous SELECT with the 3 seconds runtime.
Addendum: See these two fine postings by Alex Fatkulin that talk about the 11gR2 improvements regarding the latch contention issue that surfaced with the Result Cache feature in 11gR1:
11GR2 Result Cache Scalability
Result Cache Latch in 11GR2: Shared Mode Gets
Thanks for the comments below that mentioned possible drawbacks of Result Caching! Much appreciated 🙂
Automatic DOP in 11gR2
We have a probably very needful new feature introduced in 11g Release 2, related to parallel query: Automatically determined Degree of Parallelism (DOP). In earlier versions of the Oracle Database, we had to determine the DOP more or less manually, either with a parallel hint or by setting a parallel degree with alter table:
select /*+ parallel (sales,2) */ * from sales;
or
alter table sales parallel 2;
There was an automatic computation of the DOP available, derived from the simple formula CPU_COUNT * PARALLEL_THREADS_PER_CPU. That is what’s done internally if we would have said
select /*+ parallel (sales) */ * from sales;
or
alter table sales parallel;
The drawback with these approaches was always, that we could hardly be sure, whether the DOP is appropriate or not for the table, the statement and the hardware, we are running on. It was mostly a case of try & error. Especially problematic was the alter table approach, as this leads to the parallelization of each and every following select on those tables, even if totally inappropriate. A popular pitfall is the creation of tables with a parallel clause on OLTP-systems, because those tables inherit the parallel degree of their creation, which leads to parallel query for every statement accessing the tables afterwards – most likely not desirable for OLTP. In so far, parallelization was quite dumb (from a system-internal perspective) in versions before 11gR2. Now to the new feature:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL> select bytes/1024/1024 as mb from user_segments where segment_name='SALES';
MB
----------
563
SQL> select degree from user_tables where table_name='SALES';
DEGREE
----------------------------------------
1
For my tiny machine, this table is huge. I allow automatic determination of the DOP with the following new dynamic parameter:
SQL> alter session set parallel_degree_policy=auto; Session altered. SQL> set autotrace on explain SQL> select sum(amount_sold) from sales SUM(AMOUNT_SOLD) ---------------- 1571293299 Execution Plan ---------------------------------------------------------- Plan hash value: 3130505568 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 5 | 10798 (1)| 00:02:10 | | 1 | SORT AGGREGATE | | 1 | 5 | | | | 2 | PX COORDINATOR | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 5 | | | | 4 | SORT AGGREGATE | | 1 | 5 | | | | 5 | PX BLOCK ITERATOR | | 14M| 70M| 10798 (1)| 00:02:10 | | 6 | TABLE ACCESS FULL| SALES | 14M| 70M| 10798 (1)| 00:02:10 | ----------------------------------------------------------------------------------- Note ----- - automatic DOP: Computed Degree of Parallelism is 2 because of degree limit
I have got a moderate DOP, most likely appropriate for my relatively weak hardware, but still speeding up the query on the relatively big table. I could always override the automatic DOP determination by specifying a parallel hint as in earlier versions. [See in my newer posting why the sentence here was crossed] Also, the parameter defaults to manual, so unless we change it, automatic parallelization will not take place. In order to demonstrate the quite intelligent computation of the DOP, compared to earlier versions, i will access another, much smaller table in the same session:
SQL> select count(*) from customers; COUNT(*) ---------- 30501 Execution Plan ---------------------------------------------------------- Plan hash value: 296924608 ------------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 218 (0)| 00:00:03 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| CUSTOMERS | 30501 | 218 (0)| 00:00:03 | ------------------------------------------------------------------------ Note ----- - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold
As you can see, in this case, the system does not think that parallelization is appropriate for the select. So it is much smarter than the old force parallel query:
SQL> alter session set parallel_degree_policy=manual;
Session altered.
SQL> alter session force parallel query;
Session altered.
SQL> select count(*) from customers;
COUNT(*)
----------
30501
Execution Plan
----------------------------------------------------------
Plan hash value: 1221513835
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 121 (0)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | | |
| 4 | SORT AGGREGATE | | 1 | | |
| 5 | PX BLOCK ITERATOR | | 30501 | 121 (0)| 00:00:02 |
| 6 | TABLE ACCESS FULL| CUSTOMERS | 30501 | 121 (0)| 00:00:02 |
----------------------------------------------------------------------------
Also, we have a remedy now against inappropriate parallel degrees on tables:
SQL> alter session enable parallel query -- the default, no force;
Session altered.
SQL> alter table customers parallel -- would cause parallel query before 11gR2;
Table altered.
SQL> set autotrace on explain
SQL> select count(*) from customers;
COUNT(*)
----------
30501
Execution Plan
----------------------------------------------------------
Plan hash value: 296924608
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 218 (0)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| CUSTOMERS | 30501 | 218 (0)| 00:00:03 |
------------------------------------------------------------------------
Note
-----
- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold
