Easy way to create large demo-tables in #Exasol and #Oracle

If you need a large set of data to test or demonstrate something, this does the trick:

Create demo table in Oracle

create table t as
select 
rownum as id, -- growing unique number
'Just some text' as textcol,
mod(rownum,5) as numcol1, -- numbers 0-4 round-robin
mod(rownum,1000) as numcol2 , -- numbers 0-999 round robin
5000 as numcol3, -- constant
to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'dd.mm.yyyy') as time_id -- dates for every month 
from dual connect by level<=1e4 -- 10000 rows 
; 

Create demo table in Exasol

create table t as
select 
rownum as id, -- growing unique number
'Just some text' as textcol,
mod(rownum,5) as numcol1, -- numbers 0-4 round-robin
mod(rownum,1000) as numcol2 , -- numbers 0-999 round robin
5000 as numcol3, -- constant
to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'dd.mm.yyyy') as time_id -- dates for every month 
from (select 1 from dual connect by level<=1e4) -- 10000 rows 
; 

In Oracle you may get this error message for a high number of rows (like 10 Mio): ORA-30009: Not enough memory for CONNECT BY operation.
This way it works in spite of it:

SQL> alter session set workarea_size_policy=manual;

Session altered.

SQL> alter session set sort_area_size=1000000000;

Session altered.

SQL> create table t as
     select 
     rownum as id, 
     'Just some text' as textcol,
     mod(rownum,5) as numcol1, 
     mod(rownum,1000) as numcol2 , 
     5000 as numcol3, 
     to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'dd.mm.yyyy') as time_id  
     from dual connect by level<=1e7 ; 
Table created.
 
SQL> select count(*) from t;

  COUNT(*)
----------
  10000000

SQL> select bytes/1024/1024 as mb from user_segments where segment_name='T';

        MB
----------
       480

In Exasol, this problem doesn’t surface:

SQL_EXA> create table t as
         select
         rownum as id,
         'Just some text' as textcol,
         mod(rownum,5) as numcol1,
         mod(rownum,1000) as numcol2 ,
         5000 as numcol3,
         to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'dd.mm.yyyy') as time_id
         from (select 1 from dual connect by level<=1e7) ; 
EXA: create table t as... 
Rows affected: 10000000 
SQL_EXA> select mem_object_size/1024/1024 as mb from exa_all_object_sizes where object_name='T';
EXA: select mem_object_size/1024/1024 as mb from exa_all_object_sizes where...

MB
----------------------------------
           50.03943824768066406250

1 row in resultset.

And the resulting table is smaller because Exasol stores it in columnar compressed format 🙂

,

  1. #1 von Martin Preiss am Juli 20, 2018 - 09:00

    Hi Uwe,

    just for the sake of completeness: when I encounter this kind of memory problems in Oracle and don’t want to resize the memory, I usually use the strategy described by Tanel Poder in https://blog.tanelpoder.com/2008/06/08/generating-lots-of-rows-using-connect-by-safely/ (using multiple generator queries and creating a cartesian join).

    Regards

    Martin

  2. #2 von Uwe Hesse am Juli 20, 2018 - 12:47

    Thank you, Martin 🙂

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden /  Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden /  Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden /  Ändern )

Verbinde mit %s

Diese Seite verwendet Akismet, um Spam zu reduzieren. Erfahre, wie deine Kommentardaten verarbeitet werden..