Posts Tagged Exasol
With Python integrated to the database, you can utilize the power of SQL together with one of the most popular programming languages. And you run the program where the data is: Inside the database. Instead of having to bring the data to the program.
The Exasol database integrates the popular programming languages Python, Java and R as UDF scripting languages, together with the less known but powerful and elegant programming language Lua.
Let’s look at an example how Python can be used in Exasol, together with SQL:
--/ create or replace python3 scalar script find_books(keyword varchar(2000000)) emits (title varchar(2000000)) as import urllib.request import urllib.parse import json import ssl
ssl._create_default_https_context = ssl._create_unverified_context def run(ctx): with urllib.request.urlopen('https://www.googleapis.com/books/v1/volumes?maxResults=40&q='+ urllib.parse.quote_plus(ctx.keyword)) as url: s = url.read() data = json.loads(s) for item in data["items"]: ctx.emit(item["volumeInfo"]["title"]) /
We’re using the free google api here to access book titles. That script is then called like this:
How about sorting that list after length of title? That’s something SQL can do very well:
select length(title), title from (select find_books('discworld')) order by 1;
This should give you an idea about the endless and powerful options you have by combining Python with SQL, both integrated in the database.
By the way: We’re happy to educate you on this for free with our online learning course Exasol Advanced Analytics 🙂
Exasol leads the categories Performance, Platform Reliability and Support Quality for Analytical Database products. And we get a 100% recommendation score from the 782 customers in the survey.
So it’s not one of the big names in the industry who comes out on top of this survey. Not Oracle, not Teradata, not Snowflake, not SAP Hana leads in Analytical Databases but Exasol!
Customer quote: “Unbelievable query performance with almost zero administration effort. You just have to experience it yourself. Once you see it for yourself, you won’t want to work with any other database.”
- Exasol is the world’s fastest analytical database
- Exasol is reliable and easy to maintain
- Exasol’s services and attitude towards customers are highly appreciated
Compare that with your legacy platform: It’s time to contact us now!
The TPC-H Benchmark is for Decision Support Systems. It’s described very detailed on the TPC.ORG site, but you may find it quite an effort to generate the data and prepare the SQL for table creation and reporting.
At least I did, which is why I thought having that all ready for download and run would be helpful.
What I have prepared for Oracle and for Exasol is:
- The data files (CSV format) for the 1 GB TPC-H
- The DDL for the TPC-H tables
- The loader commands to populate these tables
- The 22 queries for the TPC-H benchmark
You can download it here:
The data volume is of course quite small for a production data warehouse but ideal for quick testing and self-education. I’m using it together with VirtualBox and VMs on my notebook with 16 GB memory.
See here for a demo – I’m setting up the TPC-H for both Oracle and Exasol and then I do a comparison:
Some remarks about the comparison:
I’m an Exasol employee and the outcome is very positive for Exasol.
Never the less, I tried to do a fair comparison. It’s just running the pure 22 SELECT statements, no tuning, no tweaking of the Exasol database or the underlying VM.
The Oracle version is quite recent (18.3) but not the most recent, same with the Exasol version (6.2), not the just released Exasol 7.0.
As you can see, the Exasol database is out of the box about 6 times faster than the Oracle database for the same workload having the same hardware resources – without any tuning.
I suppose you could get better performance from Oracle for the 22 queries with some effort, like analyzing the workload, adding indexes of the certain available types, partitioning the tables, adding SQL Profiles and Optimizer Directives etc.
The point is, that’s all not required with Exasol. I just run the workload twice and everything is self-optimized afterwards.
You could call this an autonomous database 😉
It’s totally easy to reproduce the test for yourself: Just download our free Community Edition; it’s what I’m using in this benchmark.
Keep in mind that this is a Decision Support System benchmark with an analytical workload. Oracle looks much better in a comparison with an OLTP workload.
But for analytics: Exasol stands behind nobody.
To all the other vendor’s presales consultants out there who encounter us on a PoC: Good luck 🙂