After having worked with both databases for quite some time, this is what I consider to be the key differences between Oracle and Exasol. Of course the two have much in common: Both are relational databases with a transaction management system that supports the ACID model and both follow the ANSI SQL standard – both with some enhancements. Coming from Oracle as I do, much in Exasol looks quite familiar. But let’s focus on the differences:
Oracle is leading technology for Online Transaction Processing (OLTP). If you have a high data volume with many users doing concurrent changes, this is where Oracle shines particularly.
Exasol is leading technology for analytical workloads. If you want to do real-time ad hoc reporting on high data volume, this is where Exasol shines particularly.
Data Format & In-Memory processing
Oracle uses a row-oriented data format, which is well suited for OLTP but not so much for analytical workloads. That’s why Hybrid Columnar Compression (only available on Engineered Systems respectively on Oracle proprietary storage) and the In-Memory Column Store (extra charged option) have been added in recent years.
Exasol uses natively a compressed columnar data format and processes this format in memory. That is very good for analytical queries but bad for OLTP because one session that does DML on a table locks that table against DML from other sessions. Read Consistent SELECT is possible for these other sessions, though.
Oracle was designed for OLTP at times when memory was scarce and expensive. Exasol was designed to process analytical workloads in memory.
Oracle started as a non-clustered (single instance) system. Real Application Clusters (RAC) have been added much later. The majority of Oracle installations is still non-clustered. RAC (extra charged option) is rather an exception than the rule. Most RAC installations are 2-node clusters with availability as the prime reason, scalability being rather a side aspect.
Exasol was designed from the start to run on clustered commodity Intel servers. Prime reasons were MPP performance and scalability with availability being rather a side aspect.
This doesn’t matter for most Oracle installations, only for RAC. Here, Oracle uses a shared disk architecture while Exasol uses a shared nothing architecture, which is optimal for performance because every Exasol cluster node can operate on a different part of the data in parallel. Drawback is that after adding nodes to an Exasol cluster, the data has to be re-distributed.
With Exadata, Oracle tries to compensate the performance disadvantage of the shared disk architecture by enabling the storage servers to filter data locally for analytical workloads. This approach leads to better performance than Oracle can deliver on other (non-proprietary) platforms.
Availability & Recoverability
Clearly, Oracle is better in this area. A non-clustered Oracle database running in archive log mode will enable you to recover every single committed transaction you did since you took the last backup. With Exasol, you can only restore the last backup and all changes since then are lost. You can safeguard an Oracle database against site failure with a standby database at large distance without performance impact. Exasol doesn’t have that. With RAC, you can protect an Oracle database against node failure. The database stays up (the Global Resource Directory is frozen for a couple of seconds, though) upon node failure with no data loss.
If an Exasol cluster node fails, this leads to a database restart. Means no availability for a couple of seconds and all sessions get disconnected. But also no data loss. Optionally, Exasol can be configured as Synchronous Dual Data Center – similar to Oracle’s Extended RAC.
Complexity & Manageability
I realized that there’s a big difference between Exasol and Oracle in this area when I was teaching an Exasol Admin class recently: Some seasoned Oracle DBAs in the audience kept asking questions like “We can do this and that in Oracle, how does that work with Exasol?” (e.g. creating Materialized Views or Bitmap Indexes or an extra Keep Cache) and my answer was always like “We don’t need that with Exasol to get good performance”.
Let’s face it, an Oracle database is probably one of the most complex commercial software products ever developed. You need years of experience to administer an Oracle database with confidence. See this recent Oracle Database Administration manual to get an impression. It has 1690 pages! And that’s not yet Real Application Clusters, which is additionally 492 pages. Over 2100 pages of documentation to dig through, and after having worked with Oracle for over 20 years, I can proudly say that I actually know most of it.
In comparison, Exasol is very easy to use and to manage, because the system takes care of itself largely. Which is why our Admin class can have a duration of only two days and attendees feel empowered to manage Exasol afterwards.
That was intentionally so from the start: Exasol customers are not supposed to study the database for years (or pay someone who did) in order to get great performance. Oracle realized that being complex and difficult to manage is an obstacle and came out with the Autonomous Database – but that is only available in the proprietary Oracle Cloud.
Using comparable hardware and processing the same (analytical) workload, Exasol outperforms any competitor. That includes Oracle on Exadata. Our Presales consultants regard Exadata as a sitting duck, waiting to get shot on a POC. I was personally shocked to learn that, after drinking the Oracle Kool-Aid myself for years.
In my opinion, these two points are most important: Exasol is faster and at the same time much easier to manage! I mean anything useless could be easy to manage, so that’s not an asset on its own. But together with delivering striking performance, that’s really a big deal.
This is and has always been a painpoint for Oracle customers: The licensing of an Oracle database is so complex and fine granular that you always wonder “Am I allowed to do this without violating my license? Do we really need these features that we paid for? Are we safe if Oracle does a License Audit?” With Exasol, all features are always included and the two most popular license types are totally easy to understand: You pay either for the data volume loaded into the cluster or for the amount of memory assigned to the database. No sleepless nights because of that!
This topic becomes increasingly important as many of our new customers want to deploy Exasol in the cloud. And you may have noticed that Oracle pushes going cloud seriously over the last years.
Exasol runs with all features enabled in the cloud: You can choose between Amazon Web Services, (AWS), Microsoft Azure and ExaCloud
This is presently the most popular way our customers run Exasol in the cloud. See here for more details.
Microsoft’s cloud can also be used to run Exasol, which gives you the option to choose between two major public cloud platforms. See here for more details.
Hosted and managed by Exasol, ExaCloud is a full database-as-a-service offering. See here for more details.
Hybrid Exasol deployments that combine cloud with on-prem can also be used, just depending on customer requirements.
Oracle offers RAC only on the Oracle Cloud platform, not on public clouds. Various other features are also restricted to be available only in Oracle’s own cloud. The licensing model has been tweaked to favor the usage of Oracle’s own cloud over other public clouds generally.
Customers love Exasol, as the recent Dresner report confirms. We get a perfect recommendation score. I can also tell that from personal encounters: Literally every customer I met is pleased with our product and our services!
Oracle is great for OLTP and okay for analytical workloads – especially if you pay extra for things like Partitioning, RAC, In-Memory Column Store and Exadata. Then the performance you get for your analytical workload might suit your present demand.
Exasol is totally bad for OLTP but best in the world for analytical workloads. Do you think your data volume and your analytic demands will grow?
The technique to recover only certain objects from an ordinary backup is called Virtual Access Restore. Means you create a database from backup that contains only the minimum elements needed to access the objects you request. This database is then removed afterwards.
Let’s see an example. This is my initial setup:
One database in a 2+1 cluster. Yes it’s tiny because it lives on my notebook in VirtualBox. See here how you can get that too.
It uses the data volume v0000 and I took a backup into the archive volume v0002 already.
I have a schema named RETAIL there with the table SALES:
By mistake, that table gets dropped:
And I’m on AUTOCOMMIT, otherwise this could be rolled back in Exasol. Virtual Access Restore to the rescue!
First I need another data volume:
Notice the size of the new volume: It is smaller than the overall size of the backup respectively the size of the “production database”! I did that to prove that space is not much of a concern here.
Then I add a second database to the cluster that uses that volume. The connection port (8564) must be different from the port used by the first database and the DB RAM in total must not exceed the licensed size, which is limited to 4 GB RAM in my case:
I did not start that database because for the restore procedure it has to be down anyway. Clicking on the DB Name and then on the Backups button gets me here:
No backup shown yet because I didn’t take any backups with exa_db2. Clicking on Show foreign database backups:
The Expiration date must be empty for a Virtual Access Restore, so I just remove it and click Apply. Then I select the Restore Type as Virtual Access and click Restore:
This will automatically start the second database:
I connect to exa_db2 with EXAplus, where the Schema Browser gives me the DDL for the table SALES:
I take that to exa_db1 and run it there, which gives me the table back but empty. Next I create a connection from exa_db1 to exa_db2 and import the table
create connection exa_db2 to '192.168.43.11..13:8564' user 'sys' identified by 'exasol'; import into retail.sales from exa at exa_db2 table retail.sales;
This took about 2 Minutes:
The second database and then the second data volume can now be dropped. Problem solved!
Exasol introduced Partitioning in version 6.1. This feature helps to improve the performance of statements accessing large tables. As an example, let’s take these two tables:
Say t2 is too large to fit in memory and may get partitioned therefore.
In contrast to distribution, partitioning should be done on columns that are used for filtering:
ALTER TABLE t2 PARTITION BY WhereCol;
Now without taking distribution into account (on a one-node cluster), the table t2 looks like this:
Notice that partitioning changes the way the table is physically ordered on disk.
A statement like
SELECT * FROM t2 WHERE WhereCol=’A’;
would have to load only the red part of the table into memory. This may show benefits on a one-node cluster as well as on multi-node clusters. On a multi-node cluster, a large table like t2 is distributed across the active nodes. It can additionally be partitioned also. Should the two tables reside on a three-node cluster with distribution on the JoinCol columns and the table t2 partitioned on the WhereCol column, they look like this:
That way, each node has to load a smaller portion of the table into memory if statements are executed that filter on the WhereCol column while joins on the JoinCol column are still local joins.
EXA_(USER|ALL|DBA)_TABLES shows both the distribution key and the partition key if any.
Notice that Exasol will automatically create an appropriate number of partitions – you don’t have to specify that.