Adding a cluster node will not only increase the available storage capacity but also the total compute power of your cluster. This scale-out is a quite common operation for Exasol customers to do.
My example shows how to change an existing 2+1 cluster into a 3+0 cluster. Before you can enlarge the database with an active node, this node has to be a reserve node first. See here how to add a reserve to a 2+0 cluster. Of course you can add another reserve node to change from 3+0 to 3+1 afterwards. See here if you wonder why you may want to have a reserve node at all.
Initial state – reserve node is present
I start with a 2+1 cluster – 2 active nodes and 1 reserve node:
For later comparison, let’s look at the distribution of rows of one of my tables:
The rows are roughly even distributed across the two active nodes.
Before you continue, it would be a good idea to take a backup on a remote archive volume now – just in case.
Shutdown database before volume modification
A data volume used used by a database cannot be modified while that database is up, so shut it down first:
After going to the Storage branch in EXAoperation, click on the data volume:
Then click on Edit:
Decrease volume redundancy to 1
Change the redundany from 2 to 1, then click Apply:
Why is the redundancy reduced from 2 to 1 here? Let’s try to explain that. Initially, I had 2 active nodes with a volume using redundancy 2:
A and B are master segments while A’ and B’ are mirrored segments. If I could add a node to this volume keeping the existing segments, it would look like this:
Of course this would be a bad idea. The redundancy is reduced to 1 before the new node is added to the volume:
Only distributed master segments with no mirrors at first. Then the redundancy is again increased to 2:
This way, every master segment can be mirrored on a neighbor node. That’s why the redundancy needs to be reduced to 1.
Add new node to volume
After having decreased the volume redundancy to 1, click Edit on the volume detail page again and add n13 as a new master node to the volume and click Apply:
Increase redundancy to 2
Now click Edit again and increase the redudancy to 2:
The state of the volume shows now as RECOVERING – don’t worry, it just means that mirrored segments are now created.
Enlarge the database
Now click on the database link on the EXASolution screen:
Select the Action Enlarge and click Submit:
Enter 1 and click Apply:
The database detail page looks like this now:
Technically, this is a 3+0 cluster now – but the third node doesn’t contain any data yet. If we look at the same table as before, we see that no rows are on the new node:
To change that, a REORGANIZE needs to be done. Either on the database layer, on schema layer or on table layer. Most easy to perform is REORGANIZE DATABASE:
Took me about 10 Minutes on my tiny database. That command re-distributes every table across all cluster nodes and can be time consuming with high data volume. While a table is reorganized, that table is locked against DML. You can monitor the ongoing reorganization by selecting from EXA_DBA_PROFILE_RUNNING in another session.
Let’s check the distribution of the previous table again:
As you can see above, now there are rows on the added node. Also EXAoperation confirms that the new node is not empty any more:
On a larger database, you would see that the volume usage of the nodes is less than before per node and every node is holding roughly the same amount of data. For failsafety, you could add another reserve node now.
Summary of steps
- Add a reserve node (if not yet existing)
- Take a backup on a remote archive volume
- Shutdown database
- Decrease volume redundancy to 1
- Add former reserve node as new master node to the volume
- Increase redundancy to 2
- Enlarge database by 1 active node
- Add another reserve node (optionally)
For development, demonstrations and testing, I need different database environments: Oracle, Postgres and Exasol in the first place. Having them available as VMs on my notebook is quite convenient. I consider my current corporate notebook an upper middleclass one. It’s a Dell Latitude 7480 with 2 cores, an SSD disk and 16 GB memory running Windows 10. Not too shabby but also not extremely powerful.
After having used VirtualBox for years, recently an opportunity came up to become a bit more familiar with Hyper-V, because one of our customers insisted to use only that for a team training. Yes, I’m a bit biased towards VirtualBox. Why do i prefer it over Hyper-V? Because it’s way faster for what I do with it. Especially, I observed that Hyper-V consumes much more CPU resources for the VMs than VirtualBox does. And that slows down everything of course.
For example, when I do an Exasol cluster node installation, it takes more than 30 Minutes with Hyper-V compared to 5 Minutes with VirtualBox! And the setup is the same for both: On my notebook, I create 4 VMs: 1 license server with 1500 MB memory and 3 data nodes each with 2500 MB memory. Each VM gets 1 virtual core. That’s no problem for VirtualBox but Hyper-V struggles and raises the CPU utilization on my notebook to 100% or close during the whole install.
In general, both Hyper-V and VirtualBox can do the same or very similar things. I’m sure there are use cases where Hyper-V performs well on a notebook too, and probably it’s better suited for dedicated virtualization servers than for a notebook anyway. So don’t get me wrong: I do not say VirtualBox is better than Hyper-V overall.
But if you want to run database sandboxes on your notebook, I strongly recommend to use VirtualBox instead of Hyper-V.
After having installed a 2+0 Cluster in Hyper-V, now let’s expand that to a 2+1 Cluster.
Add the node as a VM first
Add another VM in Hyper-V with the same attributes as the existing two data nodes n11 and n12 and name it n13:
- 2500 MB memory, not dynamically extending
- Legacy network adapter, connected to edu-cluster-interconnect
- Network adapter, connected to edu-public-network
- Boot order with legacy network adapter first
- Two hard disk of type VHDX with max. size 100 GB
Give it a static MAC for the Legacy Network Adapter 08:00:27:58:03:21 and a static MAC for normal Network Adapter 08:00:27:71:27:26 and make sure they do not conflict with other existing MAC addresses in your environment.
Copy an existing node in EXAoperation
Go to the nodes branch and click on the link under n11:
On the nodes detail page, click on Copy:
Enter 13 as the node numbers and change the MAC addresses as listed above, then click on Copy Node:
Click on the n0013(13) link and check the disk configuration of the new node. It should look like this:
Don’t forget to set the install flag
On the nodes branch, tick the checkbox of node n13 and Execute the action Set install flag for it:
The state changes to Unknown To install. Now power on n13 in Hyper-V.
After a while the logservice should display the new node getting installed:
This can be time consuming (took me more than 30 Minutes on my notebook ) but in the end, it lists: Boot process finished after x seconds. When you refresh the nodes branch then, the state of the node n13 changes to Running To Install.
Now tick the checkbox of n13 again and Execute the action Set active flag:
The state changes to Running Active for n13, same as for the other nodes. Notice that you can’t tell from this page if a node is an active node or a reserve node.
Add the disk capacity of the new node to the storage service
On the Storage branch, tick the checkbox for n13 and click on Add Unused Disks:
Add the new node as reserve node to the running database
On the database detail pages, click Edit and add n13 as a Reserve node, then click Apply:
The database should look like this now:
Notice that you didn’t need to shutdown the database to add a reserve node to it.
Now your cluster has been extended to a 2+1 Cluster. Next article will explain how you can enlarge the database and make this a 3+0 Cluster.
Stay tuned 🙂