9.2. Updating the Cluster Configuration

Documentation

VoltDB Home » Documentation » Using VoltDB

9.2. Updating the Cluster Configuration

If you choose to change the configuration of your cluster — adding or removing nodes or changing the K-safety value or number of partitions per server — you can save the database as a snapshot, shutdown, edit the deployment file, restart with the new number of servers, and restore the database. (See Chapter 13, Saving & Restoring a VoltDB Database for information on using save and restore).When doing benchmarking, where you need to change the number of partitions or other runtime options, this is the correct approach.

However, if you are simply adding nodes to the cluster to add capacity or increase performance, you can add the nodes while the database is running. Adding nodes "on the fly" is also known as elastic scaling.

9.2.1. Adding Nodes with Elastic Scaling

When you are ready to extend the cluster by adding one or more nodes, you simply start the VoltDB database process on the new nodes using the voltdb add command specifying the name of one of the existing cluster nodes as the host. For example, if you are adding node ServerX to a cluster where ServerA is already a member, you can execute the following command on ServerX:

me@ServerX:~$ voltdb add -l ~/license.xml --host=ServerA 

Once the add action is initiated, the cluster performs the following tasks:

  1. The cluster acknowledges the presence of a new server.

  2. The active application catalog and deployment settings are sent to the new node.

  3. Once sufficient nodes are added, copies of all replicated tables and their share of the partitioned tables are sent to the new nodes.

  4. As the data is redistributed (or rebalanced), the added nodes begin participating as full members of the cluster.

There are some important notes to consider when expanding the cluster using elastic scaling:

  • You must add a sufficient number of nodes to create an integral K-safe unit. That is, K+1 nodes. For example, if the K-safety value for the cluster is two, you must add three nodes at a time to expand the cluster. If the cluster is not K-safe (in other words it has a K-safety value of zero), you can add one node at a time.

  • When you add nodes to a K-safe cluster, the nodes added first will complete steps #1 and #2 above, but will not complete steps #3 and #4 until the correct number of nodes are added, at which point all nodes rebalance together.

  • While the cluster is rebalancing (Step #3), the database continues to handle incoming requests. However, depending on the workload and amount of data in the database, rebalancing may take a significant amount of time.

  • When using database replication (DR), the master and replica databases must have the same configuration. If you use elasticity to add nodes to the master cluster, replication stops. Once rebalancing is complete on the master database, you can restart the replica with additional nodes matching the new master cluster configuration and restart replication.

9.2.2. Configuring How VoltDB Rebalances New Nodes

Once you add the necessary number of nodes (based on the K-safety value), VoltDB rebalances the cluster, moving data from existing partitions to the new nodes. During the rebalance operation, the database remains available and actively processing client requests. How long the rebalance operation takes is dependent on two factors: how often rebalance tasks are processed and how much data each transaction moves.

Rebalance tasks are fully transactional, meaning they operate within the database's ACID-compliant transactional model. Because they involve moving data between two or more partitions, they are also multi-partition transactions. This means that each rebalance work unit can incrementally add to the latency of pending client transactions.

You can control how quickly the rebalance operation completes versus how much rebalance work impacts ongoing client transactions using two attributes of the <elastic> element in the deployment file:

  • The duration attribute sets a target value for the length of time each rebalance transaction will take, specified in milliseconds. The default is 50 milliseconds.

  • The throughput attribute sets a target value for the number of megabytes per second that will be processed by the rebalance transactions. The default is 2 megabytes.

When you change the target duration, VoltDB adjusts the amount of data that is moved in each transaction to reach the target execution time. If you increase the duration, the volume of data moved per transaction increases. Similarly, if you reduce the duration, the volume per transaction decreases.

When you change the target throughput, VoltDB adjusts the frequency of rebalance transactions to achieve the desired volume of data moved per second. If you increase the target throughout, the number of rebalance transactions per second increases. Similarly, if you decrease the target throughout, the number of transactions decreases.

The <elastic> element is a child of the <systemsettings> element. For example, the following deployment file sets the target duration to 15 milliseconds and the target throughput to 1 megabyte per second before starting the database:

<deployment>
   . . .
   <systemsettings>
       <elastic duration="15" throughput="1"/>
   </systemsettings>
</deployment>