Apache Cassandra 1.0 Documentation

Managing a Cassandra Cluster

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

This section discusses routine management and maintenance tasks.

Running Routine Node Repair

The nodetool repair command repairs inconsistencies across all of the replicas for a given range of data. Repair should be run at regular intervals during normal operations, as well as during node recovery scenarios, such as bringing a node back into the cluster after a failure.

Unless Cassandra applications perform no deletes at all, production clusters require periodic, scheduled repairs on all nodes. The hard requirement for repair frequency is the value of gc_grace_seconds. Make sure you run a repair operation at least once on each node within this time period. Following this important guideline ensures that deletes are properly handled in the cluster.

Note

Repair requires heavy disk and CPU consumption. Use caution when running node repair on more than one node at a time. Be sure to schedule regular repair operations for low-usage hours.

In systems that seldom delete or overwrite data, it is possible to raise the value of gc_grace_seconds at a minimal cost in extra disk space used. This allows wider intervals for scheduling repair operations with the nodetool utility.

Adding Capacity to an Existing Cluster

Cassandra allows you to add capacity to a cluster by introducing new nodes to the cluster in stages and by adding an entire data center. When a new node joins an existing cluster, it needs to know:

  • Its position in the ring and the range of data it is responsible for. This is determined by the settings of initial_token when the node first starts up.
  • The nodes it should contact to learn about the cluster and establish the gossip process. This is determined by the setting the seeds when the node first starts up, that is, the nodes it needs to contact to get ring and gossip information about the other nodes in the cluster.
  • The name of the cluster it is joining and how the node should be addressed within the cluster.
  • Any other non-default settings made to cassandra.yaml on your existing cluster should also be made on the new node as well before it is started.

You set the Node and Cluster Initialization Properties in cassandra.yaml file. The location of this file depends on the type of installation; see Cassandra Configuration Files Locations or DataStax Enterprise Configuration Files Locations.

Calculating Tokens For the New Nodes

When you add a node to a cluster, it needs to know its position in the ring. There are several approaches for calculating tokens for new nodes:

  • Add capacity by doubling the cluster size. Adding capacity by doubling (or tripling or quadrupling) the number of nodes is operationally less complicated when assigning tokens. Existing nodes can keep their existing token assignments, and new nodes are assigned tokens that bisect (or trisect) the existing token ranges. For example, when you generate tokens for 6 nodes, three of the generated token values will be the same as if you generated for 3 nodes. You just need to determine the token values that are already in use, and assign the newly calculated token values to the newly added nodes.
  • Recalculate new tokens for all nodes and move nodes around the ring. If doubling the cluster size is not feasible, and you need to increase capacity by a non-uniform number of nodes, you will have to recalculate tokens for the entire cluster. Existing nodes will have to have their new tokens assigned using nodetool move. After all nodes have been restarted with their new token assignments, run a nodetool cleanup in order to remove unused keys on all nodes. These operations are resource intensive and should be planned for low-usage times.
  • Add one node at a time and leave the initial_token property empty. When the initial_token is empty, Cassandra splits the token range of the heaviest loaded node and places the new node into the ring at that position. Note that this approach will probably not result in a perfectly balanced ring, but it will alleviate hot spots.

Adding Nodes to a Cluster

  1. Install Cassandra on the new nodes, but do not start them.
  2. Calculate the tokens for the nodes based on the expansion strategy you are using. You can skip this step if you want the new nodes to automatically pick a token range when joining the cluster.
  3. Set the configuration for the new nodes.
  4. Set the initial_token according to your token calculations (or leave it unset if you want the new nodes to automatically pick a token range when joining the cluster).
  5. Start Cassandra on each new node. Allow a two minutes between node initializations. You can monitor the startup and data streaming process using nodetool netstats.
  6. After the new nodes are fully bootstrapped, assign the new initial_token property value to the nodes that required new tokens, and then run nodetool move <new_token>, one node at a time.
  7. After all nodes have their new tokens assigned, run nodetool cleanup on each of the existing nodes to remove the keys no longer belonging to those nodes. Wait for cleanup to complete on one node before doing the next. Cleanup may be safely postponed for low-usage hours.

Adding a Data Center to a Cluster

The following steps describe adding a data center to an existing cluster. Before starting this procedure, please read the guidelines in Adding Capacity to an Existing Cluster above.

  1. Ensure that you are using NetworkTopologyStrategy for all of your custom keyspaces.

  2. For each new node, edit the configuration properties in the cassandra.yaml file:

    • Set auto_bootstrap to False.
    • Set the initial_token. Be sure to offset the tokens in the new data center; see Generating Tokens.
    • Set the cluster name.
    • Set any other non-default settings.
    • Set the seed lists. Every node in the cluster must have the same list of seeds and include at least one node from each data center. Typically one to three seeds are used per data center.
  3. If using the PropertyFileSnitch, update the cassandra-topology.properties file on all servers to include the new nodes. You do not need to restart.

    The location of this file depends on the type of installation; see Cassandra Configuration Files Locations or DataStax Enterprise Configuration Files Locations.

  4. Ensure that your client does not autodetect the new nodes so that they aren't contacted by the client until explicitly directed. For example in Hector, set hostConfig.setAutoDiscoverHosts(false);

  5. If using a QUORUM consistency level for reads or writes, check the LOCAL_QUORUM or EACH_QUORUM consistency level to see if the level meets your requirements for multiple data centers.

  6. Start the new nodes.

  7. After all nodes are running in the cluster:

    1. Change the strategy_options for your keyspace to the desired replication factor for the new data center. For example: strategy_options={DC1:2,DC2:2}
    2. On each new node, run nodetool repair without the -pr option one at a time.

Changing the Replication Factor

Increasing the replication factor increases the total number of copies of keyspace data stored in a Cassandra cluster.

  1. Update each keyspace in the cluster and change its replication strategy options. For example, to update the number of replicas in Cassandra CLI when using SimpleStrategy replica placement strategy:

      [default@unknown] UPDATE KEYSPACE demo
    WITH strategy_options = {replication_factor:3};
    

    Or if using NetworkTopologyStrategy:

    [default@unknown] UPDATE KEYSPACE demo
    WITH strategy_options = {datacenter1:6,datacenter2:6};
    
  2. On each node in the cluster, run nodetool repair for each keyspace that was updated. Wait until repair completes on a node before moving to the next node.

Replacing a Dead Node

To replace a node that has died (due to hardware failure, for example), bring up a new node using the token of the dead node as described in the next procedure. This token used must already be part of the ring.

To replace a dead node:

  1. Confirm that the node is dead using the nodetool ring command on any live node in the cluster.

    Trying to replace a node using a token from a live node results in an exception. The nodetool ring command shows a Down status for the token value of the dead node:


    ../../_images/operations_nodering.png
  2. Install Cassandra on the replacement node.

  3. Remove any pre-existing Cassandra data on the replacement node:

    sudo rm -rf /var/lib/cassandra/*
    
  4. Configure any non-default settings in the node's cassandra.yaml to match your existing cluster.

  5. Set the initial_token in the cassandra.yaml file to the value of the dead node's token -1. Using the value from the graphic in step 1, this is 28356863910078205288614550619314017621-1:

    initial_token: 28356863910078205288614550619314017620
    
  6. Start the new node.

  7. After the new node is up, run nodetool repair on each keyspace to ensure the node is fully consistent. For example:

    $ nodetool repair -h 10.46.123.12 keyspace_name -pr
    
  8. Remove the dead node.