Apache Cassandra 1.1 Documentation

Expanding a Cassandra AMI cluster

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

This section contains instructions for expanding a cluster that uses the DataStax Community Edition AMI (Amazon Machine Image) prior to Cassandra 1.2. If your AMI uses Cassandra 1.2, see the 1.2 instructions.

As a best practice when expanding a cluster, DataStax recommends doubling the size of the cluster size with each expansion. You should calculate the node number and token for each node based on the final ring size. For example, suppose that you ultimately want a 12 node cluster, starting with three node cluster. The initial three nodes are 0, 4, and 8. For the first expansion, you double the number of nodes to six by adding nodes, 2, 6, and 10. For the final expansion, you add six more nodes: 1, 3, 5, 7, 9, and 11. Using the tokengentool, calculate tokens for 12 nodes and enter the corresponding values in the initial_token property in the cassandra.yaml file. For more information, see Adding Capacity to an Existing Cluster.

Steps to expand a Cassandra AMI cluster

  1. In the AWS Management Console, create the number of nodes you need in another cluster with a temporary name. See Installing a Cassandra Cluster on Amazon EC2.

    The temporary name prevents the new nodes from joining the cluster with the wrong tokens.

  2. After the nodes are initialized, login to each node and stop the service:

    sudo service cassandra stop
    
  3. Clear the data in each node:

    1. Check the cassandra.yaml for the location of the data directories:

      data_file_directories:
         - /raid0/cassandra/data
      
    2. Remove the data directories:

      sudo rm -rf /raid0/cassandra/*
      

    You must clear the data because new nodes have existing data from the initial start with the temporary cluster name and token.

  4. For each node, change the /etc/dse/cassandra/cassandra.yaml settings to match the cluster_name and - seeds list of the other cluster. For example:

    cluster_name: 'NameOfExistingCluster'
    ...
    initial_token: 28356863910078205288614550619314017621
    ...
    seed_provider:
        - class_name: org.apache.cassandra.locator.SimpleSeedProvider
          parameters:
              - seeds: "110.82.155.0,110.82.155.3"
    
  5. Assign the correct initital_token to each node; it determines the node's placement within the ring.

  6. If adding nodes to an existing data center:

    1. Set auto_bootstrap: true. (If auto_bootstrap is not in the cassandra.yaml file, it automatically defaults to true.)
    2. Go to step 8.
  7. If adding nodes to a new data center:

    1. Set auto_bootstrap: false.

    2. After all new nodes have joined the ring, do either of the following:

      • Run a rolling nodetool repair on all new nodes in the cluster.
      • Run a rolling nodetool repair -pr on all nodes in the cluster.

      For example:

      $ nodetool repair -h 10.46.123.12 keyspace_name -pr
      
  8. Start each node in two minute intervals:

    $ sudo service cassandra start
    
  9. Verify that each node has finished joining the ring:

    nodetool -h <hostname> ring
    

    ../../_images/auto_bootstrap_finished.png

    Each node should show Up not Joining.