Apache Cassandra™ 1.2

The cassandra-shuffle utility

Shift a single-token-per-node architecture to virtual nodes (vnodes) without downtime.

The cassandra-shuffle utility splits up all the contiguous partition ranges (formerly token ranges) for each node and then randomly distributes them into virtual nodes throughout the cluster. Shuffling is a two-phase operation. The utility first schedules the range transfers and then begins transferring the scheduled ranges. You can shuffle on a per-data center basis and mix virtual node-enabled and non-virtual node data centers.

For a complete description of how it works, see the blog Upgrading an existing cluster to vnodes.

Procedure

In a terminal window:

  1. In the cassandra.yaml file, set the num_tokens parameter.

    A good starting point for this parameter is 256.

  2. Restart the node.

    The node sleeps for RING_DELAY to make sure its view of the ring is accurate, and then splits its current range into the number of specified tokens. However, while the range is split into many tokens, the range remains contiguous; it is still equivalent to what it was before, but with more tokens.

  3. To distribute the tokens, initialize the shuffle operation:
    shuffle create
  4. Starts the transfers:
    shuffle enable
  5. To see what transfers remain at any point:
    shuffle ls