Testing Cassandra 1000 Nodes at a Time

By Russell Spitzer -  April 28, 2014 | 6 Comments

Testing at Scale

Working as Software Engineers in Test at Datastax requires that we demonstrate that Datastax Enterprise is able to scale to handle "Big Data" but what does "Big Data" really mean? Some marketing departments would like "Big" to mean 100GB and others may just state that "Big Data" requires "10's of machines" to host. While these might be reasonable thresholds for some definitions of "Big", these definitions are actually quite conservative compared to many actual use cases of Datastax Enterprise. For these truly large systems we want to demonstrate that our users can trust Datastax Enterprise, and to that end we perform what we call the "1000 Node Test".

A snapshot of Opscenter Monitoring 1000 Nodes

1000 Node Test Plan

Every major release of DataStax Enterprise sets in motion a variety of regression, performance, and scalability tests. This effort culminates with running a variety of tests and operations on a 1000 node cluster.

We test that standard administrative operations will continue to work properly even when working with a cluster that has 1000 members and is operating under load. The main test begins by provisioning 1000 nodes on one of our certified cloud providers (more details below) and separating these nodes into two 500 node groups.

Once configured, we start up the first datacenter. This begins the test with a blank 500 node cluster with a single datacenter. A separate set of ~10 nodes is utilized to start feeding data into these first 500 nodes. Inserts are generated using cassandra-stress with NetworkTopologyStrategy placing all of the data into the running datacenter. The second 500 nodes are started as a separate datacenter with auto-bootstrap turned off while inserts are still being performed on the cluster(Adding a Datacenter to an Existing Cluster). Once all of these nodes in the second datacenter are up, we alter the replication strategy of the stress keyspace so that information will be routed to both datacenters. While this is occurring we run nodetool rebuild and repair on each node. Once these operations and the full insert have completed successfully we verify that all of the written data can be read from the second datacenter.

Diagram of the DC Add Test

Following the successful addition of a 500 node datacenter we test the removal of 500 node datacenter. The test again begins with inserting data from a separate set of nodes into 1000 node cluster. The keyspace is then altered so that data only ends up being copied to the second datacenter. One by one, the nodes from the first datacenter are scrubbed and decommissioned from the cluster.  Once the first datacenter has been successfully decommissioned we verify that the insert workload was completed successfully and all of the data we expect is on the second datacenter. This set of functional operations is usually followed by several more specific and feature related tests.

1000NodesRemoveDC (1)

Technical Tips

The most difficult part of this exercise has routinely been the initial provisioning and installation of our software. When you are performing normal system operations at this scale usually trivial concerns become magnified into show stopping bottlenecks. Most cloud services are simply not designed to request a batch of 1000 nodes at once so a batched approach to launching can be useful. Additionally it can be useful to have your own framework around provisioning which can check as the nodes are created whether or not they are actually stable (functioning ssh, mountable drives, working os .) This becomes key when requesting such large numbers of nodes since the chance that you will receive a dud or two increases.

Another key issue we originally ran into was that 1000 nodes simultaneously downloading from our package server caused a great deal of stress and was a large bottleneck. Thanks to some advice from some of our partners at Google Cloud we ended up with far more efficient provisioning. Rather than having each node set itself up separately, we would create a single base node which would have all of the software and dependencies installed on it. This would then be snapshotted using the cloud provider's api. Most cloud providers (GCE, EC2, ect ...) allow for the launching of nodes from snapshotted images which ends up saving us a great deal of time. This brought the time initially required to configure our 1000 nodes down from several days to approximately an hour.  Once all the software is in place we use our own cluster management tools to quickly configure Datastax Enterprise.

 The Future of Testing at Scale

Data is only getting bigger and scaling horizontally has shown again and again that it is the only way to really get great performance for truly "Big Data." At Datastax we always have this in mind and to this end expect us to be testing on even larger clusters with greater volumes of data in the future.

DataStax has many ways for you to advance in your career and knowledge.

You can take free classes, get certified, or read one of our many white papers.

register for classes

get certified

DBA's Guide to NoSQL


  1. sanjay says:

    Great article and insight, Thanks!

  2. John says:

    Can you please post the exact scripts and config for the cluster creation and the stress tools?
    We love Cassandra and these tools can help our deployment.

    1. Russell Spitzer says:

      The stress tools we use are those that come with any installation of Cassandra, cassandra-stress. The scripts for provisioning and creation rely on our internal api so I’m afraid I can’t share those.

  3. Connor Warrington says:

    Great to hear that a large number of nodes are being tested in an automated fashion and quickly.

    Are you also testing the other pieces of DataStax Enterprise at this node level? Hadoop jobs running on a cluster this large?
    Solr indexing and querying?

    1. Russell Spitzer says:

      We have done several one-off experiments with larger search and analytics jobs (~400 nodes) but haven’t finalized our test-plan for an equally grand scale experiment. We hope to have one running regularly in the near future.

  4. Lewis Chan says:

    Very good practice. But some part is not clear. Why to use nodetool to rebuild and repair manually? Aren’t data automatically routed to each datacenter ?


Your email address will not be published. Required fields are marked *

Subscribe for newsletter:

Tel. +1 (650) 389-6000 Offices France GermanyJapan

DataStax Enterprise is powered by the best distribution of Apache Cassandra™.

© 2018 DataStax, All Rights Reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.