The /tools/stress directory contains the Java-based stress testing utilities that can help in benchmarking and load testing a Cassandra cluster: stress.java and the daemon stressd. The daemon mode, which keeps the JVM warm more efficiently, may be useful for large-scale benchmarking.
Use Apache ant to to build the stress testing tool:
There are three different modes of operation:
You can use these modes with or without the stressd daemon running. For larger-scale testing, the daemon can yield better performance by keeping the JVM warm and preventing potential skew in test results.
If no specific operation is specified, stress will insert 1M rows.
The options available are:
-o <operation>, --operation <operation>
Sets the operation mode, one of 'insert', 'read', 'rangeslice', or 'indexedrangeslice'
-T <IP>, --send-to <IP>
Sends the command as a request to the stress daemon at the specified IP address. The daemon must already be running at that address.
-n <NUMKEYS>, --num-keys <NUMKEYS>
Number of keys to write or read. Default is 1,000,000.
-l <RF>, --replication-factor <RF>
Replication Factor to use when creating needed column families. Defaults to 1.
-R <strategy>, --replication-strategy <strategy>
Replication strategy to use (only on insert when keyspace does not exist. Default is:org.apache.cassandra.locator.SimpleStrategy.
-O <properties>, --strategy-properties <properties>
Replication strategy properties in the following format <dc_name>:<num>,<dc_name>:<num>,... Use with network topology strategy.
Set replicate_on_write to false for counters. Only for counters add with CL=ONE.
-e <CL>, --consistency-level <CL>
Consistency Level to use (ONE, QUORUM, LOCAL_QUORUM, EACH_QUORUM, ALL, ANY). Default is ONE.
-c <COLUMNS>, --columns <COLUMNS>
Number of columns per key. Default is 5.
-d <NODES>, --nodes <NODES>
Nodes to perform the test against.(comma separated, no spaces). Default is “localhost”.
-y <TYPE>, --family-type <TYPE>
Sets the ColumnFamily type. One of 'Standard' or 'Super'. If using super, set the -u option also.
Generate column values of average rather than specific size.
-u <SUPERCOLUMNS>, --supercolumns <SUPERCOLUMNS>
Use the number of supercolumns specified. You must set the -y option appropriately, or this option has no effect.
-g <COUNT>, --get-range-slice-count <COUNT>
Sets the number of rows to slice at a time and defaults to 1000. This is only used for the rangeslice operation and will NOT work with the RandomPartioner. You must set the OrderPreservingPartioner in your storage configuration (note that you will need to wipe all existing data when switching partioners.)
-g <KEYS>, --keys-per-call <KEYS>
Number of keys to get_range_slices or multiget per call. Default is 1000.
Only used for reads. By default, stress will perform reads on rows with a Guassian distribution, which will cause some repeats. Setting this option makes the reads completely random instead.
The interval, in seconds, at which progress will be output.
Usage for the daemon mode is:
/tools/stress/bin/stressd start|stop|status [-h <host>]
During stress testing, you can keep the daemon running and send stress.java commands through it using the -T or --send-to option flag.
1M inserts to given host:
/tools/stress/bin/stress -d 192.168.1.101
1M reads from given host:
tools/stress/bin/stress -d 192.168.1.101 -o read
10M inserts spread across two nodes:
/tools/stress/bin/stress -d 192.168.1.101,192.168.1.102 -n 10000000
10M inserts spread across two nodes using the daemon mode:
/tools/stress/bin/stress -d 192.168.1.101,192.168.1.102 -n 10000000 -T 126.96.36.199