Apache Cassandra 0.7 Documentation

Tuning

Effective tuning depends not only on the types of operations your cluster performs most frequently, but also on the shape of the data itself. For example, Cassandra’s memtables have overhead for index structures on top of the actual data they store. If the size of the values stored in the columns is small compared to the number of columns and rows themselves (sometimes called “skinny rows”), this overhead can be substantial. Thus, the optimal tuning for this type of data is quite different than the optimal tuning for a small numbers of columns with more data (“fat rows”).

This page discussed the importance of Java garbage collection, and gives some guidance in sizing memtables and the Java heap. For cache tuning details, see Maximizing Cache Benefit.

Sizing Memtables

A memtable is a column family specific, in memory data structure that can be easily described as a write-back cache. Memtables are flushed to disk, creating SSTables whenever one of the configurable thresholds has been exceeded.

Effectively tuning memtable thresholds depends on your data as much as your write load. Memtable thresholds are configured primarily by memtable_throughput_in_mb and memtable_operations_in_millions. You should increase MemtableThroughputInMB if:

  1. Your write load includes a high volume of updates on a smaller set of data
  2. You have steady stream of continuous writes (this will lead to more efficient compaction)

Look instead at adjusting MemtableObjectCountInMillions if, as previously described, you have large numbers of skinny rows. Memtable flushes should be tuned using this value instead to avoid consuming too much memory with metadata.

Note that any upwards adjustment of memtable thresholds will take memory away from caching and other internal Cassandra structures, so tune carefully and in small increments.

Heap Sizing

As previously mentioned, Cassandra’s default configuration opens the JVM with a heap size of half of the available memory. Many users new to Cassandra are tempted to turn this value up immediately to consume the majority of the underlying system’s RAM. Doing so in most cases is actually detrimental. The reason for this is that Cassandra, being essentially a database, spends a lot of time interacting with the operating system’s I/O infrastructure (via the JVM of course). Modern operating systems maintain disk caches for frequently accessed data and are very good at keeping this data in memory. Regardless of how much RAM your hardware has, you should keep the JVM heap size constrained by the following formula and allow the operating system’s file cache to do the rest:

memtable_throughput_in_mb * 3 * (number of Column Families) + 1G + (size of internal caches)

GCInspector

Cassandra’s GCInspector will log information about garbage collection whenever a garbage collection takes longer than 200ms. If garbage collections are occurring frequently and are taking a moderate length of time to complete (such as ConcurrentMarkSweep taking a few seconds), this is an indication that there is a lot of garbage collection pressure on the JVM; this needs to be addressed by adding nodes, lowering cache sizes, or adjusting GC options.

After 0.6.7, GCInspector also logs its usual summary whenever messages are dropped to help determine the cause.