Apache Cassandra 1.1 Documentation

Tuning Cassandra

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

Tuning Cassandra and Java resources is recommended in the event of a performance degradation, high memory consumption, and other atypical situations described in this section. After completion of tuning operations, follow recommendations in this section to monitor and test changes. Tuning Cassandra includes the following tasks:

Tuning Bloom Filters

Each SSTable has a Bloom filter. A Bloom filter tests whether an element is a member of a set. False positive retrieval results are possible, but false negatives are not. In Cassandra when data is requested, the Bloom filter checks if the requested row exists in the SSTable before doing any disk I/O. High memory consumption can result from the Bloom filter false positive ratio being set too low. The higher the Bloom filter setting, the lower the memory consumption. By tuning a Bloom filter, you are setting the chance of false positives; the lower the chances of false positives, the larger the Bloom filter. The maximum recommended setting is 0.1, as anything above this value yields diminishing returns.

Bloom filter settings range from 0.000744 (default) to 1.0 (disabled). For example, to run an analytics application that heavily scans a particular column family, you would want to inhibit the Bloom filter on the column family by setting it high. Setting it high ensures that an analytics application will never ask for keys that don't exist.

To change the Bloom filter attribute on a column family, use CQL. For example:

ALTER TABLE addamsFamily WITH bloom_filter_fp_chance = 0.01;

After updating the value of bloom_filter_fp_chance on a column family, Bloom filters need to be regenerated in one of these ways:

You do not have to restart Cassandra after regenerating SSTables.

Tuning Data Caches

These caches are built into Cassandra and provide very efficient data caching:

  • Key cache: a cache of the primary key index for a Cassandra table. Enabled by default.
  • Row cache: similar to a traditional cache like memcached. Holds the entire row in memory so reads can be satisfied without using disk. Disabled by default.

If read performance is critical, you can leverage the built-in caching to effectively pry dedicated caching tools, such as memcached, completely out of the stack. Such deployments remove a redundant layer and strengthen cache functionality in the lower tier where the data is already being stored. Caching never needs to be restarted in a completely cold state.

With proper tuning, key cache hit rates of 85% or better are possible with Cassandra, and each hit on a key cache can save one disk seek per SSTable. Row caching, when feasible, can save the system from performing any disk seeks at all when fetching a cached row. When growth in the read load begins to impact your hit rates, you can add capacity to restore optimal levels of caching. Typically, expect a 90% hit rate for row caches. If row cache hit rates are 30% or lower, it may make more sense to leave row caching disabled (the default). Using only the key cache makes the row cache available for other column families that need it.

How Caching Works

When both row and key caches are configured, the row cache returns results whenever possible. In the event of a row cache miss, the key cache might still provide a hit that makes the disk seek much more efficient. This diagram depicts two read operations on a column family with both caches already populated.


../../_images/how-cache-works.png

One read operation hits the row cache, returning the requested row without a disk seek. The other read operation requests a row that is not present in the row cache but is present in the key cache. After accessing the row in the SSTable, the system returns the data and populates the row cache with this read operation.

When to Use Key Caching

Because the key cache holds the location of keys in memory on a per-column family basis, turning this value up can have an immediate, positive impact on column family reads as soon as the cache warms.

High levels of key caching are recommended for most scenarios. Cases for row caching are more specialized, but whenever it can coexist peacefully with other demands on memory resources, row caching provides the most dramatic gains in efficiency.

Using the default key cache setting, or a higher one, works well in most cases. Tune key cache sizes in conjunction with the Java heap size.

When to Use Row Caching

Row caching saves more time than key caching, but it is extremely space consuming. Row caching is recommended in these cases:

  • Data access patterns follow a normal (Gaussian) distribution.
  • Rows contain heavily-read data and queries frequently return data from most or all of the columns.

General Cache Usage Tips

Some tips for efficient cache use are:

  • Store lower-demand data or data with extremely long rows in a column family with minimal or no caching.
  • Deploy a large number of Cassandra nodes under a relatively light load per node.
  • Logically separate heavily-read data into discrete column families.

Cassandra's memtables have overhead for index structures on top of the actual data they store. If the size of the values stored in the heavily-read columns is small compared to the number of columns and rows themselves (long, narrow rows), this overhead can be substantial. Short, narrow rows, on the other hand, lend themselves to highly efficient row caching.

Enabling the Key and Row Caches

Enable the key and row caches at the column family level using the CQL caching parameter. Unlike earlier Cassandra versions, cache sizes do not need to be specified per table. Just set caching to all, keys_only, rows_only, or none, and Cassandra weights the cached data by size and access frequency, and thus make optimal use of the cache memory without manual tuning. For archived tables, disable caching entirely because these tables are read infrequently.

Setting Cache Options

In the cassandra.yaml file, tune caching by changing these options:

Monitoring Cache Tune Ups

Make changes to cache options in small, incremental adjustments, then monitor the effects of each change using one of the following tools:

About the Off-Heap Row Cache

Cassandra can store cached rows in native memory, outside the Java heap. This results in both a smaller per-row memory footprint and reduced JVM heap requirements, which helps keep the heap size in the sweet spot for JVM garbage collection performance.

Using the off-heap row cache requires the JNA library to be installed; otherwise, Cassandra falls back on the on-heap cache provider.

Tuning the Java Heap

Because Cassandra is a database, it spends significant time interacting with the operating system's I/O infrastructure through the JVM, so a well-tuned Java heap size is important. Cassandra's default configuration opens the JVM with a heap size that is based on the total amount of system memory:

System Memory Heap Size
Less than 2GB 1/2 of system memory
2GB to 4GB 1GB
Greater than 4GB 1/4 system memory, but not more than 8GB

General Guidelines

Many users new to Cassandra are tempted to turn up Java heap size too high, which consumes the majority of the underlying system's RAM. In most cases, increasing the Java heap size is actually detrimental for these reasons:

  • In most cases, the capability of Java 6 to gracefully handle garbage collection above 8GB quickly diminishes.
  • Modern operating systems maintain the OS page cache for frequently accessed data and are very good at keeping this data in memory, but can be prevented from doing its job by an elevated Java heap size.

To change a JVM setting, modify the cassandra-env.sh file.

Because MapReduce runs outside the JVM, changes to the JVM do not affect Hadoop operations directly.

Tuning Java Garbage Collection

Cassandra's GCInspector class logs information about garbage collection whenever a garbage collection takes longer than 200ms. Garbage collections that occur frequently and take a moderate length of time to complete (such as ConcurrentMarkSweep taking a few seconds), indicate that there is a lot of garbage collection pressure on the JVM. Remedies include adding nodes, lowering cache sizes, or adjusting the JVM options regarding garbage collection.

Tuning Compaction

In addition to consolidating SSTables, the compaction process merges keys, combines columns, discards tombstones, and creates a new index in the merged SSTable.

To tune compaction, set a compaction_strategy for each column family based on its access patterns. The compaction strategies are:

  • Size-Tiered Compaction

    Appropriate for append-mostly workloads, which add new rows, and to a lesser degree, new columns to old rows.

  • Leveled Compaction

    Appropriate for workloads with many updates that change the values of existing columns, such as time-bound data in columns marked for expiration using TTL.

For example, to update a column family to use the leveled compaction strategy using Cassandra CQL:

ALTER TABLE users WITH
  compaction_strategy_class='LeveledCompactionStrategy'
  AND  compaction_strategy_options:sstable_size_in_mb:10;

Tuning Compaction for Size-Tiered Compaction

Control the frequency and scope of a minor compaction of a column family that uses the default size-tiered compaction strategy by setting the min_compaction_threshold. The size-tiered compaction strategy triggers a minor compaction when a number SSTables on disk are of the size configured by min_compaction_threshold.

By default, a minor compaction can begin any time Cassandra creates four SSTables on disk for a column family. A minor compaction must begin before the total number of SSTables reaches 32.

Configure this value per column family using CQL. For example:

ALTER TABLE users WITH min_compaction_threshold = 6;

This CQL example shows how to change the compaction_strategy_class and set a minimum compaction threshold:

ALTER TABLE users
  WITH compaction_strategy_class='SizeTieredCompactionStrategy'
  AND min_compaction_threshold = 6;

A full compaction applies only to SizeTieredCompactionStrategy. It merges all SSTables into one large SSTable. Generally, a full compaction is not recommended because the large SSTable that is created will not be compacted until the amount of actual data increases four-fold (or min_compaction_threshold). Addtionally, during runtime, full compaction is I/O and CPU intensive and can temporarily double disk space usage when no old versions or tombstones are evicted.

To initiate a full compaction for all tables in a keyspace use the nodetool compact command.

Cassandra provides a startup option for testing compaction strategies without affecting the production workload.

Tuning Column Family Compression

Compression maximizes the storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for read-dominated workloads. Cassandra quickly finds the location of rows in the SSTable index and decompresses the relevant row chunks.

Write performance is not negatively impacted by compression in Cassandra as it is in traditional databases. In traditional relational databases, writes require overwrites to existing data files on disk. The database has to locate the relevant pages on disk, decompress them, overwrite the relevant data, and finally recompress. In a relational database, compression is an expensive operation in terms of CPU cycles and disk I/O. Because Cassandra SSTable data files are immutable (they are not written to again after they have been flushed to disk), there is no recompression cycle necessary in order to process writes. SSTables are compressed only once when they are written to disk. Writes on compressed tables can show up to a 10 percent performance improvement.

How to Enable and Disable Compression

Compression is enabled by default in Cassandra 1.1. To disable compression, use CQL to set the compression parameters to an empty string:

CREATE TABLE DogTypes (
              block_id uuid,
              species text,
              alias text,
              population varint,
              PRIMARY KEY (block_id)
            )
            WITH compression_parameters:sstable_compression = '';

To enable or change compression on an existing column family, use ALTER TABLE and set the compression_parameters sstable_compression to SnappyCompressor or DeflateCompressor.

How to Change and Tune Compression

Change or tune data compression on a per-column family basis using CQL to alter a column family and set the compression_parameters attributes:

ALTER TABLE users
  WITH compression_parameters:sstable_compression = 'DeflateCompressor'
  AND compression_parameters:chunk_length_kb = 64;

When to Use Compression

Compression is best suited for column families that have many rows and each row has the same columns, or at least as many columns, as other rows. For example, a column family containing user data such as username, email, and state, is a good candidate for compression. The greater the similarity of the data across rows, the greater the compression ratio and gain in read performance.

A column family that has rows of different sets of columns, or a few wide rows, is not well-suited for compression. Dynamic column families do not yield good compression ratios.

Don't confuse column family compression with compact storage of columns, which is used for backward compatibility of old applications with CQL 3.

Depending on the data characteristics of the column family, compressing its data can result in:

  • 2x-4x reduction in data size
  • 25-35% performance improvement on reads
  • 5-10% performance improvement on writes

After configuring compression on an existing column family, subsequently created SSTables are compressed. Existing SSTables on disk are not compressed immediately. Cassandra compresses existing SSTables when the normal Cassandra compaction process occurs. Force existing SSTables to be rewritten and compressed by using nodetool upgradesstables (Cassandra 1.0.4 or later) or nodetool scrub.

Testing Compaction and Compression

Write survey mode is a Cassandra startup option for testing new compaction and compression strategies. Using write survey mode, experiment with different strategies and benchmark write performance differences without affecting the production workload.

Write survey mode adds a node to a database cluster. The node accepts all write traffic as if it were part of the normal Cassandra cluster, but the node does not officially join the ring.

To enable write survey mode, start a Cassandra node using the option shown in this example:

bin/cassandra – Dcassandra.write_survey=true

Also use write survey mode to try out a new Cassandra version. The nodes you add in write survey mode to a cluster must be of the same major release version as other nodes in the cluster. The write survey mode relies on the streaming subsystem that transfers data between nodes in bulk and differs from one major release to another.

If you want to see how read performance is affected by modifications, stop the node, bring it up as a standalone machine, and then benchmark read operations on the node.