Tuning Cassandra and Java resources is recommended in the event of a performance degradation, high memory consumption, and other atypical situations described in this section. After completion of tuning operations, follow recommendations in this section to monitor and test changes. Tuning Cassandra includes the following tasks:
Each SSTable has a Bloom filter. A Bloom filter tests whether an element is a member of a set. False positive retrieval results are possible, but false negatives are not. In Cassandra when data is requested, the Bloom filter checks if the requested row exists in the SSTable before doing any disk I/O. High memory consumption can result from the Bloom filter false positive ratio being set too low. The higher the Bloom filter setting, the lower the memory consumption. By tuning a Bloom filter, you are setting the chance of false positives; the lower the chances of false positives, the larger the Bloom filter. The maximum recommended setting is 0.1, as anything above this value yields diminishing returns.
Bloom filter settings range from 0.000744 (default) to 1.0 (disabled). For example, to run an analytics application that heavily scans a particular column family, you would want to inhibit the Bloom filter on the column family by setting it high. Setting it high ensures that an analytics application will never ask for keys that don't exist.
To change the Bloom filter attribute on a column family, use CQL. For example:
ALTER TABLE addamsFamily WITH bloom_filter_fp_chance = 0.01;
After updating the value of bloom_filter_fp_chance on a column family, Bloom filters need to be regenerated in one of these ways:
You do not have to restart Cassandra after regenerating SSTables.
These caches are built into Cassandra and provide very efficient data caching:
If read performance is critical, you can leverage the built-in caching to effectively pry dedicated caching tools, such as memcached, completely out of the stack. Such deployments remove a redundant layer and strengthen cache functionality in the lower tier where the data is already being stored. Caching never needs to be restarted in a completely cold state.
With proper tuning, key cache hit rates of 85% or better are possible with Cassandra, and each hit on a key cache can save one disk seek per SSTable. Row caching, when feasible, can save the system from performing any disk seeks at all when fetching a cached row. When growth in the read load begins to impact your hit rates, you can add capacity to restore optimal levels of caching. Typically, expect a 90% hit rate for row caches. If row cache hit rates are 30% or lower, it may make more sense to leave row caching disabled (the default). Using only the key cache makes the row cache available for other column families that need it.
When both row and key caches are configured, the row cache returns results whenever possible. In the event of a row cache miss, the key cache might still provide a hit that makes the disk seek much more efficient. This diagram depicts two read operations on a column family with both caches already populated.
One read operation hits the row cache, returning the requested row without a disk seek. The other read operation requests a row that is not present in the row cache but is present in the key cache. After accessing the row in the SSTable, the system returns the data and populates the row cache with this read operation.
Because the key cache holds the location of keys in memory on a per-column family basis, turning this value up can have an immediate, positive impact on column family reads as soon as the cache warms.
High levels of key caching are recommended for most scenarios. Cases for row caching are more specialized, but whenever it can coexist peacefully with other demands on memory resources, row caching provides the most dramatic gains in efficiency.
Using the default key cache setting, or a higher one, works well in most cases. Tune key cache sizes in conjunction with the Java heap size.
Row caching saves more time than key caching, but it is extremely space consuming. Row caching is recommended in these cases:
Some tips for efficient cache use are:
Cassandra's memtables have overhead for index structures on top of the actual data they store. If the size of the values stored in the heavily-read columns is small compared to the number of columns and rows themselves (long, narrow rows), this overhead can be substantial. Short, narrow rows, on the other hand, lend themselves to highly efficient row caching.
Enable the key and row caches at the column family level using the CQL caching parameter. Unlike earlier Cassandra versions, cache sizes do not need to be specified per table. Just set caching to all, keys_only, rows_only, or none, and Cassandra weights the cached data by size and access frequency, and thus make optimal use of the cache memory without manual tuning. For archived tables, disable caching entirely because these tables are read infrequently.
In the cassandra.yaml file, tune caching by changing these options:
Make changes to cache options in small, incremental adjustments, then monitor the effects of each change using one of the following tools:
Cassandra can store cached rows in native memory, outside the Java heap. This results in both a smaller per-row memory footprint and reduced JVM heap requirements, which helps keep the heap size in the sweet spot for JVM garbage collection performance.
Using the off-heap row cache requires the JNA library to be installed; otherwise, Cassandra falls back on the on-heap cache provider.
Because Cassandra is a database, it spends significant time interacting with the operating system's I/O infrastructure through the JVM, so a well-tuned Java heap size is important. Cassandra's default configuration opens the JVM with a heap size that is based on the total amount of system memory:
|System Memory||Heap Size|
|Less than 2GB||1/2 of system memory|
|2GB to 4GB||1GB|
|Greater than 4GB||1/4 system memory, but not more than 8GB|
Many users new to Cassandra are tempted to turn up Java heap size too high, which consumes the majority of the underlying system's RAM. In most cases, increasing the Java heap size is actually detrimental for these reasons:
To change a JVM setting, modify the cassandra-env.sh file.
Because MapReduce runs outside the JVM, changes to the JVM do not affect Hadoop operations directly.
Cassandra's GCInspector class logs information about garbage collection whenever a garbage collection takes longer than 200ms. Garbage collections that occur frequently and take a moderate length of time to complete (such as ConcurrentMarkSweep taking a few seconds), indicate that there is a lot of garbage collection pressure on the JVM. Remedies include adding nodes, lowering cache sizes, or adjusting the JVM options regarding garbage collection.
In addition to consolidating SSTables, the compaction process merges keys, combines columns, discards tombstones, and creates a new index in the merged SSTable.
To tune compaction, set a compaction_strategy for each column family based on its access patterns. The compaction strategies are:
Appropriate for append-mostly workloads, which add new rows, and to a lesser degree, new columns to old rows.
Appropriate for workloads with many updates that change the values of existing columns, such as time-bound data in columns marked for expiration using TTL.
For example, to update a column family to use the leveled compaction strategy using Cassandra CQL:
ALTER TABLE users WITH compaction_strategy_class='LeveledCompactionStrategy' AND compaction_strategy_options:sstable_size_in_mb:10;
Control the frequency and scope of a minor compaction of a column family that uses the default size-tiered compaction strategy by setting the min_compaction_threshold. The size-tiered compaction strategy triggers a minor compaction when a number SSTables on disk are of the size configured by min_compaction_threshold.
By default, a minor compaction can begin any time Cassandra creates four SSTables on disk for a column family. A minor compaction must begin before the total number of SSTables reaches 32.
Configure this value per column family using CQL. For example:
ALTER TABLE users WITH min_compaction_threshold = 6;
This CQL example shows how to change the compaction_strategy_class and set a minimum compaction threshold:
ALTER TABLE users WITH compaction_strategy_class='SizeTieredCompactionStrategy' AND min_compaction_threshold = 6;
Initiate a major compaction through nodetool compact. A major compaction merges all SSTables into one. Though major compaction can free disk space used by accumulated SSTables, during runtime it temporarily doubles disk space usage and is I/O and CPU intensive. After running a major compaction, automatic minor compactions are no longer triggered on a frequent basis. Consequently, you no longer have to manually run major compactions on a routine basis. Expect read performance to improve immediately following a major compaction, and then to continually degrade until you invoke the next major compaction. For this reason, DataStax does not recommend major compaction.
Cassandra provides a startup option for testing compaction strategies without affecting the production workload.
Compression maximizes the storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for read-dominated workloads. Cassandra quickly finds the location of rows in the SSTable index and decompresses the relevant row chunks.
Write performance is not negatively impacted by compression in Cassandra as it is in traditional databases. In traditional relational databases, writes require overwrites to existing data files on disk. The database has to locate the relevant pages on disk, decompress them, overwrite the relevant data, and finally recompress. In a relational database, compression is an expensive operation in terms of CPU cycles and disk I/O. Because Cassandra SSTable data files are immutable (they are not written to again after they have been flushed to disk), there is no recompression cycle necessary in order to process writes. SSTables are compressed only once when they are written to disk. Writes on compressed tables can show up to a 10 percent performance improvement.
Compression is enabled by default in Cassandra 1.1. To disable compression, use CQL to set the compression parameters to an empty string:
CREATE TABLE DogTypes ( block_id uuid, species text, alias text, population varint, PRIMARY KEY (block_id) ) WITH compression_parameters:sstable_compression = '';
To enable or change compression on an existing column family, use ALTER TABLE and set the compression_parameters sstable_compression to SnappyCompressor or DeflateCompressor.
Change or tune data compression on a per-column family basis using CQL to alter a column family and set the compression_parameters attributes:
ALTER TABLE users WITH compression_parameters:sstable_compression = 'DeflateCompressor' AND compression_parameters:chunk_length_kb = 64;
Compression is best suited for column families that have many rows and each row has the same columns, or at least as many columns, as other rows. For example, a column family containing user data such as username, email, and state, is a good candidate for compression. The greater the similarity of the data across rows, the greater the compression ratio and gain in read performance.
A column family that has rows of different sets of columns, or a few wide rows, is not well-suited for compression. Dynamic column families do not yield good compression ratios.
Don't confuse column family compression with compact storage of columns, which is used for backward compatibility of old applications with CQL 3.
Depending on the data characteristics of the column family, compressing its data can result in:
After configuring compression on an existing column family, subsequently created SSTables are compressed. Existing SSTables on disk are not compressed immediately. Cassandra compresses existing SSTables when the normal Cassandra compaction process occurs. Force existing SSTables to be rewritten and compressed by using nodetool upgradesstables (Cassandra 1.0.4 or later) or nodetool scrub.
Write survey mode is a Cassandra startup option for testing new compaction and compression strategies. Using write survey mode, experiment with different strategies and benchmark write performance differences without affecting the production workload.
Write survey mode adds a node to a database cluster. The node accepts all write traffic as if it were part of the normal Cassandra cluster, but the node does not officially join the ring.
To enable write survey mode, start a Cassandra node using the option shown in this example:
bin/cassandra – Dcassandra.write_survey=true
Also use write survey mode to try out a new Cassandra version. The nodes you add in write survey mode to a cluster must be of the same major release version as other nodes in the cluster. The write survey mode relies on the streaming subsystem that transfers data between nodes in bulk and differs from one major release to another.
If you want to see how read performance is affected by modifications, stop the node, bring it up as a standalone machine, and then benchmark read operations on the node.