Consolidating SSTables and configuring compression of tables in the Cassandra database can improve performance. This section discusses these configuration topics and how to test your configuration changes to compaction and compression:
In the background, Cassandra periodically merges SSTables together into larger SSTables using a process called compaction. Compaction merges row fragments, removes expired tombstones, and rebuilds primary and secondary indexes. Because the SSTables are sorted by row key, this merge is efficient (no random disk I/O). After a newly merged SSTable is complete, the input SSTables are marked as obsolete and eventually deleted by the JVM garbage collection (GC) process. However, during compaction, there is a temporary spike in disk space usage and disk I/O.
Cassandra 1.2 tracks the times that tombstones can be dropped for TTL-configured and deleted columns and performs compaction when columns exceed a CQL-configurable threshold. Also, as of 1.2, you can better manage tombstone removal and avoid manually performing user-defined compaction to recover disk space. The CQL-configurable threshold sets the minimum time to wait after an SSTable creation time before considering the SSTable for tombstone compaction.
The capability to perform multiple, independent leveled compactions in parallel promotes full I/O utilization when using SSD hardware, which is not bottlenecked by I/O. Cassandra's leveled compaction strategy creates SSTables of a fixed, relatively small size that are grouped into levels. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Cassandra executes compactions in parallel between different levels, and performs multiple compactions per level. To configure this feature, set the multithreaded_compaction setting to true in the cassandra.yaml configuration file and set the compaction_strategy as described in Configuring compaction below.
Compaction impacts reads in two ways. During compaction temporary increases in disk I/O and disk utilization can impact read performance for reads that are not fulfilled by the cache. However, after a compaction has been completed, off-cache read performance improves since there are fewer SSTable files on disk that need to be checked to complete a read request.
In addition to consolidating SSTables, the compaction process merges keys, combines columns, discards tombstones, and creates a new index in the merged SSTable.
There are two different compaction strategies that you can configure on a table:
To set compaction, construct a property map using CQL. Set compaction properties using a map collection:
name = { 'name' : value, 'name', value : 'name', value ... }
In this string, italics indicates optional.
To create or update a table to set the compaction strategy, use the ALTER or CREATE TABLE statements. For example:
ALTER TABLE users WITH
compaction =
{ 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 10 }
For the list of options and more information, see CQL 3 table storage properties.
Control the frequency and scope of a minor compaction of a table that uses the default size-tiered compaction strategy by setting the CQL 3 min_threshold attribute. The size-tiered compaction strategy triggers a minor compaction when a number SSTables on disk are of the size configured by min_threshold. Configure this value per table using CQL. For example:
ALTER TABLE users
WITH compaction =
{'class' : 'SizeTieredCompactionStrategy', 'min_threshold' : 6 }
By default, a minor compaction can begin any time Cassandra creates four SSTables on disk for a table. A minor compaction must begin before the total number of SSTables reaches 32.
Initiate a major compaction through nodetool compact. A major compaction merges all SSTables into one. Though major compaction can free disk space used by accumulated SSTables, during runtime it temporarily doubles disk space usage and is I/O and CPU intensive. After running a major compaction, automatic minor compactions are no longer triggered on a frequent basis. Consequently, you no longer have to manually run major compactions on a routine basis. Expect read performance to improve immediately following a major compaction, and then to continually degrade until you invoke the next major compaction. For this reason, DataStax does not recommend major compaction.
Cassandra provides a startup option for testing compaction strategies without affecting the production workload.
For information about compaction metrics, see Compaction metrics.
Compression maximizes the storage capacity of Cassandra nodes by reducing the volume of data on disk and disk I/O, particularly for read-dominated workloads. Cassandra quickly finds the location of rows in the SSTable index and decompresses the relevant row chunks.
Write performance is not negatively impacted by compression in Cassandra as it is in traditional databases. In traditional relational databases, writes require overwrites to existing data files on disk. The database has to locate the relevant pages on disk, decompress them, overwrite the relevant data, and finally recompress. In a relational database, compression is an expensive operation in terms of CPU cycles and disk I/O. Because Cassandra SSTable data files are immutable (they are not written to again after they have been flushed to disk), there is no recompression cycle necessary in order to process writes. SSTables are compressed only once when they are written to disk. Writes on compressed tables can show up to a 10 percent performance improvement.
Compression is enabled by default in Cassandra 1.1. To disable compression, use CQL to set the compression parameters to an empty string:
CREATE TABLE DogTypes (
block_id uuid,
species text,
alias text,
population varint,
PRIMARY KEY (block_id)
)
WITH compression = { 'sstable_compression' : 'DeflateCompressor' };
To enable or change compression on an existing table, use ALTER TABLE and set the compression algorithm sstable_compression to SnappyCompressor or DeflateCompressor.
Change or tune data compression on a per-table basis using CQL to alter a table and set the compression attributes:
ALTER TABLE users
WITH compression = { 'sstable_compression' : 'DeflateCompressor', 'chunk_length_kb' : 64 }
Compression is best suited for tables that have many rows and each row has the same columns, or at least as many columns, as other rows. For example, a table containing user data such as username, email, and state, is a good candidate for compression. The greater the similarity of the data across rows, the greater the compression ratio and gain in read performance.
A table that has rows of different sets of columns is not well-suited for compression. Dynamic tables do not yield good compression ratios.
Don't confuse table compression with compact storage of columns, which is used for backward compatibility of old applications with CQL 3.
Depending on the data characteristics of the table, compressing its data can result in:
After configuring compression on an existing table, subsequently created SSTables are compressed. Existing SSTables on disk are not compressed immediately. Cassandra compresses existing SSTables when the normal Cassandra compaction process occurs. Force existing SSTables to be rewritten and compressed by using nodetool upgradesstables (Cassandra 1.0.4 or later) or nodetool scrub.
Write survey mode is a Cassandra startup option for testing new compaction and compression strategies. In write survey mode, you can test out new compaction and compression strategies on that node and benchmark the write performance differences, without affecting the production cluster.
Write survey mode adds a node to a database cluster. The node accepts all write traffic as if it were part of the normal Cassandra cluster, but the node does not officially join the ring.
To enable write survey mode, start a Cassandra node using the option shown in this example:
bin/cassandra – Dcassandra.write_survey=true
Also use write survey mode to try out a new Cassandra version. The nodes you add in write survey mode to a cluster must be of the same major release version as other nodes in the cluster. The write survey mode relies on the streaming subsystem that transfers data between nodes in bulk and differs from one major release to another.
If you want to see how read performance is affected by modifications, stop the node, bring it up as a standalone machine, and then benchmark read operations on the node.