Apache Cassandra 1.2 Documentation

Keyspace and table storage configuration

The Cassandra 1.2 documentation is transitioning to a new format!
Please use the new Cassandra 1.2 documentation instead.
Back to Table of Contents
All Documents List     

Cassandra stores storage configuration attributes in the system keyspace. You set storage configuration attributes on a per-keyspace or per-column family basis programmatically or using a client application, such as CQL. The attribute names documented in this section are the names as they are stored in the system keyspace within Cassandra. A few attributes have slightly different names in CQL.

Keyspace attributes

A keyspace must have a user-defined name, a replica placement strategy, and options that specify the number of copies per data center or node.

Attribute Default Value
name N/A (A user-defined value is required)
placement_strategy SimpleStrategy
strategy_options N/A (container attribute)
durable_writes true

name

Required. The name for the keyspace.

placement_strategy

Required. Called strategy_class in CQL. Determines how Cassandra distributes replicas for a keyspace among nodes in the ring.

Values are:

  • SimpleStrategy or org.apache.cassandra.locator.SimpleStrategy
  • NetworkTopologyStrategy or org.apache.cassandra.locator.NetworkTopologyStrategy

NetworkTopologyStrategy requires a properly configured snitch to be able to determine rack and data center locations of a node. For more information about replication placement strategy, see About data replication.

strategy_options

Specifies configuration options for the chosen replication strategy class. The replication factor option is the total number of replicas across the cluster. A replication factor of 1 means that there is only one copy of each row on one node. A replication factor of 2 means there are two copies of each row, where each copy is on a different node. All replicas are equally important; there is no primary or master replica. As a general rule, the replication factor should not exceed the number of nodes in the cluster. However, you can increase the replication factor and then add the desired number of nodes.

When the replication factor exceeds the number of nodes, writes are rejected, but reads are served as long as the desired consistency level can be met.

To set a placement strategy and options using CQL, see CREATE KEYSPACE. For more information about configuring the replication placement strategy for a cluster and data centers, see Choosing keyspace replication options.

durable_writes

(Default: true) When set to false, data written to the keyspace bypasses the commit log. Be careful using this option because you risk losing data. Do not set this attribute on a keyspace that uses the SimpleStrategy. Change the durable_writes attribute using CQL.

Table attributes

The following attributes can be declared per table.

Option Default Value
bloom_filter_fp_chance 0.01 or 0.1 depending on compaction strategy
bucket_high 1.5
bucket_low 0.5
caching keys_only
column_metadata n/a (container attribute)
column_type Standard
comment n/a
compaction_strategy SizeTieredCompactionStrategy
compaction_strategy_options n/a (container attribute)
comparator BytesType
compare_subcolumns_with [1] BytesType
compression_options sstable_compression='SnappyCompressor'
default_validation_class n/a
dclocal_read_repair_chance 0.0
gc_grace_seconds 864000 (10 days)
key_validation_class n/a
max_compaction_threshold [2] 32
max_threshold [3] 32
min_compaction_threshold [2] 4
min_threshold [3] 4
memtable_flush_after_mins [1] n/a
memtable_operations_in_millions [1] n/a
memtable_throughput_in_mb [1] n/a
min_sstable_size 50MB
name n/a
read_repair_chance 0.1 or 1 (See description below.)
replicate_on_write true
sstable_size_in_mb 5MB
tombstone_compaction_interval 1 day
tombstone_threshold 0.2
[1]Ignored in Cassandra 1.2, but can still be declared for backward compatibility.
[2]Used by Thrift and CQL 2; ignored in CQL 3.
[3]The CQL 3 attribute name for the max_compaction_threshold and min_compaction_threshold Cassandra storage options.

bloom_filter_fp_chance

(Default: 0.01 for SizeTieredCompactionStrategy, 0.1 for LeveledCompactionStrategy) Desired false-positive probability for SSTable Bloom filters. When data is requested, the Bloom filter checks if the requested row exists before doing any disk I/O. Valid values are 0 to 1.0. A setting of 0 means that the unmodified (effectively the largest possible) Bloom filter is enabled. Setting the Bloom Filter at 1.0 disables it. The higher the setting, the less memory Cassandra uses. The maximum recommended setting is 0.1, as anything above this value yields diminishing returns. For detailed information, see Tuning Bloom filters.

bucket_high

(Default: 1.5) Size-tiered compaction considers SSTables to be within the same bucket if the SSTable size diverges by 50% or less from the default bucket_low and default bucket_high values: [average_size * bucket_low, average_size * bucket_high].

bucket_low

(Default: 0.5) See bucket_high for a description.

caching

(Default: keys_only) Optimizes the use of cache memory without manual tuning. Set caching to one of the following values:

  • all
  • keys_only
  • rows_only
  • none

Cassandra weights the cached data by size and access frequency. In Cassandra 1.1 and later, use this parameter to specify a key or row cache instead of a table cache, as in earlier versions.

chunk_length_kb

(Default: 64KB) On disk SSTables are compressed by block (to allow random reads). This subproperty of compression defines the size (in KB) of the block. Values larger than the default value might improve the compression rate, but increases the minimum size of data to be read from disk when a read occurs. The default value (64) is a good middle-ground for compressing tables. Adjust compression size to account for read/write access patterns (how much data is typically requested at once) and the average size of rows in the table.

column_metadata

(Default: N/A - container attribute) Column metadata defines these attributes of a column:

Attribute Description
name Binds a validation_class and (optionally) an index to a column.
validation_class Type used to check the column value.
index_name Name for the secondary index.
index_type Type of index. Currently the only supported value is KEYS.

Setting a value for the name option is required. The validation_class is set to the default_validation_class of the table if you do not set the validation_class option explicitly. The value of index_type must be set to create a secondary index for a column. The value of index_name is not valid unless index_type is also set.

Setting and updating column metadata with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@demo] UPDATE COLUMN FAMILY users WITH comparator=UTF8Type
AND column_metadata=[{column_name: full_name, validation_class: UTF8Type, index_type: KEYS}];

column_type

(Default: Standard) The standard type of table contains regular columns.

comment

(Default: N/A) A human readable comment describing the table.

compaction_strategy

(Default: SizeTieredCompactionStrategy) Sets the compaction strategy for the table. The available strategies are:

  • SizeTieredCompactionStrategy: The default compaction strategy and the only compaction strategy available in releases earlier than Cassandra 1.0. This strategy triggers a minor compaction whenever there are a number of similar sized SSTables on disk (as configured by min_threshold). Using this strategy causes bursts in I/O activity while a compaction is in process, followed by longer and longer lulls in compaction activity as SSTable files grow larger in size. These I/O bursts can negatively effect read-heavy workloads, but typically do not impact write performance. Watching disk capacity is also important when using this strategy, as compactions can temporarily double the size of SSTables for a table while a compaction is in progress.
  • LeveledCompactionStrategy: The leveled compaction strategy creates SSTables of a fixed, relatively small size (5 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Disk I/O is more uniform and predictable as SSTables are continuously being compacted into progressively larger levels. At each level, row keys are merged into non-overlapping SSTables. This can improve performance for reads, because Cassandra can determine which SSTables in each level to check for the existence of row key data. This compaction strategy is modeled after Google's leveldb implementation. For more information, see the articles When to Use Leveled Compaction and Leveled Compaction in Apache Cassandra.

compaction_strategy_options

(Default: N/A - container attribute) Sets attributes related to the chosen compaction_strategy. Attributes are:

CQL examples show how to set and update compaction properties.

comparator

(Default: BytesType) Defines the data types used to validate and sort column names. There are several built-in column comparators available. The comparator cannot be changed after you create a table.

compare_subcolumns_with

(Default: BytesType) Required when the column_type attribute is set to Super. Same as comparator but for the sub-columns of a super column. Ignored by Cassandra 1.2, but can be declared for backward compatibility.

compression_options

(Default: N/A - container attribute) Sets the compression algorithm and subproperties for the table. Choices are:

Using CQL presents examples of setting and updating compression properties.

crc_check_chance

(Default 1.0) When compression is enabled, each compressed block includes a checksum of that block for the purpose of detecting disk bitrot and avoiding the propagation of corruption to other replica. This option defines the probability with which those checksums are checked during read. By default they are always checked. Set to 0 to disable checksum checking and to 0.5, for instance, to check them on every other read.

default_validation_class

(Default: N/A) Defines the data type used to validate column values. There are several built-in column validators available.

dclocal_read_repair_chance

(Default: 0.0) Specifies the probability of read repairs being invoked over all replicas in the current data center. Contrast read_repair_chance.

gc_grace_seconds

(Default: 864000 [10 days]) Specifies the time to wait before garbage collecting tombstones (deletion markers). The default value allows a great deal of time for consistency to be achieved prior to deletion. In many deployments this interval can be reduced, and in a single-node cluster it can be safely set to zero. When using CLI, use gc_grace instead of gc_grace_seconds.

key_validation_class

(Default: N/A) Defines the data type used to validate row key values. There are several built-in key validators available, however CounterColumnType (distributed counters) cannot be used as a row key validator.

max_compaction_threshold

(Default: 32) Used by Thrift and CQL2. Ignored in CQL3; replaced by max_threshold. Sets the maximum number of SSTables processed by one minor compaction.

max_threshold

(Default: 32) Maximum number of SSTables processed by one minor compaction when using sizeTieredCompactionStrategy.

min_compaction_threshold

(Default: 4) Used by Thrift and CQL2. Ignored in CQL3; replaced by min_threshold. Sets the minimum number of SSTables to trigger a minor compaction when compaction_strategy=sizeTieredCompactionStrategy.

min_threshold

(Default: 4) Sets the minimum number of SSTables to start a minor compaction when using sizeTieredCompactionStrategy. Raising this value causes minor compactions to start less frequently and be more I/O-intensive.

memtable_flush_after_mins

Deprecated as of Cassandra 1.0. Can still be declared (for backward compatibility) but is ignored. Use commitlog_total_space_in_mb.

memtable_operations_in_millions

Deprecated as of Cassandra 1.0. Can still be declared (for backward compatibility) but is ignored. Use commitlog_total_space_in_mb.

memtable_throughput_in_mb

Deprecated as of Cassandra 1.0. Can still be declared (for backward compatibility) but is ignored. Use commitlog_total_space_in_mb.

min_sstable_size

(Default: 50MB) The size-tiered compaction strategy groups SSTables for compaction into buckets. The bucketing process groups SSTables that differ in size by less than 50%. This results in a bucketing process that is too fine grained for small SSTables. If your SSTables are small, use min_sstable_size to define a size threshold (in bytes) below which all SSTables belong to one unique bucket.

name

(Default: N/A) Required. The user-defined name of the table.

read_repair_chance

(Default: 0.1 or 1) Specifies the probability with which read repairs should be invoked on non-quorum reads. The value must be between 0 and 1. For tables created in versions of Cassandra before 1.0, it defaults to 1. For tables created in versions of Cassandra 1.0 and higher, it defaults to 0.1. However, for Cassandra 1.0, the default is 1.0 if you use CLI or any Thrift client, such as Hector or pycassa, and is 0.1 if you use CQL.

replicate_on_write

(Default: true) Applies only to counter tables. When set to true, replicates writes to all affected replicas regardless of the consistency level specified by the client for a write request. For counter tables, this should always be set to true.

sstable_size_in_mb

(Default: 5MB) The target size for sstables that use the leveled compaction strategy. Although SSTable sizes should be less or equal to sstable_size_in_mb, it is possible to have a larger SSTable during compaction. This occurs when data for a given partition key is exceptionally large. The data is not split into two SSTables.

sstable_compression

(Default: SnappyCompressor) The compression algorithm to use. Valid values are LZ4Compressor (available in Cassandra 1.2.2 and later), SnappyCompressor, and DeflateCompressor. Use an empty string ('') to disable compression. Choosing the right compressor depends on your requirements for space savings over read performance. LZ4 is fastest to decompress, followed by Snappy, then by Deflate. Compression effectiveness is inversely correlated with decompression speed. The extra compression from Deflate or Snappy is not enough to make up for the decreased performance for general-purpose workloads, but for archival data they may be worth considering. Developers can also implement custom compression classes using the org.apache.cassandra.io.compress.ICompressor interface. Specify the full class name as a "string constant".

tombstone_compaction_interval

(Default: 1 day) The mininum time to wait after an SSTable creation time before considering the SSTable for tombstone compaction. Tombstone compaction is the compaction triggered if the SSTable has more garbage-collectable tombstones than tombstone_threshold.

tombstone_threshold

(Default: 0.2) A ratio of garbage-collectable tombstones to all contained columns, which if exceeded by the SSTable triggers compaction (with no other sstables) for the purpose of purging the tombstones.