Apache Cassandra 1.0 Documentation

Keyspace and Column Family Storage Configuration

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

Many aspects of storage configuration are set on a per-keyspace or per-column family basis. These attributes can be manipulated programmatically, but in most cases the practical method for defining keyspace and column family attributes is to use the Cassandra CLI or CQL interfaces.

Prior to release 0.7.3, keyspace and column family attributes could be specified in cassandra.yaml, but that is no longer true in 0.7.4 and later. These attributes are now stored in the system keyspace within Cassandra.

Note

The attribute names documented in this section are the names as they are stored in the system keyspace within Cassandra. Most of these attributes can be set in the various client applications, such as Cassandra CLI or CQL. There may be slight differences in how these attributes are named depending on how they are implemented in the client.

Keyspace Attributes

A keyspace must have a user-defined name and a replica placement strategy. It also has replication strategy options, which is a container attribute for replication factor or the number of replicas per data center.

Option Default Value
ks_name n/a (A user-defined value is required)
placement_strategy org.apache.cassandra.locator.SimpleStrategy
strategy_options n/a (container attribute)

name

Required. The name for the keyspace.

placement_strategy

Required. Determines how replicas for a keyspace will be distributed among nodes in the ring.

Allowed values are:

  • org.apache.cassandra.locator.SimpleStrategy
  • org.apache.cassandra.locator.NetworkTopologyStrategy
  • org.apache.cassandra.locator.OldNetworkTopologyStrategy (deprecated)

These options are described in detail in the replication section.

Note

NetworkTopologyStrategy and OldNetworkTopologyStrategy require a properly configured snitch to be able to determine rack and data center locations of a node (see endpoint_snitch).

strategy_options

Specifies configuration options for the chosen replication strategy.

For SimpleStrategy, it specifies replication_factor in the format of replication_factor:number_of_replicas.

For NetworkTopologyStrategy, it specifies the number of replicas per data center in a comma separated list of datacenter_name:number_of_replicas. Note that what you specify for datacenter_name depends on the cluster-configured snitch you are using. There is a correlation between the data center name defined in the keyspace strategy_options and the data center name as recognized by the snitch you are using. The nodetool ring command prints out data center names and rack locations of your nodes if you are not sure what they are.

See Choosing Keyspace Replication Options for guidance on how to best configure replication strategy and strategy options for your cluster.

Setting and updating strategy options with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@unknown] CREATE KEYSPACE test
WITH placement_strategy = 'NetworkTopologyStrategy'
AND strategy_options={us-east:6,us-west:3};

Column Family Attributes

The following attributes can be declared per column family.

Option Default Value
column_metadata n/a (container attribute)
column_type Standard
comment n/a
compaction_strategy SizeTieredCompactionStrategy
compaction_strategy_options n/a (container attribute)
comparator BytesType
compare_subcolumns_with BytesType
compression_options n/a (container attribute)
default_validation_class n/a
dc_local_read_repair_chance 0.0
gc_grace_seconds 864000 (10 days)
key_validation_class n/a
keys_cached 200000
max_compaction_threshold 32
min_compaction_threshold 4
memtable_flush_after_mins ignored in 1.0 and later releases
memtable_operations_in_millions ignored in 1.0 and later releases
memtable_throughput_in_mb ignored in 1.0 and later releases
cf_name n/a (A user-defined value is required.)
read_repair_chance 0.1 or 1 (See description below.)
replicate_on_write true
rows_cached 0 (disabled by default)

column_metadata

Column metadata defines attributes of a column. Values for name and validation_class are required, though the default_validation_class for the column family is used if no validation_class is specified. Note that index_type must be set to create a secondary index for a column. index_name is not valid unless index_type is also set.

Name Description
name Binds a validation_class and (optionally) an index to a column.
validation_class Type used to check the column value.
index_name Name for the secondary index.
index_type Type of index. Currently the only supported value is KEYS.

Setting and updating column metadata with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@demo] UPDATE COLUMN FAMILY users WITH comparator=UTF8Type
AND column_metadata=[{column_name: full_name, validation_class: UTF8Type, index_type: KEYS}];

column_type

Defaults to Standard for regular column families. For super column families, use Super.

comment

A human readable comment describing the column family.

compaction_strategy

Sets the compaction strategy for the column family. The available strategies are:

  • SizeTieredCompactionStrategy - This is the default compaction strategy and the only compaction strategy available in pre-1.0 releases. This strategy triggers a minor compaction whenever there are a number of similar sized SSTables on disk (as configured by min_compaction_threshold). This strategy causes bursts in I/O activity while a compaction is in process, followed by longer and longer lulls in compaction activity as SSTable files grow larger in size. These I/O bursts can negatively effect read-heavy workloads, but typically do not impact write performance. Watching disk capacity is also important when using this strategy, as compactions can temporarily double the size of SSTables for a column family while a compaction is in progress.
  • LeveledCompactionStrategy - TThe leveled compaction strategy creates SSTables of a fixed, relatively small size (5 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Disk I/O is more uniform and predictable as SSTables are continuously being compacted into progressively larger levels. At each level, row keys are merged into non-overlapping SSTables. This can improve performance for reads, because Cassandra can determine which SSTables in each level to check for the existence of row key data. This compaction strategy is modeled after Google's leveldb implementation.

compaction_strategy_options

Sets options related to the chosen compaction_strategy. Currently only LeveledCompactionStrategy has options.

Option Default Value Description
sstable_size_in_mb 5 Sets the file size for leveled SSTables. A compaction is triggered when unleveled SSTables (newly flushed SSTable files in Level 0) exceeds 4 * sstable_size_in_mb.

Setting and updating compaction strategy options with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@demo] UPDATE COLUMN FAMILY users WITH compaction_strategy=LeveledCompactionStrategy
AND compaction_strategy_options={sstable_size_in_mb: 10};

comparator

Defines the data types used to validate and sort column names. There are several built-in column comparators available. Note that the comparator cannot be changed after a column family is created.

compare_subcolumns_with

Required when column_type is "Super". Same as comparator but for sub-columns of a SuperColumn.

For attributes of columns, see column_metadata.

compression_options

This is a container attribute for setting compression options on a column family. It contains the following options:

Option Description
sstable_compression Specifies the compression algorithm to use when compressing SSTable files. Cassandra supports two built-in compression classes: SnappyCompressor (Snappy compression library) and DeflateCompressor (Java zip implementation). Snappy compression offers faster compression/decompression while the Java zip compression offers better compression ratios. Choosing the right one depends on your requirements for space savings over read performance. For read-heavy workloads, Snappy compression is recommended. Developers can also implement custom compression classes using the org.apache.cassandra.io.compress.ICompressor interface.
chunk_length_kb Sets the compression chunk size in kilobytes. The default value (64) is a good middle-ground for compressing column families with either wide rows or with skinny rows. With wide rows, it allows reading a 64kb slice of column data without decompressing the entire row. For skinny rows, although you may still end up decompressing more data than requested, it is a good trade-off between maximizing the compression ratio and minimizing the overhead of decompressing more data than is needed to access a requested row.The compression chunk size can be adjusted to account for read/write access patterns (how much data is typically requested at once) and the average size of rows in the column family.

Setting and updating compression options with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@demo] UPDATE COLUMN FAMILY users WITH compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64};

dc_local_read_repair_chance

Specifies the probability with which read repairs should be invoked over all replicas in the current data center. Contrast read_repair_chance.

default_validation_class

Defines the data type used to validate column values. There are several built-in column validators available.

gc_grace_seconds

Specifies the time to wait before garbage collecting tombstones (deletion markers). Defaults to 864000, or 10 days, which allows a great deal of time for consistency to be achieved prior to deletion. In many deployments this interval can be reduced, and in a single-node cluster it can be safely set to zero.

Note

This property is called gc_grace in the cassandra-cli client.

keys_cached

Defines how many key locations will be kept in memory per SSTable (see rows_cached for details on caching actual row values). This can be a fixed number of keys or a fraction (for example 0.5 means 50 percent).

DataStax recommends a fixed sized cache over a relative sized cache. Only use relative cache sizing when you are confident that the data in the column family will not continue to grow over time. Otherwise, your cache will grow as your data set does, potentially causing unplanned memory pressure.

key_validation_class

Defines the data type used to validate row key values. There are several built-in key validators available, however CounterColumnType (distributed counters) cannot be used as a row key validator.

name

Required. The user-defined name of the column family.

read_repair_chance

Specifies the probability with which read repairs should be invoked on non-quorum reads. The value must be between 0 and 1. For column families created in versions of Cassandra before 1.0, it defaults to 1.0. For column families created in versions of Cassandra 1.0 and higher, it defaults to 0.1. However, for Cassandra 1.0, the default is 1.0 if you use CLI or any Thrift client, such as Hector or pycassa, and is 0.1 if you use CQL.

A value of .01 means that a read repair is performed 10% of the time and a value of 1 means that a read repair is performed 100% of the time. Lower values improve read throughput, but increase the chances of stale values when not using a strong consistency level.

replicate_on_write

Applies only to counter column families. When set to true, replicates writes to all affected replicas regardless of the consistency level specified by the client for a write request. For counter column families, this should always be set to true.

max_compaction_threshold

Sets the maximum number of SSTables to allow in a minor compaction when compaction_strategy=SizeTieredCompactionStrategy. Obsolete as of Cassandra 0.8 with the addition of compaction throttling (see cassandra.yaml parameter compaction_throughput_mb_per_sec).

Setting this to 0 disables minor compactions. Defaults to 32.

min_compaction_threshold

Sets the minimum number of SSTables to trigger a minor compaction when compaction_strategy=sizeTieredCompactionStrategy. Raising this value causes minor compactions to start less frequently and be more I/O-intensive. Setting this to 0 disables minor compactions. Defaults to 4.

memtable_flush_after_mins

Deprecated as of Cassandra 1.0. Can still be declared (for backwards compatibility) but settings will be ignored. Use the cassandra.yaml parameter commitlog_total_space_in_mb instead.

memtable_operations_in_millions

Deprecated as of Cassandra 1.0. Can still be declared (for backwards compatibility) but settings will be ignored. Use the cassandra.yaml parameter commitlog_total_space_in_mb instead.

memtable_throughput_in_mb

Deprecated as of Cassandra 1.0. Can still be declared (for backwards compatibility) but settings will be ignored. Use the cassandra.yaml parameter commitlog_total_space_in_mb instead.

rows_cached

Specifies how many rows to cache in memory. This can be a fixed number of rows or a fraction (for example 0.5 means 50 percent).

Using a row cache means that the entire row is cached in memory. This can be detrimental to performance in cases where rows are large, or where rows are frequently modified or removed.