Apache Cassandra 0.8 Documentation

Keyspace and Column Family Storage Configuration

Many aspects of storage configuration are set on a per-keyspace or per-column family basis. These attributes can be manipulated programmatically, but in most cases the practical method for defining keyspace and column family attributes is to use the Cassandra CLI or CQL interfaces.

Prior to release 0.7.3, keyspace and column family attributes could be specified in cassandra.yaml, but that is no longer true in 0.7.4 and later. These attributes are now stored in the system keyspace within Cassandra.

Note

The attribute names documented in this section are the names as they are stored in the system keyspace within Cassandra. Most of these attributes can be set in the various client applications, such as Cassandra CLI or CQL. There may be slight differences in how these attributes are named depending on how they are implemented in the client.

Keyspace Attributes

A keyspace must have a user-defined name and a replica placement strategy. It also has replication strategy options, which is a container attribute for replication factor (required) or the number of replicas per data center (optional).

Option Default Value
ks_name n/a (A user-defined value is required)
placement_strategy org.apache.cassandra.locator.SimpleStrategy
strategy_options n/a (container attribute)
column_families n/a (container attribute)

name

Required. The name for the keyspace.

placement_strategy

Required. Determines how replicas for a keyspace will be distributed among nodes in the ring.

Allowed values are:

  • org.apache.cassandra.locator.SimpleStrategy
  • org.apache.cassandra.locator.NetworkTopologyStrategy
  • org.apache.cassandra.locator.OldNetworkTopologyStrategy (deprecated)

These options are described in detail in the replication section.

Note

NetworkTopologyStrategy and OldNetworkTopologyStrategy require a properly configured snitch to be able to determine rack and data center locations of a node (see endpoint_snitch).

strategy_options

Specifies configuration options for the chosen replication strategy.

For SimpleStrategy, it specifies replication_factor in the format of replication_factor:number_of_replicas.

For NetworkTopologyStrategy, it specifies the number of replicas per data center in a comma separated list of datacenter_name:number_of_replicas. Note that what you specify for datacenter_name depends on the cluster-configured snitch you are using. There is a correlation between the data center name defined in the keyspace strategy_options and the data center name as recognized by the snitch you are using. The nodetool ring command prints out data center names and rack locations of your nodes if you are not sure what they are.

See Choosing Keyspace Replication Options for guidance on how to best configure replication strategy and strategy options for your cluster.

Setting and updating strategy options with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@unknown] CREATE KEYSPACE test
WITH placement_strategy = 'NetworkTopologyStrategy'
AND strategy_options=[{us-east:6,us-west:3}];

column_families

A keyspace does not strictly require column families in order to exist as a valid keyspace. In this sense, column families are optional. In a practical sense, most useful keyspaces will contain column families, whose elements are detailed below.

Column Family Attributes

The following are required attributes of the ColumnFamily element, which itself is only valid within a Keyspace element.

Option Default Value
cf_name n/a (A user-defined value is required)
comparator BytesType
column_type Standard
compare_subcolumns_with BytesType
keys_cached 200000
rows_cached 0 (disabled by default)
row_cache_provider ConcurrentLinkedHashCacheProvider
comment n/a
read_repair_chance 1.0 (always on)
gc_grace_seconds 864000 (10 days)
default_validation_class n/a
key_validation_class n/a
min_compaction_threshold 4
max_compaction_threshold 32
row_cache_save_period_in_seconds n/a
key_cache_save_period_in_seconds n/a
memtable_flush_after_mins 1400 (1 day)
memtable_throughput_in_mb 1/16 of the Java heap size
memtable_operations_in_millions throughput / 64 * 0.3
column_metadata n/a (container attribute)

name

Required. The user-defined name of the column family.

comparator

Defines the data types used to validate and sort column names. There are several built-in column comparators available.

column_type

Defaults to Standard for regular column families. For super column families, use Super.

compare_subcolumns_with

Required when column_type is “Super”. Same as comparator but for sub-columns of a SuperColumn.

For attributes of columns, see column_metadata.

keys_cached

Defines how many key locations will be kept in memory per SSTable (see rows_cached for details on caching actual row values). This can be a fixed number of keys or a fraction (for example 0.5 means 50 percent).

DataStax recommends a fixed sized cache over a relative sized cache. Only use relative cache sizing when you are confident that the data in the column family will not continue to grow over time. Otherwise, your cache will grow as your data set does, potentially causing unplanned memory pressure.

rows_cached

Specifies how many rows to cache in memory. This can be a fixed number of rows or a fraction (for example 0.5 means 50 percent).

Using a row cache means that the entire row is cached in memory. This can be detrimental to performance in cases where rows are large, or where rows are frequently modified or removed.

row_cache_provider

Specifies the row cache to use for the column family. Allowed values are: * (default) ConcurrentLinkedHashCacheProvider - Rows are cached using the JVM heap, providing the same row cache behavior as Cassandra versions prior to 0.8. * SerializingCacheProvider - Cached rows are serialized and stored in memory off of the JVM heap, which can reduce garbage collection (GC) pressure on the JVM and thereby improve system performance. Serialized rows are also 8-12 times smaller than unserialized rows. This is the recommended setting, as long as you have jna.jar in the CLASSPATH to enable native methods.

comment

A human readable comment describing the column family.

read_repair_chance

Specifies the probability with which read repairs should be invoked on non-quorum reads. Must be between 0 and 1. Defaults to 1.0 (always perform read repair). Lowering this value will improve throughput, but increase the number of operations that may see stale values if you are not using a strong consistency level.

gc_grace_seconds

Specifies the time to wait before garbage collecting tombstones (deletion markers). Defaults to 864000, or 10 days, which allows a great deal of time for consistency to be achieved prior to deletion. In many deployments this interval can be reduced, and in a single-node cluster it can be safely set to zero.

Note

This property is called gc_grace in the cassandra-cli client.

default_validation_class

Defines the data type used to validate column values. There are several built-in column validators available.

key_validation_class

Defines the data type used to validate row key values. There are several built-in key validators available, however CounterColumnType (distributed counters) cannot be used as a row key validator.

min_compaction_threshold

Sets the minimum number of SSTables to trigger a minor compaction. Raising this value causes minor compactions to start less frequently and be more I/O-intensive. Setting this to 0 disables minor compactions. Defaults to 4.

max_compaction_threshold

Sets the maximum number of SSTables to allow in a minor compaction. Obsolete now that the compaction_throughput setting has been added.

Setting this to 0 disables minor compactions. Defaults to 32.

row_cache_save_period_in_seconds

Sets the number of seconds between saving row caches: the row caches can be saved periodically, and if one exists on startup it will be loaded.

key_cache_save_period_in_seconds

Sets number of seconds between saving key caches: the key caches can be saved periodically, and if one exists on startup it will be loaded.

Note

This property is called key_cache_save_period in the cassandra-cli client.

memtable_throughput_in_mb

Flush a memtable after this much data has been inserted or updated. Actual heap usage will be greater than this due to overhead from column indexing. This setting must be tuned carefully, as there is (at least) one memtable per column family. See Tuning Java Heap Size for more information on tuning Cassandra memory usage.

Note

This property is called memtable_throughput in the cassandra-cli client.

memtable_operations_in_millions

Like memtable_throughput_in_mb this is per-memtable, but here we define the maximum number of updates that will be allowed to this columnfamily before flushing the memtable. This should be tuned in conjunction with memtable_throughput_in_mb as the first threshold hit will cause a flush. To make tuning easier, Cassandra logs at each flush what the memtable thresholds were.

memtable_flush_after_mins

Flush a memtable after this many minutes, even if it is not full yet. Primarily used for making sure infrequently-updated column families do not hold commitlog segments open indefinitely. It should not be the primary threshold hit by a high-traffic column family; that would either cause memory pressure (if the setting is too high) or adversely affect I/O performance (if it is too low).

Note

This property is called memtable_flush_after in the cassandra-cli client.

column_metadata

Column metadata defines attributes of a column. Values for name and validation_class are required, though the default_validation_class for the column family is used if no validation_class is specified. Note that index_type must be set to create a secondary index for a column. index_name is not valid unless index_type is also set.

Name Description
name Binds a validation_class and (optionally) an index to a column.
validation_class Type used to check the column value.
index_name Name for the secondary index.
index_type Type of index. Currently the only supported value is KEYS.

Setting and updating column metadata with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@demo] UPDATE COLUMN FAMILY users WITH comparator=UTF8Type
AND column_metadata=[{column_name: full_name, validation_class: UTF8Type, index_type: KEYS}];