Apache Cassandra 0.6 Documentation

Storage Configuration

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

The main configuration file for 0.6.x versions of Cassandra is storage-conf.xml, located in the conf directory of the distribution. This file itself has enough documentation to get most users started, but additional details are listed below. Keyspace options and column family attributes are broken out into a separate table for readability.

Option Default Value
ClusterName Test Cluster
AutoBootstrap false
HintedHandoffEnabled true
Authenticator org.apache.cassandra.auth.AllowAllAuthenticator
Partitioner org.apache.cassandra.dht.RandomPartitioner
InitialToken n/a
CommitLogDirectory /var/lib/cassandra/commitlog
DataFileDirectory /var/lib/cassandra/data
Seed 127.0.0.1
RpcTimeoutInMillis 10000
PhiConvictThreshold 8
CommitLogRotationThresholdInMB 128
ListenAddress localhost
StoragePort 7000
ThriftAddress localhost
ThriftPort 9160
ThriftFramedTransport false
DiskAccessMode auto
RowWarningThresholdInMB 512
SlicedBufferSizeInKB 64
FlushDataBufferSizeInMB 32
FlushIndexBufferSizeInMB 8
ColumnIndexSizeInKB 64
MemtableThroughputInMB 64
BinaryMemtableThroughputInMB 256
MemtableOperationsInMillions 0.3
MemtableFlushAfterMinutes 60
ConcurrentReads 8
ConcurrentWrites 32
CommitLogSync periodic
CommitLogSyncPeriodInMS 10000
CommitLogSyncBatchWindowInMS 1
GCGraceSeconds 864000

Keyspace Options

The following are options for configuring Keyspaces and are only valid within a Keyspace element.

Option Default Value
Name n/a (A user-defined value is required)
ReplicaPlacementStrategy org.apache.cassandra.locator.RackUnawareStrategy
ReplicationFactor 1
EndPointSnitch org.apache.cassandra.locator.EndPointSnitch

Column Family Options

The following options attributes of the ColumnFamily element, which itself is only valid within a Keyspace element.

Option Default Value
Name n/a (A user-defined value is required)
CompareWith BytesType
CompareSubcolumnsWith BytesType
ColumnType Standard
RowsCached n/a (disabled by default)
KeysCached 200000
Comment n/a

ClusterName

A human readable name for the cluster. This value is returned from the describe_cluster_name API call.

AutoBootstrap

Enable this option for new nodes which will join the cluster. Can be used in conjunction with InitialToken to specify which token range to take over. With no InitialToken, AutoBootstrap will acquire half the range of the most loaded node.

Note

There are several caveats with AutoBootstrap.

HintedHandoffEnabled

Set to false to disable HintedHandoff. Default is true.

Authenticator

The default value of org.apache.cassandra.auth.AllowAllAuthenticator effectively disables authentication. For simple authentication, users may choose to switch to org.apache.cassandra.auth.SimpleAuthenticator and provide access.properties and passwd.properties for configuration. Users can add custom IAuthenticator implementations by using the fully qualified class name (provided the resource is available on the class path).

Partitioner

Partitioners control how keys are distributed across the ring. At a high level, the default RandomPartitioner places data on the ring according to an MD5 hash of the key. Other partitioner types available are org.apache.cassandra.dht.OrderPreservingPartitioner and org.apache.cassandra.dht.CollatingOrderPreservingPartitioner. Users are free to implement their own IPartitioner for custom functionality.

See Tokens, Partitioners, and the Ring for more details on Tokens and Partitioners.

InitialToken

Determines the placement of a node’s token in the ring. This setting is only checked on the first start up of a node. With RandomPartitioner configured, it can be used to force equal token spacing around the ring. With OrderPreservingPartitioner, users can specify the node’s token range if the key distribution is known.

CommitLogDirectory

Specifies the directory which will hold the commit log data.

Note

This should be on a separate partition from the data directory. See DataFileDirectory for more details.

DataFileDirectory

One or more DataFileDirectory elements can be defined as children of DataFileDirectories. These directories specify the location of SSTable files.

For performance reasons, the data file directories should on separate partitions (ideally separate physical devices) from the CommitLogDirectory.

Seed

There must be one or more Seed elements for a working cluster. A Seed is a node used as a Gossip contact point for information regarding ring topology.

RpcTimeoutInMillis

The time that a node will wait on a reply from other nodes before the command is failed.

PhiConvictThreshold

The Phi Failure Accrual Detector value that must be reached before a node is marked as down.

Usually, the default value of 8 is fine. In environments with flaky networks (such as Amazon EC2, at times), this may need to be increased to 9 or 10 to help prevent a node being erroneously marked down.

CommitLogRotationThresholdInMB

The size to which the commit log will grow before creating a new commit log segment.

ListenAddress

The bind address for other nodes to communicate with this node.

This can be left blank if the hostname is set (using /etc/hostname, for example), DNS resolution is configured, and the address associated with the hostname is the correct one to use. In this case, the result of Java’s InetAddress.getLocalHost() is used. If your environment allows for this, it can help to make the configuration for all nodes the same, eliminating one potential source of configuration error. This also helps to ensure the correct interface is used.

Unlike ThriftAddress, you may not set this to 0.0.0.0. See the FAQ entry on the topic for more details.

StoragePort

The port used for internal cluster communications.

ThriftAddress

The address to which the Thrift API calls will be bound. For users that want all interfaces to listen for Thrift, the value 0.0.0.0 may be used. Leaving this value blank has the same effect as for ListenAddress.

ThriftPort

The port to which the Thrift service will be bound.

ThriftFramedTransport

To enable framing for the server, set this to true. Note that either way, this value must match the client side configuration.

DiskAccessMode

Controls if and how SSTable and Index files are mapped into memory via the mmap system call. The default mode of auto enables this feature on 64bit JVMs, as does the explicit use of mmap as the option. The next option, mmap_index_only, uses mmap for just the index files (and is also the result of auto on a 32bit JVM). The remaining option, standard, disables mmap usage.

RowWarningThresholdInMB

Logs a warning if a row is compacted and is above this size.

SlicedBufferSizeInKB

The buffer size to use for reading contiguous columns. This should match the size of the columns typically retrieved using query operations involving a slice predicate.

FlushDataBufferSizeInMB

Denotes the size of the buffer used when flushing memtables to SSTables on disk. If you have few columns per key, you should increase this. This should be decreased if you have many columns for any given key.

FlushIndexBufferSizeInMB

Behaves similarly to FlushDataBufferSizeInMB except for index files.

ColumnIndexSizeInKB

Column indexes are added to a row after the data reaches this size. This usually happens if there are a large number of columns in a row or the column values themselves are large. If you consistently read only a few columns from each row, this should be kept small as it denotes how much of the row data must be deserialized to read the column.

MemtableThroughputInMB

Memtables are flushed after this much data (actual heap usage will be greater than this due to overhead from column indexing) has been inserted or updated. This setting must be tuned carefully, as there is one memtable per column family.

BinaryMemtableThroughputInMB

The memory to be consumed for BinaryMemtables (used in bulk-loading).

MemtableOperationsInMillions

Like MemtableThroughputInMB this is per-memtable, but here we define the total number of columns in millions that will be kept in memory regardless of data size. This should be tuned in conjunction with MemtableThroughputInMB as the first one triggered will cause a memtable flush.

MemtableFlushAfterMinutes

Flush a memtable after this many minutes regardless of other memtable settings. This setting cannot be too large as unflushed column families cannot have their commit log segments deleted. Setting this too low could trigger too many flushes that would greatly impact I/O performance.

ConcurrentReads

The number of reader threads available in the system. A general rule is to keep this twice the number of processor cores in the system. For many systems, increasing this from the default value of 8 to 16 will improve read performance.

ConcurrentWrites

Defines the number of writer threads available in the system. On systems with many cores (12 or higher), increasing the default of 32 may yeild peformance improvements.

CommitLogSync

The method that Cassandra will use to acknowledge writes. The default of periodic is used in conjunction with CommitLogSyncPeriodInMS to control how often the commit log is synched to disk. Periodic syncs are acknowledged immediately. A batch mode is available that will block until the data has been fsynced to disk. This mode is used in conjunction with CommitLogSyncBatchWindowInMS to control how often the syncs are to happen.

CommitLogSyncPeriodInMS

How often to send the commit log to disk when in periodic mode of CommitLogSync.

CommitLogSyncBatchWindowInMS

How often to fsync the data to disk (which is a blocking call) when in batch mode of CommitLogSync.

GCGraceSeconds

How frequently will we run garbage collection to clean up deletion markers (known as tombstones). This should be a large enough value to allow for the propagation of deletions to all replicas regardless of hardware failures. See the section on compaction below from more information on the effects of this setting.

Keyspace Elements

ReplicaPlacementStrategy

Defines how replicas are placed on physical hardware.

The default org.apache.cassandra.locator.RackUnawareStrategy simply returns the nodes that lie next to each other on the ring according to the replication factor.

RackAwareStrategy places one replica in a different data center while placing the others on different racks in the current data center. Racks and datacenter must be delineated with IP addresses that differ in the last and second to last octets (respectively) for this strategy to work correctly.

See the section on Replication Strategies for more information.

ReplicationFactor

The number of copies of data to keep in the cluster. The default of one does not mean “make one copy” it means that there is only one copy. Thus to have three way redundancy on data, the ReplicationFactor should be three.

EndPointSnitch

The EndPointSnitch selection gives Cassandra an idea of your network topology. This option can be one of either EndpointSnitch or DynamicEndpointSnitch.

ColumnFamily Attributes

Name

Every ColumnFamily must have a name. This is the only required element.

ColumnType

Defaults to “standard” for regular columns. For super columns, use “super”.

CompareWith

This attributes defines the sort algorithm which will be used to compare columns. Users may customize this behavior by extending org.apache.cassandra.db.marshal.AbstractType. The different values available for CompareWith are detailed below:

Type Description
BytesType Simple non-validating byte comparison (Default)
AsciiType Similar to BytesType, but validates that input is US-ASCII
UTF8Type UTF-8 encoded string comparison
LongType Compares values as 64 bit longs
LexicalUUIDType 128 bit UUID compared by byte value
TimeUUIDType Timestamp compared 128 bit version 1 UUID

CompareSubcolumnsWith

Same as CompareWith but for sub-columns of a SuperColumn.

KeysCached

Defines how many key locations will be kept in memory per SSTable (see RowsCached for details on caching actual row values). This can be a fixed size number, a percentage, or a fraction. To specify a percentage or fraction, use “%50” or “0.5” respectively.

RowsCached

Specifies how many rows to cache in memory. Using RowsCached means that the whole row is cached in memory. This can be actually be detrimental to performance in cases where rows are large or frequently modified or removed. The same syntax rules for defining KeysCached apply here.

Comment

A human readable comment for a column family.