The main configuration file for 0.6.x versions of Cassandra is storage-conf.xml, located in the conf directory of the distribution. This file itself has enough documentation to get most users started, but additional details are listed below. Keyspace options and column family attributes are broken out into a separate table for readability.
The following are options for configuring Keyspaces and are only valid within a Keyspace element.
|Name||n/a (A user-defined value is required)|
The following options attributes of the ColumnFamily element, which itself is only valid within a Keyspace element.
|Name||n/a (A user-defined value is required)|
|RowsCached||n/a (disabled by default)|
A human readable name for the cluster. This value is returned from the describe_cluster_name API call.
Enable this option for new nodes which will join the cluster. Can be used in conjunction with InitialToken to specify which token range to take over. With no InitialToken, AutoBootstrap will acquire half the range of the most loaded node.
There are several caveats with AutoBootstrap.
The default value of org.apache.cassandra.auth.AllowAllAuthenticator effectively disables authentication. For simple authentication, users may choose to switch to org.apache.cassandra.auth.SimpleAuthenticator and provide access.properties and passwd.properties for configuration. Users can add custom IAuthenticator implementations by using the fully qualified class name (provided the resource is available on the class path).
Partitioners control how keys are distributed across the ring. At a high level, the default RandomPartitioner places data on the ring according to an MD5 hash of the key. Other partitioner types available are org.apache.cassandra.dht.OrderPreservingPartitioner and org.apache.cassandra.dht.CollatingOrderPreservingPartitioner. Users are free to implement their own IPartitioner for custom functionality.
See Tokens, Partitioners, and the Ring for more details on Tokens and Partitioners.
Determines the placement of a node’s token in the ring. This setting is only checked on the first start up of a node. With RandomPartitioner configured, it can be used to force equal token spacing around the ring. With OrderPreservingPartitioner, users can specify the node’s token range if the key distribution is known.
Specifies the directory which will hold the commit log data.
This should be on a separate partition from the data directory. See DataFileDirectory for more details.
One or more DataFileDirectory elements can be defined as children of DataFileDirectories. These directories specify the location of SSTable files.
For performance reasons, the data file directories should on separate partitions (ideally separate physical devices) from the CommitLogDirectory.
There must be one or more Seed elements for a working cluster. A Seed is a node used as a Gossip contact point for information regarding ring topology.
The time that a node will wait on a reply from other nodes before the command is failed.
The Phi Failure Accrual Detector value that must be reached before a node is marked as down.
Usually, the default value of 8 is fine. In environments with flaky networks (such as Amazon EC2, at times), this may need to be increased to 9 or 10 to help prevent a node being erroneously marked down.
The size to which the commit log will grow before creating a new commit log segment.
The bind address for other nodes to communicate with this node.
This can be left blank if the hostname is set (using /etc/hostname, for example), DNS resolution is configured, and the address associated with the hostname is the correct one to use. In this case, the result of Java’s InetAddress.getLocalHost() is used. If your environment allows for this, it can help to make the configuration for all nodes the same, eliminating one potential source of configuration error. This also helps to ensure the correct interface is used.
The port used for internal cluster communications.
The address to which the Thrift API calls will be bound. For users that want all interfaces to listen for Thrift, the value 0.0.0.0 may be used. Leaving this value blank has the same effect as for ListenAddress.
The port to which the Thrift service will be bound.
To enable framing for the server, set this to true. Note that either way, this value must match the client side configuration.
Controls if and how SSTable and Index files are mapped into memory via the mmap system call. The default mode of auto enables this feature on 64bit JVMs, as does the explicit use of mmap as the option. The next option, mmap_index_only, uses mmap for just the index files (and is also the result of auto on a 32bit JVM). The remaining option, standard, disables mmap usage.
Logs a warning if a row is compacted and is above this size.
The buffer size to use for reading contiguous columns. This should match the size of the columns typically retrieved using query operations involving a slice predicate.
Denotes the size of the buffer used when flushing memtables to SSTables on disk. If you have few columns per key, you should increase this. This should be decreased if you have many columns for any given key.
Column indexes are added to a row after the data reaches this size. This usually happens if there are a large number of columns in a row or the column values themselves are large. If you consistently read only a few columns from each row, this should be kept small as it denotes how much of the row data must be deserialized to read the column.
Memtables are flushed after this much data (actual heap usage will be greater than this due to overhead from column indexing) has been inserted or updated. This setting must be tuned carefully, as there is one memtable per column family.
The memory to be consumed for BinaryMemtables (used in bulk-loading).
Like MemtableThroughputInMB this is per-memtable, but here we define the total number of columns in millions that will be kept in memory regardless of data size. This should be tuned in conjunction with MemtableThroughputInMB as the first one triggered will cause a memtable flush.
Flush a memtable after this many minutes regardless of other memtable settings. This setting cannot be too large as unflushed column families cannot have their commit log segments deleted. Setting this too low could trigger too many flushes that would greatly impact I/O performance.
The number of reader threads available in the system. A general rule is to keep this twice the number of processor cores in the system. For many systems, increasing this from the default value of 8 to 16 will improve read performance.
Defines the number of writer threads available in the system. On systems with many cores (12 or higher), increasing the default of 32 may yeild peformance improvements.
The method that Cassandra will use to acknowledge writes. The default of periodic is used in conjunction with CommitLogSyncPeriodInMS to control how often the commit log is synched to disk. Periodic syncs are acknowledged immediately. A batch mode is available that will block until the data has been fsynced to disk. This mode is used in conjunction with CommitLogSyncBatchWindowInMS to control how often the syncs are to happen.
How often to send the commit log to disk when in periodic mode of CommitLogSync.
How often to fsync the data to disk (which is a blocking call) when in batch mode of CommitLogSync.
How frequently will we run garbage collection to clean up deletion markers (known as tombstones). This should be a large enough value to allow for the propagation of deletions to all replicas regardless of hardware failures. See the section on compaction below from more information on the effects of this setting.
Defines how replicas are placed on physical hardware.
The default org.apache.cassandra.locator.RackUnawareStrategy simply returns the nodes that lie next to each other on the ring according to the replication factor.
RackAwareStrategy places one replica in a different data center while placing the others on different racks in the current data center. Racks and datacenter must be delineated with IP addresses that differ in the last and second to last octets (respectively) for this strategy to work correctly.
See the section on Replication Strategies for more information.
The number of copies of data to keep in the cluster. The default of one does not mean “make one copy” it means that there is only one copy. Thus to have three way redundancy on data, the ReplicationFactor should be three.
Every ColumnFamily must have a name. This is the only required element.
This attributes defines the sort algorithm which will be used to compare columns. Users may customize this behavior by extending org.apache.cassandra.db.marshal.AbstractType. The different values available for CompareWith are detailed below:
|BytesType||Simple non-validating byte comparison (Default)|
|AsciiType||Similar to BytesType, but validates that input is US-ASCII|
|UTF8Type||UTF-8 encoded string comparison|
|LongType||Compares values as 64 bit longs|
|LexicalUUIDType||128 bit UUID compared by byte value|
|TimeUUIDType||Timestamp compared 128 bit version 1 UUID|
Defines how many key locations will be kept in memory per SSTable (see RowsCached for details on caching actual row values). This can be a fixed size number, a percentage, or a fraction. To specify a percentage or fraction, use “%50” or “0.5” respectively.