CompanyDecember 5, 2012

Configuration changes in Cassandra 1.2

Jonathan Ellis
Jonathan EllisTechnology
Configuration changes in Cassandra 1.2

Cassandra 1.2 brings a number of new and improved configuration options that it is good to be aware of.

Request timeouts

We've split the old rpc_timeout_in_ms setting into separate timeouts for [single-row] reads, range scans, writes, truncation, and miscellanea. This allows you more fine-grained control over timeouts; in particular, range queries tend to take longer than others, and truncate requires flushing so it will also be slower.

We've left the defaults alone for all of these but truncate, which was extended to 60s. (Incidentally, in 1.2 truncate only needs to flush the table being emptied, not every table in the cluster.)

Improved recovery from request overload

Cassandra deals with request overload by dropping requests that are so behind that they've timed out before being processed. Prior to Cassandra 1.2, each replica tracked request timeout locally -- that is, it assumed that setting up the request on the coordinator was instantaneous. But if the coordinator is also overloaded, which is often the case, then this is not a good assumption.

For 1.2 we've added the ability to do this with the cross_node_timeout option. This is off by default, since it requires your Cassandra cluster's clocks to be synchronized. If you have ntp enabled or otherwise synchronize your clocks, go ahead and turn cross node timeouts on.

End-to-end encryption

Cassandra has supported SSL between cluster nodes since 0.8. Now we're extending that to client connections as well. Look for client_encryption_options in cassandra.yaml.

Bloom filters

Cassandra uses bloom filters in its log-structured storage engine to avoid scanning data files that can't possibly include the partitions being queried.

Bloom filters are configured on a per-table basis, not globally like the above options. Compaction is also configured per-table.

Since leveled compaction does such a good job at minimizing the number of sstables that a given data partition can be spread across, we don't need to be quite so aggressive with the bloom filters we create. By default, Cassandra 1.2 will use a bloom filter false positive chance of 0.1 for tables using leveled compaction, and 0.01 for tables using size-tiered compaction. This results in memory savings of about 50% for those bloom filters.

Others

We've blogged about some other configuration changes in longer articles:

Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.