This page contains recommended fixes and workarounds for issues commonly encountered with Cassandra:
When setting up a Cassandra cluster, you get an error message about a schema disagreement when multiple schema updates are performed simultaneously, resulting in some nodes in your cluster having a different schema than others.
To correct this problem:
Using the Cassandra CLI, enter the DESCRIBE CLUSTER command. The output contains the IDs of the schema versions and their corresponding nodes. Remove the problematic schema and migration sstables in your system keyspace as described on the Cassandra wiki.
To prevent this problem:
Perform schema changes one at a time, at a steady pace, and from the same node.
Check the SSTable counts in cfstats. If the count is continually growing, the cluster's IO capacity is not enough to handle the write load it is receiving. Reads have slowed down because the data is fragmented across many SSTables and compaction is continually running trying to reduce them. Adding more IO capacity, either via more machines in the cluster, or faster drives such as SSDs, will be necessary to solve this.
If the SSTable count is relatively low (32 or less) then the amount of file cache available per machine compared to the amount of data per machine needs to be considered, as well as the application's read pattern. The amount of file cache can be formulated as (TotalMemory – JVMHeapSize) and if the amount of data is greater and the read pattern is approximately random, an equal ratio of reads to the cache:data ratio will need to seek the disk. With spinning media, this is a slow operation. You may be able to mitigate many of the seeks by using a key cache of 100%, and a small amount of row cache (10000-20000) if you have some 'hot' rows and they are not extremely large.
Check your system.log for messages from the GCInspector. If the GCInspector is indicating that either the ParNew or ConcurrentMarkSweep collectors took longer than 15 seconds, there is a very high probability that some portion of the JVM is being swapped out by the OS. One way this might happen is if the mmap DiskAccessMode is used without JNA support. The address space will be exhausted by mmap, and the OS will decide to swap out some portion of the JVM that isn't in use, but eventually the JVM will try to GC this space. Adding the JNA libraries will solve this (they cannot be shipped with Cassandra due to carrying a GPL license, but are freely available) or the DiskAccessMode can be switched to mmap_index_only, which as the name implies will only mmap the indicies, using much less address space. DataStax recommends that Cassandra nodes disable swap entirely, since it is better to have the OS OutOfMemory (OOM) killer kill the Java process entirely than it is to have the JVM buried in swap and responding poorly.
If the GCInspector isn't reporting very long GC times, but is reporting moderate times frequently (ConcurrentMarkSweep taking a few seconds very often) then it is likely that the JVM is experiencing extreme GC pressure and will eventually OOM. See the section below on OOM errors.
If nodes are dying with OutOfMemory exceptions, check for these typical causes:
If none of these seem to apply to your situation, try loading the heap dump in MAT and see which class is consuming the bulk of the heap for clues.
If you can run nodetool commands locally but not on other nodes in the ring, you may have a common JMX connection problem that is resolved by adding an entry like the following in <install_location>/conf/cassandra-env.sh on each node:
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=<public name>"
If you still cannot run nodetool commands remotely after making this configuration change, do a full evaluation of your firewall and network security. The nodetool utility communciates through JMX on port 7199.
This is an indication that the ring is in a bad state. This can happen when there are token conflicts (for instance, when bootstrapping two nodes simultaneously with automatic token selection.) Unfortunately, the only way to resolve this is to do a full cluster restart; a rolling restart is insufficient since gossip from nodes with the bad state will repopulate it on newly booted nodes.
One possibility is that Java is not allowed to open enough file descriptors. Cassandra generally needs more than the default (1024) amount. This can be adjusted by increasing the security limits on your Cassandra nodes. For example, using the following commands:
echo "* soft nofile 32768" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 32768" | sudo tee -a /etc/security/limits.conf
echo "root soft nofile 32768" | sudo tee -a /etc/security/limits.conf
echo "root hard nofile 32768" | sudo tee -a /etc/security/limits.conf
Another, much less likely possibility, is a file descriptor leak in Cassandra. See if the number of file descriptors opened by java seems reasonable when running lsof -n | grep java and report the error if the number is greater than a few thousand.
Insufficient resource limits may result in a number of errors in Cassandra, DataStax Enterprise, and OpsCenter, including the following:
Insufficient as (address space) or memlock setting:
ERROR [SSTableBatchOpen:1] 2012-07-25 15:46:02,913 AbstractCassandraDaemon.java (line 139)
Fatal exception in thread Thread[SSTableBatchOpen:1,5,main]
java.io.IOError: java.io.IOException: Map failed at ...
Insufficient memlock settings:
WARN [main] 2011-06-15 09:58:56,861 CLibrary.java (line 118) Unable to lock JVM memory (ENOMEM).
This can result in part of the JVM being swapped out, especially with mmapped I/O enabled.
Increase RLIMIT_MEMLOCK or run Cassandra as root.
Insufficient nofiles setting:
WARN 05:13:43,644 Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Too many open files ...
Insufficient nofiles setting:
ERROR [MutationStage:11] 2012-04-30 09:46:08,102 AbstractCassandraDaemon.java (line 139)
Fatal exception in thread Thread[MutationStage:11,5,main]
java.lang.OutOfMemoryError: unable to create new native thread
Insufficient nofiles setting:
2012-08-13 11:22:51-0400 [] INFO: Could not accept new connection (EMFILE)
You can view the current limits using the ulimit -a command. Although limits can also be temporarily set using this command, DataStax recommends permanently changing the settings by adding the following entries to your /etc/security/limits.conf file:
* soft nofile 32768
* hard nofile 32768
root soft nofile 32768
root hard nofile 32768
* soft memlock unlimited
* hard memlock unlimited
root soft memlock unlimited
root hard memlock unlimited
* soft as unlimited
* hard as unlimited
root soft as unlimited
root hard as unlimited
In addition, you may need to be run the following command:
sysctl -w vm.max_map_count = 131072
The command enables more mapping. It is not in the limits.conf file.
The following error may occur when Snappy compression/decompression is enabled although its library is available from the classpath:
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError:
Could not initialize class org.xerial.snappy.Snappy
...
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
at org.apache.cassandra.io.compress.SnappyCompressor.initialCompressedBufferLength
(SnappyCompressor.java:39)
The native library snappy-1.0.4.1-libsnappyjava.so for Snappy compression is included in the snappy-java-1.0.4.1.jar file. When the JVM initializes the JAR, the library is added to the default temp directory. If the default temp directory is mounted with a noexec option, it results in the above exception.
One solution is to specify a different temp directory that has already been mounted without the noexec option, as follows:
If you use the DSE/Cassandra command $_BIN/dse cassandra or $_BIN/cassandra, simply append the command line:
DSE: bin/dse cassandra -t -Dorg.xerial.snappy.tempdir=/path/to/newtmp
Cassandra: bin/cassandra -Dorg.xerial.snappy.tempdir=/path/to/newtmp
If starting from a package using service dse start or service cassandra start, add a system environment variable JVM_OPTS with the value:
JVM_OPTS=-Dorg.xerial.snappy.tempdir=/path/to/newtmp
The default cassandra-env.sh looks for the variable and appends to it when starting the JVM.