Apache Cassandra 1.1 Documentation

Troubleshooting Guide

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

This page contains recommended fixes and workarounds for issues commonly encountered with Cassandra:

Reads are getting slower while writes are still fast

Check the SSTable counts in cfstats. If the count is continually growing, the cluster's IO capacity is not enough to handle the write load it is receiving. Reads have slowed down because the data is fragmented across many SSTables and compaction is continually running trying to reduce them. Adding more IO capacity, either via more machines in the cluster, or faster drives such as SSDs, will be necessary to solve this.

If the SSTable count is relatively low (32 or less) then the amount of file cache available per machine compared to the amount of data per machine needs to be considered, as well as the application's read pattern. The amount of file cache can be formulated as (TotalMemory – JVMHeapSize) and if the amount of data is greater and the read pattern is approximately random, an equal ratio of reads to the cache:data ratio will need to seek the disk. With spinning media, this is a slow operation. You may be able to mitigate many of the seeks by using a key cache of 100%, and a small amount of row cache (10000-20000) if you have some 'hot' rows and they are not extremely large.

Nodes seem to freeze after some period of time

Check your system.log for messages from the GCInspector. If the GCInspector is indicating that either the ParNew or ConcurrentMarkSweep collectors took longer than 15 seconds, there is a very high probability that some portion of the JVM is being swapped out by the OS. One way this might happen is if the mmap DiskAccessMode is used without JNA support. The address space will be exhausted by mmap, and the OS will decide to swap out some portion of the JVM that isn't in use, but eventually the JVM will try to GC this space. Adding the JNA libraries will solve this (they cannot be shipped with Cassandra due to carrying a GPL license, but are freely available) or the DiskAccessMode can be switched to mmap_index_only, which as the name implies will only mmap the indicies, using much less address space. DataStax recommends that Cassandra nodes disable swap entirely (sudo swapoff --all), since it is better to have the OS OutOfMemory (OOM) killer kill the Java process entirely than it is to have the JVM buried in swap and responding poorly.

If the GCInspector isn't reporting very long GC times, but is reporting moderate times frequently (ConcurrentMarkSweep taking a few seconds very often) then it is likely that the JVM is experiencing extreme GC pressure and will eventually OOM. See the section below on OOM errors.

Nodes are dying with OOM errors

If nodes are dying with OutOfMemory exceptions, check for these typical causes:

  • Row cache is too large, or is caching large rows
    • Row cache is generally a high-end optimization. Try disabling it and see if the OOM problems continue.
  • The memtable sizes are too large for the amount of heap allocated to the JVM
    • You can expect N + 2 memtables resident in memory, where N is the number of column families. Adding another 1GB on top of that for Cassandra itself is a good estimate of total heap usage.

If none of these seem to apply to your situation, try loading the heap dump in MAT and see which class is consuming the bulk of the heap for clues.

Nodetool or JMX connections failing on remote nodes

If you can run nodetool commands locally but not on other nodes in the ring, you may have a common JMX connection problem that is resolved by adding an entry like the following in <install_location>/conf/cassandra-env.sh on each node:

JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=<public name>"

If you still cannot run nodetool commands remotely after making this configuration change, do a full evaluation of your firewall and network security. The nodetool utility communciates through JMX on port 7199.

View of ring differs between some nodes

This is an indication that the ring is in a bad state. This can happen when there are token conflicts (for instance, when bootstrapping two nodes simultaneously with automatic token selection.) Unfortunately, the only way to resolve this is to do a full cluster restart; a rolling restart is insufficient since gossip from nodes with the bad state will repopulate it on newly booted nodes.

Java reports an error saying there are too many open files

Java is not allowed to open enough file descriptors. Cassandra generally needs more than the default (1024) amount. To increase the number of file descriptors, change the security limits on your Cassandra nodes. For example, using the following commands:

echo "* soft nofile 32768" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 32768" | sudo tee -a /etc/security/limits.conf
echo "root soft nofile 32768" | sudo tee -a /etc/security/limits.conf
echo "root hard nofile 32768" | sudo tee -a /etc/security/limits.conf

Another, much less likely possibility, is a file descriptor leak in Cassandra. Run lsof -n | grep java to check that the number of file descriptors opened by Java is reasonable and reports the error if the number is greater than a few thousand.

Insufficient user resource limits errors

Insufficient resource limits may result in a number of errors in Cassandra and OpsCenter, including the following:

Cassandra errors

Insufficient as (address space) or memlock setting:

ERROR [SSTableBatchOpen:1] 2012-07-25 15:46:02,913 AbstractCassandraDaemon.java (line 139)
Fatal exception in thread Thread[SSTableBatchOpen:1,5,main]
java.io.IOError: java.io.IOException: Map failed  at ...

Insufficient memlock settings:

WARN [main] 2011-06-15 09:58:56,861 CLibrary.java (line 118) Unable to lock JVM memory (ENOMEM).
This can result in part of the JVM being swapped out, especially with mmapped I/O enabled.
Increase RLIMIT_MEMLOCK or run Cassandra as root.

Insufficient nofiles setting:

WARN 05:13:43,644 Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Too many open files ...

Insufficient nofiles setting:

ERROR [MutationStage:11] 2012-04-30 09:46:08,102 AbstractCassandraDaemon.java (line 139)
Fatal exception in thread Thread[MutationStage:11,5,main]
java.lang.OutOfMemoryError: unable to create new native thread

OpsCenter errors

Insufficient nofiles setting:

2012-08-13 11:22:51-0400 [] INFO: Could not accept new connection (EMFILE)

Cannot initialize class org.xerial.snappy.Snappy

The following error may occur when Snappy compression/decompression is enabled although its library is available from the classpath:

java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError:
    Could not initialize class org.xerial.snappy.Snappy
...
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
   at org.apache.cassandra.io.compress.SnappyCompressor.initialCompressedBufferLength
       (SnappyCompressor.java:39)

The native library snappy-1.0.4.1-libsnappyjava.so for Snappy compression is included in the snappy-java-1.0.4.1.jar file. When the JVM initializes the JAR, the library is added to the default temp directory. If the default temp directory is mounted with a noexec option, it results in the above exception.

One solution is to specify a different temp directory that has already been mounted without the noexec option, as follows:

  • If you use the DSE/Cassandra command $_BIN/dse cassandra or $_BIN/cassandra, simply append the command line:

    DSE: bin/dse cassandra -t -Dorg.xerial.snappy.tempdir=/path/to/newtmp

    Cassandra: bin/cassandra -Dorg.xerial.snappy.tempdir=/path/to/newtmp

  • If starting from a package using service dse start or service cassandra start, add a system environment variable JVM_OPTS with the value:

    JVM_OPTS=-Dorg.xerial.snappy.tempdir=/path/to/newtmp

    The default cassandra-env.sh looks for the variable and appends to it when starting the JVM.