I've been running Cassandra 0.8.4 in a 10 node cluster for several months now without issue, but in the last week I've seen get_slice operations from my client app occasionally take upward of 1 minute to complete. Earlier this week I released another app which increased the write volume on the cluster (on another CF) but subsequently rolled it back, but this seems to have been the impetus. Other than that the transaction load should not have changed. I'm at a loss for how to investigate the cause any further and was wondering if anyone else has every seen behavior like this?
Each node has an SSTable count for this ColumnFamily < 10.
I'm using key caching but no row caching.
My heap sizes are:-Xms6G -Xmx6G -Xmn1536M
Each box has 64GB of RAM and swap is never touched.
I'm using the Sun JVM 6u24_x64 on RHEL 5.
Edit: Should have mentioned transactions are done with QUORUM consistency
Any tips would be greatly appreciated.