Our setup is a 3-node cassandra cluster (all running 1.0.5).
For inserting data we use SSTableSimpleUnsortedWriter to convert csv's to sstables of about 64 MB, and then bulk load them with sstable loader.
This happens every 5 mins, and the total size is about 1GB.
This works all perfectly, no performance problems here and everything gets compacted.
However, when we try to read data from our cluster, we notice some problems.
If we read with cassandra-cli randomly keys, we notice some keys return with a very high delay (2000+ms) and some keys return very fast (<50 ms).
If we spam it a bit (manually) nodes just start dying with OOM errors.
If we try to view the data in OpsCenter Community, nodes also just die with OOM errors.
This is a pastebin with our exact error and some more details: http://pastebin.com/Kykr3th5 (key cache has been changed to 0, but problem stayed.)
We changed the heap space to 8GB (ParNew to 1GB), flush_largest_memtables to 0.5, etc but we keep getting these problems. Are there any other options we can adjust?
Or is there another problem that is causing the OOM?