I've got problem with OOM :(
After working about one or two hours cassandra go down with errors like this:
java.lang.OutOfMemoryError: Java heap space
at org.apache.cassandra.io.util.FastByteArrayOutputStream.expand(FastByteArrayOutputStream.java:104)
at org.apache.cassandra.io.util.FastByteArrayOutputStream.write(FastByteArrayOutputStream.java:220)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:319)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:338)
.........................................................
I have 8 CFs, with pretty big amunt of records i each....
For example:
Column Family: play_lists_to_users
SSTable count: 2
Space used (live): 107481694
Space used (total): 107481694
Number of Keys (estimate): 694016
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 1
Read Count: 78
Read Latency: 0.497 ms.
Write Count: 78
Write Latency: 0.064 ms.
Pending Tasks: 0
Bloom Filter False Postives: 0
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 1309232
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: 0.0
Row cache: disabled
Compacted row minimum size: 125
Compacted row maximum size: 149
Compacted row mean size: 149
And this is about only 10% of records i planned to inflate.
I read articles about this kind problem:
http://www.datastax.com/docs/1.0/troubleshooting/index#nodes-are-dying-with-oom-errors
http://www.datastax.com/docs/1.0/operations/tuning#heap-sizing and so one...
It says cassandra automatically set heap size = 1/4 of the RAM or max 8 GB.
And... after reading all this articles i don't understand - which parameter i should change to let cassandra works ok?
I have 32GB RAM, and -Xms8039M -Xmx8039M -Xmn400M (default settings). and i read that is bad idea to increase heap size - to avoid performance problems.
I've reduced batch_mutated to very little portions - about 25.000 records for all 8 Cfs.
I've attached logs and config file.
< a href="http://95.31.214.117/upl/cassi.zip">http://95.31.214.117/upl/cpu-day.png< /a>
And here is a graph of memory and CPU usage:
< a href="http://95.31.214.117/upl/cpu-day.png">http://95.31.214.117/upl/cpu-day.png< /a>
< a href="http://95.31.214.117/upl/memory-day.png">http://95.31.214.117/upl/memory-day.png< /a>
As you can see - it crushed 4 times after worked about the same time:
And, in addition, in output.log is also often occurs the following exception:
java.io.IOError: java.io.IOException: Corrupt (negative) value length encountered
at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114)
at org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:97)
at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:137)
at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:97)
at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:82)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101)
...................................................................................
Looks like something goes wrong there.
Andrey.
