Related question from reading the blog on ttl_rebuilder_fixed_rate_period.
If compaction will normally cause expired TTL columns to trigger reindexing, do I have a potential issue where compaction could cause mutation messages to drop when the backpressure feature kicks in?
To illustrate my concern:
Let's say I'm maintaining 30 days of data and TTL rows older than 30 days.
My average row size is ~1600 bytes. Let's say I write 2 million rows/day, so over the course of 30 days, I have 60 million rows ~ 96GB (I'm not sure that total data size actually matters...see below)
If my compaction_throughput_mb_per_sec is set to 16 and 1/30th of my data is TTL'd on average, does this mean I have perhaps 16MB/sec / (1600 bytes/row) = 10,000 rows/sec * 1/30th = 333 rows/sec that might be getting thrown at Solr to reindex?
If so, do I need to significantly alter my compaction throughput or perhaps schedule compaction for off-hours to avoid collision with normal high-frequency writes? Are there other alternatives that you can see?
I'm trying to figure out the spike I'm seeing in tpstats. My data rate is currently throttled to 1000 writes/sec:
2013-08-26T04:07:45 192.168.131.193 MutationStage 0 22 27205533 0 0
2013-08-26T04:08:03 192.168.131.193 MutationStage 13 56 27223449 0 0
2013-08-26T04:08:30 192.168.131.193 MutationStage 32 66040 27237761 0 0
2013-08-26T04:08:56 192.168.131.193 MutationStage 0 0 27381008 0 0
In 27 seconds I go from 56 pending messages to 66k messages.