What is the best way to get a row count for a given column family? I am running Cassandra 1.0.8.
Thanks.
If you just need a close approximation, nodetool cfstats will show an estimate that's within 128 of the actual count. If you need to get an exact number or need to do this programmatically and you can't use JMX, you can use get_range()
with a column count of zero to get all of the keys. However, note that this is subject to the "range ghosts" problem: http://wiki.apache.org/cassandra/FAQ#range_ghosts
Thanks for the tips. One follow up question - does cfstats only return the values for the node you are running on? If so, is there a way to get the results for the whole ring?
Right, cfstats only returns values for the node you're connecting to with nodetool. Assuming your ring is balanced, you can get an estimate of the total number of rows by multiplying that number by (number_of_nodes
/RF
). If the ring is not balanced, you'll need to sum the results for all nodes and then divide by RF
.