One sticking point to my adopting Cassandra for our research is the question of how to set 'milestones' or 'markers' in data and do 'retrieve all since last marker'. In MySQL I use an auto_increment column and just select for everything greater than a known point.
Situation: we collect and store sensor data. Its hundreds of millions of records x hundreds of locations. The data comes in as 'raw' and has to be processed several times over. To this end I keep pointers of 'what have I looked at'.
How could I implement this in Cassandra given the 'eventual consistency' model? I've pondered date/time/mSec/sensor keys and even putting a 'seen'/'not seen' bit on each piece of sensor data.
Any suggestions on how this could be implemented?
