DataStax Developer Blog

Cassandra 2.1: now over 50% faster

By Jonathan Ellis -  July 16, 2014 | 9 Comments

Besides improvements to compaction and repair, 2.1 brings dramatic improvements to the core read and write paths. The two most important changes were:

  1. Adding response grouping to the CQL dispatcher, on a similar principle as Nagle’s algorithm.
  2. Introducing the SharedExecutorPool for worker threads on replicas.

On reads, these combine for a 75% performance boost over 2.0 CQL, and 160% over Thrift:

Screen Shot 2014-07-16 at 10.46.57 AM

On writes, we see a similar improvement — 95% better than 2.0 CQL, and 150% better than Thrift:

Screen Shot 2014-07-16 at 10.48.30 AM

But wait! Why is write performance so inconsistent in 2.1? Writes are mostly cruising along at over 190k ops/s, but frequently dips as low as 120, so the average only works out to about 180.

It turns out that after writing a custom in-memory BTree to replace SnapTreeMap and removing the switchlock contention, writes on this 32 core VM are actually bottlenecked on the (single) commitlog disk now. We confirmed this by testing with durable writes disabled, but that’s not a very useful scenario for production. So we’re prioritizing commitlog compression and support for multiple commitlog volumes quickly.

Final thoughts:

  • CQL delivering on its promise of a substantial performance boost over Thrift. Even if you only care about performance and not the productivity benefits of CQL, I strongly recommend against Thrift unless you are maintaining a legacy code base.
  • Some environments will benefit more than others from the improvements here. EC2 seems particularly happy with the new executor pool; other hardware may see different gains.  Our two year old, 8 core test machines with 6 SATA disks saw “only” a 50% improvement on reads and a 60% improvement on writes.


Comments

  1. Alex White says:

    People are sticking with Thrift because CQL is a totally different paradigm. If CQL offered a way to access schema-less Column Families we’d use it in a heartbeat.

    1. Jonathan Ellis says:

      Sounds like a misunderstanding of what “schema-less column families” actually do. See http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows.

      1. Nishant says:

        Back in 2010, it was hard to bend a SQL mind to think the columner Thrift-way. Now when they understand that paradigm, it is hard to undo that and say hey if you create composite keys, you get your wide row beneath. But then they say, it is too complex, plus they do not get that “feel”. Then you open cassandra-cli and show them what’s beneath. Then some of them go back, turn their monitor away, and write Thrift calls. JK. :)

        1. Brice says:

          That’s not always the case, I’m a beginner regarding cassandra. And I agree CQL is simple, but it hides the daemon, wide rows, queues, etc, that you cannot really understand if you aren’t familiar with the internal ways of cassandra.

          CQL may work for small data and apps without real latency issues. But the strong point of cassandra and one the main interest over other nosql data store, is it’s ability to scale well. i.e in environment where there’s usually constraint with latency and massive data .

  2. Edward Capriolo says:

    thrift hsha?

  3. Rao says:

    do we know when the 2.1 is scheduled for release?

  4. Gowri Shankar H says:

    just wondering do Cassandra support replication from DB2 or oracle db to Cassandra?… if yes what could be steps or procedure?…IF not then any work around is there?

  5. flavio says:

    Am I missing something?
    The percentages presented don’t correspond to the chart representations.
    At least not vis-à-vis: Thrift-to-Thrift & CQL-to-CQL

    1. Richard Dawe says:

      The percentages seem to be comparing 2.0.1-rc3 cql vs. everything else, i.e.:

      “95% better than 2.0 CQL” is 2.1 CQL vs. 2.0 CQL

      “150% better than Thrift” 2.1 CQL vs. 2.0 Thrift or 2.1 Thrift

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>