DataStax Blog

Why should I use Cassandra?

By Billy Bosworth -  January 9, 2012 | 2 Comments

Whether I’m speaking with the press, analysts, potential customers, or partners, the “magic bullet” question is inevitably heard ricocheting off a rock in the background at some point during the conversation.  This big data stuff is something that most people do not absorb quickly and trying to look for that “one thing” is a natural inclination.

For example, to answer the question “Why should I use Cassandra?” I could confidently tell you something like: “For time-series data.”  That’s a huge use case in the big data world, and Cassandra handles it better than anyone out there, but it’s just one use case of many where Cassandra tops the list of options. Also, the inevitable next question is: “Why?”

Regardless of the specific use case, when choosing a database it’s almost always a combination of things that matter.  The good news is, it’s a rather small list of things when you really boil it down.

In a recent press release, I wrote the following paragraph touching on the key differentiators of Cassandra:

… The peer-to-peer design allows for high performance with linear scalability and no single points of failure, even across multiple data centers.  Combine this with native optimization for the cloud and an extremely robust data model and Cassandra clearly stands apart from the competition for enterprise, mission-critical systems.

Those that know me well know that I don’t do “marketing hype.”  I’ve been developing against, administering, or building tools for relational databases for 20 years.  The computer science guy in me will likely never die, which means I see these technologies with a realistic eye.  And with that realism in mind, I can tell you I thoroughly believe every word of what I wrote when it comes to the power of Cassandra for solving big data needs.  I still stand in awe of this technology and it is incredibly exciting watching the world see it in action for the first time.

Over the next several days I want to add some color to the points in that paragraph as they represent critical gating factors when choosing which backend storage system is right for your big data projects.  By doing so I will try to shed some light on why it isn’t just “one thing” that matters and why the set of things that do matter are often inextricably linked.  Once you understand that, it will be much easier to grasp how use cases sit atop these key, foundational aspects of Cassandra.

Stay tuned…



Comments

  1. Al says:

    I think that distributed databases solve a problem that a lot of companies with large datasets have had to solve independently in the past. Cassandra has an approach that hybridizes the Bigtable and Dynamo models, where a lot of its competitors chose to take one path or the other. Over the Bigtable clones, Cassandra has huge high-availability advantages, and no single point of failure (possible because of the eventually consistent approach). When compared to the Dynamo adherents, Cassandra has the advantage of a more advanced datamodel, allowing for a single “row” to contain billions of column/value pairs: enough to fill a machine. You also get efficient range queries for the top level key, and even within your values
    Source: Why does large Social Network projects switch to use Cassandra instead of Mysql?

  2. Prem Kumar says:

    The most important question troubling architects is when do you really say a NOSQL like Cassandra is the best fit compared to an RDBMS. I feel since atomicity is limited to a single rowkey in cassandra, if you find your application requires atomicity spanning mutliple records, then cassandra may not be a straightaway choice. You may be better off with a traditional RDBMS.. Still if you want to use cassandra for transactionally heavy systems, you have to re-design the data model so that all your atomic data is against a particular rowkey- wide column design strategy. It will be a challenge.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>