Planet Cassandra

Fast Analytics on Operational Data? Cassandra and Spark Have You Covered

By Planet CassandraMay 8, 2014

In the Cassandra community over the last year, we have seen rapid adoption of Apache Spark. In many ways, we have Evan Chan (@evanfchan), previously of Ooyala, to thank for this.

Last June at Cassandra Summit 2013, Evan gave a presentation about his work with Cassandra, Spark, and Shark, which was all about real-time analytics on Cassandra data.

So, why is the Cassandra community adopting Spark for analytics? Well…as Brian O’Neill (@boneill42) of Health Market Science puts it, “Sure, you could go grab Hadoop, and be locked into articulating analytics/transformations as MapReduce constructs. But that just makes people sad. Instead, I’d recommend Spark. It makes people happy”.

The Cassandra and Spark communities are going to be even happier with today’s news that DataStax and Databricks, the company driving Apache Spark, have announced a partnership to make it easier to integrate Cassandra and Spark together and code will be contributed back to the open source community.

Chanan Braunstein of Pearson Education sums up the benefits of such a partnership nicely:”The new Spark/Shark functionality on Cassandra is giving our users a scalable and high-performance way to quickly analyze our constantly growing data set. By moving from a relational database, this new functionality will allow us to deliver real-time data analytics where before our users relied on time delayed reports”.

If you’re interested in learning more about Cassandra and Spark together, be sure to attend Spark Summit 2014 from June 30th to July 2nd and Cassandra Summit 2014 on September 10th and 11th, both hosted in San Francisco, CA.



Your email address will not be published. Required fields are marked *

Tel. +1 (408) 933-3120 Offices France GermanyJapan

DataStax Enterprise is powered by the best distribution of Apache Cassandra™.

© 2017 DataStax, All Rights Reserved. DataStax, Titan, and TitanDB are registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache Cassandra, Apache, Tomcat, Lucene, Solr, Hadoop, Spark, TinkerPop, and Cassandra are trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.