DataStax and Databricks Partner to Deliver Up to 100X Faster Analytics Fully Distributed, Highly Scalable Cassandra Database
Industry-first integration of leading open-source technologies enables companies like Ooyala, Health Market Science, and Pearson Education to deliver highly personalized online customer experiences
By integrating Apache Spark and Apache Cassandra, lightning-fast analytics are now embedded into the transaction processing of the Distributed DBMS
Partnership will deliver open source code back to the Apache Spark and Apache Cassandra communities to ensure that developers always have the most cutting-edge technologies
SANTA CLARA, Calif., May 8, 2014 — DataStax, the company that delivers Apache Cassandra to the enterprise, today announced a partnership with Databricks, the company founded by the creators of Apache Spark. As the database industry’s first partnership to integrate Spark and Cassandra, DataStax and Databricks will deliver significantly faster analytics to users of both open source technologies and enable today’s most progressive businesses to deliver highly personalized online customer experiences.
Transactional Analytics Enable Dynamic Customer Experiences
Apache Cassandra is a fully distributed, highly scalable database that allows users to create online applications that are always on and can process large amounts of data in real time. Originally developed at UC Berkeley’s AMPLab, Apache Spark is a processing engine that enables applications in Hadoop clusters to run up to 100X faster in memory, and even 10X faster when running on disk. It also provides SQL, streaming data, machine learning, and graph computation functionality out-of-the-box as first class citizens to simplify building end-to-end analytic workflows. Together, these technologies can significantly boost analytics performance in a transactional database and allow companies to act quicker when serving customers’ needs.
Through this partnership, DataStax and Databricks are driving the operational database industry toward a better approach that allows companies to ingest user data at a very fast rate, and then analyze the results within the same distributed database. Responsiveness to customer needs is critical for successful online businesses, and by decreasing their “time to insights”, innovative companies such as video analytics provider Ooyala can create highly personalized experiences for their customers.
“The integration of Spark and Shark with Cassandra is enabling Ooyala to efficiently and effectively store, analyze and process every piece of data powering our industry leading video analytics platform,” said Kelvin Chu, compute and data team lead, Ooyala. “With Cassandra as the data store and Spark for data crunching, these new analytic capabilities are making the processing of large data volumes a breeze. Spark on Cassandra is giving us the power to act on things in real-time, which means faster decisions and faster results for our ever-growing business.”
Cassandra Community Helps Drive Spark Adoption
The Cassandra community is growing quickly, with global user meetups increasing 400 percent over the past year and Spark serving as a frequent topic of discussion. DataStax employees already contribute more than 80 percent of all Apache Cassandra open source code contributions, and by working closely with Databricks engineers, will now contribute to the Spark community as well. The partnership will help spread adoption of both technologies while creating greater cohesiveness among users.
“The Cassandra community has rapidly adopted Spark over the past year because it provides significantly faster analytics than Hadoop,” said Martin Van Ryswyk, executive vice president, engineering, DataStax. “We look forward to working closely with Databricks to make the best Spark on Cassandra solution available to the Spark community.”
“Spark and Cassandra form a natural bond by combining blazing-fast analytics with a high-performance transactional database,” said Arsalan Tavakoli-Shiraji, head of business development, Databricks. “Additionally, all of Spark’s benefits, including a unified platform that seamlessly integrates SQL, streaming data and advanced analytics, will be natively available to Cassandra users. This is further validation of Spark’s emergence as a general Big Data processing engine with broader applications than just existing Hadoop clusters.”
Learn More At Spark Summit on June 30
To learn more about how Spark and Cassandra deliver faster analytics in a transactional database system, users can attend Van Ryswyk’s presentation at the Spark Summit on June 30 through July 2 at The Westin St. Francis in San Francisco.
DataStax provides a massively scalable enterprise NoSQL platform to run mission-critical
business applications for some of the world’s most innovative and data-intensive enterprises. Powered by the open source Apache Cassandra™ database, DataStax delivers a fully distributed, continuously available platform that is faster to deploy and less expensive to maintain than other database platforms.
DataStax has more than 500 customers in 45 countries including leaders such as Netflix,
Rackspace, Pearson Education, and Constant Contact, and spans verticals including web, financial services, telecommunications, logistics, and government. Based in Santa Clara, Calif., DataStax is backed by industry-leading investors including Lightspeed Venture Partners, Meritech Capital, and Crosslink Capital. For more information, visit DataStax.com or follow us @DataStax and @DataStaxEUBuzz.
Databricks was founded by the creators of Apache Spark, and are using cutting-edge technology based on years of research to build next-generation software for analyzing and extracting value from Big Data. They believe Big Data is a tremendous opportunity that is still largely untapped, and are working to revolutionize what enterprises can do with it. They are venture-backed by Andreessen Horowitz.