CompanyApril 17, 2018

DataStax Enterprise 6 – the Distributed Cloud Database Designed for Hybrid Cloud

Robin Schumacher
Robin Schumacher
DataStax Enterprise 6 – the Distributed Cloud Database Designed for Hybrid Cloud

Each time we have a major release, I look back and think there’s no way our team can top it; that future releases will somehow be less than what just went out the door. But every time I’m proven wrong when our next release becomes GA, and there’s no better example of that than what we’re announcing today.   DataStax Enterprise (DSE) 6 represents a major win for our customers who require an always-on, distributed database to support their modern real-time (what we call ‘Right-Now’) applications, particularly in a hybrid cloud environment. Not only does it contain the best distribution of Apache Cassandra™, but it represents the only hybrid cloud database capable of maintaining and distributing your data in any format, anywhere – on-premise, in the cloud, multi-cloud, and hybrid-cloud – in truly data autonomous fashion. Let me take you on a quick tour of what’s inside the DSE 6 box, as well as OpsCenter 6.5, DataStax Studio 6, and DSE Drivers, and show you how our team has knocked yet another one out of the park.

Double the Performance

Enterprises with Right-Now applications know they have three seconds – just three seconds – to keep a customer waiting before almost half of them click away to a competitor. Because these apps are constantly interacting with a database that holds the contextual info needed for producing a personalized customer experience, it’s vital that the database not play a part in exceeding those three seconds. Exceeding the high bar of speed expectations set by today’s digital consumer is tough, but DSE has been doing it for some time now, and with version 6, things only get better. DSE Advanced Performance is a new set of performance-related optimizations, technologies, and tools that dramatically increase DSE’s performance over its foundational open source components as well as its competitors. To start, new functionality designed to make Cassandra more efficient with high-compute instances has resulted in a 2x or more out-of-the-box gain in throughput for both reads and writes. Note that these speed and throughput increases apply to all areas of DSE, including analytics, search, and graph. A new diagnostic testing framework developed by DataStax helped pinpoint performance optimization opportunities in Cassandra, with more enhancements coming in future releases. Next, DSE 6 includes our first ever advanced Apache Spark™ integration (over the open source work we’ve done for Spark in the past)  that delivers a number of improvements, as well as a 3x query performance increase. Lastly, loading and unloading large volumes of data is still a very pressing need for many enterprises. DSE 6 answers this call with our new DataStax Bulk Loader that’s built to rapidly move data in and out of the platform at impressive rates – up to 4x faster than current data loading utilities. All of these performance improvements have been designed with our customers in mind so that their Right-Now applications deliver a better-than-expected customer experience by processing more orders, fielding more queries, performing faster searches, and moving more data faster than ever before. If an app’s response time exceeds three seconds, it won’t be because of DSE.

Self-Driving Operational Simplicity

In designing DSE 6, we listened to both DataStax customers and the Cassandra community. While the interests of these groups sometimes diverge, they do have a few things in common. It turns out that helping with Cassandra repair operations is a top priority for both. For some, Cassandra repairs aren’t a big deal, but for others they are a PITA (pain in the AHEM). Don’t get repair right in a busy and dynamic cluster, and it’s just a matter of time until you have production-threatening issues. While we introduced an OpsCenter-based repair service some years ago, it was limited to repair functionality available at the Cassandra level. Knowing that a server-based approach is what Cassandra users want, our talented engineering team has delivered DSE NodeSync, which essentially makes DSE ‘repair free’ by operating in a transparent and continuous fashion to keep data synchronized in DSE clusters. If you like your current repair setup, keep it. But if you want to eliminate scripting, manual intervention, and piloting repair operations, you can turn on NodeSync and be done. It works at the table level so you have strong flexibility and granularity with NodeSync, plus it can be enabled either with CQL or visually in OpsCenter. Something else we’ve added to version 6 is DSE TrafficControl, which delivers advanced resiliency that ensures DSE nodes stay online under extreme workloads. Under severe concurrent request traffic, there have been cases of open source Cassandra nodes going offline due to the abnormal pressure. DSE TrafficControl has intelligent queueing, not found in open source, that prevents this from happening on DSE nodes.   Another area for improvement on which open source users and DataStax customers agree is upgrades. No technical pro that I know looks forward to upgrading their database software, regardless of the vendor used. I’m happy to say we now provide automated help for upgrades with our new Upgrade service that’s a part of OpsCenter 6.5. Our new upgrade functionality effortlessly handles patch upgrades by notifying you that an upgrade is available, downloading the software you need, applying it to a cluster in a rolling restart fashion so you experience zero downtime, and freeing you up to do other things. These management improvements and others are directly aimed at increasing your team’s productivity and letting you focus on business needs vs. operational overhead. The operational simplicity allows even novice DBAs and DevOps professionals to run DSE 6 like seasoned professionals. Ultimately that means much easier enterprise-wide adoption of data management at scale.

Analyze (and Search) This!

Forrester ranked DataStax a leader in their Translytical Wave, and for good reason: DSE provides the translytical functionality needed by Right-Now apps that meld transactional and analytical data together. For years, DataStax has provided 100% of the development needed to freely integrate open source Spark and Cassandra, but with DSE 6, we’re kicking things up a notch (or two).   For the first time, we’re introducing our advanced Spark SQL connectivity layer that provides a new AlwaysOn SQL Engine that automates uptime for applications connecting to DSE Analytics. This makes DSE Analytics even more capable of handling around-the-clock analytics requests, and better support interactive end-user analytics, while leveraging your existing SQL investment in tools (e.g. BI, ETL) and expertise. I’d also like to give a shout-out to the recently introduced DSE Analytics Solo. This is a subscription option introduced recently that gives a more cost-effective way to isolate analytic workloads in order to achieve predictable application performance. We also have great news for analytics developers and others who want to directly query and interact with data stored in DSE Analytics. DataStax Studio 6 provides notebook support for Spark SQL, which means you now have a visual and intelligent interface and query builder that helps you write Spark SQL queries and review the results – a huge time saver! Plus you can now export/import any notebook (graph, CQL, Spark SQL) for easy developer collaboration as well as undo notebook changes with a new versioning feature. Finally, let’s not forget the critical role search functionality plays in apps that rely on contextual and converged data. DSE Search has upped its game in this area by delivering CQL support for common search queries, such as those that use LIKE, IN, range searches, and more.

Supporting Distributed Hybrid Cloud

Over 60% of DataStax customers currently deploy DSE in the cloud, which isn’t surprising given that our technology has been built from the ground up with limitless data distribution and the cloud in mind. Customers run DSE today on AWS, Azure, GCP, Oracle Cloud, and others, as well as private clouds of course. DataStax Managed Cloud, which currently supports both AWS and Azure, will be updated to support DSE 6, so all the new functionality in our latest release is available in managed form. Whether fully managed or self-managed, our goal is to provide you with multi and hybrid cloud flexibility that supplies all the benefits of a distributed cloud database without public cloud lock-in.

Yes, There’s Actually More…

I’d be remiss if I didn’t also mention additions to our DSE Advanced Security package that contains new separation of duties capabilities and unified authentication support for DSE Analytics, the backup enhancements we’ve done for cloud operations, or all the updates to our DSE drivers. Like I mentioned at the beginning of this post, our team always delivers. With DSE 6, we want you to enjoy all the heavy-lifting advantages of Cassandra with none of the complexities and also get double the power. Downloads, free online training, and other resources are now available, so give DSE 6 a try (also now available for non-production development environments via Docker Hub) and let us know what you think.

Discover more
Hybrid CloudDataStax Enterprise

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.