CompanyJune 10, 2020

Run Apache Cassandra on Kubernetes 15x Faster with Arrikto and DataStax

Christopher Bradford
Christopher Bradford
Run Apache Cassandra on Kubernetes 15x Faster with Arrikto and DataStax

This is a synopsis of the full joint report between DataStax and Arrikto. For all the performance results and details please see the complete report here

Response times for applications impact revenue and customer satisfaction, and are therefore mission-critical. Whether your application is user-facing, performing computational analysis, or providing integration between services no one wants to wait any longer than necessary. Thousands of applications rely on Apache Cassandra to store and retrieve this data, and DataStax Enterprise is the proven leader delivering the most reliability and performance. 

Recently, we introduced the DataStax Kubernetes Operator for Apache Cassandra to make it trivial to deploy and scale distributed clusters. Never happy to sit back and think the job is done, we’ve been looking at ways to further improve the performance and reliability as well as reducing your costs.

We partnered with Arrikto and found an amazing 15x faster response time with a 22% transaction cost saving when using Amazon Web Services!

Who is Arrikto? They’re an innovative start-up disrupting the way we do storage in Kubernetes.  Kubernetes has many great advantages, but it is still lacking in storage management capabilities - with many organizations trying to solve this. However, the others all approach the problem in the same way - add an abstraction layer between your application and the underlying disk with a software-defined storage (SDS) behemoth. 

Arrikto Rok delivers containerized storage data management while staying out of the critical IO data path.

How it works

Arrikto Rok is a revolutionary storage and data management solution for stateful applications on Kubernetes. The biggest difference between Arrikto’s approach and that of legacy Software-Defined Storage solutions is that Arrikto enhances existing local storage devices’ data management capabilities rather than inserting an additional layer of abstraction.  Abstraction layers always introduce latency and slow down IO. High latency and slow storage is not what you want for your data services.

Arrikto stays out of the critical data path and instead integrates data management capabilities through patented snapshot technology and new contributions to the mainline Linux kernel.

This means when your application reads and writes to storage, it does so to a local disk.  In a cloud environment, this is even more impactful, because it means you can use significantly faster AND cheaper storage with already existing local NVMe options, instead of network-attached block storage like EBS.

Arrikto Rok enables data management, versioning, and transport in a managed way so you can ensure the highest performance for normal operations and very fast recovery.  This architecture also delivers low cost of both disk storage and operations.

Arrikto Rok Arch

Figure 1: Arrikto Rok architecture

Summary of Results

Using the same DataStax benchmark for both architectures, we found significant performance differences between the commonly used EBS model vs Arrikto’s innovative approach.

Overall, using Arrikto Rok we found performance improvements across the board in Operations per Second, Read Latency, and Write Latency.

Operations per Second saw improved performance starting at 10% and peaking at 55x faster than EBS* before the test failed to complete.

Arrikto Rok vs AWS EBS

Latency improvements across Read Heavy, Write Heavy, and Balanced Read Write was also significant. Read Heavy latency was 26x better with Arrikto Rok vs AWS EBS* - that is an improvement of over 96%!

Write intensive latency also improved by 52% - that’s a 2x speed increase*.
Read latency

Write Latency

* In many of these scenarios, EBS was simply unable to complete the test. The results reported for both EBS and Arrikto Rok are the output of EBDSE (DataStax benchmark tool) at the time the test ended. 

When we take into consideration straight EC2 cost differences of using NVMe instances instead of EBS instances, you can save approximately 15% on your AWS bill - while also seeing massive performance increases. When you consolidate your instances into fewer larger instances, we saw cost savings up to 40%.

Drilling down even deeper, the cost per transaction ($ / OPS) also saw a reduction in cost of 22% for write-intensive workloads.

What this means

So, what does this actually mean?

It means when you use Arrikto Rok, you get the following business benefits;

  • Dramatically increased throughput of up to 55x for read-intensive workloads
  • Radically faster response times of up to 32x for balanced read / write workloads
  • Impactful cost saving of at least 15% on similarly configured environments
  • Massively reduced cost of data transactions by at least 22%

 

From a technical perspective, this enables you to;

  • Completely eliminate AWS EBS
  • Deliver high levels of availability across multiple Availability Zones
  • Reduce wasted staff time managing cloud disks
  • Deploy smaller clusters with the same performance and lower cost
  • Slash your software bills for Kubernetes

 

Putting it succinctly, Arrikto Rok allows you to take a true cloud approach to containerized storage and data management. Rather than simply putting a software-defined storage layer into a container, Arrikto has taken a new approach that is actually container-native.

To see the full report including all of the performance and testing details, as well as the configuration and architecture, click here

 

Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.