Company•May 10, 2022

Cassandra Myth Busters: How Hard Is It to Run Cassandra on Kubernetes?

Jeff DiNoto

It’s said to be difficult to run stateful workloads like Cassandra on Kubernetes. In this blog, we’ll take a fresh look at that claim.

In our first “myth buster” blog, we took an objective look at the difficulties said to be associated with Apache Cassandra® and evaluated them in the context of Cassandra’s abilities and tools developed to now remove complexity. Cassandra supports cloud-native workloads, and the flexible API layer Stargate.io makes it easier to work with.

Now, in this post, we’ll examine how difficult it is to run stateful workloads like Cassandra on Kubernetes.

Myth: Running stateful workloads like Cassandra on Kubernetes is difficult

Stateless application life is free and easy. Stateful application life not so much. There are different requirements. Storage needs to be persistent and follow a workload if it’s rescheduled on another node. Identification needs to follow you as you go.

The core of the challenge for stateful application developers is the nature of Kubernetes pods and the containers within them. They are ephemeral and therefore so is the data held within them. So, what are you to do?

Reality: That was then, this is now

Kubernetes and Cassandra work hand-in-hand to create a platform for a new generation of modern cloud-native applications. It includes those with stateful requirements. Kubernetes has played a pivotal role in the emergence of modern, cloud-native microservice-oriented applications.

But there was a catch: Kubernetes was built on stateless applications and ephemeral application workloads. And databases simply didn’t belong in that environment.

Today, it’s a different story: Kubernetes powers a new generation of stateful, data-aware applications. They have improved scalability, reliability, and ease of management. This has been the culmination of several important innovations that arrived in concert-like orchestration.

Changes to the Kubernetes ecosystem
The creation of the Kubernetes operator for Cassandra
K8ssandra

Changes to the Kubernetes ecosystem

Various native Kubernetes resources provide the basic building blocks needed to host stateful applications within Kubernetes. These are APIs like StatefulSets, PersistentVolumes, PersistentVolumeClaims, and StorageClasses.

The Container Storage Interface (CSI) has made it increasingly possible for third-party storage providers to bring new storage systems to the community. These offer flexibility given the varying storage requirements of many different types of applications.

Kubernetes operator for Cassandra

The creation of the Kubernetes operator for Cassandra provided a native Kubernetes experience for deployment and management of Cassandra datacenters within a Kubernetes cluster. The Cass operator leverages all emerging Kubernetes resources to build flexible and robust deployment of Cassandra. This integrates and standardizes all capabilities required to deliver a cohesive Cassandra experience on Kubernetes.

Cass operator provides a Kubernetes custom resource called the CassandraDatacenter. This provides the abstraction layer between configuration provided via Kubernetes and translates it into the configuration of the Cassandra deployment it manages. It also exposes the state of the deployment that can be inspected like other native Kubernetes resources.

K8ssandra eases use of Cassandra on Kubernetes

The latest iteration in this process is K8ssandra. It’s a complete data platform built on Kubernetes and Cassandra with the capabilities of Cass Operator.

K8ssandra elevates and abstracts away its component technologies and integrates essential supporting services. These are for instance repair, backup and restore, monitoring, and data gateway APIs. K8ssandra is open-source, works with the latest Cassandra releases, and continuously evolves to meet production needs of the community.

K8ssandra components on kubernetes

Figure 1: K8ssandra is DataStax-contributed open source project that enables you to run

Cassandra on Kubernetes with the tools you’ll need for production deployments.

Summary

As Kubernetes has evolved, so too has the technology surrounding Cassandra. With the development of the Cass Operator and K8ssandra, Kubernetes and Cassandra now provide a platform for modern cloud-native applications, including those that have stateful requirements.

These innovations solve some very challenging technology problems. They are proof of the commitment and talent of open-source contributors (including DataStax engineers) that made them possible. As a result, we can now claim Cassandra as the default data tier for building and running powerful, resilient, truly cloud-native data apps on Kubernetes.

Follow the DataStax Tech Blog for more developer stories. Check out the DataStax YouTube channel for tutorials and DataStax Developers on Twitter for the latest news about our developer community.

Resources

Discover more

Apache Cassandra®KubernetesCloud

JUMP TO SECTION

More Company

View All

DataStax on Microsoft Azure: The Best Destination for Generative AI Applications

Company • July 16, 2024

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.

Learn More

Get Started for Free

Cassandra Myth Busters: How Hard Is It to Run Cassandra on Kubernetes?

Jeff DiNoto

Myth: Running stateful workloads like Cassandra on Kubernetes is difficult

Reality: That was then, this is now

Changes to the Kubernetes ecosystem

Kubernetes operator for Cassandra

K8ssandra eases use of Cassandra on Kubernetes

Summary

Resources

Discover more

Share

Share

Myth: Running stateful workloads like Cassandra on Kubernetes is difficult

Reality: That was then, this is now

Changes to the Kubernetes ecosystem

Kubernetes operator for Cassandra

K8ssandra eases use of Cassandra on Kubernetes

Summary

Resources

More Company

DataStax on Microsoft Azure: The Best Destination for Generative AI Applications

An Introduction to David Jones-Gilardi, Developer Relations

Introducing Tejas Kumar, Developer Relations Engineer

An Introduction to Phil Nash, Developer Relations

One-stop Data API for Production GenAI