GuideFeb 02, 2022

How to run Cassandra on AWS

Get Started for Free
How to run Cassandra on AWS

Apache Cassandra(®) is a leading NoSQL database, enabling developers to build massively scalable, geo distributed data applications with zero downtime. Cassandra is the database of choice for the most demanding applications on the internet including Netflix, Uber, Pinterest and thousands of the world’s leading engineering teams.

This guide will help you understand the best managed and self-managed ways to to run Cassandra on Amazon Web Services (AWS).


Three ways to run Cassandra on AWS

Managed Service: Using Astra DB on AWS

The fastest way to use Cassandra on AWS is with Astra DB, a database-as-a-service built on Cassandra, Kubernetes, Prometheus, Envoy, and other cuttting edge open source. Astra DB simplifies cloud-native application development and requires no operations or self-management. It reduces deployment time from weeks to minutes, delivering an unprecedented combination of serverless autoscaling, pay-as-you-go pricing, and an open source skillset you can take with you to any cloud provider. How does Astra DB make running on AWS easy?

Why Astra DB?

Global Scale
  • Scale-up to petabytes of data without impacting performance
  • Colocate data and applications anywhere in the world - without compromising performance, availability or accessibility
  • Database can be replicated across multiple data centers, availability zones, even multi-region - no leader/follower troubleshooting headaches
  • Compute and storage are separated enabling apps to scale cost effectively or scale down to zero automatically
  • Tunable consistency can adjust the tradeoff between availability and consistency of data on Cassandra nodes
No Operations
  • True serverless autoscaling eliminates manual configuration changes and guesswork on database sizing
  • Deploy in 5 minutes or less: no provisioning, install, or configuration
  • Fully managed database and OS updates and upgrades
  • Operate in any of Astra’s globally available AWS regions and availability zones
  • IaaS (Infrastructure-as-a-Service) failures handled gracefully by K8s operator to keep databases healthy
  • High availability from automatic self-healing at the database level
  • Fault Tolerance automatically replicates data to multiple nodes and across multiple data centers to create high fault tolerance and ensure zero data loss.
  • Single region deployments with a 99.9% SLA with and 99.99% SLA for multi-region minimize both downtime and the need for site-reliability engineering
  • Automated anti-entropy repair procedures
  • Automated hourly backup, with snapshot storage for 20 days
  • Integrated Grafana monitoring system to provide accurate and up to date measurement information about health and performance
DBaaS as APIs
  • Skip defining the schema upfront, use Astra DB like a JSON Document store using the Document API
  • Go schema-first with familiar REST, GraphQL, gRPC APIs and ramp up quickly
  • Drive adoption of cloud-native architectures using a microservices and API first approach
Developer Productivity
  • Absolutely no low-level AWS infrastructure knowledge required to deploy: name your database and keyspace, then select a region and you are done
  • Robust, cloud-enabled language drivers in all major programming languages
  • JDBC/ODBC drivers for BI and other tool integration
  • Popular framework integrations (Spring Boot, Spring Data, Quarkus, and more)
  • Spark Cassandra Connector
  • Built in CQLSH console
  • Postman Collection for Astra DB APIs
  • DevOps API, Terraform Provider, Ansible Playbook for CI/CD pipeline automation
  • JetBrains IDE Plugin: Astra DB Data Explorer
Enterprise Security
  • Achieve data sovereignty without replication headaches with multi-region deployments
  • SOC2 Compliance
  • Sophisticated authentication and authorization with role based access
  • Client connections use two-way certificate validation for VPN-level security from client to database (mTLS).
  • All data is encrypted at rest and in motion
  • AWS PrivateLink connectivity connects apps in your VPC to Astra DB
  • JSON web token(JWT) based authentication to ensure secure connectivity to your Astra DB database

Get Started with Astra DB on AWS

Simply register here with Github, Google ID or email and get 80 GB storage and up to 20M read/write ops free every month. No credit card is required for the free plan.

Self Managed Service: Cassandra on AWS EC2

Some IT organizations require complete control over their systems, or are already setup for self-managed software. With self-managed virtual machines you have that control. This control comes with all the associated effort and expense, and is a tradeoff that should be considered carefully.

Get Started with Cassandra on AWS EC2

Self Managed Service: K8ssandra on AWS EKS

K8ssandra is a cloud native distribution of Apache Cassandra® that runs on Kubernetes and AWS EKS. K8ssandra provides an ecosystem of tools to provide richer data APIs and automated operations alongside Cassandra. This includes metrics monitoring to promote observability, data anti-entropy services to support reliability, and backup / restore tools to support high availability and disaster recovery. As part of K8ssandra’s installation process, all of these components are installed and wired together, freeing you from having to perform the tedious plumbing of components:

  • Apache Cassandra
  • Stargate, the open-source data gateway
  • Cass-operator, the Kubernetes Operator for Apache Cassandra
  • Reaper for Apache Cassandra, an anti-entropy repair feature (plus reaper-operator)
  • Medusa for Apache Cassandra for backup and restore (plus medusa-operator)
  • Metrics Collector for Apache Cassandra, with Prometheus integration, and visualization via pre-configured Grafana dashboards

Get Started with K8ssandra on AWS EKS

Learn more about setup for AWS EKS in the K8ssandra documentation.

Which one is the most efficent way of running Cassandra on AWS?

This answer depends on your requirements, your existing investments, your staff and their skills - a host of factors.

In general, we recommend Astra DB for the vast majority of Cassandra use cases. You can be ready to go in minutes, freed from operational, security and scalability concerns

All but the most demanding, security-conscious applications will be served by environments like Astra DB that are already compliant to common security standards, saving months or even years of effort, to say nothing of expense.

Startups and enterprises alike who do not want to, or cannot, dive deep into database administration and configuration, should opt for Astra DB.

Self managing databases of Kubernetes is less efficient than DBaaS, but may be driven by preexisting organizational proficiency with Kubernetes. K8s managed services like AWS EKS and K8ssandra not only make running system-of-engagement databases on Kubernates possible, but can significantly ease the burden on SRE/Ops teams.

Self managing IaaS is the least efficient option relative to DBaaS, but may be driven by a need to self-manage for regulatory reasons or the need to interoperate with proprietary or custom systems. Alternatively, a self-managed IaaS may involve the nature of an existing application, being migrated to the cloud. Your application may simply not require, or be ready for, a cloud-native architecture.


Can I use Apache Cassandra with AWS?

Yes, you can use Apache Cassandra on AWS. Cassandra is available on AWS fully-managed through Astra DB, or self-managed via AWS Quick Start.

How do I run Cassandra on AWS?

To deploy Cassandra on AWS, you can either:

  1. Set up a new cluster on Astra DB or migrate an existing self-managed Cassandra deployment to AWS.
  2. Use the AWS Quick Start to build a new self-managed Cassandra cluster yourself.

How do I access Cassandra on AWS?

Once you have deployed your Cassandra cluster on AWS, either by using Astra DB or creating a self-managed cluster, use the cluster’s connection string to access either from the command line, or through a Cassandra driver in your language of choice.

Is Astra DB Free on AWS?

Astra DB has a free tier of $25 free credits monthly giving developers up to 80 gigabytes of free storage or up to 20 million read/writes each month. Astra DB is serverless so that you are only billed for what you use. If you’re managing your own cluster, your AWS pricing for the resources it uses will apply.

What is DataStax Astra DB?

Astra DB is a fully managed, serverless, multi-cloud database as a service powered by Apache Cassandra®.

Can I buy Astra DB on AWS Marketplace?

Yes, Astra DB is available on AWS Marketplace. There are no minimums and no upfront commitment required; your Astra DB cost will be billed to your AWS account.

Get Started

Features of Astra DB managed Cassandra on AWS

Serverless Database Built on Apache Cassandra®

Scale database resources in and out on demand to match application requirements and traffic so that you pay only for what you use. Put the power of Cassandra in the hands of every developer without ever worrying about managing the infrastructure.

Global Scale

Data replication across multiple data centers, availability zones, and multi-region. Scale-up to petabytes of data without impacting performance. The Astra service is resilient and highly available to minimize both downtime and the need for site-reliability engineering.

Enterprise Security

All data is encrypted at rest and in motion. Sophisticated authentication and authorization with role based access. Client connections use two-way certificate validation for VPN-level security from client to database. Private connectivity options like VPC peering upon request. JSON web token(JWT) based authentication to ensure secure connectivity to your Astra DB database.

No Operations

Fully managed database and OS updates and upgrades. IaaS (Infrastructure-as-a-Service) failures handled gracefully by K8s operator to keep databases healthy. Eliminate anti-entropy repair procedures. Auto scaling eliminates manual configuration changes and guesswork on database sizing.

Ready to get started with Cassandra on AWS?

Get Started

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.