New Survey: Leveraging real-time data delivers higher revenue growth and increased developer productivity. Learn more.

Toggle Menu

How to run Cassandra on Google Cloud

Get Started for Free

Apache Cassandra(™) is a leading NoSQL database, enabling developers to build massively scalable, geo-distributed data applications with zero downtime.  Cassandra is the database of choice for the most demanding applications on the internet including Netflix, Uber, Pinterest and thousands of the world’s leading engineering teams.

This guide will help you understand the best managed and self-managed ways to run Cassandra on Google Cloud.

Three ways to run Cassandra on Google Cloud

Managed Service

DataStax Astra DB, which is a cloud-native DBaaS (database as a service) powered by Apache Cassandra, and managed by DataStax.

Self-Managed GKE

Deploy K8ssandra.io to Google Kubernetes Engine (GKE) via one of our convenient helm charts.

Self-Managed Virtual Machines (VMs)

Self-Managed Virtual Machines (VMs): Download, install, configure, and operate your own open source Cassandra cluster on Google Compute Engine (GCE) virtual machine, or use a public image you trust and understand.

Get Started for Free

Managed Service: Using Astra DB on Google Cloud

The fastest way to use Cassandra on Google Cloud is with Astra DB, a database-as-a-service built on Cassandra, Kubernetes, Prometheus, Envoy, and other cutting-edge open source. Astra DB simplifies cloud-native application development and requires no operations or self-management. It reduces deployment time from weeks to minutes, delivering an unprecedented combination of serverless autoscaling, pay-as-you-go pricing, and an open source skillset you can take with you to any cloud provider. How does Astra DB make running on Google Cloud easy?

Google Cloud Marketplace

Consolidated billing, as well as assurance that Astra DB spending counts towards existing committed use discounts.

Google Cloud Functions (GCF)

Astra DB works smoothly with your serverless functions. Your Astra database automatically comes with data access APIs making integration to your GCF straightforward and simple. Get the full value of autoscaling by paring autoscaling functions with autoscaling database.

Google Compute Engine

Works with your applications deployed on GCE in any language, via traditional language drivers or the APIs mentioned above.

Google Compute Engine

IaaS failures are handled gracefully by a Kubernetes (K8s) operator to keep databases healthy.

Google Cloud VPC & Google Cloud Private Service Connect

Connect apps in your VPC to Astra DB via Private Service Connect, in the console or via API. No more databases exposed on public networks.

Google Cloud Regions

Deploy to any of our available Google Cloud regions in the United States, Europe, or Asia.

Why Astra DB?

Global Scale

  • Scale-up to petabytes of data without impacting performance
  • Colocate data and applications anywhere in the world - without compromising performance, availability, or accessibility
  • Database can be replicated across multiple data centers, availability zones, even  multi-region - no leader/follower troubleshooting headaches
  • Compute and storage are separated enabling apps to scale cost effectively or scale down to zero automatically
  • Tunable consistency can adjust the tradeoff between availability and consistency of data on Cassandra nodes

No Operations

  • True serverless autoscaling eliminates manual configuration changes and guesswork on database sizing
  • Deploy in 5 minutes or less: no provisioning, install, or configuration
  • Fully managed database and OS updates and upgrades
  • Operate in any of Astra’s globally available Google Cloud regions and availability zones
  • IaaS (Infrastructure-as-a-Service) failures handled gracefully by K8s operator to keep databases healthy
  • High availability from automatic self-healing at the database level
  • Fault Tolerance automatically replicates data to multiple nodes and across multiple data centers to create high fault tolerance and ensure zero data loss.
  • Single region deployments with a 99.9% SLA with and 99.99% SLA for multi-region minimize both downtime and the need for site-reliability engineering
  • Automated anti-entropy repair procedures
  • Automated hourly backup, with snapshot storage for 20 days
  • Integrated Grafana monitoring system to provide accurate and up to date measurement information about health and performance

DBaaS as APIs

  • Skip defining the schema upfront, use Astra DB like a JSON Document store using the Document API
  • Go schema-first with familiar REST, GraphQL, gRPC APIs and ramp up quickly
  • Drive adoption of cloud-native architectures using a microservices and API first approach

Developer Productivity

  • Absolutely no low-level Google Cloud infrastructure knowledge required to deploy: name your database and keyspace, then select a region and you are done
  • Robust, cloud-enabled language drivers in all major programming languages
  • JDBC/ODBC drivers for BI and other tool integration
  • Popular framework integrations (Spring Boot, Spring Data, Quarkus, and more)
  • Spark Cassandra Connector
  • Built in CQLSH console
  • Postman Collection for Astra DB APIs
  • DevOps API, Terraform Provider, Ansible Playbook for CI/CD pipeline automation
  • Astra DB Data Explorer JetBrains IDE Plugin

Enterprise Security

  • Achieve data sovereignty without replication headaches with multi-region deployments
  • SOC2 Compliance
  • Sophisticated authentication and authorization with role based access
  • Client connections use two-way certificate validation for VPN-level security from client to database (mTLS).
  • All data is encrypted at rest and in motion
  • Google Cloud Private Service Connect connectivity connects apps in your VPC to Astra DB
  • JSON web token(JWT) based authentication to ensure secure connectivity to your Astra DB database

Get Started with Astra DB on Google Cloud

Simply register here with Github, Google ID or email and get 80 GB storage and up to 20M read/write ops free every month.  No credit card is required for the free plan.

  1. 01

    Create an account, login to Astra DB, create a database, choose Google Cloud as the cloud provider, pick a region, and you are done!

  2. 02

    Astra DB has a built in CQL console where you can run CQL queries without having to install any extra software on your computer.

  3. 03

    Check out our videos and documentation if you're just getting started. The playlist has a wide range of short tutorials.

  4. 04

    DataStax has created a wide variety of sample app examples to help you get things done in a faster and more efficient manner.


Self Managed Service: Cassandra on Google Compute Engine

Some IT organizations require complete control over their systems, or are already setup for self-managed software. With self-managed virtual machines you have that control. This control comes with all the associated effort and expense, and is a tradeoff that should be considered carefully.

Get Started with Cassandra on Google Compute Engine

  1. 01

    For development, you can use an AMI (Amazon Machine Image) with Cassandra already installed. Prebuilt AMIs are available from a variety of providers including Azure,  Bitnami and others.

  2. 02

    For most test, staging and production environments, consider an AMI you trust and understand. You may need to create an AMI from scratch for both security and runtime performance reasons.

  3. 03

    Consider the staff and skills you may need to acquire or build. In your planning you should also account for time the team will require for ongoing configuration and maintenance of the database.

  4. 04

    Download the Apache Cassandra(™) open source database and install and configure it.

  5. 05

    Build a virtual machine that satisfies only the dependencies you need to run Cassandra and nothing more, and is running on a machine that meets the hardware requirements.

  6. 06

    Configure Google Cloud VPCs, firewall and other networking policies to ensure your application and the Cassandra cluster can communicate and are also secure.

  7. 07

    Ensure Google Cloud security groups are also configured to allow for monitoring tool traffic.

  8. 08

    After installing Cassandra on Google Cloud, both the VM, the operating system inside the VM, and the database itself needs to be managed and kept up to date with security patches and software updates.

  9. 09

    Database management includes, but is not limited to, scaling the database according to the traffic, backup/restore, DR planning, capacity planning, and repair (anti-entropy).

  10. 10

    Continuously monitor for failed operations and work on optimizing the configuration.

  11. 11

    Keep pace with Google Compute Engine and other related Google Cloud service changes, updating the configurations accordingly to obtain effective and efficient performance.

  12. 12

    Administer the security policy of Google Cloud infrastructure.


Self Managed Service: K8ssandra on GKE

K8ssandra is a cloud native distribution of Apache Cassandra® that runs on Kubernetes and GKE. K8ssandra provides an ecosystem of tools to provide richer data APIs and automated operations alongside Cassandra. This includes metrics monitoring to promote observability, data anti-entropy services to support reliability, and backup / restore tools to support high availability and disaster recovery. As part of K8ssandra’s installation process, all of these components are installed and wired together, freeing you from having to perform the tedious plumbing of components like:

  • Apache Cassandra
  • Stargate, the open-source data gateway
  • Cass-operator, the Kubernetes Operator for Apache Cassandra
  • Reaper for Apache Cassandra, an anti-entropy repair feature (plus reaper-operator)
  • Medusa for Apache Cassandra for backup and restore (plus medusa-operator)
  • Metrics Collector for Apache Cassandra, with Prometheus integration, and visualization via pre-configured Grafana dashboards

Get Started with K8ssandra on GKE

  1. 01

    Install Terraform Binary

  2. 02

    Install and Configure Google Cloud SDK

  3. 03

    Install and Configure kubectl

  4. 04

    Install and Configure Helm v3

  5. 05

    Clone the k8ssandra-terraform project

  6. 06

    Install and Configure gcloud CLI (command line interface)

  7. 07

    Configure environment variables

  8. 08

    Provision Infrastructure

  9. 09

    Retrieve kubeconfig

  10. 10

    Install K8ssandra

  11. 11

    Deploy K8sssandra with Helm

  12. 12

    Retrieve K8ssandra super user credentials

Learn more about setup for Google Cloud GKE in the K8ssandra documentation.

Which one is the most efficient way of running Cassandra on Google Cloud?

This answer depends on your requirements, your existing investments, your staff and their skills - a host of factors.

In general, we recommend Astra DB for the vast majority of Cassandra use cases. You can be ready to go in minutes, freed from operational, security and scalability concerns.

All but the most demanding, security-conscious applications will be served by environments like Astra DB that are already compliant to common security standards, saving months or even years of effort, to say nothing of expense.

Startups and enterprises alike who do not want to, or cannot, dive deep into database administration and configuration, should opt for Astra DB.

Self-managing databases on Kubernetes is less efficient than DBaaS, but may be driven by preexisting organizational proficiency with Kubernetes. K8s managed services like Google Cloud GKE and K8ssandra not only make running system-of-engagement databases on Kubernetes possible but can significantly ease the burden on SRE/Ops teams.

Self managing IaaS is the least efficient option relative to DBaaS, but may be driven by a need to self-manage for regulatory reasons or the need to interoperate with proprietary or custom systems. Alternatively, a self-managed IaaS may involve the nature of an existing application, being migrated to the cloud. Your application may simply not require, or be ready for, a cloud-native architecture.

FAQ

Can I use Apache Cassandra with Google Cloud?

Yes, you can use Apache Cassandra on Google Cloud. Cassandra is available on Google Cloud fully-managed through Astra DB, or self-managed via Google Cloud Quick Start.

How do I run Cassandra on Google Cloud?

To deploy Cassandra on Google Cloud, you can either:

  1. Set up a new cluster on Astra DB or migrate an existing self-managed Cassandra deployment to Google Cloud.
  2. Use the Google Cloud Quick Start to build a new self-managed Cassandra cluster yourself.

How do I access Cassandra on Google Cloud?

Once you have deployed your Cassandra cluster on Google Cloud, either by using Astra DB or creating a self-managed cluster, use the cluster’s connection string to access either from the command line, or through a Cassandra driver in your language of choice.

Is Astra DB Free on Google Cloud?

Astra DB has a free tier of $25 free credits monthly giving developers up to 80 gigabytes of free storage or up to 20 million read/writes each month.  Astra DB is serverless so that you are only billed for what you use. If you’re managing your own cluster, your Google Cloud pricing for the resources it uses will apply.

What is DataStax Astra DB?

Astra DB is a fully managed, serverless, multi-cloud database as a service powered by Apache Cassandra(™).

Can I buy Astra DB on Google Cloud Marketplace?

Yes, Astra DB is available on Google Cloud Marketplace. There are no minimums and no upfront commitment required; your Astra DB cost will be billed to your Google Cloud account.
Get Started

Features of Astra DB managed Cassandra on Google Cloud

Serverless Database Built on Apache Cassandra™

Scale database resources in and out on demand to match application requirements and traffic so that you pay only for what you use. Put the power of Cassandra in the hands of every developer without ever worrying about managing the infrastructure.

Global Scale

Data replication across multiple data centers, availability zones, and multi-region. Scale-up to petabytes of data without impacting performance. The Astra service is resilient and highly available to minimize both downtime and the need for site-reliability engineering.

Enterprise Security

All data is encrypted at rest and in motion. Sophisticated authentication and authorization with role based access. Client connections use two-way certificate validation for VPN-level security from client to database. Private connectivity options like VPC peering upon request. JSON web token(JWT) based authentication to ensure secure connectivity to your Astra DB database.

No Operations

Fully managed database and OS updates and upgrades. IaaS (Infrastructure-as-a-Service) failures handled gracefully by K8s operator to keep databases healthy. Eliminate anti-entropy repair procedures. Auto scaling eliminates manual configuration changes and guesswork on database sizing.

Ready to get started with Cassandra on Google Cloud?

Get Started