Company•February 20, 2020

Accelerate Rewind: How to Understand Apache Cassandra Performance Through Metrics

Wei Deng

Apache Cassandra® is a distributed database built with peer-to-peer architecture. In order to monitor the entire database, you need to be able to understand the performance of all of the nodes.

If you’re new to Cassandra, this all can be tricky.

If you need a little help, you may want to check out this session from last year’s DataStax Accelerate. Wei Deng, a Vanguard Solutions Architect at DataStax, gave some fundamental knowledge to understand performance metrics and an overview of Cassandra performance metrics tools aimed at newcomers to the database.

His talk covered how to begin to understand performance in a real-time database like Cassandra; the tools that are available to help you measure performance; and the most important metrics to keep track of; among other things.

Here’s a brief synopsis of Wei’s talk.

Cassandra: the nuts and bolts

First things first: a brief overview of Cassandra’s architecture.

Cassandra’s masterless architecture means that all nodes are the same. There aren’t any masters, which means every node’s performance metrics is important to collect and monitor.

At the same time, any client can connect to any node and read and write the data it needs. Further, any node can be a coordinator—and they can also serve as a storage or replica node. This means you will need to have visibility of performance metrics at client, coordinator and storage node levels to get the full picture.

Performance = throughput + latency

When we talk about performance, we’re talking about throughput, which is the rate of operations, and latency, which is the time it takes for one operation to complete.

What happens, though, when you have millions of operations per hour—or even millions of operations per second? How can you measure and record performance in high-velocity environments?

It’s not as hard as it might sound.

In his Accelerate session, Wei explains how you can use data structure like histograms to track latency metrics across large volumes of operations—as well as some of the pitfalls you need to avoid. Check it out.

Interested in learning more about Cassandra?

If you’re interested in learning more about Cassandra—whether you’re a newcomer to the space or an expert in it—we encourage you to head to DataStax Accelerate 2020, the world’s premier conference on Apache Cassandra.

This year, we’re hosting two events:

San Diego, Loews Coronado Bay, May 11–13, 2020
London, 133 Houndsditch, Liverpool Street, June 2–3, 2020

Accelerate is jam-packed with all sorts of sessions designed for developers, admins, architects, managers, CTOs, and more. It’s the place where Cassandra enthusiasts from around the world come together to share ideas and best practices and talk about the future.

We’d love to see you there! For more information on Accelerate, go here.

And if you’d like to hear more from Wei’s talk, check out the full session.

Discover more

Accelerate

JUMP TO SECTION

More Company

View All

DataStax on Microsoft Azure: The Best Destination for Generative AI Applications

Company • July 16, 2024

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.

Learn More

Get Started for Free

Accelerate Rewind: How to Understand Apache Cassandra Performance Through Metrics

Wei Deng

Cassandra: the nuts and bolts

Performance = throughput + latency

Interested in learning more about Cassandra?

Discover more

Share

Share

Cassandra: the nuts and bolts

Performance = throughput + latency

Interested in learning more about Cassandra?

More Company

DataStax on Microsoft Azure: The Best Destination for Generative AI Applications

An Introduction to David Jones-Gilardi, Developer Relations

Introducing Tejas Kumar, Developer Relations Engineer

An Introduction to Phil Nash, Developer Relations

One-stop Data API for Production GenAI