CompanySeptember 26, 2018

5 Lessons in Distributed Databases

Jonathan Ellis
Jonathan EllisTechnology
5 Lessons in Distributed Databases

A few weeks ago I had the pleasure to speak at  Distributed Data Summit in San Francisco—a conference exploring the latest developments in Apache Cassandra and other distributed databases. It was a fun and informative event.  In case you couldn’t make it, here is a summary of my talk, titled “Five Lessons in Distributed Databases.” (Slides here.)

Lesson #1: If it’s not SQL, it’s not a database

NoSQL started getting a lot of attention around 2008 (not coincidentally when Cassandra was open sourced) as the industry started to realize that (1) virtually every cloud application was going to need a scalable, performant, highly available database, and (2) sharding relational databases was a really terrible way to try to solve that.

At first, NoSQL’s popularity was boosted by a second group: people who really just hated SQL.  But with the passage of time it’s clear that the first camp is the one that won. People realized that SQL actually gets the job done and started moving toward that.

In the Cassandra space, we had some tough lessons on the downsides of not having an SQL API, and one of the big ones was that not only is a driverless RPC API too low level, but it also results in fragmentation within your user community as people tried to solve this problem independently in different client language ecosystems, and even the same one. (There were at least three Java clients based on Thrift that had meaningful use in production.)

In 2011, we started thinking about what we can give to people that’s higher-level than Thrift and still allows us to standardize this world we’ve created, and we came out with Cassandra Query Language 1.0. It took another year and two more releases to get it right, but CQL has served as a model for the industry in building a query language for a partitioned, scalable database.

Today, virtually every major NoSQL database offers a SQL-inspired query language. (The major exceptions are MongoDB and DynamoDB.)

Lesson #2: It takes more than five years to build a database

I learned this one the hard way. In the words of independent tech analyst Curt Monash: “Developing a good database management system requires five to seven years and tens of millions of dollars—and that’s if things go extremely well.”

Why does it take so long?

Curt cites three general reasons:

  1. Concurrent workloads: It’s going to get a lot harrier in life than what you can put together in a lab.
  2. Mixed workload management is harder than you assume it is.
  3. Those minor edge cases in which your Version 1 product works poorly aren’t really minor at all.

In the Cassandra world, some things that took multiple designs to get right included:

  1. Hinted Handoff
  2. Repair
  3. Counters
  4. Paxos (for lightweight transactions)
  5. Realistic distributed testing

Lesson #3: The customer is always right

As engineers building a database, we tend to have the attitude of, if the database broke because you did something you weren’t supposed to do, then it’s your fault. And that’s a mindset we need to get away from because it stops us from learning about the problems people are trying to solve and the best ways to fix them.

Two specific areas where we may wish to rethink our approach in Cassandra are tombstones and join support.

Lesson #4: Too much magic is a bad thing

Just enough magic is good—that’s what customers want. But too much magic is bad, even if you can actually deliver it.

A good example of this would be a feature that researchers at Brown added to HStore—the horizontally partitioned database that became VoltDB—in 2012. The feature promised to automatically generate the optimal data partitioning for your workload and spread it across clusters according to that partitioning using queries implemented as Java-store procedures. It was a promising idea, and it worked as advertised, but it was confusing to users because it was too hard to predict which transactions would require two-phase commit across multiple partitions (and slow down by a couple orders of magnitude). So you see all the major vendors today requiring explicit partitioning instead.

Lesson #5: It’s the cloud, stupid

Cassandra and DataStax Enterprise have supported multi-datacenter and hybrid cloud deployments since their introduction. This was initially greeted with some skepticism (“only Facebook and Google would need that”), but today the market has caught up. DataStax sees over 70% of our customers with at least some infrastructure footprint in the cloud, and hybrid cloud is the new hot topic. Virtually every vendor at Strata this year was talking about hybrid, but we actually have the technology and experience to deliver that.

The danger now is that we get complacent, that we say, Hey—we’re the best database for the cloud, so we don’t need to innovate there. But I think there’s another step. We need to start thinking about, What happens if your database runs only in the cloud or mostly in the cloud? What optimizations can you start making when you’re designing for a cloud-first world, when you can assume you’re going to be deployed in an environment with a built-in DFS, with built-in object storage, with built-in service discovery to integrate with?

If we can learn from our history, including our mistakes, we will set ourselves up for success in the future. With the cloud age upon us, I’m very excited for the future of DataStax and distributed databases in general, and I can’t wait to see the innovation that’s coming.


One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.