Technology•February 6, 2024

Astra DB’s Third “Serverless-versary”

Joshua Norrid

In a few weeks, we’ll celebrate a big DataStax milestone: the three-year anniversary of DataStax Astra DB becoming a serverless database.

By separating compute and storage, we delivered a cloud-native, massively scalable database with a cost structure that’s tightly coupled with an application’s changing needs over time. Astra DB thus became automatically scalable in response to workload changes, leading to more efficient operations and a very significant reduction in users’ total cost of ownership (TCO).

A lot has happened since then. Generative AI landed squarely on everyone’s doorstep last year, and, with it, the need to easily search the complex, unstructured data that powers large language models. Vector search became a foundational requirement for any database that supports organizations’ generative AI efforts; Astra DB added this capability nearly a year ago.

Others have subsequently added serverless capabilities to their vector database offerings, further validating the decisions we’ve made to make it easier, faster, and more cost-effective for our customers to deliver GenAI applications to production.

That said, not all serverless vector databases offer the same TCO benefits. In December, we commissioned analyst firm GigaOm to benchmark Astra DB vector performance and TCO with several commonly-used vector benchmarking datasets that simulate production conditions. GigaOm then compared Astra DB performance to Pinecone, a widely used vector database that recently announced a serverless offering.

The GigaOm study reveals that Astra DB delivers superior performance and cost-efficiency against Pinecone's vector database.

Astra DB's total cost of ownership (TCO) was found to be up to 80% lower over a three-year period compared to Pinecone. This significant cost advantage was evaluated across scenarios of updating production data monthly, weekly, or in real-time.

Astra DB also demonstrated up to 9x higher throughput in data ingestion and indexing compared to Pinecone, and up to 74x faster P99 query response times during data ingestion and indexing. Relevancy, measured by F1 score, proved to be up to 20% higher in Astra DB. All of these metrics play a critical role in increasing accuracy and reducing hallucinations in GenAI applications.

A significant driver of Astra DB’s TCO advantage is that applications can query data concurrently while ingesting and indexing, without needing to pause queries to rebuild indexes. This gives Astra DB a significant cost advantage in each of the TCO scenarios that model the cost required to support data updates on a monthly, weekly, and real-time basis.

You can read the full GigaOm report, which digs into the details of Astra DB’s leading performance in indexing, throughput, latency, relevancy, and TCO here.

It’s gratifying to see that pay-as-you-go databases are becoming a standard way to offer data management services. As always, a big thank you to the developers and enterprises that count on Astra DB as the high-performing, cost-effective foundation for their GenAI applications. Together, we are building the future and have much to look forward to!

Discover more

ServerlessVector Search

More Technology

View All

Introducing the DataStax AI Terraform Module

Technology • July 24, 2024

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.

Learn More

Get Started for Free

Astra DB’s Third “Serverless-versary”

Joshua Norrid

Discover more

Share

Share

More Technology

Introducing the DataStax AI Terraform Module

DataStax AI PaaS Is Now Enhanced with State-of-the-Art Retrieval Embedding with NVIDIA NeMo Retriever Integration

The Hitchhiker's Guide to Vector Embeddings

Highly Accurate Retrieval for your RAG Application with ColBERT and Astra DB

One-stop Data API for Production GenAI