Whitepaper

NVIDIA NeMo Hosted in Astra DB Performance Study

NVIDIA NeMo Hosted in Astra DB Performance Study

This report compares two approaches to creating embeddings that meet performance requirements for production-level Generative AI and retrieval-augmented generation (RAG) applications, comparing latency, throughput, predictability, and cost between Astra DB with NVIDIA NeMo Retriever embedding microservice and an alternative embedding API.