TechnologyApril 10, 2024

Making Astra DB easier for MongoDB developers

Great news for developers with MongoDB experience and apps! DataStax has updated the Astra DB Data API to be compatible with the MongoDB API.
Valentin Kulichenko
Valentin KulichenkoDatabase Product Lead
Making Astra DB easier for MongoDB developers

At DataStax, we’ve been focused on making it dramatically easier for any developer with JavaScript, Python, or full-stack application skills to build production-ready generative AI apps.  

We’ve assembled a complete GenAI stack with all the data types and AI ecosystem integrations you need to build RAG (retrieval augmented generation) applications that can scale globally on any infrastructure.  We’ve innovated to create native vector search algorithms that yield up to 20% higher relevance at ultra-low latency. We’ve added Langflow, an open source, RAG framework with drag-and-drop data flows, prebuilt components, and one click deployment of LangChain based RAG applications at scale.

Today we’re taking another step, bringing you a data API you already know, so you can bring the skills and code you already have, to use in the world of generative AI and RAG applications. 

Earlier this year we introduced the Astra DB Data API–a document-based modern API that gives access to rich functionality, scalability, and the performance of Astra DB, without the need for complex data modeling.

Since day one, our Data API enabled many developers to build their GenAI applications quickly, while avoiding a steep learning curve associated with many other databases and frameworks. When talking to these developers, we noticed that quite often they come to Astra DB with prior experience in document databases, and specifically the most popular one: MongoDB. We hear consistently that significant similarities between the Data API and MongoDB APIs serve as a strong driver for Astra DB adoption. Users choose Astra DB and its Data API because they can reuse their existing skills and code, and start coding right away.

Today, we’re rolling out a major upgrade to the Astra DB Data API and clients that ensures a much higher level of compatibility with MongoDB, which makes it even easier to develop with Astra DB. If you are familiar with Mongo, and want to start developing GenAI applications today–this update is for you!

Smoother Transition with Upgraded Data API Clients

With the goal of letting your MongoDB expertise light the way towards Astra DB in mind, we’ve been working on a multitude of improvements to all three Data API clients–Python, TypeScript, and Java.

We've meticulously walked through every single method of MongoDB clients and reworked our Data API clients based on this data, to ensure a more intuitive and familiar experience for developers transitioning from MongoDB. This includes updating method signatures to better mirror those found in MongoDB clients, enhancing our documentation to provide more comprehensive and detailed guidance, and refining overall functionality to streamline the development process.

As an indication of the importance of the latest changes, today’s release elevates the version for all clients to 1.0.0. This major update marks a significant step in aligning more closely with MongoDB, further simplifying the transition for developers familiar with document databases. By refining our clients to mirror MongoDB's intuitive practices, we’re making it easier for developers to jump straight into Astra DB, using the skills they already have.

ObjectId Support

One of the frequently used concepts in MongoDB is ObjectId that is used for document identification. Recognizing its importance in the MongoDB ecosystem, we’ve integrated compatible support of ObjectId into the Data API and its clients. This means that existing MongoDB code that utilizes ObjectId (which is a lot of code out there!) can now be reused with Astra DB with little to no changes required.

The compatibility with ObjectId further underscores our commitment to MongoDB adherence, making it easier for developers to migrate their projects and leverage their existing MongoDB expertise within Astra DB's powerful and scalable environment.

Leveraging Mongoose.JS

While the current release mainly focuses on updates to the clients, let's not forget the tools that have already been making life easier for MongoDB developers using Astra DB. One key feature of the Data API is its native integration with Mongoose.JS – a vastly popular object-document mapping library for MongoDB. This isn’t a new feature, but it’s a vital part of our commitment to smooth transitions from MongoDB to Astra DB. It lets you use your Mongoose models right in Astra DB, just like you would in MongoDB.

Porting Mongoose-based code to Astra DB is as simple as changing one line of code, making this integration ideal not just for migrating projects but also for launching new, enterprise-grade GenAI applications. To learn more, dive into our quickstart guide.

Quick Example

To demonstrate the ease of code portability from MongoDB to Astra DB, let’s use MongoDB's quickstart code for its Node.js driver. Here is how the original code looks like:

const { MongoClient } = require("mongodb");

// Replace the uri string with your connection string.
const uri = "<connection string uri>";

const client = new MongoClient(uri);

async function run() {
  try {
    const database = client.db('sample_mflix');
    const movies = database.collection('movies');

    // Query for a movie that has the title 'Back to the Future'
    const query = { title: 'Back to the Future' };
    const movie = await movies.findOne(query);

    console.log(movie);
  } finally {
    // Ensures that the client will close when you finish/error
    await client.close();
  }
}
run().catch(console.dir);

To port this basic piece of code to Astra DB, we will first create a database and upload sample data into a collection named “movies”. You can download the dataset that we are using from here: movies.json.

Each document in the new collection represents a movie and contains its title, release year, genre, and description. In addition, we store vector representation for description, which is generated using OpenAI’s embedding model.

Here is how the equivalent code will look like in Astra DB (look how similar it is!):

import { DataAPIClient } from '@datastax/astra-db-ts';

const token = "<Astra DB Token>";
const endpoint = "<Data API Endpoint URL>";

const client = new DataAPIClient(token);

async function run() {
  try {
    const database = client.db(endpoint);
    const movies = database.collection('movies');

    // Query for a movie that has the title 'Back to the Future'
    const query = { title: 'Back to the Future' };
    const movie = await movies.findOne(query);

    console.log(movie);

  } finally {
    // Ensures that the client will close when you finish/error
    await client.close();
  }
}
run().catch(console.dir);

The only major difference between the two snippets above is in the way we establish a connection. To connect to Astra DB, we need to use the DataAPIClient object, which requires an Astra DB application token and the database endpoint URL – you can acquire both from the Astra Portal by clicking on the “Connection Details” button.

All the code outside of the connection portion is exactly the same as with MongoDB. This includes actual queries, as well as inserts, updates, and other CRUD operations. Here is the output of the script:

{
  _id: '51a5d5df-ee2d-4534-a5d5-dfee2de53445',
  title: 'Back to the Future',
  year: 1985,
  genre: 'Comedy',
  description: 'Marty McFly, a 17-year-old high school student, is 
accidentally sent 30 years into the past in a time-traveling DeLorean 
invented by his close friend, the maverick scientist Doc Brown.',
  '$vector': [
      -0.011504995,  -0.017071927,  -0.014845154,  -0.030772243,
      -0.011385478,  -0.011259672,  -0.018380314,  -0.014467735,
      -0.023198698,   0.017864507,   0.015599992,   0.019600635,
      -0.004846691,   0.007542093,   0.034319982,  -0.019059667,
       0.018342571,  -0.011341446,  0.0115364455,  -0.013486445,
      -0.011523865,  -0.014706767,  0.0027378616,  -0.018770313,
       0.005375078,  0.0015191121,   0.008951125,    -0.0181287,
        0.03421934,  -0.021210957,   0.025576439, -0.0006844654,
      -0.012932897,   -0.02441902,  -0.030017404,  -0.019172894,
     -0.0018399184,  -0.021953216,  0.0131970905, -0.0049379007,
       0.017084507,   0.011794349, -0.0043812073,   0.012467413,
     -0.0024956842, 0.00024532247,  0.0118069295,  -0.024620311,
      0.0012879429,   0.010762736,   0.018770313,   0.031225147,
     -0.0123478975,   -0.02679676, -0.0026922568,  -0.014442573,
      0.0004992941,   0.018858377, -0.0139141865,   -0.00944806,
     -0.0076867705,   0.004488143,  -0.008309512, -0.0027189907,
      -0.007661609,   0.013687735, -0.0004222377, -0.0023777408,
       0.008542254,   0.022330634,    0.02991676,   0.006950803,
        0.03587998,   0.008762415,  0.0030665307,   -0.00473032,
    -0.00022605836,   0.016681926,  -0.023198698,   0.014090315,
       0.023148376,  -0.026746439,  -0.004708304,   0.016178701,
       0.030948373,  -0.005453707, -0.0061141904,   0.026343858,
      -0.006321771,  0.0025522972,   -0.02284644,   0.022204828,
      0.0067683836,   0.014241283,  -0.005198949,   0.023601279,
      0.0048309653,   0.009743704,   0.011184188,   -0.01814128,
    ... 1436 more items
  ]
}

The port of the existing code is now complete! Let’s take it further by utilizing the power of Astra’s GenAI features like vector search. For example, using the exact same collection, you can find reviews that are similar to an arbitrary text prompt. The query can look like this:

const similarMovies = await movies
  .find(
    {},
    {
      // Provide embedding vector based on the prompt.
      vector: await embedding("Criminals and detectives"),
      // Limit to three top results. 
      limit: 3,
      // Do not include vectors in the output.
      projection: { $vector: 0 },
    }
  )
  .toArray();

console.log(similarMovies);

// Function that generates embeddings, for reference.
async function embedding(prompt) {
  const response = await fetch("https://api.openai.com/v1/embeddings", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${OPENAI_API_KEY}`,
    },
    body: JSON.stringify({
      input: prompt,
      model: "text-embedding-ada-002",
    }),
  });

  const result = await response.json();

  return result.data[0].embedding;
}

By using the prompt "Criminals and detectives" to generate embeddings and perform a vector search, we aim to find movies that align closely with this theme. Here is the result:

[
  {
    _id: '4f80198d-5456-4ea6-8019-8d54567ea6c6',
    title: 'The Criminal Hypnotist',
    year: 1909,
    genre: 'Drama',
    description: 'The Criminal Hypnotist is a 1909 American silent short 
drama film directed by D. W. Griffith.'
  },
  {
    _id: 'f3ddb56a-2d02-4303-9db5-6a2d02b303bd',
    title: 'The Amateur Detective',
    year: 1914,
    genre: 'Comedy',
    description: 'The Amateur Detective is a 1914 American silent short 
comedy directed by Carroll Fleming for the Thanhouser Film Corporation. 
The film stars Carey L. Hastings, Ernest C. Warde and Muriel Ostriche.'
  },
  {
    _id: 'deb0e739-9d6f-489c-b0e7-399d6fd89c9f',
    title: 'Hemlock Hoax, the Detective',
    year: 1910,
    genre: 'Comedy',
    description: 'Hemlock Hoax, the Detective is an American short comedy
film produced and distributed in 1910 by the Lubin Manufacturing Company. 
The silent film features a detective named Hemlock Hoax who tries to solve 
a murder, which unbeknownst to him is a practical joke being played on him 
by two young boys. It was one of many shorts designed to derive its humor 
from a sleuth whose name was similar to Sherlock Holmes.'
  }
]

This simple example demonstrates how easy it is to reuse your prior knowledge of MongoDB, port the existing code, while setting the stage for further development. If you’re a MongoDB developer, building GenAI applications has never been easier!

Conclusion

The goal behind the latest changes is to make the Data API and its clients as compatible with MongoDB as possible, while continuing improvements and innovation for the best GenAI development experience possible. It’s all about making developers’ lives easier, and smoothing out the transition to Astra DB from MongoDB and other document databases.

We know learning new databases can be a headache. That's why we've doubled down on making Astra DB feel familiar for MongoDB professionals. Now, you can bring your MongoDB skills and code straight over to Astra DB with hardly any changes needed.


Get started with Astra DB by creating a free account. To dive into the Data API, refer to its latest documentation. Start building GenAI applications today!

Discover more
DataStax Astra DB
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.