MongoDB as a Vector Database for AI Agents-MongoDB

Author: Aasawari Sahasrabuddhe

Original post on Foojay: Read More

Table of Contents

Why should you use MongoDB for building AI agents?Understanding AI agentsBuilding a multi-agent application with MongoDB

Conclusion

Modern artificial intelligence systems are continually evolving. Large Language Models, or LLMs, have become the backbone of modern applications and help build conversational interfaces, like GPS, to more integrated content. However, LLMs lack memory and the capacity to retain content across interactions because they are stateless. And these limitations led to the building of AI agents. These AI agents build beyond simple prompt-response interactions into more autonomous, task-oriented workflows.

These agents are not just model invocations; rather, they are an orchestration layer that combines reasoning with capabilities like retrieval, memory, and tool execution. While developing these agents, a database with the ability to store and retrieve semantically meaningful data is needed, which is where vector databases come into the picture.

A vector database stores data as dense numerical representations of text, images, or unstructured data. These embeddings capture semantic meaning, enabling similarity search instead of exact matching. With MongoDB Atlas, developers can generate embeddings, store them alongside application data, and perform vector search within MongoDB Atlas, thus allowing AI agents to seamlessly combine operational data with semantic retrieval, simplifying architecture while improving performance.

In this blog post, we’ll build an AI agent in Java using MongoDB as our database, by storing user queries, documents, agent memory, and embeddings in a single place. We will understand how MongoDB simplifies the implementation of retrieval-augmented generation and persistent memory systems.

Why should you use MongoDB for building AI agents?

Vector store and voyage AI support – MongoDB Atlas infrastructure offers you a developer-friendly ecosystem. Giving you the ability to store vector embeddings, create vector embeddings, and finally perform the vector search directly from the platform. This reduces the need to have different systems to build an enterprise application.
Hybrid Search – With MongoDB Atlas infrastructure, you can add filters with a vector search query and add additional conditions to the query results. Unlike specialized vector stores, MongoDB can do both semantic (vector) and classically structured (keyword) queries together.
Developer Ecosystem – MongoDB has been a developer-first database ever since, and as it continues to do so, it lets your application integrate efficiently.
Operational Efficiency – If you already use MongoDB, adding vector search avoids the need to introduce new infrastructure. It simplifies schema, transactions, and ops.

Understanding AI agents

While we are building AI agents, it is important to understand the core principles of embeddings, retrieval-augmented generation (RAG), and agentic architectures.

Vector embeddings, or simply embeddings, are dense vector representations of numerical data derived from texts, audio, videos, or any form of unstructured data. These vectors reside in a high-dimensional space where semantic similarity is preserved, which means semantically similar inputs are located closer together based on distance metrics such as cosine similarity or dot product.

This vector representation helps retrieve the top-K most similar vectors, effectively performing semantic retrieval rather than exact matching using vector search. This is critical for handling paraphrasing, ambiguity, and contextual queries.

With retrieval-augmented generation, or RAG, it builds the retrieval step into a pipeline. The model uses the semantic search ability to generate responses. One of the most common challenges with standard LLMs is hallucination, or the generation of incorrect or fabricated information when relying solely on parametric knowledge stored in model weights. RAG addresses this by grounding responses in retrieved documents rather than depending only on internal weights. As a result, it improves factual consistency, traceability, and the freshness of responses.

With these changes, the concepts of agents came into the picture. In these agentic architectures, vector search becomes a core abstraction for implementing memory systems:

Short-term memory: recent interaction history embedded and retrieved for conversational continuity
Long-term memory: persisted embeddings of past interactions, documents, and tool outputs
Semantic recall: retrieving context dynamically based on similarity rather than rigid keys

In these architectures, vector databases serve as both the retrieval and the storage layer for these systems. Therefore, vector search no longer remains just for semantic searches but rather a foundational building block for agentic systems. It underpins how agents retrieve knowledge, maintain memory, and produce contextually relevant, low-hallucination outputs in real-world applications.

Building a multi-agent application with MongoDB

Before we get into the actual code for building the agents, let’s first understand a few basic prerequisites for building the application.

A free-tier MongoDB Atlas cluster.
Create your free Voyage AI API key to generate embeddings in the database.
A Spring Boot setup to work with MongoDB using Spring Initializr.
Latest Java and Gradle/Maven versions installed.

To build the multi-agent system, we are using a travel replanning system as an example.

Here is a scenario to better understand this system: You are traveling from Toronto to San Francisco with a layover at New York. And then the reality happens. The flight between New York and SF is delayed by 9 hours, and now you need a better plan, since you have that one client meeting to showcase your product.

At this point, we do not need just a system that tells me another way, but rather helps me replan the entire trip. And this is where this multi-agent replanning system would come in. This system basically does the following:

A Monitoring Agent that detects disruptions
A Planner Agent orchestrates decisions
A Booking Agent finds alternative routes
A Budget Agent filters based on cost
A Preference Agent aligns with user choices
A Memory Agent recalls similar past situations

Each agent is simple on its own. But together, they behave like a coordinated system.

What makes this system powerful is the use of MongoDB as the database. MongoDB stores real-time data in a database; every event is recorded in the system, and Voyage AI and MongoDB’s vector search capabilities store embeddings of past travel incidents and retrieve similar cases during replanning.

To build this system, we will be using four different collections: trip_state, event, agent_decision, and incident_memory. The trip_state stores the current state of the trip; all disruptions are copied into events. Every agent logs its reasoning in agent_decision, and incident_memory stores the past incidents.

Let’s do this step by step.

Step 1: Creating a vector search index

Before we build the system, we need a vector search index. The embeddings in this project are produced by Voyage AI’s voyage-3-large model.

Go to MongoDB Atlas, create a collection named incident_memory, and create a vector search index with the JSON below.

{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "cosine",
      "type": "vector"
    }
  ]
}

Step 2: Creating the Trip

The trip is created with the following API call. This request lands in the controller. Because the request body is optional, we use a default CreateTripRequest when none is supplied and pass that normalized request into the service. So, normalized is just the incoming request or a default placeholder when the client omits the body.

@PostMapping("/create")
public TripState createTrip(@RequestBody(required = false) CreateTripRequest request) {
    CreateTripRequest normalized = request == null
            ? new CreateTripRequest("demo-user", null, null)
            : request;
    return tripService.createTrip(normalized);
}

And with the Service layer, it creates the trip. Example:

curl -X POST "http://localhost:8080/trip/create" 
  -H "Content-Type: application/json" 
  -d '{
    "userId": "traveler-001",
    "preferences": {
      "airlinePreference": "SkyJet",
      "avoidRedEye": true,
      "maxAdditionalBudget": 250
    }
  }'

Would result in:

{
  "id": "69dd6111674d2228e4db4b25",
  "userId": "traveler-001",
  "itinerary": [
    {
      "segmentId": "SEG-1",
      "type": "FLIGHT",
      "provider": "SkyJet",
      "fromLocation": "JFK",
      "toLocation": "SFO",
      "cost": 420.0
    }
  ],
  "status": "ON_TRACK"
}

This trip gets stored in trip_state. At this point, everything looks fine.

Step 3: Induce a disruption

At this step, we would add a delay status in the database. This is done using another post method:

curl -X POST "http://localhost:8080/event/simulate-delay" 
  -H "Content-Type: application/json" 
  -d '{
    "tripId": "69dd6111674d2228e4db4b25",
    "delayMinutes": 180,
    "severity": "HIGH"
  }'

This is done using another code block in the controller.

@PostMapping("/simulate-delay")
public TravelEvent simulateDelay(@RequestBody SimulateDelayRequest request)

And at the same time, something critical happens:

tripState.setStatus(TripStatus.DISRUPTED);
tripService.saveTrip(tripState);

This is your first agent that detects a problem, updates the state, and logs the decision.

The following delay is stimulated:

{
  "id": "69dd6160674d2228e4db4b26",
  "tripId": "69dd6111674d2228e4db4b25",
  "type": "FLIGHT_DELAY",
  "severity": "HIGH",
  "metadata": {
    "from": "JFK",
    "to": "SFO",
    "delayMinutes": 180
  }
}

Step 4: Replanning

To trigger replanning, the PlannerAgent orchestrates the other agents. It asks MemoryAgent for similar incidents using MongoDB Vector Search and asks BookingAgent for alternative routes; then BudgetAgent and PreferenceAgent refine those options before PlannerAgent commits the final itinerary.

This enters the

@PostMapping("/plan/replan")
public TripState replan(@RequestBody ReplanRequest request)

And the planner agent takes over. Example:

curl -X POST http://localhost:8080/plan/replan 
  -H "Content-Type: application/json" 
  -d '{
    "tripId": "69dd6111674d2228e4db4b25"
  }'

Which responds as

{
  "id": "69dd6111674d2228e4db4b25",
  "status": "REPLANNED",
  "itinerary": [
    {
      "segmentId": "OPT-CHI-1",
      "fromLocation": "JFK",
      "toLocation": "ORD",
      "cost": 320.0
    },
    {
      "segmentId": "OPT-CHI-2",
      "fromLocation": "ORD",
      "toLocation": "SFO",
      "cost": 320.0
    }
  ]
}

This is where it starts to suggest taking another flight from Chicago.

Step 5: The Memory agents make use of vector search.

At first, the planner agents check, “Have we seen something like this?” If so, they retrieve it from the incident_memory and suggest what could be done.

List<IncidentMemory> results = vectorSearchService.findSimilar(query);

Step 6: Booking agent generates options

At this point, when no response is found, it starts to generate its own options. To do so,

List<AlternativeRoute> options =
    bookingAgent.generateOptions(tripState, latestEvent, memories);

The budget agent also starts to filter options with

List<AlternativeRoute> budgeted =
    budgetAgent.filterOptions(tripState, options);

Step 7: The system finally makes the decision

Finally, the trip is updated, and the system records the reason for the same. At this point, when you call:

curl http://localhost:8080/trip/69dd6111674d2228e4db4b25

It would give you the response as:

{
  "status": "REPLANNED",
  "itinerary": [
    {
      "fromLocation": "JFK",
      "toLocation": "ORD"
    },
    {
      "fromLocation": "ORD",
      "toLocation": "SFO"
    }
  ]
}

Finally, the system didn’t just detect a delay, but it used memory, coordinated multiple agents, and produced a better plan with a fully traceable decision history stored in MongoDB.

The complete code for this multi-agent system is available on the GitHub repository.

Conclusion

In this blog, we tried to build a multi-agent system that is adaptive, stateful, and intelligent, all using MongoDB.

Starting from a simple travel itinerary, we saw how a disruption triggered a chain of coordinated actions across multiple agents. The Monitoring Agent detected the issue, the Memory Agent recalled similar past incidents using vector search, and the Planner Agent orchestrated Booking, Budget, and Preference Agents to arrive at a better alternative. Most importantly, every step of this process was persisted, making the system not just intelligent, but also explainable.

What makes this architecture powerful is the role of MongoDB as a unified data platform. Instead of separating operational data and AI memory into separate systems, MongoDB brings them together: This allows agents to move beyond stateless execution and operate with context and experience.

The vector search capability of MongoDB enables the system to retrieve similar past situations and apply that knowledge to new problems, reducing guesswork and improving decision quality.

The post MongoDB as a Vector Database for AI Agents-MongoDB appeared first on foojay.