Back to Blog
Case StudyAI Workloads

Why is Nirvana benchmarking LangChain agents?

April WongApril Wong
3 min read
Why is Nirvana benchmarking LangChain agents?


Another background piece to get out before the report drops, so we're on the same page of what LangChain is, how it works, where it hits the cloud, and which part exactly we are benchmarking.

What is LangChain


When you use ChatGPT or Claude, you type a question and get an answer. One API call. That's great for a conversation, but companies need more than that. They need the AI to search their documents, check a database, look up a customer record, take an action, and remember what happened.

That's why the industry has moved to agents. You're probably already using them. These don't just generate text. They reason, call tools, observe results, and keep going. The LLM is just the brain. The brain needs hands.

That's what LangChain does. It's the most popular open-source framework (138K GitHub stars) for connecting an LLM to real tools and data. Companies like Klarna, Uber, and Replit use it to build AI agents that don't just generate text. They reason about what to do, call tools, observe the results, and keep going until the job is done.

How the loop works

Before anything runs, LangChain bundles three components: the LLM (the brain, GPT-4, Claude, Llama), tools (functions the agent can call: search, query a database, send a message, run code), and a prompt (instructions for how to reason and when to stop).

Once bundled, the ReAct loop starts. Reason, Act, Observe.

Reason. The agent reads the user's question and conversation history. The LLM thinks about what to do next. "I need to search for refund policy documents."

Act. This is where it gets interesting for us (Nirvana). The LLM picks a tool and formats the input. Call vector_search with "refund policy enterprise". That tool call doesn't stay inside the LLM. It leaves the model and hits a real application running on real infrastructure. A vector database that has to search millions of records. A cache layer that has to look up previous results. A checkpoint store that has to write the agent's state to disk. Every "act" is a storage operation happening on your VM.

Observe. LangChain captures the tool's result and feeds it back to the agent. "Found 3 documents about enterprise refund policies."

Then it loops. The agent reads the observation, decides if it has enough, and either acts again or returns the final answer. One query might loop 3, 5, 10 times. Each loop generates more tool calls, more disk I/O.

Where those tool calls land

Every time the agent acts, it hits a real application somewhere. There are Web and search (Tavily, SerpAPI, Brave). Workspace tools (Slack, Gmail, Google Drive, GitHub). API calls to third-party platforms. Developer tools (Docker, code sandboxes). Mostly compute-bound.

Then there's data and storage layer: Qdrant for vector search, Redis for caching and session state, Postgres for checkpoints and agent memory. These run on your VM. Every read and write hits the disk attached to your machine. And unlike the external API calls, the speed of these operations depends entirely on what storage is underneath them.

Every vector search walks an HNSW graph through hundreds of sequential disk reads. Every cache miss reads from disk. LangGraph writes a Postgres checkpoint at every single step of the agent loop, by design, for fault tolerance. A 10-step agent generates 10 Postgres writes per query per user.

Multiply by a few hundred concurrent users and the disk gets busy fast.

That's what we're benchmarking

We build cloud and awesome block storage. When we broke down how LangChain agents actually work, we realized the data and storage layer is the one place where what we build directly affects how fast agents finish work.

That's why a cloud company like Nirvana is benchmarking an LLM framework. It's not about the LLM. It's about the infra i.e. storage the LLM's tools run on and the workloads.

Report dropping soon.


This is a companion piece to our background research post. The full benchmark report and open-source repo are coming soon.


Nirvana: Modern Cloud for Real-time Workloads

High Performance Block Storage Cloud with High IOPS, powering blockchain, AI and real-time systems.

Learn more at Nirvana Labs

Nirvana Cloud | Pricing | Blog | Docs | Changelog | Twitter | Telegram | LinkedIn

Powering AI, blockchain, and
real-time systems.

Talk to Sales