Learn how to interpret and apply AI benchmark results. Best practices for analyzing performance, guiding model improvements, and making informed deployment decisions.
.webp)

We created an MCP server that enables you to load repos in Runloop’s isolated sandboxes and create various indices to try and inform repo navigation.
Checkout this repo and try it out on a repo of your choice (or this one!) -
https://github.com/runloopai/demo-code-understanding-mcp
AI has evolved from a basic coding assistant to a full development partner; developers are using the magic of AI to instantly generate new UI components, tests, and even full features!
When working on your small side project, the magic feels like it never ends but when you get to work on Monday and ask for help within your company’s large monorepo that magic is replaced by chat loops of pure frustration.
As the overall repo context gets larger and your code becomes filled with more and more company specific patterns, AI agents struggle to select the relevant context. This makes sense - AI agents are really ‘day 1’ employees and teaching an intern in English how to do a task can often be more work than simply doing it yourself. However, unlike our favorite interns, the AI agent will remain a day 1 employee no matter how much work it gets through.
So how can we help our favorite AI intern grow over time or make our interactions with them better despite growing context sizes? We have some options:
Both these options solve the problem, but neither of them capture the feeling of true magic. You’re still picking a model and giving it a set of instructions to follow, in essence problem-solving for the AI.
Humans also have a limited context and we are able to use various heuristics to learn how to look through large or even unfamiliar repos efficiently. Surely, we can apply the same strategies human’s use to our AI Agents?
We can start by examining the “indexing strategies” human developers use to navigate and understand large repos that we could emulate:
AI is great at the first (better than most people!) but often misses the mark on the last three, creating a significant gap in capability. We set out to test how much those strategies could supercharge our AI agents.
To test whether giving AI agents access to these “indexing strategies” would help, we created an MCP server that enables you to load repos in Runloop’s isolated sandboxes and create various indices to try and inform repo navigation. Drop it in Claude and give it a shot!
Terminal access for AI models to access file structure and content on their own
Repository mapping for high-level folder based overview using Aider’s excellent Repo Map
AST and Semantic search using the excellent cased/kit for intelligent code indexing and retrieval
Test execution traces to reveal runtime code flow patterns associated with specific tests
PR history analysis to understand historical design decisions and understand which files and code symbols tend to get edited together
Want to understand how it works in detail? Follow the quick-start below and let us know how it goes!
Run runloop_mcp_config_setup.py to add this mcp to Claude or Cursor.
You can also do it manually by making edits to ~/Library/Application Support/Claude/claude_desktop_config.json or ~/.cursor/mcp.json
{
"mcpServers": {
"code-understanding": {
"command": "/PATH_TO_YOUR_VENV/bin/uv",
"args": ["--directory", "/REPO_PATH", "run", "runloop_mcp.py"],
"env": {
"RUNLOOP_API_KEY": "YOUR_RUNLOOP_API_KEY",
"OPENAI_API_KEY": "YOUR_OPENAI_API_KEY",
"GH_TOKEN": "YOUR_GH_TOKEN"
}
}
}
}
We’ve bundled a runloop_setup.py which initializes all relevant libraries and creates a snapshot so you can reduce repeated setup and test multiple repos at your own pace. The MCP server is also configured to dynamically created devboxes and download the needed libraries when there isn’t a snapshot present!
uv run runloop_setup.py
Tell me about <https://github.com/runloopai/runloop-example-code-understanding>?
Which functions in <https://github.com/runloopai/runloop-example-code-understanding> were recently changed in the git history?
What are the foundational functions in <https://github.com/runloopai/runloop-example-code-understanding>?