Now you can run an agent, suspend the box, and wake it up again when you need it


Most agent demos donât fail because the model is weak or the prompts are wrong. They fail because the environment disappears.
An agent can spend thirty minutes installing dependencies, processing data, or even spinning up a database, only for everything to be wiped at the end of the run. The next time it starts, it repeats the same setup work from scratch. That loop might be acceptable for demos, but it breaks down immediately when you try to do anything real.
The OpenAI Agents SDK fixes an important part of this by giving developers a clean way to define agent behavior. You can structure reasoning, attach tools, and build workflows that actually resemble programs. But it deliberately leaves one question open, and it is the one that matters most in practice:
Where does the agent actually run?
Runloop is the answer to that question. It is a cloud sandbox provider designed for agents, and it plugs directly into the Agents SDK to give those agents something they have been missing this whole time: a persistent machine.
Today, most agents are forced into a stateless model where every run starts from zero, does some work, and then loses everything. That model is simple, but it does not map to real workloads. The moment an agent needs to load large datasets, install system dependencies, or run multi-step pipelines, the cost of restarting becomes obvious.
You do not want an agent that repeats work. You want an agent that makes progress.
That is the core limitation Runloop addresses. Instead of treating each run as disposable, it treats the environment as something that can evolve over time.
One of the more subtle strengths of the Agents SDK is that it separates behavior from execution. You define what the agent should do, and a sandbox backend determines where that work actually happens. That separation makes it possible to swap in different execution environments without changing your agent logic.
Runloop sits exactly in that layer. It is not a wrapper and not an add-on. It is the infrastructure that executes the agent.
Once you plug it in, the agent is no longer running in an ephemeral process. It is running inside a managed environment that can persist, pause, and resume.
The biggest shift Runloop introduces is persistence. Instead of destroying the sandbox after execution, you can suspend it and resume it later with everything intact. Files, installed packages, databases, and even running services are preserved and restored as part of the environment state.
This sounds like a small change, but it fundamentally alters how agents behave. An agent can now run for hours, pause overnight, and continue the next day without redoing any setup. It can checkpoint its progress, wait for input, and resume without losing context.
Agents stop behaving like scripts and start behaving like systems.
Once persistence exists, snapshots become more than just a recovery mechanism. They become a way to structure work over time. Runloop allows you to capture the full state of a sandbox and restore it later, which means a single agent task no longer has to be confined to a single execution.
An agent can pause at a natural boundary, store its state, and resume later, whether that is minutes or days afterward. This makes it possible to introduce human approval steps, external dependencies, or staged processing without forcing the agent to start over.
The result is a different mental model. Instead of thinking in terms of isolated runs, you start thinking in terms of continuous workflows that share a persistent environment.
Most sandboxed environments for agents are intentionally limited. They restrict system access, avoid running services, and simplify execution to keep things predictable. That approach works for narrow tasks but quickly becomes a bottleneck when agents need to interact with real infrastructure.
Runloop takes a more practical approach. Agents run in full Linux environments where they can install system packages, run containers, and manage services as needed. Docker-in-Docker support makes it possible to spin up databases or multi-service systems inside the sandbox itself, which means the agent can create and control its own dependencies.
Because the environment persists, all of that setup work persists as well. The agent does not need to rebuild its world every time it runs.
Another place where most agent systems struggle is environment setup. Even when the logic is correct, the overhead of installing dependencies and configuring the environment on every run adds friction and variability.
Runloop introduces Blueprints as a way to standardize this. A blueprint is a pre-configured environment that includes the operating system, runtimes, and any specialized capabilities you need.
Instead of rebuilding environments from scratch, you start from a known baseline and build on top of it. Over time, those environments can be snapshotted, shared, and reused, which makes agent execution more predictable and much easier to scale across teams.
As soon as agents become long-running systems, visibility becomes a real concern. It is no longer enough to look at a final output or a short log. You need to understand what the agent did over time, how it made decisions, and where things might have gone wrong.
Runloop introduces Axons as a way to stream agent activity as structured events. This provides a real-time view into execution and makes it possible to trace behavior across long workflows, which is essential when agents are operating over hours or days instead of seconds.
If agents are going to run like systems, they need to be observable like systems.
The combination of the Agents SDK and Runloop changes what an agent actually is. The SDK defines how the agent thinks and acts, while Runloop provides the environment where that behavior can persist and evolve.
That combination turns agents from stateless executions into stateful programs that maintain progress, interact with real infrastructure, and operate over time. It moves agents out of the world of demos and into the world of systems that can do real work.
Agents are no longer just API calls.
They are programs.