Wake on HTTP: Spin Up Agents On-Demand with Tunnels

You can now set up and configure a devbox to sleep when idle and wake up automatically when an HTTP request arrives. Runloop already provides stable URLs for tunnels, allowing you to run agents with long lives or that need to await for input from humans or other services. Now those same tunnels can be configured to wake up and trigger your AI coding agent on demand so you can suspend the devbox when you're not using it. This means that you don't pay for resources you aren't using.

The Problem: Paying for Idle Agent Infrastructure

Many AI agent workflows are sporadic. A webhook fires. A user clicks a button. A scheduled job runs once an hour. The actual work takes minutes, but the compute behind it often runs 24/7.

Running always-on devboxes for infrequent workloads wastes resources and inflates costs. If your coding agent handles a few pull requests a day, why pay for the hours it sits idle between them?

What you really want is infrastructure that spins up only when there's work to do, then sleeps when idle. That's exactly what wake-on-HTTP tunnels provide.

The Solution: Tunnels with Wake-on-HTTP

Runloop tunnels now support wake_on_http=True. This means an incoming HTTP request to a tunnel URL will automatically wake a suspended devbox — in under a second.

The tunnel URL remains stable even when the devbox is asleep. It acts as a persistent endpoint for webhooks, API calls, or browser requests. Think of it as a doorbell — the devbox sleeps until someone rings.

Here's the lifecycle:

HTTP request arrives → The request hits the stable tunnel URL.
Tunnel wakes the devbox → The infrastructure resumes the suspended devbox in under a second. The caller receives a 503 response with a Retry-After: 5 header while the devbox starts up.
Caller retries → The caller retries the request (most webhook providers like GitHub, Stripe, and Slack do this automatically on 503). The devbox is now running and the request is proxied through to your code.
Devbox suspends → After the work is done, the devbox suspends again — either manually or automatically via an idle timeout.

What Happens to the First Request?

The wake-on-HTTP tunnel returns a 503 Service Unavailable with a Retry-After: 5 header to the first request. The original request payload is not forwarded to your application. Instead, the caller must retry.

For webhook providers like GitHub, this works out of the box. GitHub automatically retries webhook deliveries that receive 5xx responses, with increasing intervals. The first attempt wakes the devbox, and GitHub's retry delivers the payload once the devbox and your server are running.

For custom API clients, implement retry logic that respects the Retry-After header. Wait the indicated number of seconds and resend the request.

For browsers, the tunnel returns an HTML page that auto-refreshes after the delay. No user action is needed.

How Fast Is It?

We benchmarked wake-on-HTTP resume latency in production across multiple trials:

| Metric                                   | Time   |
|------------------------------------------|--------|
| First 503 response (wake trigger)        | ~0.33s |
| Devbox fully running (pure infra resume) | ~0.65s |
| End-to-end with typical app startup      | ~2–3s  |

The infrastructure overhead is sub-second and consistent. The total time your caller waits depends on how fast your application initializes after the devbox resumes. Using blueprints or snapshots keeps this to a minimum:

The Cost Case

The whole point of wake-on-HTTP is that you stop paying for idle compute. Here's a rough comparison for a small devbox (~$0.16/hr):

| Scenario                              | Monthly cost  |
|---------------------------------------|---------------|
| Always-on (24/7)                      | ~$115/month   |
| Wake-on-HTTP, 2 hours active/day      | ~$10/month    |
| Wake-on-HTTP, 20 minutes active/day   | ~$1.60/month  |

The savings scale with how bursty your workload is. If your agent handles a handful of pull requests a day and each takes a few minutes, you're paying for minutes instead of months.

Step-by-Step: Setting Up a Wake-on-HTTP Tunnel

Prerequisites

- Python: pip install runloop-api-client

- TypeScript: npm install @runloop/api-client

1. Create a Devbox with a Wake-on-HTTP Tunnel

The simplest approach is to configure the tunnel and entrypoint at devbox creation time. The entrypoint is the command that runs when the devbox starts — and importantly, it restarts automatically when the devbox resumes from suspension.

The tunnel URL format is https://{port}-{tunnel_key}.tunnel.runloop.ai, where {port} matches whichever port your service listens on.

Python:

from runloop_api_client import AsyncRunloop
from runloop_api_client.sdk.async_devbox import AsyncDevbox

client = AsyncRunloop()

# Create a devbox with wake-on-HTTP and an entrypoint
devbox = await client.devboxes.create(
    entrypoint="python3 server.py",
    launch_parameters={
        "lifecycle": {
            "after_idle": {
                "idle_time_seconds": 300,
                "on_idle": "suspend",
            },
            "resume_triggers": {
                "http": True,
            },
        },
    },
    tunnel={"auth_mode": "open"},
)

# Get the stable tunnel URL for port 8080
sdk_devbox = AsyncDevbox(client, devbox.id)
tunnel_url = await sdk_devbox.get_tunnel_url(8080)
print(f"Tunnel URL: {tunnel_url}")

TypeScript:

import Runloop from "@Runloop Sales Admin Agent/api-client";

const client = new Runloop();

// Create a devbox with wake-on-HTTP and an entrypoint
const devbox = await client.devboxes.create({
  entrypoint: "python3 server.py",
  launchParameters: {
    lifecycle: {
      afterIdle: {
        idleTimeSeconds: 300,
        onIdle: "suspend",
      },
      resumeTriggers: {
        http: true,
      },
    },
  },
  tunnel: { authMode: "open" },
});

// Get the stable tunnel URL for port 8080
const tunnelUrl = await devbox.getTunnelUrl(8080);
console.log(`Tunnel URL: ${tunnelUrl}`);

‍

Note on auth_mode: The "open" mode makes the tunnel publicly accessible — anyone with the URL can reach it. For production webhook endpoints, use "authenticated" mode (which requires a bearer token) and verify request signatures from your webhook provider (e.g., GitHub's X-Hub-Signature-256 header). See the tunnel authentication docs for details.

Checkpoint: At this point, you should see a tunnel URL printed to your console. This URL is stable — it won't change even when the devbox suspends and wakes.

2. Suspend the Devbox

Once the tunnel is configured and your agent code is ready, suspend the devbox to stop accruing compute costs.

Python:

# Suspend devbox to save costs when idle
await devbox.suspend()‍

TypeScript:

// Suspend devbox to save costs when idle
await devbox.suspend();

‍

Checkpoint: The devbox is now suspended. The tunnel URL is still active and listening. Any HTTP request to that URL will wake the devbox automatically.

3. Send a Request to Wake the Devbox

When an HTTP request hits the tunnel URL, the devbox wakes up. You can test this with curl:

curl -i https://8080-{your-tunnel-key}.tunnel.runloop.ai

The first response will be a 503 with a Retry-After: 5 header. Wait a few seconds and retry — you'll get a response from your running service:

# First request: triggers wake
HTTP/1.1 503 Service Unavailable
Retry-After: 5

# Retry after a few seconds: devbox is running
HTTP/1.1 200 OK

Example: An On-Demand Coding Agent Behind a Webhook

Let's walk through a practical use case. We'll set up a GitHub webhook that triggers an AI coding agent running inside a Runloop devbox.

The Setup

When a pull request event fires, GitHub sends a POST request to your tunnel URL. The devbox wakes, the entrypoint restarts your server, and the agent processes the event. GitHub's webhook retry logic handles the initial 503 — the retry arrives a few seconds later when the devbox is running.

Agent Code Inside the Devbox

Here's a Python server that receives GitHub webhook payloads, verifies their signature, and uses Claude to review pull request diffs:

from http.server import HTTPServer, BaseHTTPRequestHandler
import hashlib
import hmac
import json
import os
import subprocess
import anthropic

# Note: Protect your credentials using agent gateway
client = anthropic.Anthropic()
WEBHOOK_SECRET = os.environ.get("GITHUB_WEBHOOK_SECRET", "")
class WebhookHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        content_length = int(self.headers["Content-Length"])
        raw_body = self.rfile.read(content_length)

        # Verify GitHub webhook signature
        if WEBHOOK_SECRET:
            signature = self.headers.get("X-Hub-Signature-256", "")
            expected = "sha256=" + hmac.new(
                WEBHOOK_SECRET.encode(), raw_body, hashlib.sha256
            ).hexdigest()
            if not hmac.compare_digest(signature, expected):
                self.send_response(403)
                self.end_headers()
                return

        payload = json.loads(raw_body)
        event_type = self.headers.get("X-GitHub-Event", "unknown")

        if event_type == "pull_request" and payload.get("action") == "opened":
            review = review_pull_request(payload)
            self.send_response(200)
            self.send_header("Content-Type", "application/json")
            self.end_headers()
            self.wfile.write(json.dumps({"review": review}).encode())
        else:
            self.send_response(200)
            self.end_headers()


def review_pull_request(payload):
    """Fetch the PR diff and ask Claude to review it."""
    repo = payload["repository"]["full_name"]
    pr_number = payload["pull_request"]["number"]

    # Fetch the diff using the GitHub CLI (pre-installed via blueprint)
    diff = subprocess.run(
        ["gh", "pr", "diff", str(pr_number), "--repo", repo],
        capture_output=True,
        text=True,
    ).stdout

    # Ask Claude to review the diff
    message = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": (
                f"Review this pull request diff. Flag any bugs, "
                f"security issues, or style problems:\n\n{diff}"
            ),
        }],
    )
    return message.content[0].text


server = HTTPServer(("0.0.0.0", 8080), WebhookHandler)
print("Webhook server listening on port 8080")
server.serve_forever()

‍‍

Registering the Webhook

At this point, the webhook is registered. When a pull request is opened, GitHub sends a POST request to your tunnel URL. The first delivery gets a 503 while the devbox wakes. GitHub will receive the 503 and automatically attempt retries. The second retry hits your now-running agent.

Best Practices and Tips

Always protect credentials with API Gateway

API Gateway protects your credentials by stopping them from ever landing on the devbox. This protects your agent against secret exfiltration. To add Agent Gateway support to our example, simply add the following setup configuration:

async def setup():
    # ── 1. Create gateway configs ──────────────────────────────────────
    # Anthropic API — agent never sees sk-ant-...
    anthropic_gw = await runloop.gateway_configs.create(
        name="anthropic-gateway",
        endpoint="https://api.anthropic.com",
        auth_mechanism={"type": "bearer"},
        description="Anthropic Claude API proxy",
    )

    # GitHub API — agent never sees ghp_... token
    github_gw = await runloop.gateway_configs.create(
        name="github-gateway",
        endpoint="https://api.github.com",
        auth_mechanism={"type": "bearer"},
        description="GitHub API proxy",
    )

    # ── 2. Store secrets (one-time, or use rli secret create) ──────────
    await runloop.api.secrets.create(name="ANTHROPIC_KEY", value="sk-ant-api03-...")
    await runloop.api.secrets.create(name="GITHUB_TOKEN", value="ghp_...")
    await runloop.api.secrets.create(name="GH_WEBHOOK_SECRET", value="whsec_...")

    # ── 3. Lock down egress with a network policy ──────────────────────
    policy = await runloop.network_policies.create(
        name="pr-review-policy",
        allow_all=False,
        allowed_hostnames=[
            "api.anthropic.com",
            "api.github.com",
            "github.com",
        ],
        allow_agent_gateway=True,  # permits gateway.runloop.ai traffic
    )

    # ── 4. Launch devbox with gateways ─────────────────────────────────
    # The webhook secret is non-credential config, so it can go via
    # environment_variables. But if you want it hidden from the agent too,
    # you could wrap it in a gateway with a custom header auth_mechanism.
    devbox = await runloop.devbox.create(
        name="pr-review-webhook",
        gateways={
            # key → env var prefix: $ANTHROPIC_URL, $ANTHROPIC
            "ANTHROPIC": {
                "gateway": anthropic_gw.id,
                "secret": "ANTHROPIC_KEY",
            },
            # key → env var prefix: $GITHUB_URL, $GITHUB
            "GITHUB": {
                "gateway": github_gw.id,
                "secret": "GITHUB_TOKEN",
            },
        },
        environment_variables={
            # Webhook secret is only used server-side for HMAC verification,
            # not sent outbound — account secrets work fine here.
            "GITHUB_WEBHOOK_SECRET": "{{secret:GH_WEBHOOK_SECRET}}",
        },
        launch_parameters={
            network_policy_id: policy.id,
            lifecycle: {
              # Note we're configuring suspend on idle here too
              afterIdle: {
                idleTimeSeconds: 300,
                onIdle: "suspend",
              },
              resumeTriggers: {
                http: True,
              },
            },
        },
    )

    print(f"Devbox ready: {devbox.id}")

‍

Use blueprints for fast, reproducible boot times.

A blueprint pre-installs your dependencies (Python packages, CLI tools, language models) so the devbox is ready to serve requests within seconds of waking. Without a blueprint, you'd need to install dependencies on every resume.

Set your server as the devbox entrypoint.

The entrypoint command restarts automatically when a devbox resumes from suspension. This is critical for wake-on-HTTP — without it, your server won't be running when traffic arrives after a wake.

devbox = await sdk.devbox.create(
  entrypoint="python3 server.py",  # Auto-restarts on resume
  tunnel={"auth_mode": "open"},
  launch_parameters={ 
    lifecycle: {
      resume_triggers: {
        http: True,
      }
    }
  }
)

‍Automate the sleep/wake cycle with idle timeouts.

Instead of manually suspending after each request, configure an idle timeout to suspend automatically:

devbox = await sdk.devbox.create(
    entrypoint="python3 server.py",
    tunnel={"auth_mode": "open"},
    launch_parameters={
      "lifecycle": {
        "after_idle": {
          "idle_time_seconds": 300,
          "on_idle": "suspend",
        },
        "resume_triggers": {
          "http": True,
        },
      },
    },
)

‍With this configuration, the devbox suspends after 5 minutes of inactivity. HTTP traffic through the tunnel counts as activity (via http_keep_alive, enabled by default), so the devbox stays awake while requests are flowing and sleeps when they stop. Combined with wake_on_http, this gives you a fully automatic sleep/wake cycle: no manual suspend calls needed.

Use "authenticated" mode for production endpoints.

Open tunnels are convenient for development, but production webhook endpoints should use "authenticated" mode and verify request signatures from your webhook provider. See the tunnel authentication docs for setup instructions.

Conclusion

Wake-on-HTTP tunnels make it possible to run AI agents and bursty workloads without paying for idle compute. The infrastructure resumes a suspended devbox in under a second — fast enough that webhook retries and browser refreshes handle the transition seamlessly.

The setup takes a few lines of code: create a devbox with an entrypoint, enable a tunnel with wake_on_http=True, and configure an idle timeout. The tunnel URL stays stable. Requests wake the devbox automatically. Your agent does its work and goes back to sleep.

If your workload runs two hours a day instead of twenty-four, that's over 90% off your compute bill — with no changes to your application code.

Wake on HTTP: Spin Up Agents On-Demand with Tunnels

The Problem: Paying for Idle Agent Infrastructure

The Solution: Tunnels with Wake-on-HTTP

What Happens to the First Request?

How Fast Is It?

The Cost Case

Step-by-Step: Setting Up a Wake-on-HTTP Tunnel

Prerequisites

1. Create a Devbox with a Wake-on-HTTP Tunnel

2. Suspend the Devbox

3. Send a Request to Wake the Devbox

Example: An On-Demand Coding Agent Behind a Webhook

The Setup

Agent Code Inside the Devbox

Registering the Webhook

Best Practices and Tips

Conclusion

Enjoyed This Article?

Take a Look At Our Latest Blogs

Get Started With Runloop

Get Started With Runloop

Cookie Settings

Wake on HTTP: Spin Up Agents On-Demand with Tunnels

The Problem: Paying for Idle Agent Infrastructure

The Solution: Tunnels with Wake-on-HTTP

What Happens to the First Request?

How Fast Is It?

The Cost Case

Step-by-Step: Setting Up a Wake-on-HTTP Tunnel

Prerequisites

1. Create a Devbox with a Wake-on-HTTP Tunnel

2. Suspend the Devbox

3. Send a Request to Wake the Devbox

Example: An On-Demand Coding Agent Behind a Webhook

The Setup

Agent Code Inside the Devbox

Registering the Webhook

Best Practices and Tips

Conclusion

Enjoyed This Article?

Take a Look At Our Latest Blogs

Get Started With Runloop

Get Started With Runloop