Agent Sandboxes Benchmark

TTI (Time to Interactive) = API call to first command execution. Lower is better.

Daily: Time to Interactive (TTI)

API Request → Provisioning → Boot → Ready → First Command
└───────────────────── TTI ─────────────────────┘

Each benchmark creates a fresh sandbox, runs echo "benchmark", and records wall-clock time. 10 iterations per provider, every day, fully automated.

Powered by ComputeSDK — We use ComputeSDK, a multi-provider SDK, to test all sandbox providers with the same code. One API, multiple providers, fair comparison. Interested in multi-provider failover, sandbox packing, and warm pooling? Check out ComputeSDK.

Sponsor-only tests coming soon: Stress tests, warm starts, multi-region, and more. See roadmap →

Full methodology →

📖 Open source — All benchmark code is public
📊 Raw data — Every result committed to repo
🔁 Reproducible — Anyone can run the same tests
⚙️ Automated — Daily at 5pm Pacific (00:00 UTC) via GitHub Actions on Namespace runners
🛡️ Independent — Sponsors cannot influence results

Sponsors fund large-scale infrastructure tests. Sponsors cannot influence methodology or results.

Become a sponsor →

MIT License

Hacker News

Agent Sandboxes Benchmark

Show article

tin7in

Comments

By agentica_ai 2026-02-2716:56

HackerNews