Reliable sandboxes
for AI benchmarks.
Harbor-native sandbox infrastructure that prevents agents from cheating.
$ harbor run \ --dataset terminal-bench@2.0 \ --agent claude-code \ --model anthropic/claude-opus-4-6 \ --env islo \ --n-concurrent 500Why Islo?
Gateway profiles
Per-sandbox network rules. Control what agents can reach by host, path, method, and rate limit.
host api.github.com
path /v1/*
action allow
methods GET, POST
rate 100 req/minContent filters
Scan response bodies for leaked answers. If an agent fetches a solution from GitHub, the gateway blocks it.
→ GET github.com/org/repo/issues/418← 200 body received "def solve(puzzle): …"✗ BLOCKED — content filter matchCredential injection
Agents get access, never keys. Secrets are injected by the proxy and stay out of trajectories.
match api.openai.com
header Authorization
value {{ secrets.OPENAI_KEY }}| Islo | Others | |
|---|---|---|
Snapshot-based environments | ✓ | ✓ |
Gateway profiles | ✓ | — |
Content filters | ✓ | — |
Credential injection proxy | ✓ | — |
Cost limits per run | ✓ | — |
Built-in trajectory storage | ✓ | — |
Zero infra to maintain | ✓ | ✓ |
Pricing
Pay for actual CPU cycles, resident memory, and storage. Nothing else.
Who we are
We're the team behind Incredibuild — a decade of making parallel, compute-heavy workloads fast and reliable. Running thousands of sandboxes in parallel is core to what we do.
We've worked directly with the teams building evals for frontier labs. We understand exactly where existing tools fall short — and built Islo to close those gaps.
This is Incredibuild's strategic focus — not a side project. We're investing in the long-term infrastructure layer for AI evaluation, and we're here to support teams that depend on it.