storagevolumess0fsai-agentsbenchmarks

Sandbox0 Volumes: Turning S3 into Persistent Workspaces for AI Agents

Sandbox0 Team·May 14, 2026

AI agents do not just call tools. They create workspaces.

A coding agent checks out a repository, installs dependencies, edits files, runs tests, writes logs, and leaves artifacts behind. A data agent downloads inputs, produces intermediate files, exports charts, and hands results to another process. A long-running agent may need to pause, resume, fork into several attempts, or roll back after a bad tool call.

That is why persistent storage is part of the runtime boundary for AI agents.

The easy answer is to put the workspace on object storage. S3 and S3-compatible systems are already the cloud storage substrate: durable, cheap, horizontally scalable, available across clouds, and easy to run behind private deployments through compatible APIs.

Sandbox0 does use object storage as the durable layer for Volumes.

But the hard part is not choosing S3. The hard part is making S3 behave like an agent workspace.

S3 Is the Right Durable Layer#

For Sandbox0, S3 fits the shape of the system.

A Sandbox is compute. It is scheduled, claimed, paused, deleted, and replaced. A Volume is durable state. It should survive the sandbox lifecycle and remain available when a new sandbox is claimed later.

Object storage is a strong fit for that storage plane:

it scales horizontally without binding data to one node
it is cheaper than keeping every workspace on block storage
it is durable enough to be the source of truth for long-lived workspace state
it works across AWS S3 and S3-compatible stores used in private deployments
it lets multiple data-plane clusters in one region share the same storage backend

That last point matters for Sandbox0's architecture. A region can have multiple data-plane clusters sharing the same PostgreSQL and S3-compatible storage. Compute capacity can grow by adding clusters, while Volume data remains region-scoped instead of node-scoped.

If the goal is durable, cloud-native storage, S3 is the right foundation.

But a foundation is not an interface.

Object Storage Is Not an Agent Filesystem#

An AI agent workspace is full of filesystem behavior that object storage does not naturally provide.

Agents expect to work with paths, directories, file metadata, renames, chmod, symlinks, hard links, open file handles, and many small reads and writes. Tools inside the sandbox expect normal filesystem behavior because they were built for a local POSIX-like environment, not for bucket keys.

Object storage has a different model:

objects are addressed by keys, not inodes
directories are conventions over prefixes
metadata-heavy operations become remote control-plane work
small files can turn into many object operations
random updates and overwrite semantics are not the same as local files
exposing bucket credentials directly to a sandbox weakens the storage boundary

That mismatch is especially visible in agent workloads.

The bottleneck is often not one large file. It is thousands of small files: source trees, package manager caches, test outputs, virtual environments, generated code, notebook checkpoints, and tool logs. Every extra round trip on that path can sit inside an agent step that a user is waiting on.

So the storage question is not "can we mount S3?"

The better question is: can we turn object storage into a persistent workspace primitive that behaves correctly and performs well enough for agent workloads?

The First Design: JuiceFS over S3#

Sandbox0's first Volume implementation used JuiceFS.

That was a reasonable starting point. JuiceFS already combines object storage for data with a metadata engine. In the early Sandbox0 implementation, JuiceFS used PostgreSQL as the metadata store and S3-compatible object storage for file data. The initialization path formatted a shared filesystem, configured S3 bucket access, and mounted per-volume paths through JuiceFS.

This got Sandbox0 to a working persistent filesystem faster than building everything from scratch.

It also exposed the problems that matter for a sandbox product.

The Volume system needed to be deeply integrated with Sandbox0's own runtime boundaries:

Volumes had to be independent API objects, not just subdirectories in a mounted filesystem.
Mount ownership had to match Sandbox0 access modes such as RWO and ROX.
Sandbox containers should not receive object storage or database credentials.
Direct file APIs, mounted filesystem APIs, snapshots, restore, and forks had to stay consistent.
Agent workloads needed better small-file behavior than a remote path through several service boundaries.
Dynamic mount behavior had to be replaced with template-declared mount points that could be prepared before claim time.

The conclusion was not that JuiceFS is a bad filesystem. It was that Sandbox0 needed a storage layer shaped around agent runtime semantics.

That became S0FS.

S0FS: A Filesystem Layer Built for Sandbox0 Volumes#

S0FS keeps S3-compatible object storage as the durable backing layer, but it does not expose object storage directly to the sandbox.

At a high level, S0FS stores a Volume as:

manifests that describe filesystem state
immutable segments that hold materialized file data
node-local write-ahead logs for hot mounted writes
PostgreSQL committed heads that point to the current manifest

The mounted sandbox path goes through the normal Linux filesystem path:

Layer	Responsibility
Sandbox process	Reads and writes normal paths such as `/workspace/data`
Kernel VFS and FUSE	Provide the filesystem interface
`ctld` node-local portal	Owns mounted volume execution and local WAL
S0FS	Maintains filesystem metadata, file data, snapshots, and forks
PostgreSQL	Stores committed manifest heads with compare-and-swap semantics
S3-compatible object storage	Stores durable manifest and segment objects

This split is important.

The agent sees a filesystem. The platform gets durable object storage. Sandbox containers do not need S3 or PostgreSQL credentials. The storage layer can decide when to use node-local state, when to materialize immutable objects, and when to resolve cold data from object storage.

Why Node-Local Mounts Matter#

The earliest S0FS mount path was correct, but slow for tiny files.

It ran through too many components on every mounted file operation. For a benchmark with 500 files of 256 bytes each, the initial mounted path reached only 47.5 write ops/s and 233.8 read ops/s. That was enough to show the core design working, but not enough for agent workspaces.

Replacing gRPC with a custom TCP binary protocol improved write and read throughput only slightly. That result was useful because it showed the bottleneck was not protobuf serialization.

The real fix was architectural: move the mounted RWO filesystem execution path to node-local ctld portals.

In the current design, sandbox mount points are declared in the SandboxTemplate. When a sandbox is claimed, the manager binds a concrete SandboxVolume to that predeclared portal. ctld opens the local S0FS engine and WAL on the node, and the sandbox reads and writes through:

kernel VFS -> FUSE -> ctld -> S0FS

storage-proxy remains the owner of Volume metadata, HTTP file APIs, snapshots, restore, and object-storage coordination. It is no longer the hot path for every mounted file operation.

That distinction matters for AI agents because the mounted workspace is exactly where package managers, test runners, compilers, language servers, and shell tools operate.

What Changed in Practice#

The most useful historical benchmark came from the S0FS migration PR sequence. It measured the same small-file workload inside a sandbox pod: 500 files, 256 bytes per file, three median rounds.

Stage	Write ops/s	Read ops/s	list+stat ops/s
Initial mounted S0FS path	47.5	233.8	8,244.0
After node-local portals	1,269.7	1,981.7	13,889.5
After node-local read-path optimization	5,954.0	10,506.1	55,524.5

That is roughly 125x higher write throughput, 45x higher read throughput, and 6.7x higher list/stat throughput from the initial mounted S0FS path to the optimized node-local path.

These are historical engineering measurements, not a final product benchmark. They are still useful because they show the direction of the design: the small-file problem was not solved by pretending S3 is local storage. It was solved by moving the agent-facing hot path closer to the sandbox while keeping S3 as the durable layer.

Current Benchmark Methodology#

We also keep a reproducible benchmark script in the Sandbox0 repository:

scripts/volume_mount_bench.py

The script creates a temporary SandboxTemplate with a declared mount point, creates a temporary RWO SandboxVolume, claims a sandbox with that Volume mounted, and runs the benchmark inside the claimed sandbox pod.

Both targets are measured in the same sandbox pod and Python process:

pod-local /tmp
mounted S0FS Volume at /workspace/bench-volume

That keeps CPU, node, container image, Kubernetes runtime, and Python runtime consistent. The only intentional difference is the filesystem path being measured.

Fresh run on May 14, 2026:

remote kind cluster
Linux 6.8.0-111-generic on x86_64
Python 3.12.3 inside the sandbox pod
parallelism=32
write phase includes parent directory creation
results are median values across rounds

Workload	Target	Write ops/s	Write p95 ms	Read ops/s	Read p95 ms	list+stat ops/s	list+stat p95 ms
1,000 files x 4 KiB, 5 rounds	pod-local `/tmp`	3,856.5	26.30	8,325.5	2.56	93,708.6	0.01
1,000 files x 4 KiB, 5 rounds	mounted S0FS Volume	271.1	208.84	4,214.2	17.74	43,803.7	0.01
500 files x 256 B, 3 rounds	pod-local `/tmp`	2,767.4	18.68	6,403.0	1.14	94,409.8	0.005
500 files x 256 B, 3 rounds	mounted S0FS Volume	278.5	201.96	3,498.8	18.05	46,022.4	0.01

These numbers are intentionally not presented as a vendor benchmark. They are a regression-friendly local-vs-mounted comparison under the same pod runtime. The useful signal is where S0FS is already close enough for agent reads and metadata-heavy listing, and where write latency remains the next optimization target for small-file-heavy workloads.

Why This Is Better for AI Agents Than a Mounted Bucket#

Many systems can put an object store behind a directory-like interface.

That is useful, but it is not enough for agent infrastructure.

A mounted bucket is still useful when the bucket is the source of truth. For that case, Sandbox0 now supports mounting an existing S3-compatible bucket prefix into an agent sandbox with the s3 Volume backend.

A serious agent workspace needs more than object reads and writes:

persistent state independent of sandbox lifetime
normal filesystem behavior for tools inside the sandbox
small-file performance that does not turn every file into a remote object operation
snapshots before risky tool runs
restore after a bad migration or failed experiment
copy-on-write forks for parallel agent attempts
direct file APIs for orchestration and artifact collection
storage credentials kept out of sandbox containers
region-scoped storage that works across multiple data-plane clusters

Sandbox0 Volumes are built around those runtime needs.

The durable data still lands in S3-compatible object storage. That gives the system cheap, horizontally scalable persistence. But the agent does not operate on bucket keys. It operates on a workspace filesystem that Sandbox0 can snapshot, restore, fork, mount, authorize, meter, and encrypt.

That is the product boundary S0FS exists to provide.

Snapshots and Forks Are Workspace Operations#

The most important storage operations for agents are not only read and write.

Agents fail. They take speculative paths. They make partial edits. They need to branch.

Snapshots and forks turn persistent storage into a workflow primitive:

snapshot a repository before a migration
restore after a bad tool call corrupts the workspace
fork a prepared base environment into several independent agent attempts
compare outputs from different strategies
keep a reproducible state for evaluation

S0FS implements copy-on-write forks by allowing forked manifests to reference source Volume segments until data diverges. That makes fork creation a metadata operation instead of a full copy of every file.

For AI agent systems, that is a major difference from "upload files to object storage." The storage layer becomes part of the agent control loop.

Security Boundary#

Sandbox0 does not hand object storage credentials to the sandbox.

storage-proxy and ctld hold the storage credentials and serve authorized filesystem operations. The sandbox sees a mounted path. That keeps bucket layout, database metadata, and object store credentials outside the sandbox process.

Sandbox0 also encrypts persisted S0FS volume data at the application layer before it is written to object storage. Manifest objects are encrypted as full authenticated blobs. Segment objects are encrypted in independently authenticated chunks so cold range reads can still fetch only the needed ciphertext chunks. Node-local S0FS cache files, including WAL records and snapshot state, are encrypted before being written to disk.

This is service-side application-layer encryption, not end-to-end encryption. Sandbox0 storage services can decrypt data in order to serve authorized file reads and writes. The important point is that persisted S0FS objects and node-local cache files are not stored as plaintext by default.

The Design Principle#

S3 is the right bottom layer for cloud-native persistent storage.

It is not the right abstraction for an AI agent.

An agent should not need to understand bucket keys, object prefixes, range reads, manifest pointers, materialization, committed heads, or storage credentials. It should see a workspace.

Sandbox0 Volumes are built around that principle:

S3-compatible object storage for durable, cheap, horizontally scalable storage
PostgreSQL for committed metadata heads and coordination
node-local S0FS WAL for mounted write-heavy workspaces
direct file APIs for control-plane workflows
snapshots, restore, and copy-on-write forks for agent recovery and branching
application-layer encryption for persisted S0FS objects and local cache state

The short version: object storage is where the data lives; S0FS is what makes it usable as an AI agent workspace.