Daytona AI Agent Compute and Sandboxes
Daytona AI Agent Compute and Sandboxes
AI Agents Require Composable Computers, Not Just Code Execution
AI agents need stateful, composable computers—essentially production-grade sandboxes—rather than disposable, short-lived code execution boxes. While simple isolates can run a snippet of code and return an output, sophisticated agents require environments that mirror human laptops: the ability to maintain state, pause and resume work, and access diverse operating systems (Linux, Windows, and macOS) to interact with legacy software.
The Pivot to AI Sandboxes
Daytona pivoted from automating development environments for human engineers to providing AI sandboxes in early 2025. This shift was driven by a critical market insight: the infrastructure requirements for agents differ fundamentally from those of humans.
During the development of an MVP, Daytona discovered that agent developers were desperate for a runtime that could handle high-concurrency, stateful workloads. This led to a rapid growth trajectory, with the company reporting 74% month-over-month growth and some customers running nearly 850,000 sandboxes per day.
High-Performance Infrastructure Architecture
To achieve the speed and statefulness required by AI agents, Daytona utilizes a specific architectural stack:
Bare Metal Deployment: By running on bare metal rather than virtual machines (VMs), Daytona eliminates network latency between the compute and storage (e.g., avoiding EBS), resulting in significantly faster IOPS.
Custom Scheduler: Daytona employs its own scheduler to manage resources, avoiding the overhead and complexity of Kubernetes, which the company found unsuitable for this specific workload.
Rapid Startup Times: The architecture enables a single sandbox to spin up in approximately 60ms. For massive scale, Daytona can launch 50,000 sandboxes concurrently in about 75 seconds.
Stateful Snapshots: Templates and snapshots are preloaded on bare metal machines, allowing agents to "close the lid" on a session and return to the exact same state instantly.
Handling Spiky RL and Eval Workloads
A significant portion of Daytona's usage—roughly 50%—has shifted toward Reinforcement Learning (RL) and evaluation (eval) workloads. These workloads create a unique infrastructure challenge characterized by extreme "spikiness."
Unlike background agents that follow a "follow the sun" usage pattern (peaking at noon and dipping at midnight), RL and eval workloads are unpredictable and binary. A researcher may request 100,000 CPUs instantaneously, then drop back to zero. This results in low mean utilization (around 15%) but requires the capacity to handle peaks of 90%.
Daytona's ability to dynamically resize sandboxes on the fly prevents Out-of-Memory (OOM) errors, a common failure point in managed Kubernetes environments (EKS/GKS) used by competitors.
The Necessity of Windows and macOS for Knowledge Work
While Linux is the standard for most AI agents, a vast amount of global knowledge work is locked into legacy applications running on Windows and macOS.
- The RPA Opportunity: Much of the world's high-value work in healthcare, government, and finance occurs in apps that lack APIs. For agents to automate this work, they must be able to operate the computer as a human would (Computer Use).
- Windows Sandboxes: Daytona has developed Windows sandboxes that, while slower than Linux (taking seconds rather than milliseconds), provide the necessary environment for legacy app automation.
- macOS Challenges: Providing macOS sandboxes is complicated by Apple's licensing constraints, which limit the number of parallel VMs per machine and restrict memory snapshots to the same physical hardware, hindering the ability to move workloads across a cluster to manage load.
The Future of the AI Cloud
Ivan Burazin posits that the future of AI compute will not look like a traditional cloud provider (AWS) but rather like a consumption-based API (Stripe).
Instead of managing complex infrastructure, developers will interact with a set of primitives—sandboxes, web search, and agent-specific databases—via a seamless API. As agents become the primary users of compute, the bottleneck will shift from GPUs to CPUs and networking, as the sheer volume of concurrent agent tasks creates an unprecedented demand for general-purpose compute.