Unlocking agentic AI: Secure Sandboxes are officially GA

主講人：Mike Nomitch, Kate Reznykova

最初直播時間：今天，下午12:00 - 下午12:30 [EDT]

Join Mike Nomitch, Product Manager on the Containers team, and Kate Reznykova, Engineering Manager on the Agents and Sandboxes team, as they explain why AI agents need full computer environments to run complex code.

Tune in to learn about these three major updates:

State Preservation: Save full disk and memory states with R2 snapshots, allowing agents to instantly resume tasks without reinstalling packages.
Secure Egress: Programmatic proxies and outbound workers dynamically inject credentials, ensuring secrets are never exposed to the agent.
Limits & Pricing: Concurrently run up to 15,000 lite instances and only pay for active CPU cycles.

Read the blog posts:

Agents have their own computers with Sandboxes GA

Visit the Agents Week Hub for every announcement and CFTV episode — check back all week for more!

English

文字記錄（測試版）

Hey, Kate, what's up? Good to see you. Hey, Mike. All good. How are you? Hi. Really nice to chat to you. Yeah. I'm excited to talk to everybody right now. So we are launching Sandboxes in GA this week. We've got a blog post out about it, which you should go read right now, and we're going to talk about it. And then also we want to talk about that and agentic auth and sandboxes, which we think we have some really cool patterns that solve a lot of problems for how to, you know, stick something like cloud code or open code in a box and do really great auth to keep everything secure. So Kate and I wanted to talk to each other and talk to you all about that today and give you an overview of what we've been doing. We should introduce ourselves, though. Kate, you want to go first? Yeah. Totally. It's a good idea. So hey, everyone. I'm an engineering manager on the agents and sandboxes team, and I lead the team that is implementing the couple of things that we're going to be discussing today. Mike, what about you? Yeah. Product manager at Cloudflare for about two years. So I work on the containers team, which our sandbox SDK is built on top of, as well as some other teams like the agents team here at Cloudflare. So excited to be chatting. So let's start with GA really quick. So that's kind of the big announcement right now and the thing that's prompting a lot of this. Containers and sandboxes are now going GA, which means that, you know, we're, we're fully supporting them. If you're a big customer and you want to say, Hey, let's bet on this. We're saying that, you know, Cloudflare is, is ready to back that. And there've been a lot of changes that have, we got them out in beta last June, been a ton of changes since then. So we wanted to take this opportunity to go over some of the changes. I'd say some of the biggest ones are just to call out first are our limits. So right now you can have, if you're just like a standard user swiping your credit card for the first time up to 15,000 of our smallest container or a small sandbox running in prod. Larger sandboxes, thousands of those. If you want, talk to us, if you want even more, we've got people running, you know, tens and thousands of concurrent sandboxes at scale. So that's something we're really excited about is, is that's been a really big change. And then just to call out another pricing change that came out recently, we're in charge for active CPU now. So these have been two really big changes, but they're all, that's also been a ton of usability changes to, to sandboxes and containers that that we'll go over soon. So Kate, I mean, what should we talk about first? Well, I think we should start with some core premises. Essentially a lot of people right now, including us, are saying that, Hey, AI agents need something more than the CLI. They need a real computer to do many tasks and, you know, spawn sub-agents, write code, et cetera. I'm really curious, what do you think, what were the biggest pain points you saw companies hitting when they tried to stitch together agent environments using traditional VMs or existing container solution? Yeah. So I feel like that comes down to like, what, what is a sandbox, right? And that's something that like, there are a bunch of actual answers too. And I feel like there's the core idea is that there's this like trusted and untrusted areas. And there's the untrusted area, which usually you stick some sort of like agentic harness, LLM driven agent within that untrusted area. This could also be used for just like generic user code. That's not necessarily even agent driven, but it's obviously, you know, becoming a big thing because of agents. So you have that kind of untrusted area. And then you have some like, you know, isolation software level or hardware level around that. Right. And then you have this kind of like trusted level of control. And so you as the platform running agents or whatnot can say, Hey, I'm going to spin up a sandbox. I'm going to, you know, route traffic to it in this way. I'm going to control the traffic that comes out of it in certain ways. I'm going to control the file system in certain ways. So you have this like layer of trust that's controlled around the sandbox. And then you have whatever's happening in the sandbox, which, you know, hopefully it's not nefarious or anything bad, but you kind of have to assume that something bad could happen within that sandbox. And when we were talking to various companies, basically, like the first thing people started with were all these container solutions, right, that that were kind of like off the shelf. Hey, I'm going to like use my Kubernetes cluster and I'm going to run a bunch of containers and run agents within them. And this had a bunch of issues, like some of which were ergonomic issues of just like, how do I spin things up quickly? Because a lot of these, like I expect there to be something on demand that has a file system in a certain state. That was an issue. Another issue is just like, how do I route like a lot of these solutions were built for these kind of like horizontally scalable apps, all of which were like interchangeable. And that's just not the case anymore. Like the kind of new world is a bunch of individualized stateful instances and you have to route to a specific one. So like those were two big ergonomics issues. And then there is this issue of just like, hey, how do I do the security layer? So like if you have a bunch of containers that are sharing the same kernel and on the same machine, that opens you up to a bunch of potential attacks, right? Like you could have like, hey, I compromise this machine with my agent and it does a specter attack on, you know, a container that's running right next to me. And like, that's really scary. And so there's this security layer around sandboxes. That's really hard for people to solve unless they've spent a lot of time thinking about that. So there's the kind of ergonomics layers. And then there's the security layer of like, hey, how do I actually isolate these things from one another in a way that doesn't, you know, involve a PhD and a team of, you know, 10 SRDs figuring this out. You know, Mike, we hear the word sandbox quite often. And I was wondering if it would be a good idea to highlight and sort of provide people an answer of when to use sandbox as a like full-blown computer or if there's any other ways to use similar primitives for a bit more lightweight tasks. What do you think about it? Yeah. So we actually have two things that are kind of in the sandbox realm at Cloudflare. And one of which is a full kind of micro VM that's kind of like, hey, if you have an agent that is embodying a developer for a little bit, right? Like, hey, you're doing code review, you're checking out multiple Git repos and different languages and having to do X, Y, and Z and run CLI tools and whatnot. We say, hey, you should use something that looks like a full computer, right? Linux box, right? That has that layer of isolation and control that I talked about. But then there's, you know, so there's the kind of agent as developer primitive or pattern. And we say, hey, go use a full Linux machine. Then there's things like, hey, if you imagine like I'm running chat GPT or Cloud and it runs like a little snippet of Python for me, right? Like you don't need a full or like a little snippet of JavaScript here and there. Like that's still an agent that you want to sandbox and do something in a secure way. You don't want to spin up an entire container, an entire VM that's, that's potentially lightweight. You don't want to pay for it. You don't want to wait for it just to like run a little snippet of JavaScript. Or like taking it one step further, like you want to kind of host an application that has been vibe coded long-term. You don't want to necessarily run like an entire sandbox container for each of these applications. You still want this layer of control. You still want this layer of like trust, like this trusted boundary. But what we have is something called dynamic isolates that we're really excited about for, for these use cases, where if it's just like, hey, I want to run a little bit of JavaScript or a little bit of Python or something that compiles to WebAssembly in a way that's, that's faster, that's, that's cheaper, that's easier to distribute globally. We say, hey, let's, let's use a dynamic isolate for that. That still has the same kind of pattern of there's this layer of trusted control around the sandbox, but then there's this kind of untrusted center that can execute code. So we kind of have both of these patterns, which is, is kind of sandboxes TM, which is, hey, if you want to be a full developer, go use that. But then if you want something that's like lightweight and just runs a little bit of code, we also have dynamic workers and dynamic isolates for that. So I think we're going to talk mostly about the first one, right, which is like trying to use a full computer. That's what's going GA today. But definitely check out some of the other CloudFlood blogs that have come out as part of agents week. If you want to, if you want something like, again, super lightweight that you can run some code in. So, Kate, I'm, I'm going to flip it back at you now. So Kate, you've been running the kind of sandbox SDK team. And a lot of what you do is like a kind of ergonomics around like making these things really easy to use at scale. When you're trying to like embody a full developer, you're trying to give an agent a full computer. Can you talk about some of the problems that people have faced and like how we're solving them, how we're making it like easy to do X, Y, and Z? Yeah, absolutely. I think we sort of want to replicate how a real developer would work with their computer and how their development flow would look like. So essentially, you want this secure, isolated environment, which is, by the way, technically powered by our Cloudflare containers. This environment would be able to spin up on demand. It can sleep when it's idle. It can preserve its full disk and state using the R2 snapshot that we recently released. So whenever the developer want to come back to a task that they like, let's say, postpone while they're getting lunch, they immediately get their environment back and it's ready to go. Obviously, there is a couple of very useful tools that developers use is like the terminal. You always use a bunch of CLIs, so you do want your computer to be able to use a bunch of terminals, have a bunch of tabs and being able to open it. As a developer, we usually sort of rebuild projects, rebuild UIs, frontends, and we want to have feedback immediately and see the changes immediately. For this, Sandbox comes with a file system watching, which lets you immediately reload and see the changes that you've just applied through your AI agent or however you want. The biggest thing also comes with the secure egress networking, because everything you do, you want to do with security in mind. You don't want to let your agent go and just go completely wild and blast things and disclose your data, disclose your keys. You want this sort of out-of-the-box solution where you can inject your secrets and sleep calmly during the night and not be worried about losing them or someone stealing them. So I think that's what we're trying to solve, essentially all these basic things that every developer works and sort of uses battles with and trying to solve on a daily basis comes from Cloudflare Sandbox's out-of-the-box, and we try to solve most of the pain points that people experience. Yeah. Sing? I was going to say, there's a pattern I've noticed for either CI jobs or vibe coding platforms or things that are adjacent to vibe coding platforms, where the user has some Git repo, they want to do development on it, agentic development in some way. And then they may or may not start a session and then they may or may not come back to it later. And can you just walk through what that looks like on sandboxes? From user logs in and says, point to this Git repo to code is shipped, what would that look like in terms of how you code that? Yeah, for sure. And there's a bunch of flows you can use. Obviously, a lot of people right now are using tools like Cloud or OpenCode or other coding agents. You absolutely can run your coding agents inside the sandbox and call the coding agent and make your coding agent start the project for you. You can also use just an old school, traditional shell and say, hey, Git clone and clone something inside the storage of a sandbox. And I think from there, as you work with any project, anything that comes with, let's say NPM, the first thing would be to do NPM install. So you got a bunch of nod modules, you have a bunch of baggage that basically comes the moment you start working on your project. And you don't want to lose the progress. Like let's say if I'm working on something and I want to kind of freeze this state where I am right now and come back when I have a bit more time or I've come up with idea, have to architect this better, like how do I make sure that the moment I have this, I know how to do it. I immediately can go back to exact point where I stopped without waiting for my machine to prepare everything, reinstall, re-expose things. Like this is a problem. And humans, they're not very patient creatures. They want things like instantly, immediately. And I think we want to account for that. So what we've done with the sandboxes is we tried to just minimize this time from you abandoned your coding agent, you abandoned your sandbox to it's immediately available to you within seconds. That's where the backups from R2 comes. What can happen and what you can happen using our backups, it's just the whole state, the whole disk, the whole memory of your sandbox can be uploaded to our R2 storage. That includes your installed nod modules. So all the changes are going to go exactly how you left them in the R2. And instead of waiting about like 30 seconds, when you come back to kind of reinstantiate your sandbox, install everything again, fully prep your dev environment, you just can't get it all back immediately in the same state within two seconds. I think this is what people want. They want this like, hey, we're switching context. We're going from this task to this task. And you don't have to wait seconds or minutes between the, it depends also on the complexity of the project. If the project is a gigantic monolith application, you know, the installation, just set up of the basic work environment can take quite a few minutes. We want to minimize that. We want to bring this like magical experience, start where you left off. Highly recommend for everyone who is interested to just speed up your sandboxes and get to much faster cold starts. Take a look at our backup API. We've really tried to keep a lot of DX wishes and feedback from users while we're designing this API. And yeah, I think it works pretty magically to me. It's a hard problem. Like, especially if, again, if you're like taking an existing container solution, like if you're just like used to running web apps, like how often is it that you need to like very quickly sync state somewhere? Like, like a lot of these sandboxing tools, including ours, right? Like that's, that's one of the kind of core things you have to get good at is like, Hey, we have some state that's like not being used right now. We need to like take that, those files, that, that disk, that NPM install, and just like shove it into a sandbox, you know, super fast such that your end user doesn't care. Right. And, and doesn't like, you don't have to like take that hit again of, of doing the NPM install. So that's, that's been a super common pattern I've seen as like spin up individualized sandbox, git clone, NPM install, take a snapshot, and then come back to it later. If you want. Cool. Yeah. What else should we talk about? I think we should touch upon the, well, I said there are two different flows. One is you get access to things through your coding agent, whatever you're on in sandbox. And the other one is like, you can run the CLI yourself. And I think one thing we give users out of the box is the PTY. It's something we've released, I think in February this year, and it's basically a pseudo terminal session in a sandbox that gets proxied over the WebSocket connection. It's also fully compatible with Xtrem.js and a bunch of other add-ons. It's not limited to just Xtrem.js. You can extend it as, as you wish, basically. The idea is basically that agent needs shell access and it shouldn't be just like run command, wait, run command, wait, then check the transcript. It should be naturally, it should be exactly how developer would, would use that. Like humans run something, they check the output, then they may be interrupted, then they may be completely forget about it. They go away, but they want to come back and then continue using their CLI, right? So we kind of wanted to give the same feedback loop. And essentially in the PTY, each terminal session gets its own like fully isolated shell. So it has its own working directory, its own environment, et cetera. You can open as many as you need, just like you would do on your personal computer during development. And yeah, you can reconnect and see the replay of everything you missed. So essentially this is a full blown working environment as you would have on your laptop. Yeah. I love that, especially for debugging. We've got, you know, we're, we're running agents internally and like the, the idea of just like, Hey, I can open up a webpage and with like just a tiny bit of code that you sprinkled in basically have shell access behind our auth that we've set up right on our, our, you know, UI that we provide all of our developers. It's super powerful to have that. And then of course you could give that to an agent too, but it's useful for both, you know, agents and humans to be able to just like shell into something and you don't have to do some like, you know, horrible setup to, to make it all work auth-wise. It just like kind of built into your flow. I kind of want to talk about the egress topic a little bit, cause I think there's some exciting stuff there. Absolutely. Mike, you know, many times as a, when I was young and junior, I exposed secrets as all of us did. You know, I don't think there is a perfect developer who's never done that. And I think we, we want to make sure and help people to avoid these cases going forward and, and, you know, build a solution that helps to prevent those things. I know you guys worked on egress for sandboxes that makes this possible and makes it a really nice experience. Do you want to walk us through how this feature is designed and how it works and what kind of experience the end user would get? Yeah. Yeah. So, so like, let me frame the problem, I guess. Like say the problem is you have some internal data that you want to talk to. So like, say this is like, I don't know, it could be a CRM, it could be like your internal GitHub or your GitLab instance or whatnot, that you're running that with code or whatnot. So internal service you want to talk to, and you want your agents to be able to talk to that. And there are a couple ways that people kind of do this off. Like the naive way is like mint a long live token, stick it in the box, and then like every agent has access to it and you don't know how they're going to exfiltrate it. So like, that's really bad for like hopefully obvious reasons, right? They, you know, the token's out there forever. Other times you'll do like short live tokens and this like helps the problem a little bit, but it can still be leaked and like, you could still get your data. Like you're still giving a token to like a potentially untrusted source at that point, which is also bad. Often people are doing like OIDC tokens for this. And this is like, hey, we've encoded the identity of the agent in a token and we could do some like exchange out of band to say like, hey, I present my OIDC token, it checks against some policy and then that is exchanged for a short live token that has access to X, Y and Z. And like, that's pretty good. A lot of people are doing OIDC exchange, but it's kind of hard. Like you need to set up this like whole token exchange system and you sometimes need to rely on your third parties to have this set up and have good RBAC and whatnot. So we thought like, hey, we want a system where it's really easily programmable. Like I can set any policy I want around my sandbox. So I can say, hey, even though my third party provider doesn't even have the like token capabilities, I want to lock down to these specific API endpoints. I never want to hit, have a body that has this content within it and could be checked. And it doesn't matter if the third party provider supports X, Y or Z, like I should be able to program that as the sandbox kind of owner, what it can or cannot do. So we want it to be super programmable, but we want it to be fast where you're not having to like go to some centralized source for auth details and like check, like it should be something that's just like ideally right next to the sandbox. And we want it to be kind of identity aware. So like different sandboxes could have different policy. So I could say like, this sandbox has access to my, my internal service foo. This service doesn't have access. And like, maybe you could even turn the access on and off as it goes. So there are kind of all these like things we wanted to do. And of course, like no secret ever exposed to the sandbox. Basically it should only be injected on the way out as the traffic leaves the sandbox. And so what we've done is we've, we've, we've used something called outbound workers to achieve this, where essentially you can opt in on your sandbox or your container to say, Hey, at this host name, like if I am hitting, you know, my version control system.com, let's say we are going to intercept that traffic. We're going to route it wherever you want. And then we are going to be inject potentials if you want. And we could even like modify anything about the request. So the kind of the most common thing is, Hey, request goes to this destination. It checks your identity against some source. It also checks some policy and then it injects that credential on the way out. And this is, I think, way more powerful than a lot of what other sandbox providers are doing. Like the kind of basics is like, a lot of people are like allow list, deny list, right? Where it's less like you're allowed to hit NPM and GitHub and nothing else. And like, that's okay. But like, you still have this auth creds problem. And so we're really excited about, you have this allow list, deny list ability, but you also have the ability to inject creds on the way out. Again, the sandbox never sees it. It's just part of the request. And then also it's totally programmable. So you could turn these policies on and off on the fly and you can do things like, Hey, only allow these endpoints or only allow, you know, post request or only allow get requests, don't allow post requests. Like, it's just like super trivial to, to program these things, which is, is really nice. It sounds super interesting. And it seems like users can do quite a lot of stuff with it and customize it and sort of, you know, do a bunch of things. Where do you think people can learn more about how to do sandbox auth? Can you share more? Yeah. So there's a blog post that, that just went out. So if you check our blog post, I think it's called Dynamic Authraid, Dynamic Identity Aware, Secure Sandbox Auth. So check that out. Again, it's super powerful. It can be as simple as allow list, deny list if you want, or you just pass a function into your sandbox that says, Hey, route all traffic through this. And then the world is your oyster as far as what you control. Yeah. So, so we're really excited about that. And we think that that's kind of like where sandboxes are going in the future. Like everybody's going to have this programmatic proxy that sits right there. That's trusted. Make sure that, that nothing's ever exposed to the agent. Anything else that. I think there's maybe one more thing we should mention and highlight. We are talking again about the natural development cycle. Manually or with a coding agent, you clone a project, you make modifications, you make some changes. We already said that you can like hot reload in real time and see the changes. But what if like, I've done some cool stuff and I want to share it with Mike, like how I'm going to do it. You know, I'm not going to send him a link saying localhost. That would be wildly inappropriate. But what we can do in one set, how sandbox SDK solved this out of the box is we have something called preview URLs, which is something that deployed on the Cloudflare infrastructure. And you can immediately publish the output of your coding agent and show it to other people by like letting them use that preview URL. I think it's super important because like now with agents, sub-agents and a bunch of coding, just the systems in our work cycle. We want to share those results with our team. You know, the Cloudflare sandbox SDK makes it very easy how you can do that. You can start a process, you can expose ports, you can, you can always do and share live directly from your sandbox. You don't have to even exit it. We have additional really cool utilities where you can just make sure everything works as expected. You can catch specific logs, you can expose specific ports. You can do all the real work that you would usually do instead of just like, you know, screenshotting your results and sending people through Slack or Google chat, et cetera. Yeah, I love that for verification too, because that, that feels like such a big part of making like a useful agentic flow, right? Is actually verifying your results as you, like sometimes, I don't know if you've had this experience, you set up your agent and it doesn't have a good way to verify when it's done. And so just like one shots code, and then you have to go like spin the server up yourself and then be like, Oh, agent, you did this wrong. Like do blah, blah, blah. And like, ideally you set up your agent where it's just like, Hey, check your results against this live running server. And it makes it really easy to get that kind of iteration flow, whether it's you and an agent or you and another person, right? Like sharing. Yeah. And you don't want to separate those two things. You want to have all in one place and let your agent have all the context they can get and, you know, edit things in the real time. So now it's absolutely amazing. I think, you know, as I'm guilty, I'm sure you're guilty. I love to have a million different instances of coding agents that are all separate sandboxes. And I'm pretty sure I myself hit our internal limits quite, I mean, quite a lot, I would say every day. And I think a lot of our users are very creative and they build projects where, you know, usage of sandboxes is high and most of the times even critical. I would love to ask you more. And if you can share, what are the limits that we are exposing to the users with the GA and being public and what kind of instances the users can get from our solutions? Yes. So let me pull up the official limits right now. One second. So we have our smallest size, which is pretty small. But let me get the right numbers so I don't accidentally say something incorrect. So essentially you're able to do like, I think up to 15,000 of our lite instance, but that's like a really small instance. Right. So that's, you know, a 16th of a CPU, two gigabytes of disk and, you know, under a gig of memory. Right. And so, so that's, you know, 15,000 of those. That's really nice for, again, there's like super lightweight code execution that doesn't need to be like run it. You know, if you're, you don't want to like run the JVM in that or anything. Right. Once you get up to kind of the standard size, we mostly see people buy coding with about like a full vCPU, somewhere between four to six gigs, and then around 10 gigabytes, sorry, gigabytes of memory, and then around 12 gigabytes of disk or something. That you could run over a thousand concurrently of that size. And then you could also go up at larger sizes as well. So, you know, if you get up to four vCPU, 20 gigabytes of disk and 12 gigabytes of memory, we can also go higher than that. This is something that as, you know, we're GA now, but that doesn't mean that we're stopping pushing these limits higher and higher. So, we do have, have, you know, people who in terms of total concurrency, we're working with them to run, they have run 40,000 of these kind of like mid-size instances in prod, have run 80,000 concurrently in tests. So we could get really, really high. If you're like supporting a massive amount of users running agents on your platform, talk to us if that's the case. Publish Linux right now are either tens of tens of thousands of the smaller instances or a couple thousand of the kind of mid-size instances. Yeah. I mean, as a team and as a product, we want to help you scale. And I think we kind of have everything waiting on our side to make you successful and let you run it with as many users as the product requires. I think it's exciting to know that there have been some pricing changes to the containers product. Do you want to tell us more about that? Yeah. The big change was the active CPU pricing. So when you run one of these, just kind of the baseline of how these things are charged is right, you call start on a container, you're charged from when that container starts booting to when it ends. So if you have a three minute coding session, you're charged for those three minutes. You can also set a sleep after, which is really nice to say, Hey, essentially, if I haven't gotten any commands to like execute code or requests to, you know, HTTP requests, the container will automatically sleep. And so you're generally charged from, from that time, either, you know, you start to the time you tell it to stop, Hey, stop this container. Or to the time where it automatically kind of times out. Now, during that you're charged for disk and memory. The big change recently was you're only charged for active CPU cycles during that time. And so if your CPU, usually again, going back to the NPM analogy, it might be, Hey, I boot my, my process. There's a little spike in CPU to, you know, boot up my web server. And then there's this like big spike as an NPM installs. And then as I'm like waiting for the LLM to respond to me, or I'm waiting for user input, it's like sitting at 10% CPU utilization. And essentially, whereas before you used to be charged for kind of like your high watermark of like all the CPU you use during that three minute session. Now you're only charged for those active cycles. So if you're sitting at 10% utilization of CPU, you're only charged that 10%. And this is a massive change compared to a lot of, you know, existing container platforms and something that's, that's worth calling out where you're like, Hey, I'm, I'm on EC2 and I get this instance type and it looks like it costs this much. On Cloudflare containers, right? Like there's, there's a big difference versus kind of traditional container solutions in terms of how that charging works specifically in regards to CPU. So that's been a big change and, and has made this a lot more affordable for, for users, which is great. We have to get into more hands. Yeah, I think that's it from our end. Like we're, we're really excited about this product. There's been just a huge amount of interest here from all over in terms of how people are using this. The other thing I'll say, Kate, and kudos to your team here is like making it really easy to stick open code or cloud code in a box, we've got some really nice patterns there for like, if that's how your end users are doing development, we've got some nice patterns for just like grabbing that and running it as a background agent. And for anyone who is curious to have some, you know, out of the box examples, please do check out our GitHub. We have a bunch of ready to go code pieces that you can just run on your local machine and see the product in, in action immediately, including the open code demo that Mike mentioned, but there's a bunch of other things like collaborative terminal using our PDY features, state machine using our R2 backups and a bunch of other things that directly demonstrate what you can do with sandboxes today. I think it's really cool that we are able just to give the agent this full development environment and get it as close as the human has it and kind of replicate the same loop with a bunch of tools as like, again, terminal that you can connect a browser to, you can have a code interpreter with like persistent state, you can have background processes, you can have live preview URLs, you can have the file systems that emit the changes immediately and gives you feedback in real time, you can have egress proxies for secure credential injections, and of course you can have snapshot mechanism to make sure that the container wakes up instantly and you don't have to wait too long. This is really, really exciting project. Please do check up our NPM and GitHub for more examples for code sources and, and give it a try, go and build cool things with it.

For Developers

Give your AI agents persistent, secure sandboxes to write and execute code at scale.

Read the blog

Agents Week

Join us for Agents Week 2026, where we celebrate the power of AI agents and explore how they're transforming the way we build, secure, and scale the Internet. Be sure to head to the Cloudflare Agents Week Hub for every announcement, blog post, and...

觀看更多劇集

Unlocking agentic AI: Secure Sandboxes are officially GA

文字記錄 （測試版）

文字記錄（測試版）