Code Mode: Giving AI Agents an Entire API in 1,000 Tokens (With Demos)

Presented by: João Tomé, Matt Curry

Originally aired on February 27 @ 6:00 AM - 6:30 AM EST

In this episode of This Week in NET, host João Tomé is joined by Matt Curry to break down Code Mode: a way to give AI agents access to the entire Cloudflare API (2,500+ endpoints) using two tools and roughly ~1,000 tokens of context.

Instead of exposing thousands of individual tools (which quickly becomes expensive and brittle), Code Mode lets the model write JavaScript to search and execute against a compact API context. The result is massive compression, lower cost, and better performance.

We also include demos showing how agents can query real infrastructure, navigate multiple accounts, and build things like multiplayer experiences using Durable Objects and WebSockets.

Mentioned blog posts:

Code Mode: give agents an entire API in 1,000 tokens

English

Transcript (Beta)

So the raw OpenAPI spec, if we wanted to traditionally give the model access to the whole API, we would dump the whole API spec in, or we'd dump the whole tool spec in, which is even worse potentially, and that would be about like 2 million tokens. If you just get down to required parameters, so you actually lose loads of options, but just required parameters for every API, you're at like sort of 240,000 tokens. And with code mode, just the search and execute, executing code, we're down to just over 1,000 tokens. So this is, I have this loaded in every session that I run of core code, of open code, of any coding agent. It doesn't add much context, and it works pretty damn well. Hello everyone, and welcome to This Week in NET. It's February 27, 2026 edition, and this week we're going to talk about agents, code mode, and developers, and of course, those interested in building with AI, with cost savings. For that, we start with Matt Carey, Senior Systems Engineer. I'm your host, Rondo Mair, based in Lisbon, Portugal. And this week, we actually have a double episode with two guests about two different blog posts that are about building, but those have something in common. They became viral and with a lot of interest, AI is moving fast, building with AI is getting real, and there was just too much to fit in one show. So we have two episodes about these topics. First up, agents and code mode. Matt Carey talks us into about giving agents access to the entire Cloudflare API, more than 2,500 endpoints using just two tools, and around 1,000 tokens of context, so not as expensive. Quick key terms, for example, agents are AI systems that don't just answer questions, they take action. MCP is the modal context protocol that we've discussed before in the show, the standard for letting models call external tools safely. Context window are the amount of information a model can see at once. Tokens are the units that measure model input and costs. And of course, code mode is instead of loading thousands of API tools, the model writes JavaScript to search and execute against the API's context specifically. So the result is massive compression, lower cost, better performance. In our second episode, that will be published in a few hours, we talk about vNext, how one engineer rebuilt Next.js with AI in one week. We'll also answer a few questions done on social media, how this came to be with AI. It's an experimental project, so stay tuned for that. Without further ado, here's my conversation with Matt Carey. Hello, Matt. How are you? Hey, how are you doing? I'm good. For those who don't know, where are you? Oh, I'm actually just outside Lisbon now, a beautiful beach town called Capa Rica. Yeah, I just moved here, it's very exciting. You're just a few kilometers, actually, you're the other side of the river, so I can have a view almost to where you are at. Where are you based usually? So, I actually just moved to Portugal, but I was previously in London for a while. So it's a recent move. Very recent. Are you enjoying Portugal? Yeah, super nice, super nice. We've got some really good weather at the moment. There's surfers out my window. Yeah, it's really cool. After a few weeks of storms, now the better weather is coming. It's not the same thing, actually, in New York and Boston area with the snow storm around. No, they had like, what, five storms back to back my first two weeks here, or four or five. It was crazy here, yeah. All the flooding and all of that sort of stuff. We didn't get much of it in Lisbon, but I saw it on the news. It was wild. It was, it was. Oh, well, you wrote a very interesting blog post about code mode. That is something that's been going on for a while now. Give agents an entire API and 1,000 tokens. For those who don't know, really, what is code mode and what this is about, how can we explain what it is and why they should care? Yeah, so I guess it's really good to have a little bit of a backstory around the problem that we're trying to solve. So I work on agents and MCP at Cloudflare, part of the agents SDK team. So I think you've had Sunil on here before, potentially. Yep. Sunil's on my team. And we work predominantly on open source libraries that help people be successful with Cloudflare, predominantly building agents. So we support the agents SDK, sandboxes, and also all of our MCP infrastructure. MCP, Model Context Protocol, it's released by Anthropic just over a year ago now, November in 2024, so a fair about over a year ago now. And the idea is, how do you give agents hands? How do you let AI reach outside of potentially the process it's running on, the computer it's running on, the place it's running on to access external tools? So if you think about making agents useful in the workplace, that's a big thing that you're going to have to work out at some point. How do you let it access your data? How do you let it access the places that you work in? And I've been really lucky to have some demos from some of the creators of MCP, and it's really fun to hear them speak about, yeah, we just wanted to get clawed outside the box, just let it reach outside of the glass box that it was acting in. So that's a little bit of background about MCP, Model Context Protocol, and it has things called tools, which are the things that interact, which you interact with, maybe some external function. And it also has some other stuff, but we're going to mostly focus on tools. These tools, traditionally, what people would do is they'd do like, fetch all the tools from a particular MCP server, and then they'd load them into the context of their agent. And that means that you're kind of limited by how many tools you can have. There are these things called context windows, and maybe they'll go up to 200,000 tokens, maybe all the way up to a million, but there is a limit, a hard limit. And the closer you reach that limit of context, the more, like, firstly, the more expensive the inference becomes because you're loading many more tokens, but also, like, the less performance the model has. And when you go above that, the model just, like, the API just completely won't work. So you want to try and reduce that context as much as possible. So how people have been doing this, they've been, like, turning on tools when they need them, turning them off when they don't need them, but, like, that really stops the autonomy of agents. So we've just been trying to work out, like, how can we load many, many more tools into a context window, or how can we compress them in a way that is useful to get many more in the context window? And so the first version of CodeMode came out last summer, and that was Sunil and Kenton. They worked out that if you could generate a TypeScript SDK from the tool specification, then you could just give the model one tool, and that tool would be to write code, and it could write code over the TypeScript SDK, and then that would call the underlying tools. And that was really, really cool. They had, like, massive compression over the amount of tools they could call, and we kind of expected that would be how people would do stuff. But, like, as all of these things, like, that's not exactly how it worked out, and other people came up with other ideas of how to compress their tools. So, like, Anthropic came out with programmatic tool calling just afterwards, which is very, very similar. And then they also released, like, tool search inside the core code, where you can search over different tools. Like, no one had really solved that problem with MCP. Providers were still exposing MCP servers, like Cloudflare was still exposing, like, 13 or 14 MCP servers, and all of these MCP servers had a very small amount of tools each, and they all specified everyone would, each product team that built it would be like, we're going to cherry pick the best seven things that you can do on our API, the most seven important things, and we're going to put those in our MCP server, and we're going to make sure these work really well. And when you do that, you lose, like, the granularity, you lose, like, the edge cases, which make the API really useful, right? You lose all of the endpoints, and you'll distill it down into, like, six, seven, ten unique operations. And I was like, well, why don't we combine code mode with MCP servers? So instead of executing code on the MCP client, we actually move all of the execution to the server, and try and expose the whole Cloudflare API in one MCP server. And yeah, well, we managed it. And there's some technical things we can get into about how we managed it. It's based on a primitive called dynamic worker loaders, which I think is really, really cool. We can talk about it a bit more. But that was it. That was the five-minute rundown of what we did. It makes sense. And it uses a bunch of different Cloudflare products, actually, to make this work, right? Which is also interesting, in a way. The ecosystem that is being built 15 years now, over 15 years, it actually, it's great for this type of purpose, in terms of making these type of things work. One of the things that I want to, for those that are not developers, to understand is how helpful, how relevant this is. Of course, efficiency, less tokens, less costs, always important. Anyone can understand that. But also, in terms of the output versus efficiency, what are the main gains there? Why don't we do a tiny little demo, and maybe it might be helpful to talk through it. Let's go for it. So I'll just share a screen quickly. So this is Cloud Code. We've probably seen this screen before. And we can do... We're connected to an MCP server, and I just authenticated it. But we're connected to an MCP server, the Cloudflare API MCP server. Forget about the staging bit. It just means it's got some goodies. And this is public for anyone to have a play with, so you can try and replicate some of this stuff. But say we want to use some part of the Cloudflare API. So what workers do I have deployed on Matt's account? And so I could go into the dashboard, I could use Wrangler, like RCLI, or I could ask this MCP server. And this MCP server is going to write some code. You can see I have a bunch of different accounts. The MCP server is going to write some code that searches over the API spec, finds the API that it wants to call, calls the API. Here it had to narrow it down to the particular account. And hopefully we've got a good answer. So these are all my demo workers, basically. I have 24-plus workers deployed on the account. These are the key ones. We can even go a little bit crazier like this. So maybe like... Which of my zones? So this is entering a part of the Cloudflare ecosystem that I don't touch very much. So which zones have the most traffic? Maybe we could use GraphQL API. And I just know that the API that we should use here is the GraphQL API, so I'm going to give it a little bit of feedback. It probably doesn't need it, but it's just for the purposes of the demo, just make this nice and easy. You can ask for a graph and possibly it goes there directly, right? Yeah, so it's good to give it as much feedback as possible. And I've spent a bit of time with this API, so I know which bits exist. And if I can express that to the model, then it'll just have to do less search. And so it'll just speed up this whole process for us. But we're going to... It looks like it's returned some data, so we're just waiting for CloudNow to write that out. But these are all read -only things. Oh, something's failing. Which happens in all the demos. Yeah, yeah, unknown field zone name. Now too many zones requested. Okay, maybe let's just try map z-carry. Oh no, it got around it. So I'm just asking it to try my one specific one. And let's see if it can do that. Okay, it could do that quite successfully. Maybe I asked it something that was actually impossible, potentially. Or it maybe had to write some more stuff. Okay, so it's got some data about my zone. But I'm just going to start writing a prompt so we get that. But we get it with... Okay, let's... Maybe let's try and deploy a worker next, because I think it's more fun. Let's deploy a hello world worker saying hi to all the listeners. So I've pressed enter on that while we're waiting. So my personal website had 20,000 requests over the last week. And we can split it out by day, by 24 hours, different visitors, all of that cool stuff. See my cache here. Like, we're just using the API in a way that I wouldn't know how to use it otherwise, which is quite cool. Also, in a way that it's not always exposed on the dashboard, which is also kind of interesting. Ah, this is fun. Okay, so I said deploy a hello world worker saying hi to all the listeners. It immediately said the API token doesn't have write permissions. Interesting. So let's see how we get to that. MCP, enter. When I first authenticated, I just gave it read-only permissions. So let's share my whole screen, because I don't think you'll be able to see what happens when we do this. So you're probably going to see yourself in the corner there, I'm so sorry. But if I re-authenticate, browser window opens, and I get this window here, right? And this window allows me to, just get rid of you for a second, allows me to pick exactly the scopes that I want access to. So for the purposes of this demo, I'm going to give myself workers full access, so I can read and write workers. I'm also going to give myself access, read and write access. And we'll see why I do that in a second, because that is a fun part of this demo. But let's press continue. And this is like all parts of MCP that I can basically give my agents specific OAuth scopes. And we've never done an API or an MCP server where you can do this. This is kind of new, it's kind of fresh. But it's really necessary, because if you're doing a Cloudflare, like a full MCP server for the whole API, you kind of want to be able to narrow it down. Okay, cool. So we've re -authenticated. Okay, now I'm going to say, you should be able to do this. Now I gave you permissions, something like. So it's giving me some fun insight to how I would do it just with index.js and making a Wrangler file and deploying Wrangler. But we're just going to sidestep that. We're going to see what happens if we just carry on and try and... Okay, so this is cool. So it's written some code. It's called the API endpoint to upload the worker script. And now it's enabled the workers.dev subdomain. So we have a way to access the worker, which I think is very, very cool. And then it also had to do a get request to get the subdomain. Apparently that's the thing. Okay, hello listeners. Oh, well, I think this is an emoji it's trying to do, but that is pretty cool. We have a deployed endpoint. Hello listeners. But let's make it better. I want this to be more of a visitor book. This is my favorite demo. I want to see more of a visitor book. I want to be able to write on the wall when I visit. Let's use KV to store everything. Also, yeah, we'll do that. Even if you don't say the KV to store everything, possibly it will go there. It will take only a bit more time, right? Yeah, it just becomes a little bit more non-deterministic. When I prompt a LEM, if I know how it should do something, I normally tell it because you just get better performance. And since we don't have much time, it's nice to try and rattle through this. But exactly, exactly the case. So what's it done? It's very smart, actually. Really very smart. What have we done? We've looked up the endpoints to do with KV, KV namespaces. And then we've created a new KV namespace for the visitor book, which is cool. It's returned an ID, and now it's written a whole load of JavaScript, which it's going to upload as the worker, visitor book worker. And now my visitor book is live. How cool is that? Oh, it doesn't look that bad. Let's go. That's great. So let's write on the wall. Hey friends. So hello -listeners.max.edcary.workers.dev. It's public. Okay, we're going to make it less public in a second, but anyone can sign it. Right. You see, I had to reload the page there, and it got me a bit confused. Let's make it not have to reload the page. Why don't we... Can we make this multiplayer, like a chat room? I want this all local, first, and sinky brackets, and I know brackets durable objects, because I love durable objects, if anyone hasn't played with them. It allows you to make these really cool multiplayer things, because they're like a little piece of compute that lives somewhere, and they have web sockets that you can connect to, and you can, yeah, just someone can send a message, and that message can be sent to every client that's connected, just the beauty of web sockets, and durable objects make this really easy. So hopefully they make it easy enough for an LLM to make a full sync engine, and a full multiplayer chat room with one prompt. Well, we have iterated a little bit, but let's see if it's possible. And it is also worth bearing in mind here that we haven't saved any of this code on our computers. It's kind of fun, this. So we could run this on anything. I could run this... There's no development environment. So I could run this on my phone, I could run this on my Raspberry Pi, I could run this on anything. All right. And... Cool. All right, let's go. I'm joining the wall. Nice. Right. If you would like to... Could you connect to this for me? And just see if this works? Sure. And I'm not going to reload the page. And I'm also going to... You're there. Wait. I am. I'm going to go crazy with this. Let's see how fast it is. So this is not local. This is running over the public Internet. How quick is that? Really quick. It's quite amazing to see how real-time things can be achieved in so little time. Yeah. And real-time, you see one of those super hard problems. And with Durable Objects, they're made pretty easy with the Cloudflare MCP server that's writing code over our API. It makes deployment very easy. Of these little mini apps. But say we have a real application. Like a real application. Say my personal website is mattsdaiquiri.com. Very basic website. But it is running. And I kind of like to keep it running. It says it's not secure. That's interesting. Might fix that after this. But I like to keep it running. But how about just for one moment, I want it to not be accessible to the general public. I just want it to be private. Just for me, maybe. While I'm uploading some stuff, changing some stuff, maybe I want to create a staging version of this website. Potentially. Potentially helpful. So what could we do? What could we do with that? So we have this thing called Access. Which I will admit, I'm not very good at using. And it allows all of this enterprise auth stuff. You can just basically block off connections to certain routes, certain URLs, certain zones. And just completely block off access, depending on some person's login credentials or email address or a bunch of other things. They can sign in with different IDPs. But I don't really know how to use that in a dashboard, I'm not gonna lie. I've played with it. It's quite hard. I guarantee you, this is the easiest way to set it up. How does it work? So I want to make my website only accessible to me. I'm gonna just clarify which zone that is. Only accessible to me. Can you put it behind access and make a policy of only me? And my email address is mkeri at Cloudflare.com. And we'll just see what happens here. I might end up doxing all of my emails in a second when I go into my emails. But we'll work out how we get there. But I think the demo is just like, how easy is it to play with the Cloudflare API? We have two and a half thousand endpoints. And we're able to search over them and execute code over them kind of autonomously. Yes, sure, this is a relatively decent prompt. But it's not that decent, let's be honest. When we were going back and making the visitor book, my prompt was make it local first and syncy. It's not hectic. Right, okay. So what would have taken me a while in the dashboard reading some docs is now your website, Matz.keri, is hosted behind Cloudflare Access. Shall we check? Oh, baby, let's go. Right, so I'm actually gonna stop sharing for a bit while I wander to my emails because I've done this way too many times. And it's not healthy for anyone's relationship when your emails get published. That's true. Security first. Yeah, cool, cool. Right, I did get an email. Let's share again. My whole screen. Cool, we're back. And we're in. So no one else can access my website. Obviously, I don't really want that. So maybe let's update this access. We actually don't need access anymore. So can we just... Actually, I don't want that. I just like make it accessible to the public Internet again. And it'll just like remove the policy, remove the access application. Like there's loads of bits that you have to learn how to use. And this makes it all quite straightforward. It's like having your personal LLM inside the dashboard, inside all the capabilities that Cloudflare has. You even have a DDoS attacks example on the blog as well. Yeah. To protect an origin from DDoS attacks, right? Yeah, that one's actually really, really cool. When I played through that scenario, it was like, this actually works. It's amazing. Yeah, really awesome. I guess like there was something else that I was gonna play with. So this is non -protected now again. The wall though, if we go back to the visitors thing, if I leave this up, it's an unprotected URL. People could play with it. We could put workers AI in it and we could make it like really... We could make our own social media or whatever. But it might start costing us money at some point if someone pinged me a lot or if people started using it. So I don't want to keep it running. But how would I get rid of it? It's alive in the world. I could click around the dashboard or I can just be like, can we delete the visitor book and the KV and anything else we made in this session? And then the idea being now that we can clean up and have a completely clean slate. And we've just done a huge amount of experimentation with Cloudflare. We've played with durable objects. We've played with KV. We've played with workers. We've played with access. We've played with DNS analytics. And it didn't take us that long. I think it was pretty good. It was. It was great. And we entirely scoped the permissions that we needed at the beginning, which is very important. Of course. And in a way, it was also related to the fact that you interacted with an LLM, in this case, Cloud Code. And it was really writing simple language without writing code specifically. Right. Yeah, that's pretty much it. So the blog post goes over this in a lot more words. But basically it is, we've compressed the whole of the Cloudflare API into two tools, search and execute. And search can write some code to search over the API and execute writes some code to act on the API. But importantly, this massive open API spec never gets loaded into the LLM's context window. So when we look at search here, it does some search over methods and paths in this spec. But it never reads the spec. It just reads the types of the spec. And so we've enabled that massive compression from all of these endpoints to just a few TypeScript types. So it's much more efficient in terms of cost. Yeah, cost and just destroying a context window. So the raw open API spec, if we wanted to traditionally give the model access the whole API, we would dump the whole API spec in, or we dump a whole tool spec in, which is even worse potentially. And that would be about like 2 million tokens. If you create tools from all of the schemas, you get with like 1.1 million tokens, but you lose a little bit of granularity compared to the full spec. If you just get down to required parameters, so you actually lose loads of options, but just required parameters for every API, you're at like sort of 240,000 tokens. And with code mode, just the search and execute, executing code, we're down to just over 1 ,000 tokens. So this is, I have this loaded in every session that I run of core code, of open code, of any coding agent. It doesn't add much context and it works pretty damn well. It does. And the demo is really cool, just to see what you can do in a few seconds in a very natural language type of ways. Kind of amazing to see specifically. One of the things I'm curious is also the feedback. We got a lot of feedback online. What surprised you the most in terms of feedback, in terms of how people are actually using it already? Yeah, so first of all, huge amount of people jumped on it and used it. I think there was sort of a million odd views on the Twitter post that I did about this blog, which is crazy for me. I don't normally get that much exposure. To get a million views was wild. And yeah, people loved it. There was just like, it opens the door to using Cloudflare a bit more naturally. And people spend a lot of their day, a lot of programmers spend a lot of their day, or software engineers or product managers, a lot of their day inside these coding agents. And being able to natively talk to your infrastructure in that way is kind of illuminating. But what's also very, very cool is people are wanting to know how can they do this with their own APIs? So other providers, people with very large APIs are wanting to work out how they can do this. And more importantly, potentially, the customers of those large providers are also wanting to know how they can get access to something that works not for Cloudflare, but for some other large platform that has 2000 odd endpoints that they want to use from their coding agents. So we actually released this CodeMode SDK at the same time, version two of the CodeMode SDK. And this allows anyone to build their own MCP server that does this stuff. You can wrap another MCP server, you can wrap like 1000 AI tools, you can wrap an open API spec, you can wrap whatever you want. The idea of executing code rather than calling tools directly is the idea of CodeMode. And that's what we want to try and enable with this SDK. Also, from the blog, the MCP server here, the Cloudflare MCP server is also open source. And so a fun way of understanding how it works, but also creating some of the functionality for yourself is if you get your coding agent to clone the repository, point it towards an API that you like, or that you use a lot, for instance, like a fun one is the GitHub MCP server, you could point it towards and then you can make your own CodeMode MCP server based on top of the GitHub MCP server that does the same things that we are doing. And you don't have to own the underlying MCP server, you can just make your own because APIs are public. We build Cloudflare with Cloudflare, which is a very, very typical thing we usually do. But it's actually also a proof of concept in terms of, as you say, others can use for their own APIs, their own MCPs specifically. That opens the door for MCP maybe to have a, let's say, second life in a way. Because it was MCP, it was getting less traction at some point, right? Yeah, there was a lot of chat about do CLIs kill MCP servers, like OpenClaw that's had, or Maltbots, or Maltworker, all these things. OpenClaw itself has had a huge amount of popularity recently. And did you have the Maltworker guys on? Yeah, I had Celso talking about Maltworker and also MCP, not MCP, Markdown for Agents. Okay, yeah. So all of these people are working on different stuff in parallel. And there's a lot of discourse online about how OpenClaw specifically uses MCPorter. Which is an MCP to CLI tool. And how the CLIs kill MCP because they enable things like progressive disclosure of tools. You don't dump more in the context window. They're very native to use for LLMs and all of this stuff. And I would just argue that structuring your MCP server like this enables progressive disclosure. We can search, we can execute. The same as calling a CLI, you can call help on all of the commands. And it is also very LLM native because we're just writing a small amount of code. And this code is code that LLMs have seen so many times. Like how many times has Claude generated a fetch request? Like it must be up there with one of the most popular things it's ever done. And so like I think this is hugely LLM native. The reason why we get, one of the reasons why we get such good performance here is we're writing stuff that's in distribution of the models training set. You always want to keep the models in distribution. And code, specifically JavaScript code is very much in distribution. The web is built on JavaScript and the models see a huge amount of JavaScript during pre-training and post-training. And we are just letting the model do what it wants to do, which is write JavaScript. And a lot of this design was like influenced by just like what does the, what do the models do better? I changed something, tested it, changed something else, tested it. Which does the model prefer? I think like if you can lean in towards some of that design decision, like it helps a lot. And this whole argument of like does CLIs kill MCP? Is MCP dead? No, MCP is not dead. MCP is the way that models are going to interact with external tools. Like MCP has won and it's getting better and better and better. The fact that some people dump a huge amount of tools across the MCP wire, like that is user error, I'm afraid. There are better ways of doing it. And this is one of those better ways. OpenCore itself runs on MCP. It just wraps them in a CLI for progressive disclosure. Like there are many ways of doing that. There are some situations where you can't wrap things in a CLI. So I was talking about like running this on my phone, for instance. Like I could run this on a very simple app on my phone. I would not need shell access on my phone. That would work. I maybe don't want to get, I maybe want to build an agent that doesn't have shell access. This will work. We run all of this code in a dynamic worker isolate or a dynamic worker loader, which is a super sandbox. It's like super eval. You could just run code as a worker in a very sandboxed environment. We can restrict what fetch, recall, outgoing network requests can be made. There's no access to environment variables. There's no access to like, like a file system. There's no access to anything on the underlying host. Like this code itself here does not get executed on my machine. It gets executed in a primitive that was designed for this, a sandbox primitive that was designed for this in the cloud, on the Cloudflare edge. And I don't have to worry about the LLM being prompt injected to write some very unsecure thing and dumping the context of my M variables into an email somewhere and sending them off to God knows who, because none of this happens on my machine. And I think that is, there's a big security thing that people don't really think about when they talk about CLIs are better. And even if they do think about it, there are just some situations where you can't use a CLI. You don't want to give shell access. And then it's a non-starter. So different things for different situations. Of course. It's part of the backbone that is being built in this situation. And sandboxes making things protected and agents SDK definitely are those types of tools that can leverage these new age possibilities into production and security and being around in an efficient way as well. Yeah, yeah. 100%. And like there's blog posts that me and Sonil keep on talking about writing. I think he's going to get there first about the different types of sandboxes. And Kenton has an amazing talk. I don't know what it was from, from maybe Connect last year about the different types of sandboxing. I mean, he's like the sandbox goat. But when you run, like this running untrusted user code is a very new thing for everyone. Like people built DSLs, people built markup languages, people built query languages just so they could control. They could give their users the semblance of giving them code without them actually giving them code, right? You can think about this with like basically every SaaS tool you know has probably has some way of configuring something in like a code-like manner. But it's not code. And it's not code because when the engineers sat down, sat around deciding how they were going to build this, they were like, we can't give our users the ability to execute untrusted code on our machines. That's unsafe. Well, now you can. And I think this is going to change like how people do a lot of stuff. And it's going to go beyond just, it's going to go beyond me playing around with the Cloudflare API. Makes sense. Before we go, I need to ask you, in terms of use cases, you've seen agents perform. We spoke about a few demos and possibilities. But in general, in the agent space, what excites you the most? What are the use cases you think other than the Cloudflare API that could be really make a big shift difference? Yeah, I don't have any particular ones for you. I just like LLMs are really good at taking structured data and making it unstructured and taking unstructured data and making it structured. So it's this like fuzzy layer where previously you'd have had to write some very brittle logic can now just be generated on demand and we can kind of see this here. Like the REST API from Cloudflare is a structured piece of, like a structured thing. My funny prompts are not structured and the LLM goes from one to the other. So you see like most businesses have something that is structured and something that isn't structured and they spend a lot of time putting one into the other. Well, LLMs are going to be pretty useful at all of those things. Translating like human mind to actual execution and the opposite as well, which is interesting. Exactly. One of the things I need to ask you is we've been teasing all week a new announcement. This will be public. We're recording on Tuesday. This will be published on Friday. What can we say? What excites you about that specific announcement? Oh, like the one that's coming this afternoon? Yep. Yep. Okay. So where do I start? I think what's coming this afternoon and you'll know all about it like when it comes out, I still feel weird to talk about it though, is going to really demonstrate the power of the Cloudflare platform. Cloudflare can take anyone's code and run it at the edge. And as long as it's JavaScript in some form or another, it'll work. And the aim is to make that as seamless as possible. And we're getting so close to it. Like the dev experience is getting so, so much better. Everything is getting better. And we have some incredibly smart people coming up with the most wacky ideas to make everything work even better. And that sounds super woolly, but if you'd have seen the announcement, you will know exactly what I mean. So yeah, I hope all the listeners are excited as I am. And if you haven't seen the announcement, just please like go on the Cloudflare blog or something and find it. It will be there. Last but not least, I saw this this week. It was related to getting a custom email address for free using Cloudflare plus Gmail. Gmail is actually promoting like people can have their own website as an email. And this is actually, it was trending on Twitter actually. Someone just putting the steps of using Cloudflare to have email routing, in this case, to have a custom email for free, which is kind of a very specific use case, not like business and many emails per day, but it's a very cool thing that is available for free with email routing, which is interesting. Yeah, and we're going to have email sending soon, I think. I'm pretty sure I can talk about that. Yeah, it's going to happen. Like all of these primitives that you need to build very high performance web applications on just generally things that act on the Internet, Cloudflare will have. And I think we have some of the best ones already, like worker. I joined Cloudflare because I was super excited about the direction of workers, durable objects. I saw the beta for dynamic worker loaders and I was like, oh my God, we can execute other people's code. That's wild, especially in an era of LLMs. And that's like why I joined Cloudflare. And I guess if anyone's listening, who is excited about that sort of stuff as well, and the intersection of agents with all of that, we are hiring on our team. So come and yeah, pop me an email or something. Last but not least, anything we can say of what's coming this year for agents? If we can say something at all. Yeah, in the blog post about code mode, we tease that like search and execute and this like code mode stuff is coming for MCP portals. So I think it's no harm in saying that we will definitely have that in the very near future. You will be able to put your MCP server behind an MCP portal, even if it's just one, and gain the ability to compress all your tokens into a thousand tokens, compress all your tools into a thousand tokens. Like that will come very, very soon. In terms of like more agent stuff, I think for me, there's this big focus on how do we make building agents on Cloudflare amazing? We have the raw ingredients, durable objects, sandboxes. They are like the raw ingredients for agents. You have durable objects that sit as this like little execution environment. You can give one to each user. You can give one to each session and it will persist. The data will stay there forever. It has a SQLite database. It has WebSockets. You can do all that funky streaming. Like that sort of stuff's made very easy with durable objects. Sandboxes. You want to run some heavy process with a file system. You want to give your agents access to a CLI. Sandbox. How do we wire all these things up with something like workflows as well, where you can do deterministic like workflow execution? Amazing product as well. How do we wire all this up with those ones? Also some other products like AI Search. We have this like a search index tool. Super good product as well. Like browser rendering. You can run browsers in the cloud. That's going to be sick. Like imagine your agent going out doing something autonomously, trying to use the API. The API that's struggling with, oh, we'll just pop open a browser window in the cloud. We'll run it in a browser. Absolutely fine. We need to log in. We can log in. No stress. You know, like these primitives, they're still early, but they're going to be like game changing for agents. And yeah, I'm super excited when we can bring the whole thing into a cohesive piece and we can have all of them working with the agents SDK. It makes sense to say 2026, the year of the agents. Oh yeah. Or at least because 2025 was already a bit of the earlier, but now in a more make it count type of way. Yeah. And get like agents to start doing real work. I think we had, there was definitely a shift with like Opus 4.5 and the equivalent Codex model, like Codex 5.2, where like these frontier models are much better at writing code than me. Like 100%. They are much better inferring facts over a huge amount of context than me. They're much better at all of these things individually. And it's like, how can we make them now access, like how can we meet people where they are and make these agents useful? How can we give like the tools to developers to enable them to make that like killer agent app that just wasn't possible before because the models weren't good enough? Makes sense. You're not scared for your job. Yeah, not like just yet. I think there will be a, there will be definitely a point where there is a toss up of round, like, yeah, I don't even want to talk about this that much, but that there is, there is probably going to be a point where we become less useful than the agents. And then we have to decide a few things, but I don't know. Human in the loop, human in the loop is quite important these days, for sure. Yeah, like, and I just think like going into this with your eyes open. Yes, sure. You can run an agent in dangerously skip permissions mode on your Mac mini. Yeah, sure. But if you give access to your Gmail as someone on Twitter found out, it might, it might like delete all your emails and it might delete all your emails at the speed of light because that's how it can do things. One API call or one API call in a loop. And then you're just limited by Gmail's throttling. I mean, all of this stuff is going to happen. People are going to have some very weird experiences. And our aim is to make sure that agents can only do what you let them do. But they can do everything that you want them to do. It's super important that you design these applications. I think intentionally is the right word. I'm going to say that. Makes perfect sense. Let's see where things play out. Yeah, super exciting, isn't it? It is, it is. Thank you for this, Matt. It was great. Yeah, lovely to chat. And that's a wrap. It's done.

This Week in NET

Tune in for weekly updates on the latest news at Cloudflare and across the Internet. Check back regularly for updates. Also available as an audio podcast!

Watch more episodes