How building Cloudflare with Cloudflare helps developers
Presented by: Celso Martinho, João Tomé
Originally aired on November 8, 2024 @ 12:00 PM - 12:30 PM EST
Celso Martinho is a Senior Director of Engineering at Cloudflare. He works across multiple teams from Radar to Workers AI in our new Cloudflare Lisbon offices.
In this conversation, host Joao Tome and Celso discuss the rapid growth of Cloudflare Workers, simplifying life for developers, and the importance of observability in AI. Celso also dives into how the challenges for developers have changed since the 1990s tech era and why he believes the Internet is still in its infancy.
Related Links:
English
AI Gateway
API
Cloudflare Workers
Durable Objects
Internet
News
Wrangler
Transcript (Beta)
Hello everyone and welcome to This Week in NET. Today we're going to talk about our developers platform and for that we have Celso Martinho.
Hello. Hello Celso. Celso is our Senior Director of Engineering based here in Lisbon where we are today in our new Lisbon office.
Welcome Celso to our show. Thank you. Thank you for inviting me.
You've been here before and in the show and of course in the new Lisbon office.
Before we start, how exciting it is to work with a team, a new office, people coming more to the office specifically.
The Lisbon office is a strategic location for Cloudflare for many reasons.
So I think we're all excited about the new office.
We have a lot to talk about. You work with a lot of teams. Which teams do you work here?
Well that's the kind of question you have to ask me from three to three months.
The kind of things I usually do keep moving. I've been lucky enough to be involved in many projects at Cloudflare since I joined the company about four and a half years ago.
But I can name a few projects that you know me and my teams have been working with.
So Radar is obviously the first project I would mention.
It was just the first thing we started building at Cloudflare when I joined.
It's still an ongoing project with lots of ideas and a huge roadmap ahead of us.
So I feel like I should mention Radar. But then we've done a bunch of things like email routing.
We helped with D1. More recently we've been involved with the workers AI, AI gateway, which I think we're going to talk about.
Workflows.
So there's a number of things that you know me and my teams have been helping Cloudflare with.
Radar is a bit of Internet trends type of website, free site.
Workers in many projects you have now related to workers are developers platforms specifically.
Those are very different tasks and workflows, if you want.
In terms of the potential of the platform you're working on, specifically workers for developers right now, how is the potential you see coming from that?
Well, I would say you're right and wrong.
At the same time, Cloudflare's building blocks are quite different now than they were four or five years ago.
Today we're in a place where you can take all of our products and APIs, specifically those on top of workers, and just build completely new products just using that.
In fact, today Radar is completely different from what it was four years ago.
Radar is now completely built on top of Radar.
And that is... On top of Cloudflare. On top of Cloudflare Workers.
And that is also true for the new products we're launching. So workflows is a good example of that.
It was completely built just on top of our APIs and on top of our existing products, workers products.
And this is very exciting because not only we can show our customers that they can also build complex applications if they want to, just on top of workers, but it also gives us the speed and the time to market to build new products very quickly without having to go through the typical complexities of building things from scratch.
So this is a very exciting time for not only us, but Cloudflare as a whole, because we are now in a comfortable place where we can think about new products and we can launch them quickly and they're all completely built on top of what we have right now.
For those who don't know, you created a while ago one of the main web portals in Portugal, SAPU, in the 90s.
It became quite huge in Portugal. Those were times where things were starting in the Internet era.
In what way do you see this facility of doing all of these things now with AI, with workers AI, with workers in general?
Well, obviously completely different times.
Comparing the late 90s, early 2000s with today doesn't make a lot of sense.
The world in terms of technology has changed completely, but still I can do some parallel between the excitement of building stuff back then and today.
I still think the Internet is still in its early days.
I'm not the kind of person that will tell you that we're done, there's nothing else to do on the Internet.
I think the opposite. I think we're still in the beginning, the beginning of the exponential curve.
But the way of doing things is completely different.
Back in the day, we didn't have the tools, we didn't have the frameworks, we didn't have a lot of building blocks.
We would have to do everything from scratch, basically.
Also, there was no knowledge, there was no abundance of knowledge, there was no people to discuss ideas with.
We were building the Internet. Today it's completely different. Not only you have a lot of options.
It's easier, in a sense. It's easier, you've got a lot of tooling, you have a lot of available knowledge and huge communities where you can discuss your ideas, your approaches to solving problems.
And that changes everything.
Things are more complex, technologies are more complex, for sure. I mean, if you were an engineer from the late 90s and you stopped learning, you stopped updating yourself, you would be completely lost today.
So things are much more complex.
But at the same time, the tooling is so much better. Abundance of knowledge is here and you have so many people to discuss your ideas with that it makes things easier, it makes things possible.
I want to dig further, but let's start by going to some of the announcements, the blog posts we published recently.
One, of course, is about our workers' platform, developers' platform, but specifically about workflows.
Build durable applications on Cloud for Workers. You write the workflows, we take care of the rest.
That's the title of the blog post we launched.
For those who don't know, what is Workflows, really? Yeah. Workflows is a new product that basically implements what the industry is now calling durable execution.
Typically, workers would be short-lived requests or short-lived applications.
So you get an HTTP request, you would do something about it, you would implement some logic on top of that, and you would respond to the client in a matter of milliseconds or seconds.
And this is the typical workers' application.
So workers was not designed for specific applications where you have to do something for minutes or hours or even days.
And Workflows solves that, but it also solves that in a very useful way.
What Workflows allows you to do is for you to build an application that needs to do complex tasks and it can take hours or days or even months to complete.
And we will deal with all of the trouble, all of the problems that can happen when those tasks are being executed.
So typically, long-lived applications have issues like you're trying to connect to a third-party service, the network is down, or the database has a problem, or you're trying to charge a credit card because it's an e -commerce application and the credit card processor fails, and you have to deal with all of those problems, all of that logic.
And Workflows solves that for you.
It gives you primitives. It gives you a higher-level API that's very easy to use, that will deal with all the retries.
If the application fails completely, it will restart the application, get you to the state where it was before it failed.
And all of that is transparent to the user. You will basically be allowed to run long -lived applications.
We will take care of scaling, we will take care of handling problems and failures, and we will not rest until your application completes successfully.
How automated is Workflows in terms of it deals with all of the stuff that needs to deal automatically without you going and checking every spot?
Yeah, so it's fully automatic and managed. You just write a worker script that will give you a very simple-to-use API based on steps.
Every step is kind of a task of your application. You just deploy that to workers.
Then what you can do is trigger instances of that application or of that workflow.
The thing that's also very important for people that work, that build long-lived, asynchronous applications like Workflows, is that we give you observability over what's happening.
Because if you don't have that observability, it's really hard for you to know what's happening, where your application is at.
So we also work really hard to provide things like logs, snapshots of your instance, where you can see the state.
We've got all the tooling around Cloudflare working with Workflows, including Wrangler, the dashboard.
Through the dashboard, those, or other methods?
Yeah, there's several ways you can use Workflows.
You can use them or trigger them and also see the logs and statistics using the dashboard.
You can use Wrangler, which is our command line Swiss knife that allows you to interact with workers.
You can use the REST APIs as well. So there's a number of ways you can interact with Workflows.
I was surprised looking, I didn't know a lot about that, Workflows specifically.
And I was surprised on the feedback, especially from the tech community, people that are building, how excited they were seeing something like that appear.
Why do you think that excitement comes?
Did you get some feedback? I think there's two things to that.
One is we announced that we were building Workflows on birthday week. So that created a kind of a buzz that we were building durable execution functionality on top of workers.
And also because it's a hot topic in the industry as a whole. There's a couple of alternative projects that try to solve durable execution.
Ours is one of them.
We think ours is pretty compelling and competitive just because of the fact that it's built on top of workers.
And because of that, you're basically taking advantage of our global network.
It scales automatically, but it's still a hot topic that a lot of companies are trying to solve.
So when we finally launched last week, obviously the reception has been amazing.
And I think what we're going to do is use a lot of the feedback that we're now getting from real customers to also shape the roadmap of Workflows for the future.
What type of shape that could be in terms of improvements, new additions?
There's a couple of things we know that we want to do.
One of them, I think we mentioned that in the blog, is support for events.
So you can basically use events from other applications to trigger Workflows.
That's something we know that we have to build. But then we want to listen to customers and we want to know what they also want from Workflows.
And we're doing a lot of that now, actively now discussing what to do for at least the next quarter.
Before going to AI Gateway, I'm curious on the ecosystem of building Cloudflare with Cloudflare, which is a great motto, actually.
It always reminds me of that song from Tom Jones, Fighting Fire with Fire.
Maybe I'll do a video about that. But the process of building Cloudflare with workers specifically, you mentioned already some advantages specifically.
But in what way that improves workers?
Because we're actually customer zero and we are actually potentially giving suggestions to the team to make it better and improve the Cloudflare ecosystem and the product itself.
If you look at the number of APIs we have now, we've got the full stack.
So workers now is things like storage, databases.
We do email. We do queuing. We have durable objects. We now have durable objects with SQLite.
There's just so many options we can use now. And the thing that a lot of people don't realize is that, you know, when we're trying to build something like Workflows, if you were to do that from scratch, it's a very complex problem to solve.
I mean, you know, building a database or building a distributed database is not an easy task.
It's very complex from an engineering perspective. And so what's amazing about workers is that we already solved a lot of those very complex problems.
I mean, if you look at durable objects alone. For those who don't know, what is durable objects and why is it important?
A durable object is like a singleton worker.
When you instantiate a durable object, we guarantee that there's only one running in a very specific location in our network.
And it persists data.
It has like storage. It now has SQLite as an option. That was actually one of the big enablers of Workflows.
So it's kind of a different kind of worker. A typical worker runs everywhere at the same time with the request.
A durable object is unique and it runs somewhere and you can use that to orchestrate things between other workers and also persist state.
So that's a very important building block. It solves a lot of problems that otherwise, if you wouldn't have durable objects, you would spend a lot of time just implementing that.
And durable objects alone has been a big enabler for a lot of new projects that we've been launching.
We were with the team in Lisbon two weeks ago and we actually asked a lot of our directors and engineering managers what they thought were the big wins for Cloudflare over the last few years.
And durable objects was in the top three of the most exciting things that we've built.
That's now enabling us to do other things on top of that.
Enables the ecosystem to be more complete, the full stack that you're mentioning specifically.
And there's advantages of having in the same ecosystem all of those potential areas to leverage the network, but also to make it easier to implement because it's in the same ecosystem.
Yeah, I mean, you don't need to solve the problems that we've solved for you already.
You just need to, you have an idea and you just take the building blocks we have, which have solved a lot of complex problems already, and you just put them together.
That isn't to say that you also will not have new problems to solve.
You'll probably have, but what we have now speeds you up immensely compared to what was the ecosystem of workers like five years ago, four years ago.
Another blog post, another area you have under you, AI Gateway, billions and billions of logs scaling AI Gateway with Cloudflare developer platform.
This one in particular also shows how logs are important specifically here.
So first, what is AI Gateway?
AI Gateway is, I guess the best way I can put it is like our CDN for AI requests.
It's basically a proxy where through our gateway, you can access multiple AI providers.
So you can use workers AI, open AI, many others. Models, datasets, things like that.
But your application will request AI Gateway and then we'll proxy to multiple providers.
And while we are proxying the request, we're also giving you a couple of things.
We're giving you cash, we're giving you observability.
And now that was the launch of AI Gateway last year. It already gives you a lot of benefits, including cost savings.
But now we're building on top of that and doing more things.
So what we announced a few weeks ago was logs and specifically the blog about last week was how we built that.
But we're also doing things like evaluations, which is really important.
AI Gateway has become one of our most popular features.
It's completely free. Anyone can use it. And it simplifies life for developers that are doing AI powered applications because you don't have to deal with multiple providers, multiple APIs.
If you want to change from one provider to the other, you just go to the dashboard and reconfigure what we want to use and that speeds up life of developers.
In terms of the billions of logs daily that is already doing, those are helpful for observability, monitoring, analysis, making things more efficient, better, less of avoid problems and errors in a sense.
In what way, like specifically this analysis and monitoring is really making a difference, especially in the AI space?
Because as you said, AI needs a special care, if you want.
Observability in AI is really important, specifically right now, because it's an emerging space and people are still largely learning, optimizing, improving their AI powered applications.
So if you don't give them observability, it's really hard for them to improve.
And navigate, right? To navigate that space.
So we knew this from the beginning. Customers have been asking us for better observability and we're providing that now with AI Gateway.
So you can have full logs of what's happening between the traffic of your application and the AI providers you're using.
You can also push the logs to your own infrastructure or to other applications if you want to.
And we've done that securely. You can actually encrypt the traffic because the last thing you want is to store data with sensitive information that might be traveling between the AI requests in the prompts or other things.
So we're really happy that we finally shipped that. For example, we spoke about the ecosystem and R2 is helping here specifically.
Hundreds of terabytes of data can be stored here.
Also important in terms of ecosystem, building Cloudflare with Cloudflare in a sense.
Yeah, so there's two things that if we didn't have them, we couldn't build AI logs as we did and launched last week.
One of them is, again, Durable Objects with SQLite. We're using that to create the indexes for your logs and it scales automatically.
So we can have thousands or hundreds of thousands of customers and we're just spawning Durable Objects for them and storing the indexes of their logs there.
That was an amazing enabler.
Same as workflows. And then R2. So I think we launched R2, I don't know, two years ago, something like that.
So now we have storage. We have storage buckets and we're using that to store the content, like Durable Object as the index, R2 as the content.
Again, an enabler for us. If we didn't have a storage solution, it would be much harder for us to launch AI gateway logs.
We would probably have to do something, use something not at the edge and it would make our life much harder than it was just using workers.
So many building blocks and going a bit to the beginning of our conversation about the differences between now and before.
If you were starting today, you already mentioned that it's more complex, more options, much different, but also bigger communities, more discussion.
In what way you see this AI models, AI datasets, perspective building, one person building a very good project, even a team of two or three can build like a real company.
In what way you see that more enabled or more easy because there's more tools than late 90s?
One of the things I wouldn't do for sure is manage hardware, servers, data centers, network cables.
It's not that I didn't enjoy doing that in the late 90s, but it's just so much overhead and costs and then you have to have teams to deal with that.
So that's a solved problem for most companies, for most projects, but especially startups.
So that's one. The other thing is like we were discussing in the beginning.
If I was to build a portal now, I would think about what kind of problems do I need to solve from a software perspective.
And before I solve those problems, I would go looking for APIs or services or frameworks that I could use so that I didn't have to reinvent the wheel.
And again, today there's so many options there.
I think, honestly speaking, that I could relaunch SAPO today just using Cloudflare Workers.
I mean, there's not one single thing that I can think of that we don't have.
In a very short amount of time compared with the years that took.
A very short amount of time. There are things I couldn't do alone, like, you know, designing the graphics or those kinds of things.
But, you know, from a software perspective, engineering perspective, it would be much easier.
Regarding AI models, specifically datasets, in what way like the applications you see today are really making a difference?
I've been learning about AI a lot over the last few months, just by being involved with teams that are building AI products.
So for me personally, it's been also a learning experience.
I do think that there are levels of maturity when we talk about different AI models or different AI services.
There are things that we know today that AI does pretty well.
There are things that AI does very well today that we know, like recaps or summaries and texts or image analysis.
I mean, there are models today that, you know, there's no risk there.
It just does the task very well.
Even things like, you know, text to speech or speech recognition. It's so advanced and it's so easy to do that now.
And you can just use AI for that. But then there are fields of AI that I think, you know, we're still experimenting.
It's still evolving.
And there are things about AI that we need to be careful with, you know, the whole AI security thing.
We take for granted that what a large language model tells us is a fact or is the truth.
We need to be really careful with that because that can have implications for people, for software, for applications.
So I think in short, I think AI is a space where at the same time, we're still experimenting.
There are things that you need to be careful with. But also at the same time, there are things that I also think are solved problems.
And AI does very well a number of tasks today.
So again, it's another building block that we have at our disposal that we can use to build new applications.
Makes sense. There's a few other blog posts that if you want to, I will mention them.
And if you want to give an input there, feel free.
One is migrating billions of records. So moving our active DNS database while it's in use.
So this is all about DNS records that have moved to a new database, bringing improved performance and reliability to all of our customers.
Why do you think potentially this is important?
Yeah, so that blog post is one of the many blogs we did last week on the Data Week.
And Cloudflare is known for trying to share the technical details of how we build things, but not only building things, but also improving things or migrating data, which is the case.
And I think that blog post in particular showcases how now we can do things at a very large scale.
I mean, the DNS platform is one of our most important products, and it has a lot of data.
And dealing with that in real time is not an easy task.
So I think it showcases how we were able to do that, again, on top of our platform.
Also, on the developer platform perspective, we had this blog post building Vectorize, a distributed vector database on Cloudflare's developer platform, specifically now supporting indexes of up to 5 million vectors, delivering faster responses and lower pricing.
Why is Vectorize in terms of its importance here specifically? Yeah, so Vectorize, I think we can say it's one of our AI products.
I think you can take WorkAsAI, AI Gateway, and Vectorize as our AI suite of products.
Vectorize is important because it allows you to build things like regs, like AI search engines.
So a vector is kind of a mathematical representation of a content.
And with content vectors, you can basically make search very easy because you can say, you can ask Vectorize for things.
You know, give me the content that is close, that you think it's close to this specific question or this specific content.
And basically, it uses mathematical vectors to calculate how approximate objects are from one another.
And that's really important when you're building AI applications. We've launched Vectorize a few months ago, I think, on our last year, I think.
And we're now improving the product with more performance, more scalability.
Let's end on this note, which I think is an interesting one.
You started your career building things.
For a while, you actually were not building things. You were helping others build things.
And now you're actually building things again in a different scale, a cultural scale, a different scale.
In what way building things still matters to you, still important?
You mentioned in the beginning that the Internet is still in an infancy growing.
In what way building is exciting? Well, for me, it's life, honestly.
It took me a while to realize that. I mean, when I started my career and we built Sappho, that was all I was doing, is building stuff.
We had no idea what we were doing back then. Then we had some idea of what we were building when we finally had a business.
I was with Sappho for 20 something years, did a bunch of stuff on the Internet and also telcos space.
Then I left Sappho and I ran an investment fund in startups for a while.
We were supposed to also kind of keep building startups.
But the fact is that running an investment fund pushed me to another direction where I was mostly running the funds and not really building products.
And I realized that I was missing that. I was missing the intersection of technology with shipping products.
So long story short, I was not looking to join Cloudflare, but I did.
It was a number of coincidences that happened at the same time.
And I'm very glad that I did join the team. Four and a half years later, I feel like I've been building stuff again.
Many things. Many things. I've been working with Dane over the last few years.
Dane runs the Emerging Technologies and Innovation organization at Cloudflare, which is the place to be if you want to be yourself inside the cycle of having ideas, thinking about new products, launching, experimenting, learning from customers, improving.
That cycle is really exciting for me.
And I try to surround myself with people that think the same.
That's why I told you that you need to ask me what I'm doing from three to three months, because that keeps changing.
And that change is, I think, what drives me.
It's not for everyone. Some people stay in the same product, stay in the same lane for years.
I also like that. But I also like being in the cycle of thinking about new products.
Makes sense. Thank you, Celso. This was great. Hope you liked it.
And that's a wrap.