This Week in Net: Developer opportunities, new AI-driven programmers (and a bit of TCP)
Presented by: John Graham-Cumming, João Tomé
Originally aired on July 26, 2023 @ 2:30 AM - 3:00 AM EDT
Welcome to our weekly review of stories from our blog and other sources, covering a range of topics from product announcements, tools and features to disruptions on the Internet. João Tomé is joined by our CTO, John Graham-Cumming.
In this week's program, we delve into the fruitful aftermath of our Developer Week. We received valuable feedback and announced the second cohort of the Workers Launchpad, featuring incredible companies building on Cloudflare's Developer Platform. Additionally, we launched our new Open Source Software Sponsorships Program.
We also explore a piece of Internet history related to TCP, followed by a deep dive blog where we discuss how we optimized the performance of our systems to address the memory usage situation in TCP sessions. And last but not least, we explore the transformative power of AI in programming, examining how it will revolutionize the "who" and "how" of the programming world.
For more about our Developer Week 2023, visit the Hub for every blog post and announcement — and don't miss CFTV's Developer Week programming .
English
News
Transcript (Beta)
Hello, everyone, and welcome to This Week in Net.
It's the May 26th, 2023 edition, and it's also the fruitful aftermath of one of our Innovation Weeks, Developer Week.
I'm João Tomé, based in Lisbon, Portugal, and with me, I have, as usual, our CTO, John Graham -Cumming.
Hello, John. Hello, good morning, good afternoon, good whatever time of day it is.
Nice to see you again. Nice to see you again, too. We just had, last week, our Developer Week 2023, but there was a one more thing type of day this past Monday, so a few more announcements related to Developer Week.
Before going into those blog posts we published, for example, on Monday, what would be the sum up of that full week that was a little bit more over than a week that you would do?
I mean, I think a lot of it was AI. That was a big part of the week.
And another big part of the week was developer productivity, like getting stuff done on our platform.
I think those were two pretty important themes. And then perhaps rounding out the platform.
I different technologies that are in there. Database integrations, D1 becoming available, a bunch of stuff that makes the platform a place where you can really build whatever application you want to build, and a bunch of cool features.
So yeah, I guess AI developer experience and even rounder, outer, however you say that, platform.
But about, for example, feedback.
I've seen a bunch of feedback from developers through social media and all that.
Were you surprised with some of that feedback in specific? Well, I don't know what feedback you've been seeing, but I think, you know, whenever we roll out stuff on our platform, I think developers are really happy because it gives them more tools.
And because we do all the auto scaling and the security and all that stuff for people, they can just, you know, write code.
And if they can write, if they've got an application they want to write and it becomes easy to write because of all our new tools, then it's fantastic.
And also, I mean, cause there's obviously a bit of a reaction to the AI stuff because, you know, it's pretty clear that going forward applications of all types are going to include some amount of AI to make them, you know, more productive.
I mean, just think about the photos app on your phone where you can now search for cat and see pictures of cats.
I mean, that's an AI doing that work.
We're going to see that sort of thing get added to all manner of applications.
So being able to do it on our platform is really important. Exactly.
And in a sense, a lot of the AI part of the week, I think, is not only how Cloudflare is powered more and more by AI and developer platforms is powered by AI, but Cloudflare is also powering AI in a sense with a lot of companies.
We're helping companies in the AI sector are powered by Cloudflare in a sense.
Yeah. I mean, Cloudflare has always powered a lot of startups, right?
A lot of companies have come to Cloudflare for the protection, for the speed, for the pricing model.
And so perhaps it's no surprise that a lot of AI companies have come to us because a lot of them are smaller new startups.
But it's perhaps an area which is not so obvious, which is that the AI companies have to use a lot of data, right?
Terabytes, maybe even petabytes of data, and they have to move it around. And because of the way in which Cloudflare is always traditionally charged, which is we're not going out there and charging for bandwidth and we're not locking people in with egress fees, it becomes really quite a good idea to store your models or your snapshots to training on Cloudflare R2, because then you can move it to whatever training platform you're using without incurring any charge by us.
So I think some of the sort of fundamentals of our platform have made it very attractive to AI companies.
And as you say, there's sort of more and more of them seem to be using us.
Makes sense. Let's go over the blog posts. We have a bunch of those still this week related to Developer Week.
In a sense, we can possibly go over our Workers Launchpad 2, right?
Yeah. So let's click on that one. Let's dig into that. Mia wrote a great post about the second cohort of Workers Launchpad.
So this is a program where people building on Workers on our platform get to talk about the products they built through a demo day.
This is linked to our Workers Launchpad funding program, which is a $2 billion funding, which we announced late last year.
If you're building something on Cloudflare, then you can talk to us about that funding, getting access to that funding.
And we run these demo days where companies come together and demo what they're building.
You can go watch demo day on Cloudflare TV.
There's Matthew. And this blog post is about a bunch of companies that they built on the Cloudflare platform and they're now launching.
And so really interesting to see the variety of things.
I mean, for example, this one, Drively, which is an API for buying and selling cars online.
I mean, this is a very traditional business, but here it is with an API built on Cloudflare, built on workers, queues, durable objects, all of the stuff.
And so it's worth going through this blog post if you want to get a sort of a sense for the variety of things that are getting built on Cloudflare.
Because now I think the platform has reached a scale and reached a depth of functionality that people are building pretty big functions and pretty big companies.
And hopefully, there's the next unicorn in this group or the previous group.
Yeah. It's interesting to see different areas of expertise and industries using different and also different products and different tools.
Absolutely.
Absolutely. I mean, it's all sorts of stuff here, but this is, you know, what we hope is we encourage people to use our platform.
There is funding available.
And I think it's a really easy way to scale something up, which, you know, I think startups should not have to worry about what scaling is going to look like and they don't have to worry about it with our platform.
So, yeah, if you want to learn about the sorts of things, there are pictures from all of these companies, videos here, you know, feel free to come in here and read all about it.
Exactly. Actually, I have a stat that Matthew gave the other day regarding AI companies.
There is a 270% year-over-year increase in AI companies onboarding with their services to Cloudflare.
That doesn't surprise me. That doesn't surprise me, given the rate at which things are going and the fact that, you know, a lot of these companies seem to think that our pricing and the way in which our platform operates really fits with the businesses they're running.
Exactly. There's a bunch of things to read and to explore in this blog post, specifically with different examples of startups in the program.
And there's also a cohort number three. There is.
Those who are interested can apply here in this case. Exactly. Come in here and apply and we will connect with you and figure out, you know, if you're the right company to be on our platform.
And please, you know, don't hesitate to tell us about what you're building.
Exactly. We also enjoy learning about those for sure. Where should we go next?
Well, I mean, another program is right next to that. Veronica wrote a great post about our open source software sponsorship program.
And I think what's interesting about this is that, you know, we've had this program for a while.
If you're running, you know, a non-profit open source piece of software and, you know, you want to have a website for that and, you know, you want it to be protected and accelerated, then we have a program for you.
And you're like, here's a list of some of the, you know, some of the things that use us.
Node and, you know, CDNJS and D3 and Yarn and React.
I mean, you know, these are open source projects on Cloudflare.
And we actually kind of simplified our program because previously it was kind of like, it was very specific.
We kind of said it was for tools of a certain type.
And then we just said, look, it's very simple. It has to be operated on a non-profit basis.
And you have to link back to us. That's the deal. If you do that, then, you know, you're eligible for the program.
You get Cloudflare Pro included.
So that includes things like the WAF, which gives you security tools. And, you know, dedicated forums to deal with us.
And we can also give you access to products, right?
So products that maybe haven't come out yet or specific products that you particularly need for your particular use case.
So this is a new program.
It's available now. You know, if you're running an open source project, I mean, I've done that many times.
It's helpful to have, you know, a website that's protected.
It's helpful to have access to perhaps workers and things like that.
Come chat with us. We're happy to have you. Exactly. And there's a lot of benefits, like you were saying, in a sense.
Absolutely. There's lots. So it's a good idea to try it out.
Yeah. Even related to developer week, we had a blog post recapping, written by Ricky and Don, recapping all of the several, 34 blog posts we published last week with different types of products and features in that tutorials.
So it was a lot. Yeah, I'm not sure how many you and I talked about in this week in net last week.
It wasn't all 34. So if you watched us yakking last week, you would have missed out on some of these blog posts.
So it's worth this particular one.
You can sort of get a quick summary of all the stuff. It's been separated into sections around AI, around data, about developer experience.
Pick the thing you're most interested in and click on the link and we can tell you, learn more.
But this is a good summary by Don and Ricky. It is. And in a sense, it also has the latest ones in terms of Monday that were published also and the Cloudflare TV segments related to some of those announcements.
At the beginning of this blog post, actually, there's some of that feedback I was mentioning in the beginning, some tweets, some things that people were talking about, giving feedback in terms of some of the announcements.
And one of the things that surprised me in a sense is you could see that some of these announcements are really relevant to those who are working in this area.
It can give you a really great boost or just save you work.
So it's kind of amazing to see the feedback that some people are having.
I'm glad because it means people are using our developer platform and they want to do more on it.
And so this is really saying, hey, wow, this functionality lets me do something.
And so, yeah, this is great to see these reactions. Exactly. Where should we go next?
Out of developer week in a sense? Yeah, back into this week. Now it's been quiet this week, right?
We had the recap on Monday and then we've had a couple of blog posts this week.
Mike Freeman wrote this really long blog post that you can scroll down and it scrolls on forever and there's graphs and stuff like that.
But this is a nerdy deep dive. So if you want to get down and nerdy with something inside of how Cloudflare operates and the kind of problems that we face, Mike's post is a really good one to cover this stuff.
Before we go there, we want to help those who don't understand a lot about how the Internet works.
This blog post is about TCP.
It is. But those who don't know, give us a sense of what is TCP and why is this transmission control protocol important?
On the basis of the Internet, really.
Well, I mean, TCP is pretty much fundamental to most things you're doing on the Internet.
There's another thing called UDP, which is probably equally important.
But for all of the normal stuff you do on the Internet, like going to a website, there is a need for your computer or your phone or whatever to connect to the website in order to be able to have a communication with it and say, hey, give me the web page or hey, I'm logging in.
And the underlying protocol, the underlying agreement between your device and web server, for example, is done through a protocol called TCP.
And it is a way of establishing what we call a session, which is an agreement between two computers that they're going to talk and they're going to pass data back and forth.
So it is a way of reliably passing data between two computers over the Internet and dealing with all the mess of the Internet.
So dealing with congestion where some point along the path between you and me or me and some web server gets congested because it's too much Internet traffic.
And it's a way of ensuring that packets get to their destination correctly so the data gets to the right destination.
And so TCP is really fundamental to a lot of things that we do on the Internet.
We use it all the time without knowing it. And we use Linux on our servers.
And one of the problems we noticed was that sometimes the amount of memory being used by TCP.
So you imagine I send some data to you.
Let's suppose you've got a server at home, Joao, and I send some data to your server over TCP and you receive that data.
The program you're running might not be ready to actually receive it into your program.
So the Linux will actually, what we call buffering, it'll keep it around, like waiting for your program to be ready.
And there's a limit set because you can't handle an infinite amount of data being thrown at you.
You get the per session limit, which is set. And sometimes it didn't work.
Sometimes what we saw was that the amount of memory being used by TCP was just growing in an unbounded way, just getting bigger and bigger and bigger and bigger and bigger.
And we needed to figure out why. And because we can't have the memory run out on our machines because it's needed for other things.
And certainly not from something that's happening in the kernel.
Because if one of our own programs goes haywire and starts consuming too much memory, then we can kill it, right?
We can terminate it and say, obviously our code has gone wrong in some way, but the Linux kernel underlies everything we're doing.
So it better do the right thing.
And so Mike's post is an investigation of why this was happening and then how we fixed it.
And so he went through and figured out at some point, there are some situations in which those limits, how much memory you should consume, are ignored.
And then exactly what the circumstances are and what the fix looks like.
So if you go through this, you're going to find out why this happened.
You're going to go look at kernel code. You're going to have a patch. You're going to have testing post the patch.
The big deal for us is that we fixed it and we don't have this runaway.
It saved us a lot of memory on our systems, which is important because we need that memory to run our service and give you all good performance.
And it also saved, there was actually time taken because if you store a bunch of stuff in memory, at some point you need to put it all together to hand to the program, which is called collapsing it.
And that was also taking a long time because of the way in which this stuff was really happening.
So this is a serious deep dive for network nerds about this.
And hopefully the patch gets upstream and becomes butter looks and stops this kind of thing happening.
So this is one of those, you know, when you operate something at cloud first scale with the variety of traffic you have with all these things happening all over the world, you come across real oddities.
And this is a great oddity. Why are we using more memory than we absolutely needed?
And well, it turns out, you know, we know the answer is to do with TCP coalescing where things get put back together again, because things come in sections and they have to get squashed together.
And sometimes it goes horribly wrong.
And so, you know, as you can see, keep scrolling. You're only halfway there, dude.
Exactly. It's big. You're not even close. If you know about this stuff, you'll get a lot of answers here, right?
A lot of procedures, details.
And also if you're the sort of person who likes this kind of thing, we have a job for you because we're no doubt going to find other weirdness in Linux or other bits.
Years ago, we had a problem with memory leaks and then machines that were crashing randomly.
And that turned out to be a microcode problem in the actual microprocessors.
So we go hunting for weirdness at scale. Exactly. In a sense, for the general audience, what is the bigger advantage in doing some of these things?
Saving time, saving money in an indirect way, in a sense. Well, as an end user, you don't care about this at all.
This is Cloudflare's problem to solve, right?
I mean, it's like Cloudflare's got to make its service work. From our perspective, it makes our business more efficient, right?
Because we're saving memory.
This memory was wasted. And so by fixing this, we fix this, we save memory.
That means we don't have to buy as many machines. That makes our business more efficient.
That means we can pass on, we can run the business that we're running very, very efficiently to our customers.
And if you think about our free plan, our pro plan, our business plan, I mean, these are a great deal of value for a small amount of money.
Part of how we do that is by making sure we don't waste memory or waste CPU, etc.
And this is a great blog post. It gives you an example of how those kind of investigations.
So in a sense, there's no direct relation with general audience, but there is because if a company is more efficient, does things better and easier, things could be not as pricey, more efficient.
So efficiency benefits all. Yeah, absolutely. Yeah. It's a bit also of the history of the Internet, right?
Because there was a lot of technical detail that no one knew about that were taken care of, problem solved.
And then people can have a good experience online.
Well, I think you sort of bring up a point which is interesting, which is like TCP itself is standardized.
Those are standards documents in the world called RFCs.
And there are different implementations of TCP, like there'll be this one particular one in the Linux kernel.
I actually wrote one about 20 years ago myself. And what you learn by writing one yourself is there is a lot of detail in TCP and a lot of stuff that, a lot of situations that can arise.
And in here, Mike actually tests a bunch of different scenarios.
I'm sending data, but the thing I'm sending to doesn't bother to read it or reads it very slowly.
And they're like, these situations actually happen on the real Internet.
And actually getting all that stuff right when you have these different situations occurring is not easy.
And so, yeah, as you see, we've done a lot of reading about TCP in the past because we're trying to really make it work for all of the weird situations we're in, right?
Some of our situations is a little quick connection because you just want a website or a long running connection because you're doing something with web sockets and it needs to be up for hours maybe at a time.
There's all sorts of things with TCP, which are the difficult details when it comes down to running something at our size.
True. And like you were saying, the tag for TCP has a lot of blog posts, a lot from Merrick actually.
Because it's so fundamental to what we do. Exactly. Really interesting. For those who want to learn more about these types of topics, a lot to discover and understand here.
Exactly. This one you wrote in actually 11 years ago, why mobile performance is difficult.
Yes. So it has a history element to it, more than 10 years. TCP is a lot older than 11 years ago, a lot older than me.
From the 70s, right? Yeah.
Which actually makes it younger than me. But let's leave that. Let's cut that part.
Sure. But a lot to discover and back here in terms of TCP and IP, which are a little bit both put together since the 70s with Vint Cerf and Bob Pan, right?
That's right.
There's a lot of history there if you want to dig into it. Just this Friday, we also have another oxy blog post.
Another oxy blog post. So if you don't know what oxy is, oxy is our proxy.
It's proxy without PR, the meaning of it is the name, which has become fundamental to a bunch of stuff we do, particularly what are called forward proxies.
So things like the work we do on iCloud Private Relay, the work we do with Warp.
And it's a big framework for building proxies written in Rust.
And if you want to read about it, you can go to the intro blog post, and there's a whole load of blog posts.
This particular blog post is about how the team made it very, very extensible using hooks.
And so it just talks to you about the hooking mechanisms that are available.
And what's interesting about oxy is that it works at every layer of the OSI model.
So you can be hooking in at the sort of packet level.
You can be hooking in at an encrypted stream level, an application protocol level.
And there are hooks for everything, which means that the application itself does not have to be updated.
People can build their plugins, build things that sit within oxy.
And by connecting to the appropriate hooks, they can build the functionality they want.
So another deep dive into oxy. Oxy is now, I don't know, five blog posts maybe at this point.
Very big and very important piece of software for Cloudflare.
I think it's more than five, the tag now. Yeah, here it is.
Or five. No, it's five. You said it correctly. I read them all. I read them all.
So true. And more coming, right? I'm sure there's more coming because it's a very large and important piece of software for us.
And so the team is really digging into how it operates.
Before we go, I want to go over, if you want, with a blog post you wrote about AI in a sense.
And also, it's related to Developer Week for sure, but also in general is related to AI and this new, it's not a new era, but at least is more trendy era in terms of AI that we're entering.
And you make a very interesting point in my perspective here in terms of who is programming and how someone is programming.
Developers, programmers could be different people for the next few years, right?
Well, I made this point here, which is like, I remember this 1947 paper, which is about the thing called an EDVAC, which was a machine that was built in the UK.
And in 1947, you're talking about there being zero to a handful of machines, like in that era, computers totally.
And in that paper, they invented what we call a subroutine, which is something which is so common, it's extraordinary that it had to get invented.
And there's actually a paper about, we could do this subroutine thing.
And the reason I thought that was a significant starting point was that when you think about programming, at the time, the people who were programming were almost certainly the people who built the machine or were very close to the building of the machine.
And naturally, the group of people who knew how to program was tiny.
And over time, we have invented all this stuff to make programming easier.
Syntax highlighting, high-level languages. I mean, basic came along and said, wow, people could write in basic and didn't have to understand other things, right?
And then APIs and Visual Basic, and then all of these things and all of these tools that make programming easier.
And what they also do is they increase the group, the size of the group of people who can be programmers, right?
You don't have to be the low-level expert on, you don't have to have soldered the computer together yourself, like you might've had to do in 1947.
And we grow this out. You think about Excel, the people who write complex things in Excel using formulae are a type of programmer.
And some of those things are really quite large.
If you think about the children's programming language, Scratch, which is very popular, it allows children to program without learning all of the arcane programming, which can be very, very difficult to work with because you get tripped up by things which are not obvious.
And I think that the AI we have now, the assistants, they are making experienced programmers like me more efficient.
And I've used the AI assistants to do things that I knew I could do myself, but it could just do it faster.
It's like, oh, just do me a thing like this.
Okay. And then I can look at it and go, yes, that's what it is. But I think it also increases the who, because I think it's going to enable people to do things that they would have had difficulty doing before.
And similar in a way that Scratch does for kids, it eliminates some of the complexity.
It doesn't eliminate programmers.
I think it actually makes more people programmers. So that was kind of the point of this blog post.
I said in it that I was like, this was the most excited I've been about technology ever, because I think it feels like a really, really fundamental change in how we work and how we interact with computers.
And so as programmers, it's how we program and who is a programmer.
And I think that over time, we will end up with more people programming.
And also the programs we use will have AI built into them.
That will just become part of how these things operate.
And I left on what I hope is a positive point, which is when AlphaGo, which is an AI that plays the Go game, beat Lee Sedol, who was one of the best players in the world.
There was this incredible sense that somehow, because Go was so much harder than chess, and chess had already been done, that somehow we'd taken a big step forward in terms of AI.
And that can be a little bit scary. But then an interesting thing happened, which is the Go players, who are experienced, started playing against AIs.
And their Go players, the human players, started getting stronger.
And I think that's absolutely fascinating. I think the same thing can happen in programming.
I think the same thing can happen in lots of domains, where we have these incredible, not necessarily adversaries, but teachers who are showing us, look, there's a different way to think about this kind of stuff.
We have new strategies and improvements. So I think AI has the potential to do that for all of us.
And I left it with saying, imagine 100 years on from that 1947 paper, there's a 2047 paper saying, what will preparation of programs, i.e.
what will programming be on these new AI, I call them neural type machines, what's it going to look like?
What will there be to discover? What are we going to discover about our interactions?
And I'm very hopeful for what it means for our use of computers.
It makes sense. As a non -programmer person, to be honest, I've been playing with ChatGPT and all that.
And I was surprised, using CSVs, just how much of things I can build.
Just asking for code, specific code. I have this CSV, it has this column, this column, this column.
Give me some code to do this or that.
And I've been talking with some of our developers or programmers in the company, showing the code that ChatGPT has produced.
And they found new, interesting ideas of doing sometimes even the same thing with different code.
So I see creativity there also at play.
Actually, it's funny you say that about the new ways of doing things.
I use ChatGPT to write a small amount of JavaScript code for something.
And I am not, I mean, I can program in JavaScript, but I definitely don't consider myself to be an expert.
And it came up with a really nice solution, which essentially used what was, I think, called a list comprehension.
And I was like, oh, that's really nice.
I like this. Now I learned from it. So thanks, ChatGPT. Exactly.
Sometimes we learn even for other examples that, in this case, a system gives us.
Well, it was great, John. See you next week. Thank you very much. Yeah, see you next week, João.
Bye. That's a wrap. Bye-bye.