Originally aired on September 7, 2023 @ 5:30 PM - 6:00 PM EDT
Welcome to our weekly review of stories from our blog and elsewhere, from products, tools and announcements to disruptions on the Internet.
João Tomé is joined by our CTO, John Graham-Cumming. In this week’s program we highlight how Cloudflare was named one of the Top 100 Most Loved Workplaces in 2022; we also talk about the introduction our new system designed to detect route leaks and its integration on Cloudflare Radar and its public API; we go over routing on the Internet, and how prepending can do more harm than good; and how we manage Cloudflare IP addresses used to retrieve the data from the Internet, how our egress network design has evolved, how we optimized it for best use of available IP space and introduce our soft-anycast technology. There are also some Ukraine Internet outage, Thanksgiving, World Cup and Black Friday trends.
Read the blog posts:
Hello, and welcome to This Week in Net, our weekly review of stories we've been writing in our Cloudflare blog and things affecting the Internet.
I'm João Tomé coming to you from Lisbon, Portugal, and with me I have, as usual, our CDO John Graham-Cumming, which is, you're already like dressed for Thanksgiving, right?
Actually, I don't know if anybody in the US would actually dress with a turkey hat on.
I got this in England for Christmas at some point and I just thought, you know, I would honor our American colleagues who are off today, yesterday and today, by having a turkey on my head.
So I hope that's not inappropriate. I don't think so.
I don't think so. It's fun for sure. And we're recording this on Black Friday day, like November the 25th, so in a sense it's the Black Friday.
We will also be dealing with some Black Friday trends here.
Do you have a Thanksgiving story that you want to share?
Okay, so yeah, I lived in the US. Exactly, you lived in the US for a number of years, yeah.
I did, yeah. I lived in the US for a number of years and I think the very first Thanksgiving I had in the US, I had heard that people eat something called pumpkin pie and so I decided to make pumpkin pie, which is something very, very delicious, by buying a pumpkin and doing it from scratch and essentially nobody does this.
They buy the pumpkin mix in a can and then make the pumpkin pie, but I decided that was the best way and it's a lot of work to get a pumpkin and hollow it out and cook it and all that stuff, so don't be me and do it from scratch.
I can see that. It's a lot of work. Actually, this year for Halloween, I did for the first time the slash of pumpkin with the face and all that and it was more difficult than I thought, but YouTube came to the rescue, so that was also good.
We have a lot to cover this week. It was not an innovation week like we had last year, where we talked about the super cloud, what we're doing, but now we have so many blogs, so many days, so many blogs, so many things to talk about, but this week we have more in-depth, deep dive type of blog posts that I think are really relevant to explain a little bit how the Internet works, so people can learn with them.
If they're more technical, they will have really technical things.
If they're less technical, they can try to learn a bit. I'll share my screen and we'll take it from there and here we are.
So, the first of the week, first we had actually a Why Call Flares one of the top 100 most loved workplaces in 2022.
It's always important and we're on a list of the News Week and Best Practice Institute.
So, there's that and for those who want to see what we announced in our Developer Week 2022, there's also a blog post just summing up all of the announcements.
That was right. That was last Friday. There was a summary thing, so in case you missed it, you can go back and look in there and see all the things we announced, 30 different announcements.
And as you say, earlier this week we had the News Week and the Best Practice Institute saying that we're one of the top 100 most loved workplaces in 2022 and I'm not surprised because I think people like working at Cloudflare and I think it's a great place to have a career and learn a lot and build a lot.
And this was a summary by Janet who runs people at Cloudflare and Scott who runs recruiting at Cloudflare.
So, this blog post gives you a good idea about why.
Here's a nice picture of our team when we got together, which now we're I think getting a little bit large to get all of us together, but there we all are.
Find me in there somewhere. I don't know where I am quite in that image.
I know I was there somewhere. And so, you can get some quotes from people here at the company about why they enjoy working here and the sorts of initiatives we have.
So, yeah, that was a nice surprise from News Week, although I guess I wasn't surprised that people like working at Cloudflare enough to, you know, to vote us one of the top 100.
And we have a short video for those who want to see. We do, we do.
It's about three other people talking about, you know, why they like working at Cloudflare.
My name is Satyen Desai and I love working at Cloudflare because of our all-inclusive culture, our speed of innovation and the value we bring to our customers.
And here are the other reasons why we all love working at Cloudflare.
The number one reason I love Cloudflare is because I have the best team.
I feel and I believe that I'm valued and respected. I love Cloudflare because of our people, the focus on our customers.
We like to have some fun along the way.
I love our mission to build a better Internet, not just for the big guys or the little guys, but for everybody.
And then we had actually, it's an explanation, but also an announcement because it's something that we have now on Cloudflare Radar available for anyone to try and see.
First, let's start, what are root leaks and why are they important in terms of the Internet?
Well, okay. So the Internet, the name actually means that it is a network of networks.
You know, your ISP, your company, certain businesses like Google on the Internet are themselves networks that are self-contained.
And in order for the Internet to be formed, those networks need to come together and connect to each other.
And the way in which they connect to each other is through a thing called BGP.
And BGP is a way of a network saying, hey, I'm over here.
I'm the Portuguese operator Vodafone and I'm Google, et cetera.
And that, those things, what they do is they announce, I'm responsible for this bit of the Internet.
I'm responsible for this bit of the Internet.
And there's an agreement. So this protocol comes together. And so the, if like you're sitting at home using Vodafone and you go to Google, the Internet knows, oh, I can get to Google by going through this sequence of hops across these different networks to get there.
And that's great. That's how it should normally work.
Now there's a problem, which is if somebody claims to be somebody else.
And one version of this is what's called a root leak, which is that there's some announcement where some network says, oh, by the way, I'm actually Google.
There was a very famous example of this that happened in Pakistan, where in an attempt to block YouTube, an ISP in Pakistan said they were Google and all of Google's traffic or a large portion of it then went over to that, that location in Pakistan, which obviously didn't work.
And there've been others, other route leaks. So this is a problem that can be caused when information that's meant to be announcing, yes, I'm this network, I'm responsible for this bit of the Internet comes from the wrong place and is incorrect.
And these things happen relatively frequently. And this blog post explains all of that, explains how they occur, the different types of them.
And then it explains Cloudflare Radar's new route leak section, which is per network, you can actually go in and see leaks that are occurring.
Some of these things are really tiny and not very important, but some of them actually have a really big impact on our use of the Internet because our traffic goes where it shouldn't.
And sometimes route leaks are malicious. There have been instances of people deliberately creating these route leaks in order to try to attract traffic to a network they control in order to do something malicious with it.
A bit like redirecting someone's mail to your home or something so you can read their mail.
It's an attempt to do that.
So we're showing that information on Cloudflare Radar and we're going to be expanding Cloudflare Radar so that you can get alerts when there's a route leak.
So you can get very, very detailed information about what's happening and why it has happened.
So that was the announcement two days ago. And I think this is a huge thing for the Internet because you're able to get this information directly from us.
We have an incredible view of what's happening across the Internet.
And this blog, I hope, explains how these things happen and why. And it then shows you, if you scroll down a little further, you can actually see the interface that's in the Cloudflare Radar route leak system.
So that's live today. You can go to any network and see leaks that have actually occurred in chronological order and how large an effect they had.
So why don't you click on that one?
Yeah. Specific examples are always better. Big US provided Cogent. And you can see, in fact, there's a route leak today, 8 o'clock this morning, but only affected a very small number of prefixes.
Prefixes are groups of IP addresses. So this one probably didn't actually affect the Internet in the sense that nobody really noticed this impact.
But other ones can be quite large. And so now we provide all this information on Cloudflare Radar.
Yeah, exactly. And you can see through time, before what happened in a few days.
And it's helpful in many ways. And like you said, sometimes people aren't affected by that.
But if it's big, if it's by accident or on purpose, it can really affect your Internet experience.
And so even the end user will also be impacted by some of these things, for sure.
Yeah, that's right.
So let's move on. And now we have, yesterday actually, Tom Strix had this blog post about why BGP communities are better than AS path prepends.
Actually, a small note on BGP, for example, it's really important in terms of the address book of the Internet.
And we've discussed this before, the Facebook outage last year, so many hours, it was like news all over the world.
And it was BGP related.
And even in Canada, one of the biggest ISPs, Rogers, this summer had a clear outage, more than one day of complete outage, mostly complete outage, and millions of customers being impacted without Internet, ATMs being impacted, all sorts of services being impacted, because there was a BGP problem.
So it was related to this protocol.
Yeah. I mean, if you really want to screw things up on the Internet, there are three protocols you can mess up and really mess things up.
There's BGP, because that's what binds the Internet together into network networks.
There's DNS, because DNS gives us the name to address translation, which we all use all the time.
And lastly, there's something to do with SSL certificates. If you mess up an SSL certificate in the right way, you can have a big impact on things.
And we've seen that happen to O2 in the UK, Telco, they messed up an SSL certificate, and they had a very big impact on the network.
So yeah, so those are the three ways to really mess with the Internet.
This blog post by Tom, who works on the network team, is about a pretty esoteric topic, actually, for most people, which is how do you set up your BGP in such a way that the traffic you're dealing with gets sent and received from in the ways in which you want.
So the beginning of this is a primer on how the Internet works.
BGP is actually really interesting if you want to learn about this stuff, how the Internet is actually bound together with these autonomous systems, which are actually just what we call networks, how they are associated with IP addresses, and how BGP sends out what are called announcements, which is the network is saying, hey, I'm this network, and I control these IP addresses.
And then later on, you've seen here that there is each network that receives all these BGP messages, then makes a decision about how to interpret those messages.
So they go through this filtering process. And that filtering process, what they're trying to do is they're getting all these announcements, and they're kind of resolving into, okay, this is what the map of the Internet looks like.
So they have to make that map out of it. And some of that map, some of the directions, at least the local directions, like, you know, here I am at home, I can turn left or I can turn right out of my driveway, right?
That's up to me.
And that's what we call a local preference, right? I can decide that.
As I get further down the street, there's going to be a bunch of road signs that I don't control, which will tell me I can only go in certain directions.
And so if you scroll down, there's a nice thing here where there's a message, and then there's like, the best path, we're choosing the best path is like this sequence of events.
And what BGP wants to do is choose the shortest path. And the shortest path will be the least number of hops between networks.
And so that's a fairly simple algorithm.
We've had Dijkstra's algorithm forever for doing that kind of stuff, or we could do other things.
But there might be other reasons that you don't necessarily want to use that particular direction, that particular path.
And the reasons are varied. If you're a network like Cloudflare, where you are connected to the Internet in lots of different ways, so you might be directly connected to some service providers, you might be in Internet exchanges, where lots of networks come together to exchange data.
And you might be connected to some of the major transit providers, those are really big sort of telcos that provide Internet connectivity at a global scale.
How you send traffic is something quite sensitive, because some of those routes might be really expensive.
So transit might be a paid service and very expensive compared to going to an Internet exchange.
But there may also be other reasons beyond cost around how much capacity you have in a certain direction.
And so what folks like Tom Strix do is they do what's called traffic engineering, which is they try and figure out, we'll send certain sorts of traffic this way and certain sort of traffic that way.
And how you do that, and in particular, how traffic gets to you, because that's actually an even more interesting question, because it's like, I can actually, for example, if I drive out of my house, I can decide to turn left or right.
But if I tell you to come around, you're likely to make your own decision about the route, because there's no way to get to me.
And how you influence that, I might say to you, you should really come this way and turn right into our driveway, because it's easier than turning left or whatever.
That is actually rather complicated.
And so what BGP operators do is they do this thing called prepends, where they actually try to make the path look longer for the one they don't desire, so that the network sends in the direction, they're trying to influence how traffic gets into them.
And that doesn't always work. And so there is this other thing called communities, which is a way for a network to kind of announce to the outside world, hey, here's some information, and here's what I'd like to do.
And they give this example for Cogent here, where Cogent is kind of saying, I prefer traffic coming this way, and this awaits on this traffic.
So if you want to know about the sort of stuff that Tom Strix and others do in the company, this is a good introductory blog post, an introductory blog post to BGP, but also some of the sorts of traffic engineering.
And if you are a network person, then feel free to come and argue with Tom about whether he's right about prepends not being as useful as communities, or why communities are not good, et cetera, et cetera.
But a pretty deep dive in some of the aspects of what it takes to actually run the Internet, because it does actually turn out it needs people like Tom, often figuring out somewhat manually, okay, this is what the path should look like for all sorts of different reasons, sometimes cost, sometimes performance.
And so, you know, read all about it here.
And those decisions, this is really interesting, have a real world impact, because if you're taking better lessons from what you do, you can, in essence, make the Internet a little bit faster, work better.
So, and we've discussed this in the past, sometimes a small difference can make a big, huge difference, especially if it's one small difference today, and in six months or a year, there's a bigger difference.
And then like five years, the Internet was much different than it is today for all these types of decisions, of rationales, of lessons learned, of sharing information, right?
Yeah, absolutely. Absolutely.
And then we have another blog post coming out today. We have. Right there.
And I suspect you actually have it, yeah. Yes. No, it's not this one. This one.
Yeah. Let me put in the beginning. Here it is. Cloudflare service don't own IPs anymore.
So how do they connect to the Internet? Yes. So this is a really fun, this is a really fun long blog post about, you know, we've often in the past talked about the fact that we use a thing called Anycast, which means is that if you visit a something that uses Cloudflare, wherever you are in the world, it will have the same IP address.
And by the magic of this thing called Anycast, you will end up connecting to the data center closest to you.
So here we are in Lisbon. If I go to Cloudflare.com, we actually connect to Lisbon.
And if you're in London, you connect to London, but it'd be the same IP address.
So that's what we call the ingress side.
That's great. That's easy in some ways. The other side of it is Cloudflare sometimes needs to communicate with things on the Internet, which we call egress.
And you can't use Anycast for that because you need to come from a specific IP address.
And how we do that is somewhat complicated and the decisions around, well, okay, that means that a Cloudflare server needs a specific IP address so that traffic can get back to it from the thing it's communicating with.
Because if it came back to an Anycast address, well, unfortunately it might connect to a different data center and it would be a big old mess.
So this blog post covers how we deal with these egress connections.
And in particular, a whole bunch of complexity around the fact that things come to us in different locations around the world, and they then need to go from us to servers.
And those servers sometimes want to know where we are physically around the world.
And the reason they might be doing geo -blocking, for example, if you try to watch the BBC outside of the UK, then that doesn't work.
And so this, and sometimes also from a performance perspective, they want it to be that they know the location so they can give a faster response.
If you use fixed IP addresses, this is relatively easy because you can publish those fixed IP addresses and say, this IP is in Frankfurt and this IP is in Paris, et cetera.
And people can rely on that. But it becomes very complicated when you're operating at our scale, when you're moving traffic around for performance reasons, when sometimes you shut down a data center.
So for example, maybe I'm doing something in the UK, normally I'm using London and London is shut down for maintenance and my traffic goes to Paris while I'm still in the UK.
And you really want it to be the case that when Cloudflare contacts the thing I needed to talk to, it looks like I'm in the UK.
Exactly. The experience is the same, right?
Yes. And this is super complicated as we move traffic around. And so Marek is going to talk about a technique we invented called Soft Unicast, which allows us to move traffic around all over the place.
And it's quite complicated. And he runs through all of the decisions about how we do it and what were the trade-offs.
And well, here it is.
I mean, this is the third way, which is this Soft Unicast invention.
So in a couple of hours, and by the time you're watching this video, this will be on our blog and you'll be able to read about our solution.
Exactly. And there's all sorts of charts here that explain and graphs that explain how things move around in the Internet.
A lot to learn, actually, a lot to learn in this blog post.
Usually Marek does these deep dives that you can learn amazing things if you really want to get a sense of how the Internet works, how the decisions are made.
These blog posts are really great to go about that. And I think one of the sort of secret weapons that Cloudflare has is, I think, Unimog and I think Purimog, which allow us to do essentially arbitrary moving of traffic anywhere around the world.
And that actually binds Cloudflare's networking to be one giant network.
And this is part of that whole world. Exactly. Last week, actually. Sorry.
I was going to say, you've been looking at trends on the Internet, right? So this is a super busy week because we've got the World Cup, we've got Black Friday and the war in Ukraine is continuing.
And I think there was an earthquake as well, wasn't there?
The Solomon Islands got cut off. There was. And yesterday it was Kenya also that was impacted.
Kenya. And that was a power outage. Power outage. Yeah, exactly.
Show us some Kenya, John. I will. So this is Kenya, actually, yesterday. So Thanksgiving Day, a big drop in traffic in Kenya for a power outage.
This is something that happens on the Internet.
But it was more serious in terms of impact, for sure, in Ukraine because of attacks.
Let me show you some trends here. So this is Friday, November 25th information, where the widespread partial outage came back after 34 hours of a real impact in terms of the Internet.
So you can see here the impact.
It went as low. This was power, right? This was caused by a big power loss across Ukraine.
Yeah. Since October the 10th, there's been clear attacks, airstrikes into energy infrastructure.
So there's been impact since then.
But this week was huge, especially since Tuesday. So there was this big drop in traffic in all of Ukraine, in all the country, because airstrikes really made big damages in terms of energy infrastructure.
So I think a big part of the country was without energy, electricity.
So because of that, of course...
Didn't you have a map somewhere that was really cool? I have a map from when the outage started.
So let me just search. We have a lot in our Twitter handle these days.
I see Moldova going by as well, because Moldova was affected by the power outages in Ukraine as well, right?
Exactly. So let me... Here it is.
So this is the heat map of... So this is the comparison of Internet traffic, the drop after Wednesday, compared with the previous 24 hours.
So in red, it means how much Internet traffic dropped.
And in green is when it rose. So usually it's more spread out, a bit of green, a bit of light red.
And this is all red because Internet traffic dropped as much as 49 percent in Ukraine.
So half of the Internet traffic went down because of the energy problems that they had there.
And Moldova was also impacted, but it was shorter.
It was like a few hours apart, a partial outage, because Moldova is really close to Ukraine.
So they are connected in terms of Internet traffic to...
Internet traffic? No, energy to Ukraine. So they lost some energy there.
So the Internet was also affected, but like really a few hours, just a few hours.
So that was Ukraine. But in a more light note, the World Cup also started this week, and we've been tracking a few trends about the World Cup.
And one of those, for example, was... So the World Cup is in Qatar, but all sorts of countries are participating, including the US, England, Portugal.
And I took some time to look at different countries yesterday.
And one of the amazing things is how the Internet usually goes up during the game times.
You could see a bump just when the game starts.
Mostly, I think, because of course people will be using more their phones.
Some will be streaming. So you could see different types of impact in different countries.
There was a country here, let me show you, where there was a drop.
It was South Korea. That's usual. For example, when the Super Bowl is happening in the US, usually Internet traffic goes down because people are watching their TVs.
Only countries where people are watching via streaming, for example, it goes really up.
But in South Korea, and this is not the case in most countries we've seen, Internet traffic went down.
And it's clearly seen. So this also shows that the World Cup is really impacting the Internet in all of these countries in terms of people are really making something different because they're watching the World Cup, which is interesting.
And you could see the halftime.
Yeah, that's amazing. The halftime, everyone got their phones out. Everyone was like, I want to check my mail or Twitter or Instagram or whatever.
Yeah, exactly.
And these are all trends that you could see using radar. So all of these print screens are from radar.
For example, Ghana, Portugal played Ghana yesterday, so the Thanksgiving day.
And in Ghana, you could see actually Internet traffic going up for all of the games that happen in 24 hours, which is interesting.
But of course, the Ghana one was the big one.
Not the same case, actually, in Portugal. Portugal was a bit different.
So you could see also the same thing. People were more using the Internet while those games were happening.
But the Portugal-Ghana game, it was not as high.
So it was a bit higher, but not as high. So they were probably seeing more TV.
They're watching TV or maybe they were driving to work at that time, maybe.
Yeah, watching TV. Watching TV makes sense. If they will watch more TV, the Internet traffic won't be as high, for example.
So it adds up. Black Friday, right?
What's happening in terms of shopping? So, big increase. The Internet shopping has been increasing since late October, but early December.
I saw a few countries.
So this is worldwide. For example, you could see the trend of growing, for sure.
A few days from the past week, that has the biggest increases. And also, let me show the trend is more easy.
This is the seven-day rolling average.
So this gives us a perspective of growth that is more clear to see. Since late October, you could see clearly DNS traffic to e-commerce websites worldwide going up.
And this is also clear in the West, for example, which is interesting. And in Great Britain, we've seen a few countries there.
And hopefully next week, we'll have a blog post that will highlight some of these trends.
This is the US, for example.
I thought it was interesting in the US that there's kind of a dip there. It's almost like people were like, I'm not going to buy anything until the deals start coming.
That's true. And actually, I have something here to show. Actually, this is the Internet traffic coming up this morning, Friday, in the Ukraine.
That's great.
But this is the e-commerce domains traffic growth in the US. You could see a dip, a clear dip, is like the lower...
The turkey dip? Is that the turkey?
That's the turkey dip, because that's exactly like 3pm on Thanksgiving Day in the US, which is completely aligned with people not using the Internet to check e-commerce websites during Thanksgiving.
So that plays out and makes sense.
So this is a trend also we've seen. So a lot of trends. So hopefully we can discuss them in a blog post actually next week to see if Cyber Monday was the day with higher traffic.
That was the case actually last year. Cyber Monday had more traffic than Black Friday, which was interesting.
Yep. All right. Excellent. So that's a wrap.
We had covered a lot of topics this week. And let's wish a good Thanksgiving, Black Friday, weekend days for everyone, right?
Exactly. It was nice chatting with you.
Talk to you next week. Talk to you next week. That's a wrap.