Cloudflare TV

This Week In Net

Presented by: John Graham-Cumming
Originally aired on July 21, 2020 @ 4:30 PM - 5:00 PM EDT

A weekly review of stories affecting the Internet, brought to you by Cloudflare's CTO. We'll look at outages, trends, and new technologies — with special guests to help us explore these topics in greater depth.

Originally broadcast July 10, 2020

English
News
Interviews

Transcript (Beta)

Okay, welcome to This Week in Net. With me, I'm John Graham-Cumming, Cloudflare's CTO, and I do this show approximately weekly and look at things that have been significant in Internet news over the preceding week or so.

And today is Friday, July the 10th, and I'm sitting here in sunny Lisbon, where Cloudflare has one of its major development offices.

And I'm going to talk about, well, I'm going to start with a story that I've covered earlier on this week on Monday.

So on Monday, I do a show which is called which is about what's happening in threats, ThreatWatch.

And one of the threats to the Internet is a rather simple one, which is the Internet getting turned off.

And over many years, we've seen the Internet get shut down in countries around the world.

This happens often during political situations, elections, unrest, contested elections very often, but also in some countries when there are national exams that the government will decide to switch off the Internet to prevent cheating.

That's the idea, at least. And also sometimes in countries where there is some sort of civil rights problem happening or human rights abuses, which brings me to a current Internet shutdown, which is in Ethiopia.

So the Internet has been off now for 11 days, pretty much in Ethiopia, because of the killing of a popular singer in Addis, and he's at Addis Ababa.

And the singer was from one of the ethnic groups in Ethiopia, and was very much part of previous demonstrations, and in fact, is of the same group as the current prime minister of Ethiopia.

And as part of the unrest and part of messages being sent around the country, there's obviously been a decision made to switch off the Internet.

And when I looked at this on Monday, this is what it looked like. So this graph shows you Internet traffic as seen by Cloudflare from IPs inside Ethiopia.

So we do country level geolocation and say where does this IP come from.

And this graph starts at the beginning of May, just to give you a sense of what it looks like.

And so on a daily basis, obviously, Internet goes up and down a little bit as people use it more or less depending on the day.

But what's striking is that there was a fairly consistent level of Internet use over the last couple of months.

And then right there at the beginning, at the end of June, beginning of July, immediately after the killing of the singer, the Internet was shut off, not actually completely.

If you look, it dropped down, and the figures here are it's about running at about one to 2% of what it was running before.

So there are people inside Ethiopia with access to the Internet.

Mobile connections are not working.

Home broadband isn't working. Home dial up things aren't working.

But at this stage, on Monday, it seemed like some critical government agencies still had access to the Internet.

And if you look at the last tick on there, on the sixth, it seems to be ticking up very slightly.

And that appeared to be because the government decided to reconnect some things.

And I think that was in particular, things that were critical to the running of the country.

And so you see a slight uptick.

So think some things like banks, for example, might need international communications.

And by the way, it's not just the Internet that's been turned off, it's almost impossible to get a phone call into Ethiopia directly.

So other parts of the telecommunications network have been shut down.

Well, this situation has continued.

So initially, around this time on Monday, it looked like there were indications that the Internet might get turned on.

There seemed to be indications the government was saying that the unrest was dying down, and we're going to be able to switch the Internet back on again.

Here's the situation today. It really hasn't.

In fact, that little tick up that we saw on the sixth has kind of stayed there.

And so it's now running at 2 % of what we were seeing earlier on. So we're 11 days into a shutdown in Ethiopia.

I think this is the longest time Ethiopia has had the Internet shut off for the entire country.

And anecdotally, there are messages we're hearing from people within Ethiopia saying that in order to get access to the Internet, if you know someone who works a business or in the government who has access, you might be begging them to check your email or something like that.

And of course, this shuts down everything. It's not just communications like WhatsApp and email, but it's access to any kind of web or Internet at all.

So significant outage in Ethiopia. And obviously, I'll report on this later as it will no doubt come back at some point.

This kind of outage is really striking because it's the entire country.

Sometimes outages don't really show up in our data because they tend to be regional.

So in the past, for example, in Ethiopia, there have been shutdowns in regions, which might only look like a little drop in the traffic that we see.

And we see that in general. We see that in parts of India right now, in Myanmar as well, where there are regional shutdowns.

So we'll keep reporting on what's happening in Ethiopia, but the Internet is shut off for pretty much everybody except for people who it appears have been deemed to be working in something critical around the government or perhaps banking.

Now switch to something we've talked about a few times on this week, which is to talk about DDoS attacks.

So if you've been following this show over the last few weeks, we've talked about some DDoS activity.

And DDoS activity is something that happens continuously.

It's a little bit like email spam. It's just out there and constantly things are getting hit with DDoSes.

And really the question is, what's the amplitude of the DDoS?

And for quite a long time, we had not had any news of large DDoS attacks.

And I think this is for a few reasons. One was that DDoS mitigation companies like Cloudflare had got very good at blocking those large attacks.

If you go back a few years, there were some big news stories about the size of attacks.

And what our data shows is that most attackers have decided to use smaller attacks.

They'll sort of do these drive-by attacks where they'll try and knock a website offline for five minutes to show their power.

But they seem to have moved on to other things.

They've moved on to the extortion type things with the BitLocker, the CryptoLocker type stuff.

But then having said all that, some big DDoSes came along.

And so if you were following along, Akamai said earlier on that in June, they had mitigated an attack which was 1.44 terabits per second.

So obviously the large DDoSes are now measured in terabits per second rather than gigabits per second.

It wasn't long ago that we talked about tens, hundreds of gigabits per second.

Now really anything that's going to be in the news, at least, is going to be in terabits per second.

This isn't the largest DDoS that had ever been acknowledged, but it's pretty close.

The largest previous to this had been acknowledged was 1.6 terabits per second against an unknown target or at least unknown to the general public.

But anyway, Akamai said they'd seen one that was really, really big.

What's interesting is Amazon followed up fairly quickly by saying in a report that actually they had mitigated one which was 2.3 terabits per second.

And this is now the largest DDoS in terms of bits per second that has been publicly acknowledged.

It's a pretty large attack and that will knock pretty much anything offline.

But as I said about the spam problem and the analogy here with DDoS is that spam was a huge problem and then what happened was companies came along and solved it.

And today you probably get spam and you have a spam folder in your email system, but it's all being filtered successfully.

And the same kind of stuff is happening with DDoS.

So you only really get damaged by a DDoS attack if you don't have some kind of DDoS mitigation service.

Same thing for spam. If you don't have some sort of spam filtering, yeah, you're going to get a lot of spam.

So that was kind of what was happening with these volumetric attacks. And then Akamai the other day came along and said they had mitigated an attack which was 809 million packets per second.

So it's slightly different here. We're not talking about bits per second, we're talking about packets per second.

Of course, that's how the Internet works, right?

Information is broken up into packets and sent around and that's what's made it so successful.

And so this was a different style of attack rather than it being just pure volume in terms of bits per second, it was packets per second.

And I'm going to talk about why that's significant in a moment.

But we then published this week that the day before that Akamai saw this 809 million packets per second attack, we saw one which was 754 million packets per second.

So very, very close to the same kind of scale. And I suspect that just as in the past we talked about DDoS attacks that were in gigabits per second and moved to terabit, I suspect we're going to start talking about billions of packets per second because we're now getting close to a billion packets per second here at 754 and 800 or so.

So those are two kind of things that are happening.

And if you look at what we saw, this was happening over about a four day period.

So it was sort of June the 18th to June the 21st. The real peak when it went up to the 754 was on June the 20th.

But if you look actually, and the colors here are two different mitigation systems dropping packets, we're talking about hundreds, many hundreds of millions of packets per second sustained over a very long period against the target.

And this was actually against an IP address that Cloudflare uses primarily for our free customers.

Some years ago we announced that we were no longer ever going to charge anybody for a DDoS attack.

It's called unmetered mitigation.

It doesn't matter whether you're a free customer or a paying customer.

We will take the DDoS attack on slot and deal with it.

And this is a great example. This was against an IP that is shared by a few free customers and some very low paying customers.

And it was automatically mitigated.

In fact, it was completely automatically mitigated. We only know about this attack because it shows up in metrics.

It didn't require anyone to get alerted or any human to get involved and do anything.

And we've built the systems that can handle these attacks, much like people have built systems that can handle email spam, no matter what the volume of it.

So this is sort of the new reality, which is we're seeing above one terabit and now above two terabit per second attacks.

And we're seeing attacks that are getting close to a billion packets per second.

A billion packets per second, put it another way, means you're getting a new packet arrive at your network hardware once every nanosecond.

So let's just talk about the three types of DDoS attack that are out there, because I think it's worth thinking about this in terms of how you protect an application.

So we have things that are in terabits per second, things that are in millions of packets per second, and then things that are in millions of requests per second.

And these are trying to knock targets offline in different ways.

But they're ultimately all trying to make something be unavailable, be it a website, a mail server, the back -end API for an application.

They're all trying to overwhelm it in some way. The one we think I've thought about for years is the terabit per second one, or gigabits per second, but once upon a time megabits per second.

What that's trying to do is overwhelm the size of the pipe going to the website.

So you know if a website or an API has say a 10 gigabit per second link, or maybe they're lucky 100 gigabit per second link, then if you can send more than that, it just will be offline because the link capacity will be filled up.

And so that's what people try to do. They just try to use different techniques to send as much traffic to that IP address that would cause it then to go offline because the link would be full.

And that's the bits per second style.

And now we've gone up and up and up in terms of the sizes because mitigations have got better and better and networks.

I think Cloudflare has 37 terabits per second of capacity, so it's very hard, not possible to knock us offline.

And similarly other DDoS mitigation services have very large amounts of capacity.

So that's filling the pipe. But another thing you can do is cause the network hardware that's handling the packets to have trouble even if you can't fill the pipe.

And that's where we get into the packets per second style. So when a packet arrives at a router or a switch or at a machine to be processed, there is some fixed small amount of processing that happens no matter what the packet does.

You have to load the packet. You have to recognize what it is and make a decision about what to do with it.

And so if you can overwhelm that packet processing, then you have the same effect, which is that the website or the API or whatever has gone offline.

And the more packets you can send per second, the higher the chance you'll overwhelm the CPU.

Typically these attacks don't end up filling the pipe.

But what will happen is in order to get these packet rates, the attacker will send very small packets.

So very many small packets, millions of packets per second that are small, might only end up being a few hundred gigabits per second.

It won't be a headline terabits per second number. But when it gets to the router or the switch or into the machine it's hitting, that will overwhelm the CPU or the processing element that has to deal with it.

So that's why packets per second attacks become interesting because we've got very good at the floods of data filling pipes.

If neither of those work, then what people are doing is they're going for requests per second.

And this is taking the problem up to the server level.

So what an attacker will do is rather than try and knock off the actual networking hardware, they'll go after the server infrastructure.

So they'll say, okay I'm going to load the home page over and over again and that will be enough to cause the CPU on the web server to have a problem and therefore be offline.

And depending on the size of the web server or the API server, this can be really effective because you may not be able to scale to sometimes even thousands or tens of thousands through requests per second.

And this is the same thing that happens when a website goes offline because it suddenly gets popular.

Right now because many places are unlocking and people are being allowed to travel, we're seeing travel websites fall over because they just can't cope with the amount of genuine traffic coming to them.

Well this is the DDoS improvement of that.

I'll just reload the home page many many times or if it's more sophisticated, then what the attacker will do is they will go and take a look at a particular page that is slow on a website and load that page over and over again because they'll know it's consuming resources.

Some years ago we saw an attack on a Bitcoin website where what the attacker did was they opened a whole load of accounts on that Bitcoin website.

They didn't put any bitcoins in but then they requested the balance which the balance was going to be zero.

But this request for the balance meant going back to the website's database to calculate the balance even though it was zero.

And by doing that many many times over many different accounts that overwhelmed the site and the site was offline.

So requests per second can be very effective even if it would never measure in terms of gigabits per second or packets per second can knock things offline.

And one way to think about this is to think about the terabits per second is a bit like a flood.

It's a thing that fills the river until it overflows.

But the packets per second things is like an onslaught of mosquitoes.

You're going to spend time slapping down each one of those individual mosquitoes to make it go away.

So these are very very different techniques and so as you see reports about DDoS attacks, look for the detail of whether it's bits per second overwhelming a link, packets per second overwhelming network hardware, or requests per second overwhelming the actual server infrastructure.

Because all of these are threats and anyone who's looking for a DDoS mitigation service needs to look at something that can handle all the different threats.

All right that is it for a rather short edition of This Week in Net.

I will be back next Friday with another edition looking at what's happened in that week.

I'm also back on Monday with Threat Watch looking at information about interesting new threats that have occurred and I think I'm going to cover what's happened with some networking hardware.

There was a nasty vulnerability with F5 and we can talk about how that got mitigated.

There's patches out for that. On Tuesday I have story time when I'm going to talk to a Cloudflare staff member about something that's happened.

If you missed the one with Richard Bolton from This Week which was about one of our core teams, the challenge of running it, I recommend you go look for that on our replays because it's really very very interesting about the challenge of running something at Cloudflare's scale and dealing with errors.

I've been teaching a course on GNU Make and I'm on episode four now so if you're interested in GNU Make and getting to know its syntax, how to use it, there's a little GNU Make course as well.

But as I said that's it for This Week in Net.

Thanks very much for listening. I hope this was useful and I'll let you know next week what's happening in Ethiopia.

We have seen malicious foreign actors attempt to subvert democracy.

What we saw was a sophisticated attack on our electoral system.

The Athenian project is our little contribution as a company to say how can we help ensure that the political process has integrity, that people can trust it and that people can rely on it.

It's like a small family or community here and I think elections around the nation is the same way.

We're not a big agency. We don't have thousands of employees. We have tens of employees that we have less than a hundred here in North Carolina.

So what's on my mind when I get up and go to work every morning is what's next?

What did we not think of and what are the bad actors thinking of?

The Athenian project, we use that to protect our voter information center site and allow it to be securely accessed by the citizens of Rhode Island.

It's extremely important to protect that and to be able to keep it available.

There are many bad actors out there that are trying to bring that down and others trying to penetrate our perimeter defenses from the Internet to access our voter registration and or tabulation data.

So it's very important to have an elections website that is safe, secure and foremost accurate.

The Athenian project for anyone who is trying to run an election anywhere in the United States is provided by us for free.

We think of it as a community service.

I stay optimistic by reminding myself there's a light at the end of the tunnel.

It's not a train. Having this protection gives us some peace of mind that we know if for some reason we were to come under attack we wouldn't have to scramble or worry about trying to keep our site up that Cloudflare has our back.

So the release of worker sites makes it super easy to deploy static applications to Cloudflare Workers.

In this example I'll use create react app to quickly deploy a react application to Cloudflare workers.

To start I'll run npx create react app passing in the name of my project.

Here I'll call it my react app. Once create react app has finished setting up my project we can go in the folder and run wrangler init dash dash site.

This will set up some sane defaults that we can use to get started deploying our react app.

wrangler.toml which we'll get to in a second represents the configuration for my project and workers site is the default code needed to run it on the workers platform.

If you're interested you can look in the workers site folder to understand how it works but for now we'll just use the default configuration.

For now I'll open up wrangler.toml and paste in a couple configuration keys.

I'll need my Cloudflare account id to indicate to wrangler where I actually want to deploy my application.

So in the Cloudflare ui I'll go to my account, go to workers, and on the sidebar I'll scroll down and find my account id here and copy it to my clipboard.

Back in my wrangler.toml I'll paste in my account id and bucket is the location that my project will be built out to.

With create react app this is the build folder. Once I've set those up I'll save the file and run npm build.

Create react app will build my project in just a couple seconds and once it's done I'm ready to deploy my project to Cloudflare workers.

I'll run wrangler publish which will take my project, build it, and upload all of the static assets to workers kv as well as the necessary script to serve those assets from kv to my users.

Opening up my new project in the browser you can see that my react app is available at my workers.dev domain and with a couple minutes and just a brief amount of config we've deployed an application that's automatically cached on Cloudflare servers so it stays super fast.

If you're interested in learning more about worker sites make sure to check out our docs where we've added a new tutorial to go along with this video as well as an entire new workers site section to help you learn how to deploy other applications to Cloudflare Workers.

The real privilege of working at Mozilla is that we're a mission-driven organization.

What that means is that before we do things we ask what's good for the users as opposed to what's going to make the most money.

Mozilla's values are similar to Cloudflare's.

They care about enabling the web for everybody in a way that is secure, in a way that is private, and in a way that is trustworthy.

We've been collaborating on improving the protocols that help secure connections between browsers and websites.

Mozilla and Cloudflare collaborate on a wide range of technologies.

The first place we really collaborated was the new TLS 1.3 protocol and then we followed it up with QUIC and DNS server HTTPS and most recently the new Firefox private network.

DNS is core to the way that everything on the Internet works.

It's a very old protocol and it's also in plain text meaning that it's not encrypted and this is something that a lot of people don't realize.

You can be using SSL and connecting securely to websites but your DNS traffic may still be unencrypted.

When Mozilla was looking for a partner for providing encrypted DNS Cloudflare was a natural fit.

The idea was that Cloudflare would run the server piece of it and Mozilla would run the client piece of it and the consequence would be that we protect DNS traffic for anybody who used Firefox.

Cloudflare was a great partner with this because they were really willing early on to implement the protocol, stand up a trusted recursive resolver and create this experience for users.

They were strong supporters of it. One of the great things about working with Cloudflare is their engineers are crazy fast so the time between we decide to do something and we write down the barest protocol sketch and they have it running in their infrastructure is a matter of days to weeks not a matter of months to years.

There's a difference between standing up a service that one person can use or 10 people can use and a service that everybody on the Internet can use.

When we talk about bringing new protocols to the web we're talking about bringing it not to millions not to tens of millions we're talking about hundreds of millions to billions of people.

Cloudflare has been an amazing partner in the privacy front. They've been willing to be extremely transparent about the data that they are collecting and why they're using it and they've also been willing to throw those logs away.

Really users are getting two classes of benefits out of our partnership with Cloudflare.

The first is direct benefits that is we're offering services to the user that make them more secure and we're offering them via Cloudflare so that's like an immediate benefit these users are getting.

The indirect benefit these users are getting is that we're developing the next generation of security and privacy technology and Cloudflare is helping us do it and that will ultimately benefit every user both Firefox users and every user of the Internet.

We're really excited to work with an organization like Mozilla that is aligned with the user's interests and in taking the Internet and moving it in a direction that is more private more secure and is aligned with what we think the Internet should be.

you Hi, we're Cloudflare.

We're building one of the world's largest global cloud networks to help make the Internet faster, more secure, and more reliable.

Meet our customer Falabella.

They're South America's largest department store chain with over a hundred locations and operations in over six countries.

My name is Karan Tiwari.

I work as a lead architect in Odessa E-Commerce at Falabella.

Like many other retailers in the industry, Falabella is in the midst of a digital transformation to evolve their business culture to maintain their competitive advantage and to better serve their customers.

Cloudflare was an important step towards not only accelerating their website properties, but also increasing their organization's operational efficiencies and agility.

So I think we were looking at better agility, better response time in terms of support, better operational capabilities.

Earlier, for a cache purge, it used to take around two hours.

Today, it takes around 20 milliseconds, 30 milliseconds to do a cache purge.

The homepage loads faster. Your first view is much faster. It's fast.

Cloudflare plays an important role in safeguarding customer information and improving the efficiencies of all of their web properties.

With customers like Falabella and over 10 million other domains that trust Cloudflare with their security and performance, we're making the Internet fast, secure, and reliable for everyone.

Cloudflare. Helping build a better Internet.