This Week in Net: Privacy Gateway, partial outage, Email Routing, and a treat
Presented by: John Graham-Cumming, João Tomé
Originally aired on September 8, 2024 @ 6:30 PM - 7:00 PM EDT
Welcome to our weekly review of stories from our blog and elsewhere, from products, tools and announcements to disruptions on the Internet.
João Tomé is joined by our CTO, John Graham-Cumming. In this week’s program we start to explain why Privacy Gateway is important, we go over what Email Routing does, now that it left Beta, and we talk about our partial outage.
We explain how to create trust with Oblivious HTTP privacy properties and why John moved his YouTube channel "Behind The Screens" dedicated to digging into source code seen in films and on TV, using the Cloudflare ecosystem (from Pages to Stream). There’s also a small Halloween treat.
Read the blog posts:
- Privacy Gateway: a privacy preserving proxy built on Internet standards
- Stronger than a promise: proving Oblivious HTTP privacy properties
- Partial Cloudflare outage on October 25, 2022
- Email Routing leaves Beta
- And here's another one: the Next.js Edge Runtime becomes the fourth full-stack framework supported by Cloudflare Pages
- Page Shield can now watch for malicious outbound connections made by third-party JavaScript code
- Cloudflare Workers and micro-frontends: made for one another
English
News
Transcript (Beta)
Hello, and welcome to This Week in Net, our weekly review of stories we've been writing in our Cloudflare blog, but also things affecting the Internet.
With me, I have, as usual, our CTO, John Graham -Cumming, that is, in Lisbon, like me.
Hello, John.
Good afternoon, morning, night, wherever you're watching this. Exactly.
That's a recurrent catchphrase you're using. We had a few things this week.
It was not an innovation week, per se. Had a few things this week, from Privacy Gateway to oblivious HTTP privacy properties, to even a Cloudflare partial outage.
Should we start with Privacy Gateway? Let's start there, yeah. So, in a sense, Privacy Gateway enables a privacy for applications to use Cloudflare as a trusted relay, so limiting which identifying information, including IP addresses, is visible to their infrastructure, in a sense.
But what does this mean for the general type of user?
Imagine that you are building a weather app, where people look at the weather forecast.
When you go to that weather app and you type in, I want the weather in Lisbon, there's a connection from the app that goes to the weather service, whatever your backend is, your service, and says, give me the weather in Lisbon, right?
Which goes back to the device and gets displayed to the end user.
But inherently, because your phone or whatever had to connect to that backend service, did it over the Internet, and one of the things it provided was the IP address.
So, there is a link between the IP address, which might be my home IP address, if I'm on my Wi-Fi at home, my office, if I'm on the office one, or it could be Wi-Fi, it could be my mobile phone provider.
So, there's information, right, which is given, which is like, not only am I interested in Lisbon, but I'm also from this IP address.
So, it makes a connection between something about me, the IP address, and something I'm thinking about, in this case, the weather in Lisbon, which reveals some information.
And so, what Privacy Gateway does is it allows you to break that link.
And the way it works is that the application sends a request to Cloudflare, this user wants to get Lisbon.
And then we send the request to the actual application backend, saying, I need the weather for Lisbon, and then pass the reply through us.
And this hop in the middle has a really interesting property, which is that by using cryptography, it is possible to ensure that the application backend only knows somebody wanted to know about Lisbon.
And the Cloudflare bit only knows that something with a particular IP address wanted to do something on this backend.
So, we, in this instance, would know the IP address of the end user.
The application would know Lisbon, we wouldn't know Lisbon, and the application wouldn't know the IP address.
So, it breaks the linkage between the thing I'm looking for and the IP address associated with me.
And that's really powerful.
You could use it for weather apps, you could use it for all sorts of things.
It's being used by an app called Flow, which is for period tracking for women, and it can break that connection.
And so, this is now available as a product that anyone can use if they want to have this privacy guarantee when we provide it.
And so, we can't see the content that's being requested, and the end service can't see the IP address.
And the interesting thing about this is, so this is a service, but it is based on an Internet standard.
And if you flip to the other blog post, which is about provability, there is actually a stronger than a promise.
So, oblivious HTTP is the underlying protocol. And this word oblivious is kind of important because it means that parts of the system, which consists of the application, the relay bit, and the actual backend, are oblivious to different bits of information.
And we've seen this in other technologies. So, oblivious DNS over HTTPS does exactly this, but for DNS, so that you know that, okay, somebody wants to know this IP address associated with this name, for example, but you can't know who it is.
And the middle bit doesn't know what's being asked for, but knows who's asking.
This private relay product uses the same thing. And so, this blog post is about the fact that there's a standard, oblivious HTTP, and a proof, mathematical proof, using a tool called Tamarin to prove that it does what it says on the box.
That is, we can formally say that we know that the middle bit, the relay, can't get information about what the HTTP request is, doesn't know what content is being asked for.
And we know that the end application server doesn't know the IP address, but does know, obviously, the request because they need to know.
So, this is a nice thing. So, we have the service which you can use, and then we, which is based on a standard, oblivious HTTP, which is going through IETF, and we have a mathematical proof.
On GitHub, if you want to go off and run the tool yourself, you can go off and run the tool yourself and see that it proves that it has the privacy properties that you want.
That's the prover. And if you scroll down in that original blog post, you'll see there's a link to the GitHub, which gives you right at the bottom, I think.
It is. Yeah, there you go. So, that's actually the proofs.
And we have proofs of other things. So, one of the things we're really interested in doing is mathematical proofs that something does what it says, because proofs and building something into the protocol are such a strong way of saying that you are preserving privacy, or you do have certain security properties, much stronger than a promise, like, well, we promise not to look at this information.
Here, it's like, we literally can't, and here's a mathematical proof of it.
So, I think this is really fascinating, because in a sense, not only we're stating that we're doing something that is relevant and important in terms of privacy and also security, but we're showing them in a mathematical format, in a sense.
Absolutely. It's a very, very strong guarantee of what can and cannot be done, and what is provided by the service.
The title says it all. It's stronger than a promise, for sure.
Just giving those highlights there in terms of how can you check that what we are doing is what we say that we are doing.
So, it's also good in terms of trust, using a service in terms of trust, right?
Absolutely, absolutely.
You don't just have to trust the service, because it's built into the mathematics.
Exactly. But in terms of the use cases for these types of products, which are the customers, for example, the companies that could make a better use of this type of products?
Well, I mean, if you don't need to know the end user's IP address, right?
You don't need to know who they are, then why not do this by default? I think one of the things that's happening with the Internet is people are going to start having privacy by default.
So, you could build this into any application where you say, well, look, I'm just going to break the link between who this person is and what they're asking, or at least who their IP address is, which to a certain extent identifies somebody or a connection, at least to the Internet, which could give you information.
So, I think this could be very well widely used. I think the reason I gave the example of the weather app is that this is not an obviously needs to be private application.
Yet, if I'm searching for the weather in Lisbon and I search for that regularly, it gives some information.
I probably live in Lisbon or want to move to Lisbon or something like that, right?
So, I think privacy concerns all aspects of our lives and it's important.
And this is a mathematically proven service that can help with that.
Makes sense. And again, there's new regulation, new policy, and who knows that new regulation in the specific countries would highlight that or privilege that, right?
Well, I mean, so it's certainly the case that countries around the world are legislating privacy for their citizens.
And we're going to see this. This is going to be a big trend or the next...
I mean, it's been a trend since GDPR in Europe, but this is by no means a European thing.
This is something that's happening worldwide. And so, getting ahead of these things, preserving people's privacy, I think is really important for us as individuals and frankly, for business, because you're going to have to comply with those laws anyway.
And if it's a use case that you can explain to your customers, you're not using their data, you're focused on privacy and here's the proof of that.
Yeah. And it's more than you're not, it's we can't.
We can, yeah. We can't, yeah. Even if we want, yeah. Actually, that's a better proof actually than just hitting that we're not.
We simply can't, yeah. We build this in order for us not to be able to do that, even if someone asks.
Yeah, exactly.
So, let's move to email routing that left beta. So, it's already available.
Let me share my screen. So, email routing is a thing that we announced quite a long time ago now, 2021, seems like forever ago.
And that is that we can take email in for your domain name.
So, suppose you own a particular domain name and we can route it to the right mailboxes for you.
So, it's a very common use case that somebody has, say, an account on Gmail, but actually the email address they give out to the world is something which uses their own domain name.
So, they've bought a domain name and if you don't have one, you can buy one through Cloudflare, it's dead easy.
And then you can give out, joao.pt, for example, we could, I don't know if that exists, but you could have that.
And then it could be routed to your actual Gmail account.
And this is something we launched in birthday week of 2021, so a little bit over a year ago, and it just took off like wildfire.
So, we quickly became actually one of the leading email providers in a way, because of the amount of email we're routing through it.
And it has been in beta for a year and is now out of beta.
So, about half a million inboxes and 2 million messages a day, and it's continued to grow very rapidly.
We also integrated stuff from Area 1. So, Area 1 is a company that we acquired earlier this year, which did email security.
And so now we've taken some of the know-how, data feeds from that to strengthen spam filtering and things like that within email routing.
And a bunch of other features.
So, as it says, Cloudflare has always been very API first, you should be able to integrate through the API and do it not through the UI.
The API for email routing hadn't been public until now, now it is.
It's part of our standard API.
You can go in there and manage email accounts through the API, and it's all fully documented.
And that means you can also use it through Terraform. So, if you're one of those people who likes to use Terraform to manage everything, you can manage it like that through the API.
I was trying to open it, but I don't know why it's quite slow.
Well, one of those days. If only we knew something about how the Internet works, we might be able to figure out.
Exactly. Let's go back to the blog while we're chatting, while that's loading up.
So, if you scroll down in here then.
So, IPv6, super important. We're getting above 30% globally, above 50% in many countries for IPv6.
All Cloudflare customers get IPv6, and we are now egressing.
That is to say, when we forward an email from Cloudflare to your real email provider, the backend email provider, we can use IPv6.
And that's very important there.
For example, Google Mail uses IPv6 for its mail delivery. So, once again, we like to use IPv6.
Very important thing. Let's scroll down a little bit further, and you're going to see that we've added a bunch of observability.
So, we have analytics.
So, you see how much mail we're forwarding, how much we're dropping, because we think it's spam.
And you can also dig in to any part of this. So, this is really nice.
So, you can see, here's the emails that are going through, here are a few that we dropped, because presumably they were spam or harmful, malware, for example.
But you can also go into a very detailed log, which will show each email and what decision we made about it.
Did we forward it or not? So, there's an activity log that shows that, and it also shows bounces, if an email bounced.
So, here you can see digging into a particular email and it was bounced, or it failed one of the particular standards that protects email integrity.
So, the tool now, out of beta, anyone can use it.
It's free to sign up for. If you don't have a domain, you can go get one from Cloudflare.
If you're a larger user, you want audit logs, so you can see exactly what happened, why things got set up in the way in which they did.
And you can see that here. And anti-spam. We have built-in anti-spam as well.
So, this will augment whatever you're using on your real email provider.
We will block spam before it even gets there. And I'll see IDN support. So, internationalized domain names.
So, the web was made by people who use essentially the 26 letters of the so -called Latin alphabet, but also without even any of the Latin characters with accents.
So, if you go down in the web, it's really ASCII.
So, now we support all of the internationalized domain names. So, you can have in Greek and in Japanese and in Chinese, et cetera, et cetera, in the domain names.
You give the example here in Greek, which is a bit of a joke because the Greek word says test or example.
There we go. And 8-bit MIME, again, if we go back to the beginning of the Internet, we used 7-bit characters rather than 8-bit characters.
And that was the standard way, way back when this stuff got going. And these things have resisted a lot of change, but there is 8-bit, which is better for sending binary files.
And guess what? We send a ton of binary files via email, right?
We send photographs to each other, PDFs and stuff like that. And way back at the beginning of email, it was all text and it was all English text using English characters.
So, there we go. So, lots of other stuff, having those email routing out of beta.
And there will be coming soon, route to workers, which means that you can process email using workers.
You can write code to do whatever you want with email.
And that's going to be a very exciting release when that becomes fully available.
And again, it's all about integration and integrating products, making life easier for a company, for a developer, in a sense of using the whole capability of things, right?
Well, I mean, I think the thing is, I think one of the lessons of the world is that if you add programmability to something, then you open up new use cases, you give people power to do things, right?
So, we did that with workers, which is we gave people the power to customize the Cloudflare platform.
And then of course, they build their own applications on top. And we see that with large applications.
I think what will happen with route to workers is people will start out building little customizations to their email and then, well, wait a minute, I can build an application here.
So, I really believe that adding a Turing machine to things gives you power that you previously didn't have.
And because we have a very large API catalog, there's lots you can do. For sure, for sure.
We also had a partial Cloudflare outage this week. And it was related to something that usually is not a common type of problem that we have related to tiered cache system, right?
Well, okay. So, the outage was exactly, it was partial, which means that not every customer was affected.
In fact, it was a small percentage of customers.
However, on the other hand, if you were affected, then your service wasn't working for the most part.
Although it may be the nature of the bug meant that it might've been working and it might've worked some of the time.
So, it would have been very frustrating if you were a customer who had this happen to you, and you would have seen errors appearing, a 530 error appearing rather than your website or API.
Yes, this affected a thing called tiered cache. So, we have layers of cache.
And if you have a large website, you may use tiered cache, which means that you reduce the number of Cloudflare locations that connect to your backend.
And we keep larger caches in some locations and then spread that cache data out across the rest of the CDN.
There's a nice diagram of this here where you have like going to a few locations.
One of the things is Cloudflare has a very large number of locations.
And in some cases, it'll be much more efficient if only a small number of them talk to your origin server, and then our machines organize the cache amongst themselves.
That's what tiered cache does. A software bug was introduced, a very, very bizarre software bug.
If you take a look at it, the next, if you scroll down to the little bit of text down here, you're going to see some code.
Oh, here it is. Here's the interesting question. Here's the thing. If you look at this after here, what was changed was a single line was added.
And that line was a tracing function.
Now, what's a tracing function? One of the things with a system of the complexity of Cloudflare is there are many systems that are working together to process something that goes through Cloudflare.
So, an HTTP request or an email or a DNS lookup or whatever.
And if something isn't working, one of the things you want to do is trace it.
That is to say, you want to follow the thread.
Oh, first it reached this piece of software, then it went to this piece of software, then it went to this piece of software, and so on.
And you want to see a view of that.
And you want to see a view of it from a performance perspective, so you know where it gets slow, and also if there's a fault somewhere along that path.
And so, an engineer added this one line, which to say, oh, by the way, if we hit tiered cache here, keep a record of that.
Here's the tiered cache rewrite. And actually, while the incident was happening, I was talking to the engineer and he said, I can't see how this could cause a problem.
This is like the- A small part of code.
Tiny thing. There's no if then here, there's no decision being made. Well, it turned out that a very bad thing happened, which is the trace function had a side effect.
And so, the side effect means is it did more than it said. So, it said it would keep a trace.
It just would keep a record somewhere. We came here, we did this.
It didn't just do that. It also cleared something, the control headers. And the control headers are information that will get passed between bits of Cloudflare's internal structure, telling the system, okay, here's where this came from, here's what you should do next with it.
And there was a side effect which cleared it, which meant that the tiered cache thing couldn't end up going to another location.
In particular, it couldn't get to the origin server of that customer.
And therefore, because the DNS data wasn't there, therefore, there was a failed DNS lookup.
And that appeared as a 530 error. And so, for a long time, we thought something was wrong with our DNS system, because it looked like a DNS system.
So, the story here is of a one -line change that should have been improving our visibility into the system.
But because the function had a side effect, which the engineer wasn't aware of, it caused a big problem.
And normally, when we use the trace functions, that side effect doesn't have any dangerous effect.
But there you go.
And so, this was about an hour and a half where the customers who were affected saw errors intermittently, quite frequently, on their services.
And that was this week.
Again, I was a journalist, so this was not on the news, because it was a small percentage of our customers.
Again, it was a partial, not that big in terms of customers.
But again, to your point, the customers that were affected, this is really important for them.
And I think it's really important for us also to explain what happened, but also what we did to resolve it even for the future, right?
In a sense. Of course. Yes, I mean- Learning is important. It's always worth talking about these mistakes.
I mean, people, first of all, are interested to understand why what's happening.
We want to make sure we got down to understand why.
And then, of course, there's an internal effort now to understand how did that side effect get introduced?
Why wasn't it documented? All of the follow-on aspects to it.
But yes, we've always had this habit of talking about problems we've had, as much as we talk about the successes and our architecture and our software and our products.
This is part of the all-rounded world of being a company.
And it's part of the Cloudflare culture, actually. Since I've been here, I've seen this over and over again.
And since the early stages of the blog, you could see that something happened.
We explain it in detail, in technical detail.
It has a way also of collaboration because we're sharing the word for other companies not to have the same problems.
Other companies, other developers or people in this area, right?
I think that's right. But I also think the most important thing is that if you've ever built something, you will know that there are errors get created.
There are problems that occur. There are things that happen.
And it's just better to be honest about this stuff. Just say, look, this is what happened with our system today.
And other people will recognize that, oh yeah, me too.
I've had things like that. It's interesting what happened. And it creates trust.
And so that's why we do it. Of course. Makes sense. We still have a few minutes.
We can run a little bit of things from the previous week. For example, we announced last week Cloudflare Workers and Microfront Ends made for each other.
Do you want to...
Yeah, there was a couple of announcements recently about this. So one of them was that so there was a sort of architectural discussion about using Microfront Ends, which is to say that if you're building an application, a web application in particular, one architectural way of doing it is to break up the page, the application into small areas, which are referred to as Microfront Ends and pair them with a Cloudflare Worker on the backend.
And then your application comes together as the amalgam of all those things.
And some folks were showing how to do that with and how they were really made for each other in terms of the architecture.
And so there's an example application here of a gallery of clouds, showing how the different parts of it work.
And that was one thing. And then there was another one, which was around Cloudflare Pages supporting the Next.js Edge runtime.
So it's the fourth full stack framework. And so I think what's important to understand about Workers and Pages is that this is a platform where you can build any type of application on it.
And people are building any type of application.
And this is now the fourth stack we've got on there. So you know, Stelton, Remix, and Quick, and all this stuff is running in.
So along with Pages functions, you can put a static website up, you can build a richly interactive website with Workers.
You can go out there and use maybe an architecture like Microfront Ends.
And we'll hear more about this kind of stuff. So I think these two announcements are kind of interesting because they're really showing the maturity of the platform.
I mean, Workers has been around more than five years at this point.
And so I think that people are starting to build really quite complex applications on us, especially because of our storage products like the KV Store, D1 Database, Durable Objects.
And so, you know, come build whatever you want to build on Cloudflare with whatever tools you want to use.
Exactly. And actually, you did an example of that, right?
Of using Pages? Well, I've used Pages for a few static websites.
One of the things that I did recently was I took all of the videos I made on a YouTube channel called Behind the Screens, which is about, you know, sometimes when you look at a film or a TV show, if you're a programmer, you end up seeing some code on screen.
And it's kind of interesting to see what that code is, right?
And so if you're an engineer, you instantly like, what is that code? And I moved all of my YouTube videos off of YouTube onto a dedicated site behind the screens, which is on Cloudflare Pages.
And the video streaming is using Cloudflare Stream, which is Cloudflare's own streaming product.
So if you are super nerdy and wonder about some code from Airwolf in the 80s, or from what was in the Terminator or Jurassic Park, I have spent a ridiculous amount of time digging into, you know, what that code really was.
And quite often actually copying it, running it in an emulator and showing you how it actually operates.
So that's Behind the Screens, but that is all on Cloudflare.
And, you know, I moved it from YouTube because we have such a great streaming service.
We have such a great static hosting thing.
And I'm going to be moving more stuff that's super interactive onto our platform because it's now, you know, it's full featured for whatever you want to build.
And you did this like in not that much time, right? Just moving things up.
It's easy, right? Well, it was pretty easy actually. Yeah. Yeah. The hardest thing was actually finding the original copies of these videos.
Once I found the disk with all the copies on it, then I just uploaded them all to Cloudflare Stream.
Cloudflare Stream did all of the work to, you know, figure out the different formats and render everything correctly.
And then, you know, Cloudflare Pages, I happen to be using GitHub.
It's just a GitHub project. You just, you know, push your changes up to GitHub and it builds it and puts it out, you know, pushes it onto there.
And I even bought, I believe, the domain name through Cloudflare too.
So it was a full package in terms of... It was a full package. Yes. Yeah.
Yeah. You're trying out as one should. A lot of amazing stuff here that I advise anyone to see if they want.
One of the things that I found this interesting, if you have the YouTube experience or other experiences where you have ads everywhere, you have a different experience.
This way, you can have a more clean experience the way you like it more type of thing.
The other thing is, you know, Cloudflare Stream has a very generous, you know, amount of minutes and storage built into the pro and the business plans, which we recently announced.
And so, since I pay for my own account on Cloudflare with my own credit card, it just made sense.
Like, yeah, this is inexpensive to do. And to be fair, the audience for behind the screens, it isn't quite as big as Korean super group BTS, right?
The other BTS. I think so, but there's an audience and they will love this for sure.
So, because the details are kind of amazing. If you love movies and you love programming and code and all that and computers, you have amazing stuff here for sure.
Good. Well, there you go. That's behind the screens. Exactly. So, this session is done.
And next week we'll be at Web Summit in Lisbon. So, maybe someone could reach out to us there if someone is coming.
That's right. We'll be at Web Summit.
See you at Web Summit. Exactly. A big event, 70,000 people. It's coming again to Lisbon.
So, we'll be there. See you there. So, I think that's a wrap. Thank you, John.
Cheers, Rob. Good to see you again. Bye-bye. And that's a wrap. But before we go, and because to be honest, it's the Halloween, actually this is a very interesting color given that is very similar to the Cloudflare color, we have a small surprise for you.
Check it out. So,