🎂 Closing the last privacy holes on the Internet
Presented by: Mari Galicer, Achiel van der Mandele
Originally aired on January 14 @ 11:30 PM - 12:00 AM EST
Welcome to Cloudflare Birthday Week 2023!
2023 marks Cloudflare’s 13th birthday! Each day this week we will announce new products and host fascinating discussions with guests including product experts, customers, and industry peers.
Tune in all week for more news, announcements, and thought-provoking discussions!
Read the blog posts:
- Privacy-preserving measurement and machine learning
- Encrypted Client Hello - the last puzzle piece to privacy
- See what threats are lurking in your Office 365 with Cloudflare Email Retro Scan
Visit the Birthday Week Hub for every announcement and CFTV episode — check back all week for more!
English
Birthday Week
Transcript (Beta)
All right. Hi, everyone. Welcome to our segment, Closing the Last Privacy Bulls on the Internet.
My name is Mari. I'm a product manager for privacy here, and I am joined by Achiel.
Do you want to introduce yourself? Yeah. Hi, my name is Achiel. I'm also a product manager at Cloudflare overseeing edge connectivity.
I'm very excited to talk about privacy today.
Great. So over the past two days, obviously, as part of Birthday Week, we've released a bunch of exciting blogs related to privacy.
Among them are my blog post on Microsoft Edge, the secure network feature, which is a VPN that's built right into the Microsoft Edge browser and powered by our privacy proxy platform.
I also released a blog post with a colleague of mine, Christopher Patton, on privacy preserving measurement and some of the different ways that it can be applied in machine learning, other federated learning applications.
And it uses a really cool new cryptography technique called secret sharing and also utilizes multi-party computation.
And then we also have a really awesome post that Achiel and some of his colleagues wrote on encrypted client blow.
So super excited to talk about all of those and how they're changing the nature of privacy on the Internet.
Yeah, let's start with Microsoft Edge.
I'm kind of curious. Can you tell us a little bit about the history and how did we get to this announcement?
And like, most importantly, maybe what is the announcement?
Yeah, so the announcement, in short, is that now all Edge users that, you know, have the normal Edge builds built into their browsers can enable a feature called secure, Edge secure VPN.
And what that means is that when they have this feature enabled, they toggle it on and Edge will automatically detect that they're in a situation, say like an open Wi-Fi network, in which the privacy of their network connection could be better protected.
And how that works is that the Edge browser connects automatically to Cloudflare's network and we have a privacy proxy platform that proxies those connections and sends them on their way to origin sites.
How did this kind of come together? How did we work together with Microsoft to bring this to life?
Yeah, so like any partnership, I think a lot of times these things take a while to come to fruition.
So Microsoft approached us a few years ago and they're interested in building this feature.
I think that, in general, network security being built into browsers and operating systems is somewhat of a trend and a trend that I'm really glad to see is happening in the privacy space.
A lot of times when we're browsing the Internet by default, you know, we don't realize that there are a lot of tracking or that, say, we might be connecting over an insecure HTTP connection.
So I think Microsoft really took note of the way that different, you know, different trends in the space were happening.
And we're excited that Cloudflare offers this type of service and this type of platform.
And so we started prototyping things out. We kind of got a proof of concept together.
And that seemed to be working really well. It helps that a lot of the technologies underneath it were already proven at scale.
For example, with our CDN and with, you know, our WireGuard VPN warp and also with our Proxy-B implementation with private relay.
So that was kind of the POC kind of went well and we're super excited to work with them on bringing it to GA.
Cool. I always love learning about like how do these big companies like work with each other and how does the history kind of play out.
So I'd love to learn more about the technology.
But before we get into that, you and I are kind of like in this privacy space and we're used to talking to like super tech savvy users and people that talk about privacy all day long and tracking.
But how would I explain this to like my grandmother?
Like she can barely use a smartphone. How do I explain to her like what this means and how this kind of improves her life?
Yeah, that's a really great question.
And I think something that I'm super passionate about in the privacy space because I think that sometimes the way that we talk about different threats, even that is a very, we talk about them in very technical terms.
But at the end of the day, I always like to take a step back and explain a little bit that when the Internet was first designed, it was not designed with privacy in mind.
And we didn't have private and secure ways of communicating with each other on the Internet or visiting websites or anything like that.
So you can imagine that once we started, you know, once people started wanting to, you know, do things like banking online, making credit card purchases, doing private communication, that the, you know, the Internet standards that we had needed to evolve.
So what I would say is that usually when you're using a browser, you know, you can use any type of browser from Chrome to Firefox to Safari.
A lot of times there are different threats that can arise from just regular browsing on the Internet.
One of these threats is persistent tracking of an IP address so that your IP address is basically an identifier that oftentimes advertisers or third party services use to attach to, you know, a person or an identity, even though that's not necessarily correct.
And the way that they use it is that they kind of sit on the network, sit on the network and look at the activity that, you know, sites that you're browsing and then associate that together across the different websites and across the different applications that you're using.
And so when you're using something like Edge Secure Network, you're actually mitigating that threat by using Cloudflare as kind of like a secure tunnel to visit every one of those websites or applications.
And so the only thing that those third party trackers can actually see is the IP address of the Cloudflare network, not the IP address of the device or router in which you're using in your home or on your phone or anything like that.
Got it. Yeah, that makes sense. Okay, so shifting gears back to the techno nerds.
Let's talk about technology. Can you talk a little bit more about like what type of technology are we using here and like how does it actually work?
Yeah, so I'm really excited to talk about this because I think it's a really cool and innovative design.
We use a combination of two different technologies and both of them are actually based on open Internet standards.
So for folks who don't know about open Internet standards, something that Cloudflare is really passionate about, we are very involved in a standards making body called the IETF.
And for the control plane of the privacy proxy platform, we actually use something that's called tokens.
The standard here that it's based on is the privacy pass standard. And that is the control plane for allowing us to control which clients are allowed to connect to our privacy proxy itself.
And the way that that works, you know, this is obviously glossing over a lot of details, but basically the way that that works is that there is an issuer that Cloudflare operates and the Microsoft Edge client.
When a Val user wants to connect to the service and has a valid, you know, is using a valid Edge client and is logged into their account, they prompt the issuer for a token and then the issuer gives them a token based on the validity of that information.
And then the client then uses that token to connect to the proxy. So the proxy itself is the one that is validating whether those tokens are indeed the correct type of, you know, the correct type of token and that they were, you know, minted by a Cloudflare issuer.
And the reason why that is a kind of, and the privacy protocol is a cool design, is that it allows us to test for the validity of a user or a Microsoft Edge client without actually knowing a lot of information about the Microsoft Edge user themselves.
The second part of the privacy proxy platform is the data plane, and that is the kind of tunnel that I was talking about before that is, you know, forwarding packets and creating this secure tunnel for all of your traffic.
And that's also based on an open Internet standard called HTTP connect.
So most folks are probably familiar with the HTTP method, it's what we use to browse the Internet every day.
And HTTP connect is a method within the HTTP standard that just defines a forward proxy.
And it's an oversimplification but at the end of the day this data plane is just basically a giant HTTP connect proxy.
Cool, that sounds super exciting.
So I'm psyched. Let's say I want to actually use this.
What do I do? Can I sign up today or is there somewhere where I can follow updates?
Yeah, so it is available in the Microsoft Edge browser today. If you go ahead and check out my blog post, there is a link to the Edge Secure Network VPN kind of splash page for the feature.
And basically it'll guide you to download, if you haven't already, download the Microsoft Edge browser.
And once you have that downloaded, it's pretty simple in the settings to toggle on that feature.
Cool. So what's next?
Do you have any other sneak peeks that you can share today? Yeah, as far as this feature with Microsoft Edge goes, we're kind of wrapping up the development side of that and we've proven it at scale and it's in production.
But we do have some other privacy proxy partnerships in the works. I can't share exactly with which partners they are right now, but they're really interesting threat models and really interesting privacy benefits to be gained.
So I'm really excited to push things forward there.
Cool. Always excited to learn more.
Yeah. So should we turn to ECH? I would love to hear a little bit more about what ECH is and why it evolved a little bit about its history.
Yeah, I'd like to talk more about ECH.
So ECH stands for Encrypted Client Hello, and it's a new standard.
You talked a little bit about standards and the IETF, the Internet Engineering Task Force, where we're active participants.
Encrypted Client Hello is a new proposed standard, and it kind of operates on the opposite side of the technology that you were just announcing.
So the Microsoft Edge side of things really allows consumers or any user to improve privacy themselves.
But we want to tackle it from all the sides all the time.
So that's where ECH comes in. A little bit of a history before we kind of talk about how ECH works.
So way, way, way, way back in the day when HTTP was introduced, everything was basically plain text, right?
You visit a website, let's say example.com. Nothing is encrypted. Anyone can view what you're doing.
It's just bytes across the line. So your ISP can see what you're doing, any other transit provider, anyone can basically see the plain text website.
And that includes stuff like your credit card or your banking details or all sorts of scary stuff.
So there was an immediate need for like, well, let's figure out a way to encrypt this.
And that's where SSL and later TLS came from.
So this encrypts the payload, like what you're doing on a website.
That still leaves kind of like two things up in the air that allow people to kind of follow along on what you're doing, assuming they're in like the network path.
So let's start with DNS. DNS is a protocol. It's one of the first things your computer does when you're visiting a website.
So let's stick with example.com.
The first thing you need to know is like which IP to connect to, like what is the IP of the actual web server that serves that website.
So your computer performs the DNS request saying like, hey, I have example.com, please give me an IP address.
And that again is normally done over plain text, like DNS doesn't have any encryption built into the protocol out of the box.
So we launched like DOH, which is DNS over HTTPS a couple of years ago, which essentially masks that when like a browser such as Firefox and others have built in support for this in the past.
So that masks like which website you're requesting.
But the only problem then is once you're actually performing like the TLS website, so you know which IP address you're connecting to, but you're doing like the TLS handshake saying, hello, I would like to connect to something like the example.com.
That handshake is still plain text.
And basically the first packets that your computer sends when requesting websites still contains like the name of the server.
So no one, assuming it's an encrypted connection, you can't see what the user is doing on the website, but you can still see which website someone is visiting.
And that's the last puzzle piece that we tried to solve with ECH.
So, in short, the RFC or the standard by the ITF proposes that you kind of split up the TLS handshake into, or specifically the client hello, which is the first part of the handshake, into two parts, like an inner and an outer part.
And it essentially negotiates the outer part with like a common name.
So for Cloudflare, we implemented this using a domain called cloud ECH.
And anytime anyone visits any website that has ECH enabled on Cloudflare, that's the only SNI like server name indication that any intermediary will be able to see.
Then once that part is decrypted, then the client hello, like the inner part, which isn't visible to intermediaries, that contains the actual website.
So you can kind of see it as like a two-part TLS handshake.
One saying, hey, I want to secure connection to Cloudflare ECH. That's what everyone in the middle sees.
And then, by the way, secretly, I'm actually looking for example.com.
And this is really great because Cloudflare operates a lot of websites.
If you want to, or if you enable this on Cloudflare, all of a sudden you're kind of like masked in the sea of websites that are on Cloudflare.
We're very much like trying to help people hide which websites people are visiting.
And this is really interesting to parties. Let's say you are a provider of medical services.
And you are apprehensive about people visiting your website in an intermediary scene.
You maybe don't want intermediary networks or ISPs to be able to see that random users are visiting your website to perform a certain medical procedure.
The name of your website might include something that's really, really sensitive already.
And having ISPs track that you're visiting that website is already kind of an invasion of privacy in our opinion.
So those types of websites is where we definitely advocate.
Go onto the Cloudflare dashboard and enable it today.
That's a mouthful, but go ahead. I was going to ask, is this something that obviously within the Internet standard space, it seems like this is getting a little bit of traction.
Is this something that the onus is on the customers to turn on and adopt?
Or do you see this becoming the de facto way that all requests on the Internet are served?
Yeah, so there's a bunch of moving pieces here, which are exciting.
So one, we very much had to build this out in tandem with other browsers.
So we work closely with the major browsers.
And you can see in our blog, both Chrome and Mozilla have at least indicated their intent to ship and ramp up support for ECH.
So it's incredibly humbling to work at a company like Cloudflare where we get to work together with these standards.
You're sitting between the ITF Cloudflare websites and these browsers to bring this together.
And it really allows us to move technology like this forward really, really quickly.
In terms of rollout on our end, we've enabled this by default for all free customers.
You can opt out on the dashboard.
If you're a paying customer, we don't like making changes on your behalf. So it's very much an opt-in.
But you can rest assured that you're already going to join a very large amount of websites.
So you'll immediately be masked in that sea of random websites that all advertise ECH.
If you do want to sign up, we have an opt -in waiting list right on the dash.
The blog contains a link. Or you can also just browse to the Cloudflare dashboard and go to the edge certificates section.
And you can join there.
So we're really excited. It's kind of like a perfect storm on our end of browsers supporting it and then we're supporting it.
And being able to bring this to life really, really quickly.
Whereas older standards, like if you look at something like HTTP2, it took years for everyone to adopt it.
And right now we're moving at weeks, months period.
So that's very exciting to me. Yeah, that is really exciting.
Is there anything like, is there a call or a call out to any other sort of developers or folks that might be interested in this space for what they can do to support the adoption of ECH?
Sure. So if you go to our dashboard, you can enable it right now.
That being said, it's a completely open standard. We are huge proponents of anyone supporting ECH.
There's nothing stopping you from running this on your own web server.
If for whatever reason you don't want to join the Cloudflare herd.
If you're looking to support this or follow along, the RFC by the ITF is still in proposal state.
So you can follow along there. And we'll continue to put up blogs to recap how progression is going from our point of view.
Awesome. Yeah, that's really great.
It's kind of like the perfect mirror of privacy, right? Like Cloudflare has a bunch of proxying and VPN style products that I talked about and I managed.
Like I said, the Microsoft Edge partnership that we announced this week.
But we also still have our WireGuard based work and our consumer version and our Zero Trust version.
But at the end of the day, a lot of people aren't going to use VPN or proxying style solutions for all of their traffic.
But it's still super important that their network metadata stays private and stays safe.
And so I think ECH is actually a huge way that we're fundamentally changing that and making it more private.
Yeah, absolutely. I'm just excited that we're really trying to attack this from both ends.
We're putting technology in the hands of like end users.
But with ECH, we're also putting technology in the hands of people who operate websites.
You can't control whether your users or your, like in the medical example I was using, whether your patients use a VPN.
But you can turn on ECH and help them that way.
And I'm super excited that we're attacking it on all fronts.
So that's about like privacy preserving technologies. But lastly, you mentioned the privacy preserving measurement blog.
Can you tell us a little bit more about that?
Yeah, so this is a super new, not super new, but relatively within the realm of Cloudflare new initiative that we're launching.
It is also based, unsurprisingly, in the standards world.
The working group at the ITF is called the PPM or the Privacy Preserving Measurement Working Group.
And the standard that is emerging from this group is called Distributed Aggregation Protocol.
And a lot of the concepts around DAP are very complicated. But I think that, you know, if we speak to the use case and the problem that it's solving, it's a little bit more clear.
Basically, a lot of times we want to collect measurements and to collect analytics about the products that we're building.
As developers, right, or as people trying to measure things, we want to collect those aggregate measurements.
But we actually don't really need to know what each individual data points are for that measurement.
Examples of this, a very simple example of this is like a poll or a vote, right?
You want to know the final output or the outcome of that poll or vote.
But you don't really actually need to know or store for in like, you know, store a long term what each individual person voted for which, you know, candidate or, you know, item or choice or anything like that.
The same goes, for example, if you're conducting a census, you might want to collect, you might want to collect, you know, the aggregate result of how many people live in each different area, but you don't actually need each individual response attached to the person's name.
Similarly, there's also, this also applies to like browser telemetry or developer tools, where you want to collect aggregate analytics about maybe what website that's crashing using your browser.
If you're a browser developer, the one that's crashing the most often or the network error logs, for example, but you don't actually want to store each information about each client and which website they're visiting, right?
This is the problem that the Privacy Preserving Measurement Group is tackling and that the DAP protocol is working on.
So basically, at a super high level overview, the approach here is called, is based on a concept called MPC or multiparty computation.
You take one of these measurements, the individual client measurements, and you do something called secret charting.
Basically, it's splitting that measurement into two different pieces and encrypting it to two separate aggregator servers.
So you could take my voting example, and you could say that Akil voted for candidate A and Mari voted for candidate B, and you have two different aggregation servers.
I would split my measurement into two different pieces, maybe a numeric value or something that indicated the candidate, and I would send half of my measurement to aggregator server one and half of my measurement to aggregator server two, the same for you.
And those aggregator servers would collect each secret shared input over a population of people.
So say up until, I don't know, a thousand people, right?
Until they have a minimum threshold for the secret shared inputs, then they compute a statistic.
So in our case, let's just say it's a simple sum.
In reality, this can apply to machine learning. You can do linear regressions. You can do means, things like that.
But let's say it's a simple sum. Those two aggregator servers calculate that simple sum and then put the pieces together at the end using something called a collector service.
So that's kind of at a high level how this multiparty computation solution is working.
And Cloudflare is also working on a workers-based implementation of this aggregator server service.
So we're working with some other folks in the space on how we can be operating one of these independent aggregator servers to do this calculation, to do those aggregation jobs, and enforce the kind of privacy that multiparty computation is looking at and advocating for.
Great. Very, very cool. I think that kind of wraps up our segment for today.
But super excited to learn about Microsoft Edge and our partnership there as well as privacy preserving measures.
Shout out, everyone. If you're a Microsoft Edge user, go out, try the new service today.
If you're a paying customer on Cloudflare, please just go on and turn on the ECH right now.
It's available right on the dashboard.
And stay tuned on our blog for more updates. Anyway, thanks everyone for joining.
Thank you so much. Bye.