The Future of Privacy-Preserving Technology
Presented by: Wesley Evans, Marwan Fayed, Nick Sullivan, Tara Whalen, Chris Wood
Originally aired on April 9, 2021 @ 7:30 PM - 8:00 PM EDT
Cloudflare Research is dedicated to advancing the state of the art in Privacy Preserving Technologies with computer science research. Today, the team leads will discuss what they see as the future of this space.
Featuring:
- Nick Sullivan
- Head of Research, Cloudflare
- Wesley Evans
- Product Manager, Cloudflare
- Tara Whalen
- Research Lead, Cloudflare
- Chris Wood
- Research Lead, Cloudflare
- Marwan Fayed
- Research Lead, Cloudflare & Adjunct Professor, University of St Andrews, UK
English
Privacy Week
Transcript (Beta)
Hi and welcome to the Future of Privacy Preserving Internet Technology segment on Cloudflare TV.
My name is Wesley and I'm the PM for the research team. With me today, I've got Nick Sullivan, Tara Whalen, Marwan Fayed and Chris Wood.
I'd love to first welcome everyone to our segment and say thank you so much for joining us during Privacy Week.
Cloudflare is committed to helping build the next generation of Internet privacy preserving technologies and rolling those out broadly, not just for our own customers, but for the good of the Internet.
Before we jump in and really start talking about how the research team sees the future of Internet and privacy preserving technologies, I want to go around the horn and introduce all of our panelists.
So starting with Nick. Nick, a little bit about yourself and what you do at Cloudflare.
Hi, so I'm Nick Sullivan. I'm the head of research here at Cloudflare, and I'm very interested in how we can apply technologies to make the Internet better.
And by better, I mean faster, more secure, more efficient, more private.
And my personal favorite way to do this is to apply cryptography, but I also love all the realms of computer science.
And so I'm excited to share what we've been doing for this entire year and what we're launching this week and share our directions, what we see as the future of research here at Cloudflare.
Awesome. Thanks, Nick. Tara, why don't you take us next? Thanks, Wesley.
I am Tara Whalen. I am the research lead for privacy here at Cloudflare.
So I've had a long privacy career with a lot of different roles. Here at Cloudflare, I'm enjoying applying my skill sets to helping us make more privacy protective technologies for customers, for our end users, and of course, for the Internet in general.
And so my work has been very specific projects here at the company, but I also will work with larger standards bodies for making a more privacy protective Internet as well.
Awesome. Thanks, Tara. Marwan. Thanks, Wesley. I'm Marwan Fayyad.
I was until recently a full academic professor with no plans to join an industry lab, and yet here I am.
I think that says something about the environment.
My own domain tends to focus on networking and systems. In the past, I have worked with streaming video supports at the network layer, routing and transport, co-founded an ISP for rural and remote broadband backhaul services, and like everybody else in the room, broadly interested in computer science and the engineering domain.
Great. Thanks, Marwan. And Chris. Yeah. Thanks, Wesley. My name's Chris.
I am a research lead here on the team, focused on networking, security, privacy, and applied cryptography, and generally how they all intersect with various protocols and projects here at Cloudflare and beyond.
I participate in standards bodies in various capacities to help advance and improve and ship these things at large scales, and I'm looking forward to talking about this stuff.
Awesome.
Thanks, Chris. And Chris may be the luckiest person on the call. If you hear the rumbling in the background, Chris is surrounded by a slew of dogs.
I didn't know if you could hear that.
And so he has the most pets out of any of us. And I'm Wesley Evans.
I'm the PM for the research team. Over my career, I've worked in a lot of different environments, research labs at Snapchat, doing engineering and product work at legal technology companies.
And here at Cloudflare, I work with the research team to help us think through and productize a lot of the work that's been on the lab bench and help bring it out into the world.
So things like Oblivious CNS, for instance, and future work on our blockchain space and the distributed web.
I'm really interested in how we can build a world that's more connected and secure and private so that we can help bring more people together in physical ways and in the virtual spaces as well.
So that's the research leads team. And what I'd like to do now, though, is ask Nick to tell us a little bit more about just what is Cloudflare research?
Why does Cloudflare have a research team and what do we work on?
Yeah. So Cloudflare research is Cloudflare's research lab. Surprise, surprise.
It's a small group of dedicated experts who try to tackle difficult problems in computer science that have a timeline of three to five years to beyond.
And our remit is working with experts in academia, experts across other companies and standards bodies, anybody who shares our vision to help build a better Internet and take these technologies and incubate them inside the company, advertise them outside the company, and make sure that these nuggets of ideas that help make the Internet better in various ways can grow and become something that affects everybody online.
And so Cloudflare is a fantastic place to do this work because of the global reach of Cloudflare's network and the commitment that the company has to being able to take technologies that are broadly good for the world and apply them and ship them quickly.
We have an amazing engineering team that we work closely with and the technology that we build, they aren't toys that are kind of put up on a shelf, they're real production systems that work with Cloudflare's real production systems and we work hand in hand with them to make these things like all the announcements that we're making this week and that we've made over previous years, over different weeks that really make a difference in making the Internet better.
One of the areas that we're really concerned with and we think there's a lot of opportunity to grow, and this is one way in which we help identify problems to solve, is where is there a gap?
What is there in the Internet, the way that it's built right now, that has something that was maybe not done right the first time around?
Or where is there a gap? Where is there something that we can actually improve?
And privacy enhancing technologies is one of those areas. Well, one of those areas that helps apply, that applies to what's happening on the Internet.
So the Internet's connected together, there's now trillions of devices that are online with IoT and with all the connected people around the world and there are standards for which these different participants and these actors and these components all fit together.
And a lot of these standards were created decades ago and they had different constraints.
The Internet was smaller at the time, it had fewer participants, there was more personal trust with respect to who your ISP is or what the website that you're going to.
And as the Internet has grown, there's been more of a need for security and the ability to understand and trust who it is that you're talking with and who you're sharing information with online.
And as the Internet has kind of exploded this year in particular, we've moved most of our lives online for those in the information realm.
There's more and more concern about where does this data go and this data that you're using to connect to various websites and so your relationship with different organizations and different websites is really the core of what data security is.
So what is it that you can share with the website?
These websites have privacy policies that go along with it.
That's kind of the top layer of privacy on the Internet is you have all these different organizations, all these different tools, these apps, and you communicate with them.
But under the hood, the part of the Internet that people don't usually see are protocols like DNS, protocols like TLS, protocols like HTTP.
These are the fundamental building blocks of how data is shared from different places around the Internet.
And so many of these, and we've noticed over the last few years as we've helped improve security of these protocols, we've also noticed that there are some holes with respect to privacy.
And so this week we're focusing on three specific technologies that help take something that's out there, that's existing, that's used by millions and billions of people, DNS, HTTPS, and even passwords, and taking these three technologies and finding a way to make it so that the data that is personal to you, that transits these different protocols, is something that you have more control over.
And that's something that we're really excited to do, and that's a big part of what Cloudflare Research does, is solving these problems.
And so the idea of building privacy into the Internet by default is what we're focused on right now, and what we're happy to talk about today.
Awesome. Thanks, Nick. And I just want to drill down on that a little bit too, because so much of what Cloudflare Research does, as you said, is we don't build toys that sit on the shelf.
We build real production services that run out in the real world.
And what we're currently talking about today is privacy-enhancing technologies, but the research team has a long history of building security -enhancing technologies, and that led us to find some of these gaps in what we're talking about this week.
Can you talk a little bit about some of the work that we've done in the past that's now running at global scale, maybe TLS, other areas that we've been working on?
Yeah. So there's so many fundamental protocols on the Internet that go under the hood.
You don't really notice what's happening. And one of the major ones is HTTPS.
And so you may know this as the lock icon in your address bar.
If you see a site that has the lock icon, it's using HTTPS, which is an encrypted and authenticated protocol.
Back before the research team was really Cloudflare Research, we helped out with this launch called Universal SSL.
And this was really where we helped solidify Cloudflare's role in helping making the Internet better.
So what we did at that time was we made it free and made it free for all Cloudflare customers to have a certificate automated and controlled for their website.
And this was something that was actually very controversial at the time.
There was pushback about... It was assumed that HTTPS was a luxury feature, something that people would pay for, something that we would want people to upgrade to be able to do this.
And this is actually a big debate within the company. And we really sat down and put our foot down and said, no, we want to make the Internet better.
And we want it to be secure for as many people as possible. So Universal SSL was something that we launched just over six years ago to help make the Internet more encrypted and help move the world from a mostly unencrypted web to a mostly encrypted web.
And we played our part in that. And over the years, we've got involved in other aspects of this, one of which was TLS 1.3, which is the latest version of the encryption protocol that makes HTTPS so secure.
We got involved with the standardization of this, and we're the first to deploy it on a wide scale.
And in deploying it early, we helped identify different problems that came about through deploying it.
So it actually was... It was great when TLS 1.3 finally became an RFC, it was already running on so much of the Internet.
And this is something we consider to be a really great success, is using Cloudflare's scale to help bring new technologies to the widest audience as early as possible so that security issues can be found early and deployment issues can be found early, and other people can kind of join along on the ride with us.
Well, I love that you mentioned HTTPS and our early involvement in that, and also the fact that we were so early in that process and helped push it forward.
I want to talk a little bit, Marwan and Odo, Oblivious DNS is something that we launched this week.
It really is the next evolution of DNS itself. For those of you who don't know what Oblivious DNS is, it uses HTTPS and DNS over HTTPS, so encrypting your DNS query, but actually obfuscates the client IP address.
So in a sense, you get this really secure and also really private methodology of utilizing the phone book of the Internet.
Marwan, I'd love to hear from you a little bit about how you see Odo as one of the next evolutions of DNS, both from a sense that you ran an ISP back in the day, and also talk a little bit about our partners that are involved with this.
Cloudflare isn't just launching this and then also running our own proxy, we're involved with a number of partners that are actually providing the proxy service, right?
Yeah, so Odo, Oblivious DNS over HTTPS, it's important to start with a couple of things.
First, we're lucky actually to have one of the co-authors of the draft in the room with us, that's Chris.
The second thing is, one of the things that makes this story particularly nice to follow is the DNS history.
So DNS has been long -standing, clear text, it's, as you say, the phone book of the Internet.
We use it to find out where we need to go to get information from web servers and the like.
But anybody could see your messages. So this is going to be familiar to many in the audience from DNS over port 53 in clear text, we migrated to DNS over TLS and DNS over HTTPS.
And I don't think I would be alone saying that we were surprised, I think, everywhere in the community by the amount of attention that that got.
For a couple of reasons, one is people are concerned that somehow putting DNS in HTTPS violates the layering that we've built over the Internet, the separation of functions.
That might be true in the HTTP case, but it's important then to understand why a DOH, the HTTPS, dominates over DOT, the TLS.
And DOT is preferable, I think, from many people's perspective.
The problem is it's still easy to manipulate or prevent DOT connections from getting through.
And so HTTP tends to be the one thing that will reliably get through on the Internet.
And so this tends to be the version that has dominated DNS over HTTPS.
One of the criticisms on the back end, however, was the world isn't ready for this, meaning at the time, there were only three or four providers of a public DOH service.
ISPs still have to catch up. The reason for this, by the way, one of the is there's no discovery service.
This is going to be relevant in just a minute.
So the consequence is that there's a large, a huge number of DNS queries that are going to a very small number of places.
And this raises eyebrows.
One of the benefits of oblivious DNS over HTTPS or ODOH is that it addresses one of those concerns.
So it puts a proxy in the middle in such a way that the proxy can see the IP addresses on both ends, the client and the resolver, but cannot see the query.
And the resolver gets to see the query, but cannot see the originating IP address.
This is a lovely property to have. And there are a couple of things to say about this.
One is there's an evolution of oblivious and privacy preserving types of protocols on which ODOH builds.
The second thing is that ODOH is built in such a way that the queries are submitted on a per message basis or a per query basis, rather than just establish one pass through connection.
And that's important to preserve privacy. And the third nice feature that was built into ODOH is that you can actually pick your proxy and pick your target.
And picking any one at any point in time does not restrict you from changing that in the future.
And, you know, who are some of our proxy partners that we've, you know, early stage gotten on board with us?
Right. So one of the things that CloudStare has gotten very good at doing is building these types of partnerships with people.
So we have a few different partners. First, the draft itself is coauthored between us and Apple and Fastly.
And at launch, we are partnering with PCCW, Equinix, provide data center services, and an educational research network in the Netherlands called SURF.
And they are working with us to provide proxy services that are available to the public, and crucially to what has turned out to be a growing number of interested clients.
So if you want to run this right now, there's open source implementations of everything on GitHub.
You can download them, run a proxy, run a target resolver, or even run a client.
But right now, it's the only way to use ODOH as a client is to download our client.
So we do have browsers, among others, who that we're talking to, and they've started testing using the proxies and our target.
And we really hope that this part, this alliance of people grows.
Thanks, Marwan. I'm super excited about this. As someone who cares a lot about privacy, I think that the ability to have something like ODOH baked into 1.1.1, which is Cloudflare's own DNS resolver, but then also be able to work with all these other partners, it's a really amazing advantage for the Internet.
I think about how can I get these technologies, not just into the hands of people who understand this technology at a really deep level, but how do I get into the hands of my mother, my father?
To have them be able to start seeing some of the benefits of ODOH long -term, I think is just really exciting.
And so there's another protocol that we've also been spending a lot of time working on this week.
And while we haven't launched it yet, I am really quite happy because it fills in, as Nick was saying, another gap in the Internet.
So Tara, can you tell us a little bit about what ECH is and some of the work that Cloudflare is doing on it right now and why it's important for the Internet as a whole?
Sure.
I can talk a bit about the encrypted client Hello. And again, as Marwan said, we are fortunate to have two of the authors of the spec here on the call.
So Chris and Nick are here.
So thank you for your work. And I also welcome you weighing in with corrections and additions as needed, because I know you've been pretty deep in this work for some time.
So with the evolution of TLS, sort of as a protocol that's designed for security and designed for privacy, it has evolved over time to become better, become more robust.
There've been a lot of issues, weaknesses identified over time.
And one of the issues that has happened is in the point where before the sort of full communication has started, where the handshake begins, where everyone is getting set up, they're negotiating their connection, information can be leaked at that point where you can then learn about, for example, which parties are communicating.
And the purpose of the encrypted client Hello was to add an additional layer to ensure that there can be less information leaked about, for example, what server someone is connecting to.
That information was previously exposed. There's now been more work to try to keep that sort of under wraps that a connection can still be effectively made over this complex infrastructure, but a lot of that metadata in a sense about who you might be connecting to won't be leaked to third parties.
So this has required a lot of work, a lot of work with preexisting protocols, for instance, in order to get this deployed.
This requires a lot of cooperation, a lot of work across a lot of systems.
So it's definitely speaks to the importance of the work and the amount of heavy lifting that has to be done to get something like this rolled out.
But the payoff then you get is that you have far less data floating around about, for example, what is the specific server that someone is connecting to when they're trying to have a private communication with a party on the Internet.
There has been a lot of work towards getting this matured. There is still some work to be done, I believe.
Some places where traffic analysis work will remain to ensure, for example, you might not leak that someone is going to a very long domain name versus a very short domain name.
Those things won't actually be as evident if you try to do traffic analysis.
That's going to be worked on in the future.
But for right now, it closes off a large place where data about people's private communication could be leaked.
That's great. And I think it's such an important thing to be talking about because Cloudflare was one of the leaders in launching ES&I a few years ago.
And ECH is the next evolution of ES&I. For Chris or Nick, you guys have both been deeply involved in writing the spec.
And as Tara mentioned, this isn't just Cloudflare coming up with something.
We work with a broad community of collaborators to refine and test ideas and co-develop these standards.
What's the process been like of moving from ES&I to ECH?
Do you want me to take this one, Nick?
Yeah, take it, please. Yeah. The evolution of ES&I to ECH has been rather turbulent.
After we initially launched ES&I, we started to really carefully look at it under consultation with various academics and cryptographers in the community to see whether or not we actually got it right, whether or not we protected against all the relevant attacks.
And our threat model was consistent with what's relevant today.
We started finding small kinks in the armor, so to speak.
And eventually, we wound up just completely going back to the drawing board and asking ourselves, like, OK, it's really clearly hard to hide this one very sensitive field, the ES&I.
And what can we do to hide everything and also prevent all these other additional attacks at the same time?
And in collaboration with all of the other people in the IETF, Google, Apple, Mozilla, Akamai, Fastly, you name it, everyone who's basically in the TLS working group participating on a regular basis has had some input and contributed to the specification in some way.
We converged on this sort of complete encapsulation mechanism that Tara was describing earlier, wherein all of the important metadata is effectively encrypted subject to traffic analysis and so on that we'll address in the future.
But it's really been this massive effort alongside people in the industry and academia.
And it's funny, because the initial effort, the motivation was so simple, like, just protect this one small field, like, no more than 255 bytes in a client hello.
And the number of things that have had to change and move to do that effectively and correctly is astounding.
And we're still not done. I mean, we're actively working on publishing the next version, which we hope will be an interoperable target for browsers and surfers alike.
And then hopefully, we'll continue there on, assuming that goes well, to sort of the final state of ECH.
But yeah, this has been a long time coming.
We've learned a lot along the way. And hopefully, the end is near, but I don't want to speak too soon.
Yeah, it's so interesting.
I mean, a phrase that stands out to me to sort of pivot the subject a little bit is this idea of kinks in the armor, right?
The way that we've conceptualized the Internet from back in the day, there are sort of flaws in the underlying way that the world works that we might want to go back and rethink.
And you all clearly did that, moving from ESNI to ECH.
Something else that we launched this week is a new sort of passwordless protocol called OPAQ.
And OPAQ takes an entirely different viewpoint of understanding how passwords can and should be used by servers.
Can you talk a little bit about what OPAQ is and why it's such a bleeding edge idea?
Is that a question for me? Yes, it's a question for you. Yeah, sure.
So OPAQ is really interesting because it's fundamentally trying to address the problem of authentication on the Internet and authentication in other ecosystems and application scenarios as well.
And you would think, if you go back through the cryptographic literature, that authentication is a relatively simple problem to solve.
We know how to do public key cryptography. We know how to sign things.
Why is this still a problem today? And it's this usability aspect of it that really paints us into this, or causes problems for us.
Passwords were born because we needed short to remember strings that we could authenticate ourselves with, things that we know.
But unfortunately, the way they've been deployed on the Internet, as you've kind of pointed out, there's a lot of kinks in this particular authentication mechanism.
The first of which is that in the overwhelmingly common deployment model for passwords, which is send your password over an unencrypted connection between the client and server, the server still has to do something with this password.
And one of those things might be mishandling it, or accidentally writing it to a log file, or otherwise leaking it somehow.
And we've seen various instances in which passwords are leaked to detrimental effect.
Now, it would be great if we can move entirely away from passwords altogether towards something like WebAuthn, which is this emerging web standard based on public key cryptography, and private keys, and YubiKeys, and Face ID, and all the things that modern devices seem to support.
But there's going to be legacy clients around for a while, there's going to be users who are trained and accustomed to passwords.
And Opaque tries to fill this void, tries to address this problem of password mishandling in the presence of legacy devices, in the presence of sort of ubiquity of passwords.
And at a very, very high level, what it effectively does is it uses cryptography in such a way that rather than reveal to the server during the authentication flow, the actual value of the password, like I love Cloudflare, you know, exclamation point, it proves to the server in zero knowledge that it knows the correct password without actually revealing what that password is.
So when you say passwordless, it's not truly passwordless, it just hides the password from one party in the protocol.
And if you hide that password from the server, in this particular scenario, you remove the ability to mishandle it.
And this has been the theme for ODO, like taking away IP addresses such that we can't mishandle them.
This has been the theme for ECH, taking away SNI so the network can't mishandle them.
And this is just trying to, you know, put in the right boundaries in place for sensitive information.
I couldn't have said that better than myself.
You know, you really hit the theme in the nail on the head there of, for the week, the whole modus here has been blinding and obfuscating data with OPAQ, with ECH, with ODO.
And I know we only have 58 seconds here, so I'm sorry to cut you off.
And I just want to throw it back to Nick for, you look out for the next year, for this week, even.
What are you excited about? What are you thinking that CloudFloat Research is going to be showing off?
You know, can you give us a little sneak preview of stuff?
Yeah, so I'm very excited about ODO. This is the only real product launch that we're doing this week.
OPAQ is a demo. We hope people that try it hop on and join the OPAQ train and help solve this problem with us.
And with ECH, this is going to be a world mover when it happens.
And I expect it to happen sometime soon.
Some of the themes that Chris hinted on here are with respect to zero knowledge.
Unfortunately, we have four seconds left before the clicker gets us.
Thank you, everyone, for joining us today.