🔒 From Idea to Internet: Deploying and Developing Privacy Enhancing Protocols
Presented by: Watson Ladd, Tanya Verma, Chris Patton
Originally aired on November 30, 2021 @ 8:30 PM - 9:30 PM EST
Developing new protocols and deploying them is a process that often seems mysterious and intimidating. It's actually an open process that anyone can participate in. Come learn how Cloudflare Research participates in standards development, implements new protocols, and works with our partners to help make a better Internet.
English
Privacy Week
Transcript (Beta)
Good afternoon to all our viewers around the world from sunny Berkeley, California. Today we'll be talking about the journey of a new protocol from an idea to running on servers around the world with documents describing it so anyone can implement it.
Both Chris and Tanya have worked hard to turn some ideas into reality.
Chris is a software engineer on the research team who has been working on a new feature for TLS called Encrypted Client Hello that aims to improve the protocol's privacy.
Tanya is also a software engineer on the research team.
She's been working on adding support for Oblivious DNS over HTTPS or ODO, a more anonymized version of DNS to Cloudflare Resolver 1.1.1.1.
So Chris, what does ECH do and how does it improve privacy?
Yeah, so ECH is, yeah, so ECH, I think we should first like start off by saying, well, what is TLS?
In case people don't know. So, you know, when you're going to a website and you see that little lock, that green lock or gray lock in my browser, and it's telling you that the connection to the website is insecure.
And what that means is that it's encrypted.
So in order to encrypt traffic, the client and server first have to exchange a key.
And that's the job of the TLS handshake. So the TLS handshake is this, what we call a key exchange protocol that's used by the client and server basically to agree to agree on a key over an untrusted and potentially insecure network.
So that's the TLS handshake. So ECH is a new feature for the TLS handshake.
And what it does is essentially this. So when the client and server run the TLS handshake, they're exchanging a bunch of cryptographic parameters over the communication channel that are potentially privacy sensitive to the client.
So this includes some of these. Yeah. So this includes things like the big one is called the server name indication SNI extension.
The SNI is basically used for the, it's used by the client to tell the server which website it wants to go to.
So if you're going to a website behind Cloudflare, Cloudflare hosts a bunch of different websites.
And when the client's making a connection, the Cloudflare has to be able to tell which origin server it wants to go to.
And so the SNI is how the client says, hey, I want to go to example.com.
And the server will provide what's called a TLS certificate for example.com.
So it's essentially this very clear signal of where you're going to.
Behind Cloudflare.
And it's also used all over the Internet. It's not just Cloudflare, it's other CDNs.
And it's also, you pretty much use the SNI anywhere on the Internet, no matter where you're going.
So, and then there's other things like there's an extension called ALPN, which is kind of, it's used by the client and server to figure out what they're going to do next after they've done the handshake.
So whether they're going to use HTTPS or some kind of email server, email protocol, something like that.
So yeah, there's all of these sensitive parameters that are exchanged during the handshake.
And the goal of ECH is to protect those parameters.
And so what it does is we, the client learns the public, an encryption, a public key to use to encrypt its sensitive handshake parameters before, before beginning the handshake.
And in order to learn this key, it has to go over something called DNS.
So which is ODO, oblivious DNS over HTTP. So Tanya, what is ODO and how does it improve privacy?
All right. So DNS, if you're not familiar, is like the way that, is the way that like computers look up what the IP address of host, of like a certain domain name is.
So it's kind of like a, like a page book for the Internet where like you can, you have the address for like every single host name.
So like google.com has like a certain IP and so on and so forth. So when DNS started out, like it didn't have any authentication features.
So that means that if I asked a certain DNS server what IP address, you know, google.com was, it would tell me something, but I had no way of knowing whether or not like someone hadn't intercepted that connection and just served like a wrong IP address to me.
And this caused like lots of attacks, like DNS hijacking attacks, where like some other, some other like entity would say that google.com was like some other IP address and would take me to that page instead.
That can cause like all sorts of issues.
So one of the first things that happened in like DNS security was DNSSEC. And DNSSEC was adding authentication to DNS responses.
So if I asked a resolver what the IP address of google was, it would send me the response for like what the IP address was, but it would also send me some additional data that allowed me to verify whether or not it was the resolver that had sent me the answer.
So that was the first thing.
Then came in DNS over HTTPS. That was another addition. First off, let's back up a little bit and like try to figure out like why we even need privacy or like, you know, within DNS.
So a lot of ISPs around the world and like just like a lot of other entities, look at what DNS requests you're trying to get and like they block IP addresses based on DNS.
So that makes DNS a very attractive vector to secure because just like a lot of like privacy leaks happen through it.
So DNS over HTTPS was a protocol that was designed with this particular use case in mind.
So instead of sending an unencrypted DNS query to a resolver, we started sending it over HTTPS, which is an encrypted channel.
So it's just like an HTTP request that's being sent.
And so this prevented people outside of that TLS connection that is like outside of like the client and the resolver to be able to see what DNS query, what DNS hostname was being queried.
And so this was pretty good, but it still allows the resolver who is getting the client's query to see what client and like what the client's IP address is.
So like they can have a map of, you know, like what every single client is querying.
So that's also kind of bad.
So then there was this other idea called DNS over Tor, which basically send DNS queries over the Tor network.
And so that's like over multiple hops with like, you know, onion routing and everything.
And that's very slow, like everything in the Tor network.
So another solution that was proposed recently was oblivious DNS over HTTPS, which is the protocol that we were trying to talk about first.
So Odo, what it does is that it separates the client IP from the query by encrypting the client's DNS query with the resolver's key.
But instead of directly sending the query to the resolver, it sends it through a proxy.
So this ends up decoupling the client IP from the query, since the proxy cannot see the query as it's encrypted by the key from the target, the target resolver, and the target cannot see the client IP since the message came via a proxy.
So in a sense, it's kind of like a one hop onion routing setup.
Um, so that's, that's essentially what Odo is. And Odo got started at Apple as a protocol, because Apple runs a DNS resolver.
If you are, if you own a Mac, then like, like you probably are using Apple's DNS resolver.
And Apple tries to be somewhat privacy focused. So they, they came up with this idea that like, hey, we want people to use our DNS resolver, but we don't want to know what their client IPs is.
So like this, this could be like a way to go.
And so this protocol is still in the draft stage, which means which we'll get to later what that means.
But essentially, it's being worked on. It's like still work in progress.
So right now, if I wanted, if I walked to well, ignoring the current situation, if I was sitting in a coffee shop, say a year ago, and I was logged into their Wi Fi, they could see the names of every website that I visited.
And they'll see that because they'll see the DNS queries I was making, and they could read them off the wire, even if I was using another resolver.
And they can look at the SNI and look at all the domain names there.
And now, you know, DOH closes a little bit of that with oblivious DOH.
Not only can they not see the queries, nobody can see the queries and who I am.
And with ECH, encrypted client hello, nobody looking at the wire can see the domain name at all.
That's basically it. Yeah, I mean, yeah, I mean, I it's, it's we're, we're, we're kind of we're just trying to with these two standards, like together, I think that we are like, we're kind of we're, we're pushing the ball forward and making the Internet a little bit more private.
For sure. Yeah. So let's talk a bit about we sort of touched on that these are early early stages of standardization.
So what are what's described as a protocol right now, when you and the people who run a ODO proxy or our server and the browser, the people who make browsers like Firefox and Chrome, what do they look at to make sure that what they write will work with what we write?
So there's one standard document that everyone looks at, which is an IETF draft or RFC.
So IETF is, is an organization that basically specifies this specifies every single protocol very explicitly.
So everyone can look at the spec and implement their own version of it.
And it will conform, it will be interoperable with someone else's implemented version because they both match to the spec.
So the spec is just kind of like an interface, like if you're, if you program in Go or like, or, you know, any, any object oriented language, like you have certain interfaces that like you can implement certain classes from.
And so like, that's kind of what a spec is like, you can have different implementations of it, but you all can interoperate.
Except that it's in English, which makes things harder.
I mean, there's, you know, I, so I, I don't know if you guys know this, I think maybe you do, but the Internet doesn't know this.
I got my PhD in cryptography.
And what I studied during what my thesis was about is sort of like the, kind of the, the gap between what we call provable security.
So the goal of a cryptographic protocol is not only it's secure in the sense that it meets some mathematical definition of security, and I can prove it does based on, based on some like well-established assumptions.
And there's a significant gap between provable security and what a spec is, because a spec is, you know, it, they work, you work, they work really hard to make the spec like as precise as possible.
I think like, you know, like it's, but it's, but it's essentially it's English.
It's, I mean, we should talk about, I think a lot, like how they're developed, because it's, it's a lot of people's like perspectives being dumped into a single English language document, or, you know, there's, there's, there's, there's standards in other languages as well.
And there's a lot of chances for a given implementation to either not match the spec or just, just interpret something that was ambiguous in the spec in, in some way that another implementation differs.
So interop only happens when we actually like test that our different implementations of the same protocol actually talk to each other.
And I, I don't know, I, I think that's fun.
It's one of the, it's one of the fun things about protocol standards.
So when you talk about the standards of the IETF, there's, it's not like the whole IETF, which is this huge organization, is, is working on every single one.
There's like different groups that are in different areas. How does that work?
I think, I think you should answer that question, Watson. I think you know better than Tanya or I.
I mean, I've been like following the TLS working group pretty closely.
I don't know if ODO is in its own working group. I don't know, Watson, why don't you first say what a working group is?
I don't think so, yeah.
You know, it's never really nice as a host when your guests do this to you.
The whole point of having a bot is so they talk.
Yeah, so, so part of Internet standardization is that the Internet is extremely complicated.
And these working groups are going to certain areas and protocols and maintaining them.
And they'll have, specifications can come in and there's various ways in which they can become what's called an RFC, which is a request for comment series.
And they describe the technical infrastructure of the Internet.
And working groups, they'll be drafted, anyone can submit.
And a working group might form, or there might already be one that adopts the draft and says, yes, I want to work on this.
And hopefully produces a document that describes the consensus, the working group.
And then it goes on through a long and mysterious process that ultimately culminates in it becoming an RFC or not.
But the process isn't that mysterious because it's public.
I mean, that's one of the cool things about the IETF is that anybody gets to participate in the standardization process.
Oh, yeah. There are gigantic mailing lists where you can see every step of the process.
And there's lots of tracking of where things are and what they go through.
When I was an intern, I signed up for like- At Cloudflare?
Intern at Cloudflare? At Cloudflare last summer.
I signed up for a bunch of these mailing lists. And now they just crowd out my inbox.
But it's really fun to follow the conversation. If you're interested in a certain area, just try to see what protocols are being developed in that area.
And just follow the conversation. I was following one about message layer security, which is just yet another protocol for group messaging and scalable group messaging.
I know there's- It's supposed to be the end-all protocol. It's supposed to be the one that encompasses- Everyone says that they're the end-all protocol.
But yeah, I mean, this is yet another stab at being the end -all protocol.
So it's kind of fun to see how tidy the details are that have to be resolved.
Even when I was working on Odo, there's things that you feel are so nitpicky about certain things.
When you're talking about protocol design, just extremely tiny details, like this bit should go over here.
And no, the field and the struct should be above this other field.
All of those things matter so much when all communication is happening over the wire.
And what that means is that you have your code, but at the end of the day, when you transport something from a client to a server, between two hosts, you have to be extremely sensitive about how you're doing it and what format your data is in.
So big-endian, little-endian, it's something you learn when you're learning C or something like, oh, this is the byte order of the network.
This is the byte order of my machine. That becomes extremely important because the way that you're sending out data is being consumed by another computer, and they might not be running the same system as you.
They might be in a totally different part of the world, which has a totally different convention than you.
So you have to be extremely explicit about what byte order your protocol is supposed to be serialized in or deserialized in, and just details like that, like how many bytes should the length of a vector be?
Do you use 64 bytes?
Do you use eight bytes to specify the length? All of those things begin to matter so much.
So that's what takes a long time to finalize. A lot of the big parts of a protocol aren't that difficult.
It's just everything around it that just ends up taking a lot of time.
Yeah, I mean, I would add to that just the wire format is really important, and getting that right is hard.
I totally agree. I also think it's tricky when the, so as we said, you know, so I guess we should just take a step back and kind of clarify.
So every IETF protocol ends up as an RFC. That's the goal.
So RFC 8446 specifies TLS 1.3, which is the latest version of TLS. And it's kind of, we call it the gold standard of real world cryptography today.
And where was I going with that?
Well, so yeah, but each RFC begins as an Internet draft, which is the details of the Internet draft are, you know, discussed over months, more often years, among the various stakeholders who are interested in influencing its design.
And these stakeholders are people like representatives from companies like Cloudflare, but also just people on the Internet who care about how TLS works, or how ODO works.
So the thing that's been tricky for me personally to navigate is when the requirements, people have different understanding of what the requirements are for the protocol.
So from a security perspective, what properties should we have of this protocol?
I think ODO is a rare case where it's like really clear.
We want to be able to, you know, we want to be able to provide this obliviousness by just having a proxy that is essentially just forwarding a ciphertext that it can't decrypt to the target who then decrypts it.
And the target doesn't know where what IP address the query originated from.
And that's really simple and clear. EC is a lot harder in this respect, because it's, you know, it's kind of ODO doesn't change TLS.
And this is an extension is by nature, just changing, I would call it the shape of the protocol.
And everyone kind of, we all need to come to a consensus on what is the correct shape.
And that starts with like requirements, which thankfully, I think are pretty well specified for ECH.
I think, you know, there are some, not everything, but yeah, I mean, yeah, I mean, that's the part that I'd like to, I haven't, I've always, I've worked on a few Internet drafts at this point.
And I've always kind of been some like, I've always joined the conversation somewhere in the middle.
So there's like the very beginning when it's like, you know, a handful of people's idea, they get together at an IDF meeting, and they build it into this bigger idea, which, you know, then the community decides, yes, pursuing this idea is a good idea.
And let's work on a standard. So I've always been kind of in the middle to the end.
And I haven't, I mean, it's, I don't know, both of you have gotten to work on the beginning of a standard.
What is it like?
What is it like to kind of influence the core design? I came, I wouldn't say I came in the middle for sure for Odo.
But I wasn't like there at the very beginning.
So at the very beginning, like, it's just, you know, you have like, somewhat of a clear picture of like, what you want to accomplish, but like, you haven't gotten, you know, like the implementation details worked out.
So like, an example would be, you've got oblivious DNS over HTTPS.
Let's say that you have an error, like the client, client is using a public key that is not valid.
Or like, you know, like it's using a public key that has the wrong, like, suite of parameters.
How would the server respond to that? So that's like something you would have to think about.
So this is not like very late stage, but it's not like super early stage either.
Because like, you've gotten to the point where you're now thinking about, hey, how do we handle errors?
At the very beginning, it's more like, my idea would be that, you know, like, hey, we have this need, we need to like, like, figure out like, what to do about this need?
Like, do we go like, do we make our own protocol that we just like use?
Like, you know, internally, do we create a standard for the Internet?
Like, you know, like Apple started Odo. You could ask the question, hey, why didn't they just like use that for their own resolver?
Like, after all, like, Apple is such like a vertically integrated system that like, you know, they could control it.
Like, I mean, they control their chips now, like, they're creating their own chips.
So like, you know, you could say that, like, hey, why did they use the resolver to do oblivious dough?
Like, why did they have to push it to the IETF?
And like, kind of make a question is like, why don't they just do it unilaterally?
Like, why do they need buy in from the rest of the Internet?
Yeah, yeah. It's a good question. I don't know the answer to that.
Yeah. Like, it's, I mean, it's probably, you know, like, like, it depends, right?
Like, like, some people might want to just do it quickly and do it like internally.
Some people, they might not because especially with privacy related things, I think that it matters a lot to have things in the public domain, like, everything where you claim needs to be private, like with with the chips, for instance, that Apple is using in its new laptop, like in that case, it matters less, in my opinion, because you know, like, they're, they're competing on speed, they're competing on like a whole host of other factors, perhaps you could say on security, but not privacy necessarily.
But when it comes to privacy, like, you want to show the world like, hey, this is what we've done.
And that's why, you know, like a lot of privacy oriented software is open source.
Like, I think, yeah, I think that's like one of the reasons that they might have thought of, like, you know, doing it in the public domain.
So you know, other people can verify and like, check their work again.
I think that's right. But also with Odo, like someone has to be a proxy.
Is Apple going to be a proxy? Cloudflare is a proxy, right? No, we're targeting.
Cloudflare is not a proxy. Not right now. We might have one in the future.
I'm not sure. But for us, it makes more sense to be a resolver. And point because like we can, we already have all the DNS resolution network set up.
So we already support DNS over HTTP.
Yes. So like, we have a third party who is a proxy who we don't collude with.
Exactly. Yeah. Right. Yeah. And, and the good thing about having like a separate proxy, is that anyone like, it's very, like, simple to set one up.
It's not as simple to set up like a whole like DNS resolution server. And so like, people can just use like whoever they trust most.
It doesn't have to be, you know, like, like a big organization.
Right? Yeah. Yeah. And this is another benefit of open standards.
Anyone can go ahead and implement the proxy and run it.
And if they have requirements for their own environment, they can add it to the code and know that's going to keep working.
Yeah. So you mentioned, so I've, I've acted in a number of working groups.
I was around while a grad student for the early development of TLS 1.3.
I'm currently, and one of the big things is evolution, where you have a protocol, you've made certain mistakes, and you want to change, you want to solve those mistakes.
And you frequently run into a problem where it's an example of a mistake, Watson, just to make it concrete.
Like a bug in the spec, or like a bug in the code?
Well, there's any number of mistakes, or is it any number of protocols?
Sure, yeah. And you also have, so in NTP, there is the dominant implementation by number of devices.
And, you know, playing nicely with the working group, we have, it's somewhat political, it's also somewhat technical, because they had a DNS amplification bug.
And because of that bug, ISPs now filter very heavily NTP packets.
And this created problems when we try to release NTS.
Because all of a sudden, packets don't get through if they're between certain lengths.
So for the sake of people listening, what does NTS stand for?
Network Time Security. So Cloudflare operates time.Cloudflare.com. And you can use it securely if you have a NTP daemon that supports NTS.
Just put, tell it to use NTS.
There's a number of guides on our website for the documentation of your NTP daemon.
And it will communicate with us securely. It will go set up some keys and then use those keys in the future to secure the packets going back and forth.
And if you're on an ISP that's blocking NTP packets of a certain length, it will not work the first time, not work the second time, not work the third time, but the packet gets bigger and bigger.
Because every time the packet tries, it says, hey, I'm going to need more keys back.
Is this a protocol bug or an implementation?
This is a protocol success. Oh. The packet grows and eventually it's longer than the range of ISP filters.
And it goes through. It restarts.
I see. Okay. And every so often we get, you know, there's an email to the crony users list or some email list and it's, why is one in four of my packets getting through?
And you ask for the PCAP, you see the PCAP and you see this very nice rhythm where this, you know, doesn't get through, doesn't get through, doesn't get through, gets through, doesn't get through, doesn't get through.
The early bug in this implement, it's not even a protocol issue.
I don't think mode six was ever specified other than some servers do it.
But it's widespread bugging implementation forks a workaround in the spec and is influenced accidentally the way future things work.
So we're thinking about NTPP5 now, and we realize that we have to be careful of the packet links.
Yep. A certain size or it won't traverse the Internet.
So let me give you a, yeah, go ahead. Um, there's, you know, there's like, I think, I think the really interesting things are when there are like security issues with some version of a, of a, of a specification, not necessarily, you know, if the specification is insecure and so is the implementation.
This was a case where like, you know, the implementation was wrong, but the specification was right.
Yeah, this stuff happens. But yeah, so like, so ECH, so encrypted client hello, you know, on paper, pretty simple idea, just, um, I'm going to, I have a key, a public key that I got from a DNS server, and I'm going to use it to encrypt my client hello message, which contains all these sensitive parameters.
And the server is going to be able to decrypt it because it, you know, it gave me, it gave me the public key to use.
Um, so on paper, really simple. There's a lot of, there's a lot of engineering challenges there, particularly with DNS.
Um, but there's, but this has been a, this is, it's been kind of evolving for a long time.
So this, this, I, this hope to have, to encrypt as much of the TLS handshake as possible.
It's actually, people have been wanting to do this for a long time.
So if you look at the TLS 1.3 spec, um, encryption starts as soon as the client and server have some shared secrets that they can derive a key from and begin, uh, begin encryption.
But that doesn't start until the server, um, sends the client, uh, its key share, which they, you know, they, they exchange key shares and they, and then they can derive a key from those.
Um, and we've been wanting to do this kind of for a long time.
Um, and, uh, it was hard, technically difficult when TLS 1.3 was being worked out.
So it was ultimately dropped as a design requirement.
Um, and it was brought back with a standard called ESNI, uh, which stands for encrypted SNI or encrypted server name indication.
And the goal of this is basically just to encrypt the most sensitive parameter in the handshake.
And that's, as we've said, the, the SNI, which is the, um, the, the name of the server you're going to example .com or google.com or whatever.
Um, so that early standard, it was, it was very simple, uh, Cloudflare, uh, deployed it in 2018 and it's, it's still running.
Um, Firefox, um, you can, you can turn it on. Um, you can turn it on, um, uh, uh, you can opt into it by changing it about, uh, about config settings, excuse me.
Um, so ESNI is basically just, you know, encrypting the, the server name and, um, that's really useful.
This standard though has some subtle, very subtle, like, uh, security vulnerabilities.
There's, they're kind of hard to pull off these attacks, but there's attacks that would allow you to, to, um, there's an active attack where you could like send, send a message to the client and try to get it to divulge a little bit of information.
Not very much information, not much more than a bit of information about the server name, but it's, um, but it works.
Um, so, um, yeah, so, um, ECH has had to, ECH, like I said, is kind of the next version of ESNI and it's had to address all these security vulnerabilities in, uh, in the standard.
Um, it's a very different protocol. So ESNI is kind of going away.
It's, we're going to keep it around for as long. So, you know, ECH hasn't been deployed yet.
Um, so we're going to keep ESNI as long as we don't have ECH, but, um, we're not going to try to support, you know, it's hard to support both at the same time.
In general, it's hard to, like, if you're making changes to a protocol, like if you're evolving a protocol over time, you have to maintain some amount of backwards compatibility because a client and a server, a given client and server might not agree on what version of the specification is the right one.
So, um, uh, so, you know, so an ESNI client shouldn't be able to talk to an ECH server.
That wouldn't make much sense. But even now, like we're, you know, the next version of ECH, uh, the next draft is going to be actually targeted for interop testing.
So browsers are going to try to, um, browsers and, and, um, and servers like Cloudflare are going to try to make sure their implementations match each other.
Um, um, but then, you know, we're going to make, we're going to make hopefully small changes from here on out.
So it's, it's been completely redesigned.
And from here on out, hopefully it's just iterative.
It's, it's possible to maintain backwards compatibility when you're making small iterative changes.
It gets a lot harder when you have to make big changes to the protocol.
Um, and maintaining backwards compatibility, it's kind of like, it's kind of the central challenge once you widely deploy protocol.
Um, so we're going to see a little bit of that for ECH, but hopefully we'll be, we'll have some chances to, to kind of minimize it.
Yeah. I wanted to add something to that.
Um, so like you might've heard about, um, how we had TLS 1.1 first and then like, you know, we got like TLS 1.2 and like, finally we're like, don't forget about 1.0.
Oh wait, it's still like running? No, well over our network, not very much, but, um, but there are 1.0 still exists in the world.
Oh my God. Yeah.
So that's one of the most interesting things that I found about protocols was that like, there are so many routers and like tiny computers and like, you know, raspberry pies and like, you know, really old enterprise super computers and just like so many different types of computers that exist in the world, like across, like basically everywhere that like, if you want to change a protocol, you have to support like for, for, you know, like the bigger changes, you basically have to keep supporting older ones for like a really long time.
Cause like a lot of these are not even actively being managed. Like there's some router, like say, you know, like on the space station, like maybe no one, like they don't want to send like an IT person up for like, you know, a long time.
Like maybe everyone who's been like, you know, going up there doesn't have like experience doing this stuff.
So like you, you basically like can't change a lot of things.
Cause like, you know, maybe the hardware doesn't support it. Maybe it's like running such like a old system that like have, having to support, you know, say TLS 1.3 with it's like, you know, better cipher suites, it's just like not possible for it.
And like all of these like make a big difference because even if you release like a much better implementation, a lot of times like your version does get downgraded to like, you know, like a worse version.
And like, that's definitely a problem when it comes to especially security and privacy, because like if you're using TLS 1.1, you're using like really old crypto algorithms that have like long been broken.
So, but you can't do anything about it because you still have to support it.
Just cause you know, like your client like needs that.
So that's like a pretty big issue. And so like, that's one of the reasons why it takes like such a long time to like, you know, clear out like all the little bits of like a protocol before people make it an RFC.
It's, it's one of the reasons, but like, it's definitely like a big reason because like changing it in the wild is a lot harder.
It's not just like, you know, oh, I push a software update.
Like it's not like that. Yeah. If you're watching this and want to ask our guests a question, you can email livestudio at Cloudflare.tv or call 1-380 -33-FLARE and don't ask about cars.
I don't know what that means. Oh, cause of car tech.
Cause you're a big fan of NPR car talk. I don't know anything about cars.
I know a little bit about bicycles. So one of the things that that's become more apparent is that privacy issues that have been longstanding in the Internet and people will can go on and tell you about the origins of this are getting more and more important to solve that.
And where have you seen this outside of sort of immediately ODO and ECH, but where else are we seeing privacy become more important for Internet protocol design?
That's a good question.
For Internet protocols specifically. I mean, there's like, there's also, there's also the myriad things that you, you use Internet protocols for.
Um, like NTS was an example, you know, I don't know if we like discussed this, but you know, we had NTP before, which is your regular time.
And then like, um, NTS is authenticating.
It's kind of like DNS in a certain sense where, um, you know, like with DNS sec, you authenticate the resolver that your response for your DNS query is coming from with NTS.
You, um, are authenticating the time server where your response is coming from.
Um, so that's one example of the top of my head. Um, like, I guess you have to think about like all the ways you're connecting to the Internet and like, um, which of those are not using TLS because like, if you're using TLS and you're going over like an HTTPS connection, so that's like basically secure.
Um, obviously there's like, you know, like a lot of, um, other factors, but, um, what are the other ways that you're using, um, like connection, like there's DNS there's, you know, like you have to get time from somewhere.
Um, yeah. Uh, and like, to me, like, it's interesting cause like, there's always like a big trade off, right?
Like, like to me when I was looking into oblivious DNS, um, like it seems like that, like DNS is like such a lightweight protocol that like, you know, bolting on like, you know, like this entire, like TLS connection that you have to like create and like, you know, like, like tear down and like encryption and like decryption, like sometimes it almost feels like, you know, overkill.
Um, but I guess like that's the cost of privacy. Like, you know, like you can't, you can't get like everything, like there's people who like care a lot about privacy.
They're, they're using Tor. So like, um, so that's, that's a decision that like, you know, people have to make, um, like what do they value more?
And at this point I feel like, you know, it's kind of like the, the needle is like swung like way too much in like the other direction where like there was no privacy for a while that like, you know, like we kind of have to correct for it.
And then like, I'm sure there's going to be pushback and like, you know, eventually like meet in the middle somewhere.
I don't know. Like, I think, I think, I don't, I think that people are starting to, to recognize it as, I mean, there's, I think that people think about privacy in very different ways.
So if like you, you know, you, if you pull people about like, do they care that, you know, company X collects information Y about you, um, you know, you'll get very mixed responses because some people really like the convenience, but I think we're seeing a shift.
Like, I, I, I, I think, um, because there's so many, there's not just, I mean, there's privacy and then there's, there's privacy for privacy sake, but there's all of these other things that happen in your life and like as a consequence of, of, of, of privacy being violated.
So I, I don't know. I I'm really interested to see where things go.
I don't really know right now. I mean, I'm, uh, I hope, hopefully, I mean, I think we're in a good position as a company to, to help in a lot of ways.
Um, I, I, I'm personally really excited about, uh, I was thinking about this today at the, the, the, the non-collusion assumption that underlies, uh, DOE.
There's an amazing, there's a, there's a lot of things that you can do with that.
So one of the things that I was really interested in as a, as a, as a graduate student, um, I never really did anything with it, but, um, it's a predecessor of PRIO, which I'm not prepared to talk about to any technical depth, but PRIO is the system for, you know, like, uh, designed by, um, Dan Bonet and Henry Corrigan Gibbs, uh, two cryptographers at, uh, Stanford.
Um, although I think Henry's gone somewhere else now.
Um, and it's, it's a, it's a, it's a system for, for, for private aggre- for, for collecting, um, metrics from users in a privacy preserving way.
So you're able to compute like certain statistical things, uh, statistical properties about your, you know, your population of people without actually having like the data point for each person.
So for example, if you want to get like, what are the top 10 most visited websites?
Theoretically, uh, you know, Firefox or Chrome could use PRIO to, um, collect everyone's browser history and then aggregate them into like a histogram to see what are the most frequently visited sites.
But the idea is to do that without actually having any visibility into, into one person's browser history.
Um, it's kind of amazing that this works. Uh, there's some really cool math in there.
Um, but the fundamental assumption you have to make is that there's these two entities, um, that are, you know, that they're non-colluding.
You trust, you're basically as the user, you're trusting that these two entities engage in the protocol, the, the PRIO protocol, but otherwise don't do any communication beyond that.
Um, so you have like, you know, Firefox is collecting telemetry and they're using some other organization.
I don't even know what another organization would be, um, off the top of my head, but as long as they're not cheating, um, they're the, they execute this protocol and they can, they can, they can collectively compute the statistic without actually revealing personal information about any one person.
Um, so yeah. Sorry. Uh, is this like related to homomorphic encryption?
It's not homomorphic encryption.
Okay. Okay. Homomorphic encryption solution. So I actually, yeah, there are alternatives based on homomorphic encryption.
For like one, like for one server cases, yeah, you can do, right.
You end up having to, well, we won't, it's, we're drifting pretty far from our topic.
Really enjoying this. Are there any protocols further out on the horizon you're excited about or older protocols that you think are right for privacy and security enhancements?
I'm excited about Prio.
Um, I think it has a lot of potential. Um, I think, I think, uh, I think that quick the way, yeah.
So, I mean, there's, there's, okay. There's a lot in transport security.
There's a lot, there's a lot to do in transport security as far as privacy goes, but that's kind of just plugging existing leaks as far as, I don't know if anyone has any ideas for like, you know, what are the, the, uh, kind of the future applications that we're interested in?
I would personally like really like to see something, um, with decentralization take, you know, like more effect.
Like we have like, you know, like for instance, message layer security, um, that's not decentralized, but it allows you to have like scalable group chats that are like somewhat encrypted.
Um, like I think being able to have like, um, you know, uh, peer-to-peer group chats that kind of scale would be, would be interesting.
Um, and then I'm also very excited about, um, WebAuthn, which, uh, Cloudflare is also working on, um, where, because like I hate CAPTCHAs.
So like being able to, um, so WebAuthn, uh, is, you know, like face ID and, uh, in general touch ID, face ID, all of these things.
Um, and, uh, or, you know, like even YubiKey, like that's, um, that also, uh, conforms to the WebAuthn standards as far as I'm aware.
Um, and so like being able to use those, uh, things in order to, you know, like bypass CAPTCHAs, um, would be really nice.
And I run Linux, so like I have to like do CAPTCHAs all the time.
And I'm using Chrome. I'm not even using Mozilla, um, but I still like run into CAPTCHAs.
Like every time to like log into Zoom, I have to do a CAPTCHA because like Zoom doesn't work very well, um, like as an app on my computer.
Um, so I have to use it in the browser. Um, so yeah, WebAuthn would be like a very interesting, um, standard, uh, and, um, and I think we're working on that at Cloudflare as well.
So, um, stay tuned for updates on that one.
Um, Tanya, I like that you're another Linux person because I feel like most of the team have, uh, have MacBooks these days.
Okay. Well, when I started at Cloudflare, Linux, developing on Linux was a real pain.
And after three days I said, okay, give me a MacBook.
So that's a problem I have. It's fine now. Yeah. I just use Ubuntu, which is, you know, your grandma's Linux and it works really well.
Exactly. Yeah. I don't use anything special because, you know, like I'm afraid, like, you know, I'm going to have to spend like way too much time debugging like random things.
So yeah, I use Ubuntu too. And I think it works pretty well besides, you know, minor inconveniences.
In my experience, even Ubuntu is not necessarily immune from the, gee, I wanted to update because I needed to go install this thing.
Um, so protocols, I'm really excited about. I think one of the... Nice segue, Watson.
One of the interesting sort of emerging applications is zero knowledge proofs where we have a whole lot of ways to prove things and only display that you know, a witness to the validity of the statement.
And there's a lot that's being done.
There's stuff like Zcash, um, WebAuthn. Not WebAuthn. I meant like, um, I mean, like, yeah, nevermind.
Nevermind. Yeah. So there's a privacy pass, right? You can solve a capture once on a Cloudflare site and we can, even if you're running Tor, we can keep track of the fact you solved the capture, but we don't necessarily know that connect you solving the capture to you later.
That would be an example of these sort of zero knowledge proof techniques.
I think another old protocol that I think would be interesting to see what we can do with its privacy and security.
I think something like IRC, where it's been around forever. It has legions of devotees.
It's an open messaging system and it's horrifically insecure.
You know, people end up with, uh, Internet fame and fortune on bash.org very frequently, um, because they're on public IRC channels chatting to their friends and someone copying, you know, some stranger comes along and copy and pastes on bash.org for the amusement of everybody and, or sort of older protocols there, but, you know, security, it's good.
These older protocols, you start talking about how everything's over HTTPS now, and that really raises questions that centralization, et cetera, et cetera.
And so there's probably interesting things to think about in terms of decentralization, the way cryptography can help there.
Yeah. Yeah.
Um, another thing I wanted to bring up with respect to privacy sensitive protocols is that until we have like, um, so one reason that, you know, we can't use, um, things like zero knowledge proofs and homomorphic encryption, like everywhere is cause like, it's really slow.
Um, at least homomorphic encryption is, and homomorphic encryption is kind of like Priyo, which, um, Chris described, uh, where you have some data of a person, um, or like a better example would be like identifying, um, you know, like, uh, certain stuff like, you know, like child pornography, for instance, like if you want to block that, but you don't want someone to sit and look at it, um, how do you like, you know, figure out a way to like, um, or like medical data, that's a, that's an even better example.
Like you don't want to tie it, like you don't want to look at individual medical data.
You just want like aggregated statistics on it.
Yeah. Yeah. I think we should say first homomorphic encryption is, and it's basically, it allows you, it allows you to do computation over a cipher text, um, without knowing the data that you're operating on and without actually seeing the result until something happens, uh, in the future.
Yeah. Um, so like, I think that will open, like, if that gets fast, like that will open a lot of doors in terms of like not having trusted parties everywhere.
Um, cause right now, like even with TLS, you know, like you have this entire like certificate chain and then there's like someone at the top, which is like, you know, Google it's encrypted, like one of these organizations, which are saying, Hey, you have to trust me.
Like I will. And they're pretty trustworthy for now, but you, you know, like in the future, you never know.
Um, and so that's something like, you know, cause like, it's not just like their fault, you know, like the CEO might, uh, be replaced by someone who isn't as trustworthy, like the government or a hacker break, you know, breaks into their system.
And it's not, you know, there's always been hacks.
So like, like, I want to see the Internet move more towards, um, a system where like, trust is not as required.
Um, and I guess that's why, like, I think that, uh, decentralization is like pretty cool.
And I, um, and like, I think that a lot of like cryptocurrency related things are kind of, um, you know, iffy and like a lot of regards, but I think, uh, things like Ethereum and like what they're trying to do with smart contracts and like just having like programmable money and things like that, or like, I think that it has, um, and you know, just like the network being like Turing complete, being able to like run whatever computation, um, like, I think stuff like that could be very interesting in the future.
Um, and like, we, we just never know, like it might, it might be.
Yeah. I mean, I I'd like to see like, I, in general, I mean, things like, you know, homomorphic, homomorphic encryption or fully homomorphic encryption, um, zero knowledge, uh, multiparty computation, all of these like really cool advanced crypto techniques are cool.
And it's really, it's what's, I mean, what's, what's fascinating about cryptography is that you can do these very general things and that's what people aim for.
Like if you read, like some of these papers, like, uh, you know, a zero knowledge proof of like, of, uh, you know, like, you know, like, uh, for all NP complete languages or something like that.
Um, that kind of stuff is, is, is interesting, but I think you make more progress when you try to, when you pick specific problem and you tailor a system to that problem.
Um, I kind of, you know, I'm, I'm sort of an incrementalist. I don't want to, I'm not really necessarily interested in, in the, in the more general solutions.
Um, I'd like, you know, I like the idea of, of, of, of kind of biting off simpler problems, simple, you know, it's, it's still, you know, it's still not, not everything, you know, has an immediate solution.
Um, but you have to, you have to make things fast.
You have to make things space efficient. People don't like to pay for cryptography.
That's like one of the, like in terms of like, you know, money or time or space or, you know, network load, uh, no one wants to pay for it.
And people just want, you want the benefits without having to, uh, incur the cost.
And, you know, they're, it's not perfect, you know, it's, you have to do something, you know, you have to, you have to, you know, do some amount of computation, but, um, yeah, I mean, I, I, uh, that's kind of the world that I'm looking towards right now.
Um, yeah, we'll see.
Um, I think AI and machine learning kind of did what you talked about, right?
Yeah, look like small problem spaces and then like, they tailor things for it.
And obviously you have things like GPT-3, which is still like a lot more general.
Um, but like they, they use that approach and I think that they, you know, like people always criticize them that like, Hey, you're just like trying to make it play chess.
Like you can't do anything else, but like, like what's the point of just playing chess?
But I think like all of those advancements, like really like helped, um, you know, grow the field and like, yeah.
Yeah.
We have about three minutes left. So why don't, why don't both of you take a minute to just sort of conclude the segment with any last thoughts you might have?
Um, just on that, I think there's a lot of, I think there's a lot of work to do to make TLS more secure.
It's going to be with us. I mean, one of the lessons I've learned over the last, you know, few years of thinking about TLS is that it's going to be around for a long time.
Um, and it has privacy issues and, and they need to be addressed.
Um, I think we're doing that with, with, I think we're doing that in a big way on, uh, on, uh, with these two, with these two protocols.
And we didn't talk about OPEC, but OPEC's cool.
Tanya?
Uh, I don't know. Um, I'm, I'm excited. I'm interested in seeing how, like, you know, like trustless communication ends up working.
Um, cause like, like, you know, working for Cloudflare is interesting because it lets you see, like, um, how much, like the Internet could potentially be centralized, um, you know, between like the major players.
Um, so like, I think that, yeah, uh, advances in trustless communication are like what I'm most interested in, just like decentralized everything.
Um, like it's slow for now, but like, you know, a lot of things were very slow, um, like 50 years ago.
Um, like, you know, like machine learning was a farce, like, you know, like there, like, there was like this, you know, craze, like, like 60, 70 years ago where like, they came up with algorithms, but like, you know, like they were like slow and like, didn't like really like take off.
And then like, they took off like later. So, um, and I think like in software, I feel like we're always like, you know, looking for speed or like, okay, like why, why doesn't this go by faster?
And so, um, yeah, I think that like, we just, we just wait and watch and like keep doing stuff.
Yeah. Well, thanks all of you for tuning in as we've discussed some of the recent developments and work that's hard work that's been announced this week.
Um, and we hope that you read the blog posts and try, try some of these protocols out for yourself.
That's it for now.
And we're, and stay tuned for the next segment. Bye.