Cloudflare TV

Master of (Computer) Science

Presented by Nick Sullivan
Originally aired on 

Nick Sullivan, Head of Research, interviews heavyweights in computer science research in areas such as Cryptography, Artificial Intelligence, Databases, and more.

English
Interviews

Transcript (Beta)

Welcome everybody to Cloudflare TV. This is another session called Master of Computer Science.

I'm here with Dr. Karthik Bhargavan, who's working in INRIA in Paris, on, well, lots of different things.

So Karthik, tell us a little bit about yourself and what you're what you're interested in, what you're working on.

Hi, Nick, and thanks for the opportunity to do this slightly weird thing.

So at INRIA, so INRIA is like a research lab, right?

But it's like a huge research lab, it's run by the government.

So we're all officially like civil servants, and all academics in France are civil servants.

But within INRIA, you have teams. So I've got this team called Prosecco, which stands for programming securely with cryptography, which kind of encompasses.

So we sort of, we are not a programming language team, and we're not a security team, we're somewhat in between programming languages, security and applied cryptography.

And we take all topics in that, in that zone, we design tools for verification, like, we've been working on these tools called ProVerif and CryptoVerif, and also FSTAR, in collaboration with Microsoft Research, to do various kinds of security analysis, and then we apply them.

So there are a lot of our work is on applying them to things like TLS, MLS, which is messaging their security, for those who don't know, to signal, to all kinds of things.

So anything new that comes up.

And occasionally, we also contribute to standards development, or fixing standards, or improving them or designing new ones, like HPK, or TLS 1.3, or MLS and this kind of stuff.

So yeah, that's the sort of stuff.

So we should definitely get into that pretty soon. But before we kind of get into what you're currently working on, and standards bodies, and all the fun parts about taking cryptography and bringing it to the real world, it might be interesting for the audience to get to get a sense of what your history is.

How did you get into computer science research?

How did you get into security and programming languages?

What's, what, what inspired you to follow this career path? I was actually snookered into it and fooled into it because I was, I was in India, doing undergrad, and I was going to follow the traditional Indian path of at the time, which was, you first go to your engineering degree, then you do your management degree, and you earn lots of money.

So I'd pretty much settled into doing that path.

And I think it was almost the last year of my undergrad that suddenly one teacher decided to teach us the lambda calculus.

And until then, you know, just like here's a very applied course, we've learned things like network programming and C and stuff, but you wouldn't learn things like the lambda calculus or scheme and stuff that wasn't really part of our curriculum.

At one point, we suddenly started learning something about the lambda calculus.

And somehow, I got totally addicted.

So if you would like I was writing entire like page full of programs in the lambda calculus, you know, sort of by hand.

And then we started, that professor said, okay, why don't you try doing some proofs on, on computer, you have this thing called the higher order logic theorem prover, you can you can do a proof of this calculus.

And it's like, oh, that sounds interesting. Is it like the lambda calculus?

Yeah, it's like the lambda calculus. Okay, I'll do that.

So six months went past, and I had basically not gone out of gone out of the lab was completely addicted to doing this kind of thing.

And then, you know, all the management courses had gone away.

Companies that could hire me had gone away.

So I was I was spoiled for everything but research. And, and actually, a lot of people I know have have this kind of story.

I don't know about you, Nick, who kind of got hooked into it.

They didn't exactly think they're going to do it or for any higher reason, but they just got it just seemed like the coolest thing at some point, and we just are still doing it 20 years later or whatever.

Yeah, I think that's, that's a common story.

I mean, cryptography is something that I hooked on to, as a young person as looking for ways to get into discrete mathematics and, and find ways to, to fit it into the world.

It might be interesting for folks to who aren't aren't super aware, what is the lambda calculus?

And why is it such a interesting and fascinating tool for, for computer science?

Yeah, so I mean, there are many different ways in which the foundations of computing can be set up, right.

So there is the Turing machine, which is perhaps the most famous one, which is a way of representing calculations that can be done by computers.

And it's also a way of finding out the limits of what can be actually computed by what's going to be done by a computer, right, which is something we all know.

But there are many other equivalently powerful, equivalently expressive ways of defining a foundational calculus for for computing.

And the lambda calculus, which is, you know, just, it's just originally by Church, and it was evolved in many different ways.

And there are typed lambda calculus, system F, and there's a whole tradition of this, of this way of writing.

So lambda calculus is something you can write in, I don't know, three lines, you can describe the whole language, but it's as expressive as any computer program out there.

So it's very attractive for people who are thinking about computation and computability, but also about proofs, because there's a there's a very nice link between, between computations and the lambda calculus and the kind of proofs you can do in first order logic, and so on.

So, so there is a whole tradition, which actually got especially popular because of the ML group of languages, which is OCaml, F sharp.

There's also this thing, which the reason that Facebook has, so it's a whole bunch of functional programming languages, which are all designed originally to do proofs.

So they were supposed to support proofs.

So and, but, but they are really full -fledged programming languages.

And the one that we use most often is something called F star, which is again, a very advanced programming language, but at the heart of it is just this thing called the lambda calculus, which is a four line programming language, which can do everything.

So with the lambda calculus, yeah, you can you can do a lot with just a small, tiny foundation.

And different languages have evolved in different popularities for different use cases.

And, yeah, you mentioned the ML category of languages. How often are these used in business versus in academias and proof, academia and proofs?

Because there's a whole lot of different options you have to choosing your language.

What is what are some of the benefits of something that is more strongly typed or something that was oriented around building proofs?

Yeah. So that's an interesting question.

So I would say that languages like, like OCaml and F sharp and so on, which are to an extent F star, but it's, I would say more, more industrial languages like OCaml and F sharp, is that since they are functional and they are strongly typed, some of the fact that they're functional and strongly typed is not so different from Rust, you know, or because Rust has a very nice functional, is actually a very nice functional programming language if you ignore the stateful borrowing stuff.

And it is strongly typed too. And so is Java, actually.

Java has actually a lot of these features. So, but if you take a language which is strongly typed, which is functional and has type inference, as in, has a type system where you can automatically infer all the types, that basically gives you this sort of this magic thing where you can write code very, very succinctly.

So like, I would say, a 10,000 line program in Java can probably be written in 2000 lines of OCaml, I would say that would be about what you would expect.

Because you can really like scrunch down the code and write just the part, write it in a very clean and nice way.

So the reason that this is attractive for proof is, of course, you want to write proofs about programs which are, you know, which are very small, which are very nicely structured, and so on.

And functional programming actually gives you this thing.

So they're not really used so much or as much in, in industrial settings, except as fragments of languages.

So Java, like I said, has generics and has a lot of the features from programming, functional programming have been incorporated into Java, and a lot of them have been incorporated into Rust, into Go.

I mean, I think a lot of these industrial languages sort of pull in the features that they like, right?

And then those ones become just part of those languages, right?

So, but the original languages like OCaml, I think is used by Jane Street and a couple of other Wall Street firms.

It's used certainly in France, for doing a lot of programming. But mostly, I would say, it's not, I mean, I don't, I know F sharp is used in a bunch of places inside Microsoft, but I'm not entirely sure.

And I know Reason is used by Facebook quite a bit.

So now they're getting more and more, I would say, mainstreamed, but traditionally, not so much.

So they were more like a hobbyist or academic thing.

Yeah, you mentioned Rust, and we've been using here at Cloudflare Rust more and more.

And you mentioned Go as well, which is also a very big language that we use here.

What do you think is contributing to the success and popularity of these two languages?

You mentioned importing some functional characteristics, but from a general popularity for building systems, Go and Rust are useful in different areas, but they're both becoming extremely popular languages for building new systems.

Why do you think that is? Actually, I don't know about industrial popularity.

Do you have, do you have some ideas, Nick? What do you think? What makes it popular inside Cloudflare, for example?

Well, not to go into too much inside baseball, but Go is an extremely useful prototyping language for writing a lot of code really quickly, or making changes to features very quickly.

It has a library, a standard library that has a guarantee so that it won't be changed going forward.

Go has this guarantee where all the interfaces you use are, you know, once they're introduced to the standard library, they're set.

And so it has a nice capability of you can compile everything into one binary, and then have that binary run in various different places.

You don't have to deal with so much dependency management.

For Rust, the learning curve is a little bit steeper than Go, but it has some features that are very attractive for running high speed computing.

In fact, as you mentioned, it has a strongly typed language system.

So there are certain issues like memory corruption or memory overreads and overwrites that just can't happen with Rust because they're checked at compile time.

And so that also gives you the extra benefit of not having to deal with some sort of garbage collector.

So we've been using it for various, various applications.

Rust is great for high speed parsing and anything that requires something to be very fast and follow a very strict pattern.

And Go is kind of useful everywhere for services talking to each other.

But we're also at Cloudflare interested in correctness, right?

So some of these more advanced powerful languages like F star, how do you how do you think how would you compare them to Rust in terms of functionality and with respect to being able to prove things about your code?

So traditionally, when people have been trying to prove things about real world software, this has been like the holy grail from the from, I don't know, for 20-25 years, people have wanted to prove operating systems and things that always go bad.

I mean, the old days, it was like, you know, Microsoft Windows keeps crashing, can we make sure that we can provably prevent all kinds of crashes in these device drivers or in other places, and so on.

Since then, the focus has moved to many other places than the operating system, right?

But always the problem has been legacy code. So verification tools, there have been many verification projects, which have successfully analyzed legacy code, but they're few and far between and the amount of effort required is unreplicable.

So you just cannot do it twice, you know. So I think a lot of teams, including the SEL4 team in Australia, the CommCert team, our own work on crypto, we've all come to the conclusion that, in fact, you just have to write code from scratch.

If you want to verify a large, important project to you, you have to write it from the very beginning in a style that is amenable to verification.

You can't post-factor do this stuff, right? And so I think one thing that the FSTAR style comes to, and a lot of other projects in the formal verification world, which are focusing on real world software like this, is that, okay, we'll give you a style in which to write software, but you have to rewrite your software, it's not going to work on existing ones, unless they were already magically beautifully written in a perfect style for verification.

So what's beautiful code from the programming point of view, and even from the implementation point of view, from the audit point of view, may not actually be very good code for verification.

Some patterns are just idiosyncratically better for proofs than for, they may look better, but they're actually not very good.

So in FSTAR, for example, you can, so we've written like a TLS library, we've written an entire crypto library called HackleStar.

And so we developed a set of patterns in which you can program this and other people can program this too.

And then from that, we compile out to C, to WebAssembly, to DoCaml, and so on.

So the target code that most people might want to use, like the HackleStar C library, is just a C library, just take the C library, but it's actually been generated from FSTAR code.

So we wrote the FSTAR code very carefully to make the proofs easier.

And then we developed this compiler to spit out code, which is fast, it's relatively readable, it won't be exactly the kind of code that you would write by hand, but it still will pass audits, if you wanted to actually audit the code, you can read it, it actually makes sense and so on.

So that's, that's the kind of place where you're at. Again, I'm not entirely sure that these languages, some people use Coq, some people use a star, the Galois folks have their own languages, and so on.

I'm not entirely convinced that these source languages are usable outside the teams that designed them.

Okay, so you've tried a lot of experiments, you had a lot of interns and stuff.

It's, it's a steep, it's, I mean, if you thought Rust was a steep learning curve, this is like, you know, two times or three times a steep learning curve.

It's a, it's a master's thesis level of steep learning curve, right?

So you need to spend six months just learning this stuff. So I don't think that's, that's the right level at which people should be engaging in this, in these verification tools, except for very well defined components like a crypto library or something like clean and small.

And you say, Okay, I want that verified, because I'm going to install it.

So that you can do. But if you want to do like general verification of your infrastructure, for example, and incrementally, then it may not be the case that these tools are just these languages are appropriate.

So in my team, we are experimenting, I know other teams experimenting with this as well, of saying, Well, how about if people could write Rust programs, maybe not using the full power of Rust, but maybe it'll give you a very nice subset of Rust, pretty large subset, which will give you linters to check that you're in that subset.

And if you write your programs in that subset, and you write your specs also in a in a subset of Rust, which kind of makes sense to you, then maybe we can use these tools like a star and cock and so on as back end provers that can automatically prove properties about these things.

It won't be so clean that you will just come and give it and give you a yes or no, it'll probably say I couldn't prove it because ABC.

And then, in some cases, you may actually have to go and look inside the extra proof or the cock proof to understand why it didn't work.

But in many cases, maybe once you have done the proof, the incremental thing that you're just changing a few things here and there, you just want to run make and see if the proofs break or not.

That might be pretty easy to debug, for example.

And the code that you write, the original code is still readable, it's in Rust, everybody understands what the what it's trying to do, what it's trying to prove.

So that kind of thing, I think I can imagine. Rust is just an example.

Of course, you could use some other language, whichever is your favorite one.

But yeah, so I think a lot of these verification tools are doing very well at very focused examples.

And for code that has been rewritten by hand, in a way that can really be verified.

We are in the on the cusp of trying to understand I think, both as academics and as practitioners, I mean, you guys and Amazon guys, a lot of people are trying to really incorporate, you know, verification into into their workflow, on how to do this in a way that developers can kind of really engage with it and do this incrementally and bit by bit rather than having to like invest two years into doing one thing, which is not really worth it.

So what sort of things are easy to prove?

And what sort of things are hard to prove when we're talking about computer programs?

Help, help me get some intuition around this. Let's see.

So I'd say things about memory usage, like if I'm using my buffers correctly, am I accidentally going outside my range?

This kind of stuff is actually quite easy. It's a pain, it has turned out to be a big pain in life because of languages that have chosen like C, which have chosen to use memory structures where everything gets flattened out.

So you lose all the high level structure that the programmer has in his mind when you're actually trying to analyze it.

But any language like Rust or Java or whatever, that keeps those intuitions of how the data is supposed to be structured in the program, it's very easy to prove those things.

So those things, once you get the program in the right form, let's say, it's just automatic.

It's very easy to prove that you never go outside bounds of an array and stuff like that.

Arithmetic of various kinds is quite easy to prove. It starts getting complex when you are doing kind of field arithmetic over some complex field and so on.

But if you're just doing normal machine integer arithmetic and stuff, and you want to prove some property about this arithmetic, it's relatively easy, almost can be done automatically.

In general, the place where proofs become complex is when there is non trivial recursion or loops of some kind.

But even the loops, which are very simple loops, like you're going to copy one array into another kind of thing, and you want to make sure that you did it correctly.

This kind of stuff is trivial. This is nothing to prove there.

Most tools can do that very easily. But if you had a loop where you were trying to actually do some kind of complex copying between one thing into another with some manipulations, which are not obvious, and there is some loop invariant, which actually involves two or three variables that you have to keep track of, that stuff you definitely won't be able to do automatically, and you need to do invariance and so on.

And it's generally hard to do these proofs by hand as well.

So it's mentally kind of requires some thinking.

So there is the general complexity of the proof, which I think a lot of software actually does not require complex proofs.

There are very small aspects of it that do.

When you're putting together modules, there is some interesting complexities there, and so on.

And then there is what the tools can do, which is a different question.

There might be some super easy, almost obvious things that you might think, which a tool may just not be able to prove automatically.

You say, that's obvious.

I should not have to prove anything here. But actually, it doesn't, just because of the idiosyncrasy of that particular tool.

Whereas something else, which you think of as very subtle, the tool may be able to kill by automatically.

So there are lots of these sort of and those ones depend on the technology being used by each tool.

So each tool has some strengths and some weaknesses and so on. So if you look at the SMT SAT solving competitive world, they have a very good handle on the problems that are easy and hard for automatic proofs.

So anything nonlinear is usually hard.

Anything that has quantifiers is usually hard. Anything that has loops is usually hard.

But fragments of programs, like one block of a program that has no loop, but it has a complex kind of, you're moving memory from place to another, you're doing some arithmetic computations, should be easy.

I don't know if that gives you a sense.

Yeah, that definitely helps that categorization.

Okay, so we've talked a bit about formal verification. We've also heard the term formal analysis.

What is the difference between these two terms? And how does it relate to different types of work you do in the area?

Yeah, so, I mean, academics can be pretty strict about what they call a proof, right?

So we have come up with lots of different terms to make sure that we don't annoy people.

So if I run a tool that is going to analyze your program for all possible reachable traces up to a certain length, let's say up to 100 steps, and report any bugs that it finds.

Now, this is a very useful tool. You can find attacks, you can find bugs. Most of the time, probably, it's going to find everything that you probably need.

But it's not a proof because there is some limit.

It's like 100, maybe you say that this is a protocol, I'm going to limit it to five people, or I'm going to limit it to 100 steps or something like that.

So that would be, but it could still be formal in the sense that there is a very clean semantics as to what is the set of attacks you're trying to find here?

What is the set of bugs that you can find? So you can formalize that, you can show that my tool will find every single bug that appears within 100 execution steps.

So this is a formal analysis. Okay, so it's formal, because it's formalized, you're actually stating a guarantee that you get.

The guarantee is not perfect, but it's a useful guarantee, right?

Then there is a formal proof, which would be for all lengths, for all sizes, for all everything, for all inputs, you're proving that your program is correct.

It is not buggy.

And that I would say is when you can say that your protocol is secure, or your program is correct.

And that is the level of guarantee that something like the CompCert compiler is trying to give you.

It's really giving you for all, for all, for all.

There's no testing going on there or any limitation. It's like a full proof. And formal verification is usually used as the terminology for tools or mechanized tools that will help you achieve formal proofs.

So if someone says I'm formally verifying something usually means that he wants to get to a full theorem, which is for all inputs, for all lengths, for all attackers, for all sizes of the system, I'm going to give you a proof.

And I'm going to do this using tools, like a Coq theorem prover, or F star programming language, or something else.

And so the act of achieving this proof using automated tools is sometimes called computer-aided verification in the software verification world.

In the crypto world, it's called computer-aided cryptography.

But it all means that eventually, you're going to get a proof and using formal verification tools.

So I mean, again, people use these terminologies kind of very, very loosely.

So it's not, it's not a strong distinction.

Okay. So your team is very well known, the Prosecco team in RIA for finding vulnerabilities in existing protocols, in particular, TLS, which is the fundamental protocol used to encrypt web connections.

If you see an HTTPS there, that S means it's TLS, used to mean SSL.

But terminology aside, tell me a little bit about how your team got into studying TLS.

And tell me a little bit about some of the results that you found over the last, I guess, it's been a decade or so that you've been almost looking at TLS.

Yeah, we actually started looking at it in 2008, I think. More than a decade ago.

So at that time, I think TLS 1.2 had just come out, and everybody was excited about it.

And so we started looking at that protocol as from the viewpoint of verification, saying that, okay, this is clearly an important protocol.

The web was just, I mean, a lot of, it was still like 30% of the web or less, maybe using HTTPS, but it was clearly on the up.

And HTTPS was the way to secure websites. So we were like, okay, let's look at this protocol.

And we weren't really trying to analyze it, we assumed TLS was secure, because he said, it's a very well known protocol, it must be completely secure, everybody has looked at it.

And so we wanted to do a proof of security, assuming that it was secure, that we just wanted to do a formal proof.

And we wanted to do it at an excruciating level of detail, let's say, we wanted to make sure that we caught every single corner of this protocol.

So we wrote an implementation.

So we wrote an implementation, it was called ETLS. And we analyzed this, we first used a tool called ProVerif to analyze it.

And we, a lot of that effort was not necessarily trying to see if the protocol was bad, was trying to see if our tools were good enough, you know, so just mostly an exercise of the tool.

So, so we would try to use the tool, the tool wouldn't be able to prove anything, it would just run out of memory.

And then you would tune the tool and tune the model and tune how we're doing things.

And then the whole exercise was really, well, look, we can take this big protocol, which is obviously secure.

And we can prove that it's secure, you know, sort of, that was the, that was the idea.

And so it was more of a verification effort, not really a protocol analysis effort.

But after like a few years of staring at this protocol, and you're like, okay, let's try this other corner.

And then there were these, there was this famous attack called the renegotiation attack that came up in 2009, I think, which showed off that in fact, there was a problem in the way TLS was, you know, composing connections.

And we were kind of embarrassed by it, because we had published a proof already in 2008, saying TLS was secure.

And now, you know, there was this new attack, which was not covered by our proof.

And that was because we had forgotten about this one, not really model this one, one thing called stationary negotiation.

And so you're like, oh, that's not that's not so great.

Maybe we should look at it properly and do a proper implementation, test it against other implementations, make sure all the features are covered.

So again, we were trying out a new verification method, which is F star with the predecessor of F star, which we call F7.

And so we said, okay, let's try to analyze TLS with this, it'll be an exercise for our toolchain.

But also, it'll allow us to cover up that gap that we left the last time around when you forgot to model negotiation, renegotiation.

We said, okay, now we're going to do real handshakes, we're going to do resumption handshakes, we're going to do renegotiation handshakes, everything together.

And when you are doing this, one day, I came to work and there was this whiteboard, which was filled with a TLS sort of handshake.

And my student Antoine Deligne-Labeau, who's at Microsoft Research now, and I post up Alfredo Peronti, was sitting in front of it.

And Antoine was like, Alfredo was asking me about how this works, but I don't think it's exactly working like you think it should.

And we've been looking at this thing called channel ID, this protocol that Google had come up with.

So you're looking at all kinds of extensions of TLS.

And somehow you're trying to make it work, and it wasn't working.

You're like, we can't prove that this property holds. And then I sat down with them, and all of us standing at this board, and at some point, you're like, I think this property is actually false.

And then we tried to work it out.

And it turned out that if you compose three different connections, like a regular handshake, and resumption handshake, and renegotiation handshake, then in fact, you could do the same kind of attack that the renegotiation attack in 2009 was doing.

So TLS was still broken. And this is the paper that we call the triple handshake attack.

And at the time, we were not, I mean, we're not used to finding attacks and things, you're like, we didn't know what to do.

So we wrote to the TLS guys.

And then we learned this idea that sometimes you have to embargo your result while the browsers all fix things.

So they all fix the thing. And when we gave talks about this at the TLS ITF working group, and we've never done this before, we've never gone to ITF.

So we said, Okay, do you think this is a problem here?

And there is a fundamental problem in TLS and stuff. And surprisingly, people took us seriously.

So there was, there was Kenny, and there was Kenny Patterson, and there was Eric Tascola, and a bunch of them, and they got it, they got the problem.

And they thought that this was important enough that this needed a solution.

So we also helped kind of fix the fix the protocol. And since once you kind of get into that phase, where you really, really understand the protocol, enough that you can find problems in it, from that point on things just start falling, falling through, right.

So since then, after that, we were looking at other corner cases, like, okay, this one corner case doesn't work, maybe other corner cases doesn't don't work either.

So then we found this problem with export cipher suites that that was still enabled in many parts of TLS.

And again, I mean, many of these cases, what would happen is that somebody would say, Oh, but what about negotiation?

And or what about export cipher suites? I mean, like, nobody uses those.

So there's no point modeling those things. We're not going to model those things.

But then somebody would say, you know what, you should check if anybody's using those.

And you know, you could check. And then there'd be like 25% of the web is using or supporting export ciphers.

And so that led to the freak attack, and then the logjam attack.

And then we were looking at other corner cases like MD5, or triple bears and stuff.

So the general pattern would be we would say, okay, the core part of TLS we did approve of, but then there are all these other things, renegotiation and other kinds of things.

Let's go look at them, see if adding them into the core model still preserves the nice security theorem we have.

Usually, it wouldn't. But and then we say, Okay, if it doesn't, then is that actually an attack?

And it can you is the attack actually executable in real life. So and often, there will be some very funky, interesting sort of protocol hacking to do in order to make the attack work.

So just kind of an interesting phase, let's say, although that wasn't our goal.

So our point was, you were trying to do proofs.

And on the pro in the process, we found attacks. So, you know, I don't think this is something that we would necessarily be able to do again, because you will never understand the protocol at that depth or be interested enough in it to find it.

But you know, that's the sequence that led to this four or five attacks out there on TLS.

But after that, we also helped fix up TLS 1.3. And then the rest of our work has been much more constructive, so to speak.

Yeah, I was gonna ask because a lot of those attacks helped inspire the IETF to get going with building a new version of TLS that is structured in a way such that some of these attacks are just not possible.

So from your perspective, what were the important design choices for TLS 1.3 to help make sure that it didn't accidentally have all of these corner cases that were enabled in the real world and actually exploitable by realistic attackers?

Yeah, so I guess the biggest change in TLS 1.3 was the careful treatment of transcripts, let's say.

So, I mean, the way people, cryptographers, write protocols, right, they try to do it minimally.

So that's a certain kind of mathematical beauty in saying, if I don't need a MAC, I won't use a MAC.

And if my MAC only needs to include these two values, but nothing else, then I won't use the other things because I want to have a minimal protocol that shows off that every single element I can justify that if you take this off, something breaks, right?

And so that's how SIGMA's protocol is designed.

A lot of the other protocols are designed with like this really beautiful, like, if you want to avoid the confusion between message two and message three, then you just flip the order of the values so that they can't be confused.

But you don't add a tag because that would be ugly.

Why would I add a tag to this, right? So I think a lot of the old protocols like IQE2, IQE1, TLS, to an extent SSH, but SSH actually has a somewhat nicer design, were designed at this minimal core protocol idea where you would not add something unless you could justify that this was actually needed by some property.

Whereas TLS decides, TLS 1 .3 decides to go the other way, where you put everything that has happened into the transcript.

And every single crypto operation you do is effectively including the transcript in the crypto operation.

So at any point in the history of your connection, if I and you don't agree on some value, that's going to basically show up in a crypto failure cycle.

So you turn a agreement failure into a crypto failure, into a functional failure.

And I think that's a design choice that's taken also in the noise protocol framework and in many other places now.

I think it's become more of a modern choice that you force failures if there is any kind of attack anywhere and any kind of tampering or even a network level failure happens, the crypto will fail.

So that I think was like the biggest thing that happened because that was needed because of the triple handshake attacks, but also because of all the downgrade problems.

All of them go away if you make sure that both sides will refuse to do crypto unless their transcripts completely agree.

So that would say is the biggest one. Other than that, there's a natural cleanup, right?

So TLS 1.3 and like in many other modern protocols, you throw away everything which is old.

And so that's a natural cleanup. It's not so easy with TLS as it is with more modern protocols because of the legacy.

You still have to support TLS 1.2 and maybe even 1.0. So you can't throw away MD5 from a TLS implementation because you need it as an essential element of TLS 1.2, the finish messages.

So you cannot throw away certain things. You just have to keep them.

But you can still try to protect TLS 1.3 from downgrading back to those ones.

So there are two or three different downgrade mechanisms built into TLS 1.3.

So there's really a strong notion of defense in depth that kind of maybe we went overboard, I would say in some cases, because they're like four different ways of authenticating something and they're all in there.

So there's nothing like the signature includes the transcript, the finish message includes the transcript, the keys include the transcript, like everything authenticates the transcript.

So in some ways, it's completely antithetical to the minimal principle.

But the idea is that there were so many corner cases where we were falling through in earlier versions that we just kind of made this, tried to make this into much more, I mean, there might be weaknesses, but they might be, they'll become, they'll be very hard to exploit is the idea.

So how have you taken the lessons from attacking TLS 1 .2 or trying to prove it and discovering attacks and helping build TLS 1.3 to new protocols?

You mentioned MLS. What is MLS and how is MLS going to be secure, given that it's a brand new protocol?

So MLS is this IETF, well, it's a draft protocol designed by the IETF working group called messaging layer security.

And the idea there was that, well, we seem to have many secure messaging protocols, usually some flavor of signal or something else, which do very well with person to messaging, so end-to-end two-person messaging.

But the idea of doing messaging in groups, especially large groups, is largely unsolved.

And it's also something that perhaps would benefit from a consensual design where academics can come and analyze it while it's being standardized, much like TLS 1.3 was.

So somewhat inspired by TLS 1.3, somewhat inspired by the real-world need of needing such a protocol, which doesn't exist.

And part of this IETF philosophy that if something is worth doing, it's worth doing properly and once and everybody can share and whatever.

So we've been, we and many other research groups, two or three other research groups have been involved in the very beginning, trying to guide the design of the protocol in a way that would be amenable to formal proof, so that when we come up with the, like we did for TLS 1.3, that when the final standard comes out, it comes out with two or three or four different styled proofs of different aspects of the protocol, so we can get good coverage and be confident about it.

This is particularly interesting for MLS because messaging doesn't have as much of a history as TLS.

So we don't have a very good idea of what it's supposed to do, even.

So especially when you think about group messaging, even Facebook, WhatsApp, all of these people, I mean, nobody really has a very strong idea of what the goals of these things should be.

I mean, we know some things, obviously, that nobody from outside the group should be able to read messages that are sent in the group.

And once you're removed, you shouldn't be able to go back into the group once you're added into the group and so on.

So I mean, there's a notion of membership, and it's clear, but there's lots of corner cases which are unclear.

So for example, when somebody is added to a group, do they get to see previous messages or not?

That's a very important design choice, and it affects a lot of the security goals.

And similarly about removal, about what happens when you update your keys.

If you suspect that your keys have been compromised or your phone has been compromised for a few minutes at the airport or something, then how can you recover from this?

Signal already provides not nice abstractions and ways of doing this kind of thing, and people perhaps have begun to understand what these kinds of security guarantees mean in the two -person case.

But in the group case, everything gets way more complex and interesting, let's say, from an academic point of view.

And we don't have, we did not have, at least to begin with, very good security definitions for what we should design, even when we started designing.

So it has been a very interesting, I think, process, and you're part of this as well, Nick.

So I'm wondering if you find it intriguing as well, the way the need for specification, the need for standardization kind of are going hand in hand as we are discovering what we need to do while we're doing it.

It certainly has made it a very interesting target for proofs.

And I think now the first proofs are coming out, and they're all super interesting because they're really stretching the limits of what we can do with our formal verification tools.

Yeah, it's an interesting challenge, right? Because with two people, it's barely able to grasp what it means for security to roll over, or you'd have forward security, or what happens if your phone gets compromised?

What sort of messages you have to worry about have been exposed to the attacker?

But when you get, you move to a three -person situation, what happens when people are compromised to, you know, a hundred person situation to where as MLS is supposed to scale to tens of thousands?

What does that even mean? Can you even have a strong idea of who 10,000 people even are as a person, let alone being in a group chat with them?

So I think there's a lot of fascinating challenges. So is MLS taking the maximalist approach of TLS 1.3, or the sort of minimalist approach of something like Sigma, where are we hashing everything, or are we hashing just a little bit?

And what is the intuition behind that? So at the protocol level, we are hashing everything.

So it's maximal in the same sense. But the guarantees that you get, you have to be very careful to not make this unscalable.

So here's an example.

So if I want to be sure that if there are, if I have a 10,000 person group, it's unrealistic to assume that everybody in that group is honest.

So somebody in there is going to be malicious, even just a prankster, maybe.

But maybe somebody is actually going to be malicious.

So what kind of guarantees do you get when there is somebody in that group who's malicious?

One employee who's going to leave next week, but wants to do something.

So what are the limits on what you can do in that situation is perhaps the most interesting part of MLS.

And there, you could go with a full hog and really enhance the protocol with almost like a secure multi-party computation.

You could add zero-knowledge proofs to make sure that nobody can cheat.

You can really make this protocol resilient against n minus 1 bad guys in a group of n people, or n by 2 bad guys in a group of n people.

So you could actually add, it could be really maximal and provide maximal security almost at the level of multi -party computation.

But that's not going to scale to these kinds of groups, at least with the SMC technology we have today.

So it's trying to use mainstream crypto, so far at least, to solve this problem.

So it has its limitations. So it provides weaker guarantees in some places where you could get stronger guarantees if you're willing to deploy much more powerful crypto primitives at the cost that it would in general.

So it's not maximal in that sense, where TLS 1.3 was really going for the latest, greatest security.

You can add a post-quantum chem to TLS 1.3 without too many issues and so on.

So it's really not limiting itself for security, whereas MLS has to find this balance between performance and security.

So we have not yet perhaps found the right point at which to say, okay, this is the perfect point.

But we're getting there. We kind of like go too far and go back and so on.

So I'm not sure whether it's sort of comparable.

But indeed, we're trying to do the design in a way, of course, that sort of gets maximum security, right?

So we'll see how it goes. So you mentioned using existing sort of boring historical crypto primitives rather than the fun, fancy zero -knowledge proofs and pairings and things like that.

But with MLS, we discovered that there was something that was not standardized that could be a basic tool chain that you helped push through to make a standard about.

Tell us about HPKE. Yes. So HPKE is actually, I mean, so people who know ECIES or the box construction from SALT will say, oh, it's dumb, right?

There's nothing special about HPKE. It just does, if you want to encrypt something to be using somebody's public key, some data, this is like a classic public key encryption thing.

HPKE is a construct that does that.

The reason that we designed a new standard for looking at is that it turned out that most of the existing sort of libraries or standards for doing this were kind of using obsolete crypto or non-standard crypto.

So the SALT box construction is pretty effective, but it uses SALSA20, not CHARCHAR20.

And it does not do a certain amount of derivations and so on.

So it's kind of a bit hard to prove secure because it just skips a couple of steps and so on.

Whereas ECIES, the standards that are there for it do not really support things like up to 5 .519 in a clean way.

And they do not use HKDF, which is much more modern and so on. So just got a little bit obsolete.

So I would say that HPKE is a mechanism that if in 2020, you want to do public key encryption for an arbitrary length plaintext, then this is, we think, the best way you can do it.

So it's almost like a, I would say, rather than a new construction, it's almost like a documentation of what uncontroversially you should agree that this is a mechanism by which you should be encrypting large plaintext using somebody's public key.

In addition, of course, it also has modes which are very commonly needed, where you don't just encrypt to somebody, but you also authenticate yourself.

And this is notoriously hard to get right.

And even if you give it to your, this is the exercise I give to my grad students a lot to say, okay, you have an encryption mechanism and a signature mechanism.

I want you to send a message to me, encrypt it and sign it in a way that I can be sure that it has authenticity and confidentiality.

And it's like a billion ways of getting this wrong.

You can do encrypt and sign, that could be wrong.

You could do sign then encrypt, that could be wrong and so on. So in order to make this sort of idiot friendly in some sense, we have these modes for authentication using pre-shared keys or using private keys, which kind of just work.

And so the whole thing, we have a proof of it already in using cryptograph and we're hoping that other people are going to do more proofs of this mechanism.

It certainly is very amenable to proof because it was designed to be so.

And the basic constructions in there and the algorithms you use there should be hopefully completely non-controversial, at least at the CFRG, it seems to be going through pretty smoothly.

And one of the nice things about it is also that we designed this with this ChemDem structure, which is very popular for these kinds of mechanisms, that allows you to just plug in a post-quantum chem, if you like, and it just works.

So it's post-quantum friend ready in some sense. So the HPK spec itself is used inside MLS, like you pointed out.

It's also going to be used in the encrypted client hello extension to TLS, which tries to provide better privacy for TLS 1.3 connections.

And we are already looking at using it in a few other places.

I think it's going to be a pretty useful construct, hopefully, that people will use and implement.

Okay. So what's next for you? What's interesting to you? How do you decide what your next research topic is and what do you have lined up?

Well, some things you do because you want to, and some things get dropped in your lap, right?

So the last few months, we along, like every other researcher out there, has to be had to look at these proximity tracing solutions and analyze them.

So we've been looking at a few of those. I don't see that as something that necessarily we might go on to do for long term.

But certainly in this day and age, it's important to be able to look at and analyze some of these new systems that are coming out.

So we've been looking at a few of them and trying to analyze them. Other than that, the big new direction, I think, that we are looking at and have been exploring with the students is privacy preserving machine learning.

So like many others, like I expect you, we are also concerned about the enormous amount of AI in our world and its effects on confidentiality, privacy, and everything else that we hold dear.

So we've been studying algorithms that people have been doing and seeing if they actually provide the kinds of guarantees that you would like.

And how do you prove that a machine learning system is actually preserving privacy?

So which is a delicate property to have, because it's already a probabilistic system which learns.

So to stage a property unambiguously about what it might learn, especially when we don't have a very good speck of what it will learn, is not terribly easy.

But I think that's an important new direction that I'm keen to look at.

And then, of course, there's a billion other protocols that are coming out.

The IoT people have a new working group for doing authenticated key exchange called Lightweight Authenticated Key Exchange, LIG.

That looks very interesting.

There are many extensions of TLS. And we're also looking at the noise protocol framework and so on.

So in the protocol world, it never stops. People keep designing new stuff, and we keep analyzing that.

How do you have fewer protocols going forward?

Less work for you, but I think safer for everyone else. Safer for everybody, yeah.

What about you guys, Nick? What do you think is the next big thing for your team?

So here at Cloudflare, we're exploring what of the next generation of protocols are ones that we can introduce and provide for our customers.

And some of the characteristics that we're looking to improve upon are related to privacy and privacy of browsing.

So the one you mentioned earlier, Encrypted Client Hello, it used to be called Encrypted SNI.

But recently, the design has been expanded to provide more privacy.

That's a really interesting one.

I think there's certain gaps in the design of protocols that have been deployed on the Internet for decades and decades, in that the cryptographic level of security is not something that was baked into a lot of different things.

If you think about DNS, HTTP, we're still along the road of encouraging both incorporating identity and cryptographic assurances into these things.

So I'm excited for Encrypted DNS.

We're excited to continue to do work on Encrypted Client Hello.

We're additionally interested in finding other ways to help make sure that these connections between browsers and devices and IoT and Cloudflare's backend are secure and use the most modern cryptography possible.

So there's multiple different fronts in which these are evolving.

So see how they move along. The amount of traffic passed through TLS 1 .3 to Cloudflare recently crossed past the number of connections over TLS 1.2 in the last couple of months.

So it's been a pretty rapid evolution compared to the original deployment of TLS 1.2.

We're really excited that the world is coming along this ride of upgrading the fundamental protocols of the Internet.

Yeah, I think so. In the research world, there seems to be a huge spurt of people working on multiparty computations, zero -knowledge proofs, homomorphic encryption, and so on.

It seems like we've gone past now to a point where we have solved a lot of the basic protocol issues, and now we're ready to apply these fancier things.

So it's going to be a very exciting moment in the coming few years.

Yeah, I'm excited to figure out where in the stack these actually can be at value, because it's a little difficult to explain what value it brings, but it's useful in certain contexts.

It's kind of amazing that in 2020, cryptographic applications are actually bound by some types of resources that you wouldn't think would still be a problem, like CPU.

So we moved a lot of our HTTPS infrastructure from RSA to ECDSA in order to cut down on the amount of CPU that's being used.

And even with the fast computers that we have today with so many different cores, you actually do have to be concerned about that.

So when it comes to more advanced protocols involving multiparty or zero-knowledge proofs, yeah, it's surprising that something like CPU cost or something like memory cost is still in the way.

And just down the road is, as you mentioned several times, post -quantum cryptography.

And I still haven't seen a quantum computer that can do more than counting 10, but you're either coming along.

Yeah. And you guys seem to already have some nice prototype things going for that.

Yeah, we have. And we're excited about things like HPK, which are designed to be post-quantum ready.

Because as cryptography gets older and creakier, it has to be replaced.

And hopefully, we can replace it with something of an appropriate shape and have some fun with it.

So, all right, Karthik, thank you so much for spending this last hour with us here at Cloudflare TV.

And yeah, any last words, anything else you want to share with the audience before we let you go?

Not particularly.

I've been talking too much in this last hour. So it was good fun to chat with you.

I mean, it's always fun to hear what you guys have been doing. Because every year when Crypto Week rolls around, I kind of see tons of very interesting stuff coming out from you guys.

Excellent. So, all right. Thanks, everybody.

And take care, Karthik. Bye, Nick. Bye-bye. Bye.

So,