Post-Quantum Internships at Cloudflare

Presented by: Sofía Celi, Bas Westerbaan, Jonathan Hoyland, Thom Wiggers, Goutam Tamvada

Originally aired on March 4, 2022 @ 8:00 AM - 8:30 AM EST

In this segment, we will speak with Thom and Goutam about their experience of being an intern and helping Cloudflare become quantum-secure.

For more, don't miss:

Post-quantumify internal services: Logfwrdr, Tunnel, and gokeyless
The Cloudflare Research Hub , where you'll find all the latest news and blog posts!

English

Research

Transcript (Beta)

Hi everyone and welcome again to another Cloudflare TV segment of the Post Quantum Series. Today we are joined with a huge array of really amazing people. We have Bas Westerbaan from Cloudflare Research Team. We have Jonathan Hoyland also from Cloudflare Research Team. Goutam Tamvada now as a Research Engineer from the Max Planck Institute. Finally, Thom Wiggers, PhD candidate of Radboud University. Today we are going to be talking about the amazing way that they did an internship on Cloudflare. So welcome and happy you all are here. Thanks for having us. Okay, so let's jump right in. So let's start first with Thom. Thom, what was your internship like and what did you do on that period? Yeah, so yeah, my day job, if you want to call it that, is to do all sorts of research on Post Quantum Cryptography and writing papers and stuff about that. And yeah, I joined Cloudflare because, of course, it's a big Internet thing and they do all sorts of big impact things on the Internet as well. I had worked with them before a little bit also on the research side. And I felt like, yeah, I know this academic world a bit. Let's have a look a little bit at what research is like in the industry as well. And yeah, the internship at Cloudflare was really fun in that sense, because I could do both a bit of the engineering work and also a bit of the still the research side of things. So in the engineering work, during my internship, I worked on bolting on Post Quantum Cryptography onto some Cloudflare internal services, namely a log forwarder, which is a service that pushes all sorts of logs from Cloudflare servers that are spread around the world over the Internet to Cloudflare central data centers for processing. And yeah, in an experiment, we put Post Quantum in there to basically see, does this break very hard? It didn't, which is good. Is it faster or slower? And it does basically the same and they're just as reliable as all the other things. So that was a very successful experiment in that sense. So it's on the roadmap now to eventually get rolled out a bit more broadly, I think. And on the research side of things, which I also still really enjoy doing, I mean, playing with Kubernetes is fun, but for around half the week, the research side of things, I looked at proving a security protocol that was part of my PhD research to come up with a protocol. And we already played around with that at Cloudflare a little bit in the past. And we had a security proof by pen and paper. But this time around, we were basically, you know how computers are very, they always do what you tell them to do and not what you want them to do. So convincing a computer that your protocol is correct is actually quite a good exercise in, is this protocol actually secure? And that was basically my other project at Cloudflare. So what made you pick a sort of formal analysis mechanism? Had you done any research in that before or you just decided you want to do something completely new? It was something that was not completely new in the sense that I had some exposure to it a little bit in my master's courses. But, yeah, we have this pen and paper thing. And we had already run into some things where, because you actually writing this, you read between the lines, etc. of what you're writing. And then it turned out like, oh, yeah, this is actually not true. What we're writing down here, what we meant was true, but the text doesn't convey that correctly. And we basically were thinking like, yeah, this is a very good follow up work and a good complement to what we've been doing. And also slightly with like the outline of my thesis in mind, this would be a very good chapter to sort of fill out that picture. We chose to work with Tavrin, which is a tool that allows you to model these protocols. And, yeah, that was mainly because there was an existing model of TLS in there already. And we basically adapted and extended that model to work with KenTLS instead of with normal TLS and redid the proof. It was quite important that that model was already there because then I had something to build on, which sort of compensated with the lack of experience that I had. So we didn't have to come up with all of the security properties, etc. from scratch. And it was quite helpful. How did you find the adaptation? Did you find it easy to sort of figure out what Tavrin syntax was and what everything was supposed to be done? Yeah, I think Tavrin syntax is not so bad, actually. Controversial opinion. But the. I mean, in the end, it resembles like something happens on the left hand side of your protocol diagram and then something happens on the right hand side of the protocol diagram and then there's an arrow and stuff happens on the arrow. The main difficulty was in that you just start up Tavrin. In this case, we had a lot of lemmas already from the previous work, but those lemmas weren't directly applicable always. And sometimes because certain protocol messages have been reversed, so like this happens before this was no longer correct, etc. And then Tavrin is not so good at providing feedback why certain things happen the way that they do happen when it's trying to automatically prove things, for example. So then, yeah, it just takes a lot of time and a lot of memory. And then it would sometimes be nice in the user interface as well that it kind of says, I'm trying to do this now rather than. And. Yeah, also the pictures that Tavrin can create these pictures with what it's sort of trying to do, but those are not the most legible things in the world. They can get pretty big. Yeah. Trying to do something complicated, so perhaps necessary. Okay, Gautam, so on what services, going back to PostQuantum, on what services did you work? My internship at Cloudflare, I worked on these two services, the connections between Keynado and GoQLS internal. And essentially what they are is when Cloudflare has to terminate a lot of TLS connections from a lot of places and it's better not to have the private key in the same place as the other TLS operations are happening. And so the private key is kept separately and any signing operations that require the private key calls have to be made to these services in order to handle those operations. And Keynado is one such service that handles a lot of the private key signing operations. And GoQLS internal is another that it deals with more complicated, that also does signing operations, but also deals with more complicated scenarios. And I spent some time, I spent a bunch of my time PostQuantum, making PostQuantum the connections between Keynado, which is written in Rust, and GoQLS internal. And how did it go? Smoothly. Keynado, I spent a lot of the time with Keynado, which is written in Rust TLS, which is pleasant to work with. It's certainly, yeah, not bad at all, I'll say, in comparison to some of the other TLS libraries that I've worked with. You've worked a lot on Boring SSL, right? Yeah, so in my time leading up to the internship at Cloudflare, I spent a decent chunk of my time trying to integrate PostQuantum cryptography into various TLS and SSH libraries and trying to figure out or trying to understand what goes wrong or what happens and the like. And yeah, as part of it also arising from that, I got to spend a lot of time with Boring SSL. So, a question that maybe gets, that's, I think, important to highlight. So why is for everyone, why is it important that we migrate our internal connections to PostQuantum cryptography? I don't know who wants to answer, whoever wants to answer. I can take that. Basically, if all sorts of adversaries, intelligence agencies, attackers are recording all this data that is like flowing on the Internet right now is being recorded. And at some point when quantum computers do come out, they will be able to break the confidentiality of this recorded information. Now, if, whether that takes 10 years or whether that could take 10 years, it could take 20 years, it could take 30 years. But those timescales might not be long enough. There are many institutions, many organizations that would want to keep a secret for like longer. So even if there isn't a quantum computer right now, we need to make the transition to cryptography that is resistant by quantum computers in order to make sure that the secrecy does not have, the data does not have like an expiry date. Tom, you did some work on the authentication in TLS, sorry, PostQuantum authentication. Do you want to talk a bit about how that's different from the confidentiality stuff that Gautam was just talking about? Yeah, so perhaps to reemphasize what the difference is, there's key exchange, which makes sure that you and who you are talking to agree on a shared key and use that to encrypt data with. And you use that key for a while to keep sending data. And the attack that Gautam describes that records that key exchange, and then sometime in the future, they break that and then they can decrypt whatever happened then. Authentication tries to achieve something that's a bit different. It tries to make sure that the party who you are talking to is the party that they claim to be. So in principle, that is something that is a more here and now type property. Because I want to know right now that I'm talking to Sophia and Jonathan and Bas and Gautam. And they want to be sure that they're talking to me. Of course, in Zoom, you can see each other's faces, but in TLS connections, you only have like host names and certificates that you can base this on. So that is the authentication. TLS uses signatures for that right now. The project that I was working on investigated using key exchange also for that method. So that's the fact that I have someone that says for me, yeah, someone has that key exchange algorithm key that is in a certificate, for example. People can check that I should have that public key. And then by the virtue of being able to execute that key exchange algorithm with you, and then proving that I can drive the shared key, I must have had the secret key that belongs to the public key that was in that certificate. These are a bit more complicated changes in the post-quantum setting. Looking at this versus doing signatures is interesting and still quite open research, I think. Because it used to be the case that signatures or key exchange were equally expensive in Diffie-Hellman or in elliptic curves or with RSA. But with post -quantum algorithms, we find that the available options for key exchange seem to be a lot more efficient than the available options for signature algorithms. Whether that is in size, so the number of bytes that you send, or the algorithmic complexity of the algorithms. There's lots of weird trade-offs that you can make there, but it's not so straightforward anymore. So that's why we were also looking at this alternate approach. Why is that important to be looking at now? It's mainly because migrating the whole alt -stack signatures from certificates, the public key infrastructure on the Internet, you need to get your certificate vendor to get you a key with that key. There's a lot of standardization effort that goes over that, and it's important to get that work done now, so that we are ready when the quantum computer arrives in the future. Because if someone does have a quantum computer and you are still using RSA, they could impersonate you and then get in the middle anyway, even if you are using a post-quantum key exchange algorithm. We need to have the certificate authorities change all the way to the roots, so that everything's quantum safe. Gautam, you only mentioned one part of the projects you worked on. You also did some work in formal methods, right? Yeah, I spent some time looking at formally verifying the correctness and security of a post-quantum key exchange algorithm called ProtoChem. Yeah, this was on the more researchy side of things. But this is also, again, as Thon was mentioning, it's better to have your work checked over by a computer when it comes to this stuff. It can be very easy to make subtle mistakes that can cascade and snowball into something big. And so, yeah, just like for protocols, I spent some time looking at formally verifying algorithms as well. So, you mentioned ProtoChem. What is ProtoChem? So, the CHEM stands for Key Encapsulation Mechanism, but it's basically a key exchange algorithm. It's a post-quantum key exchange algorithm that's based on the learning with errors problem. It uses unstructured lattices and its implementation and its design is relatively simple to understand and to analyze. So, as of the moment, it is an alternate candidate in the NIST post-quantum cryptography standardization process. And you're implementing it in Jasmine? Yeah, so part of the work that I did, some of the work that I did at Cloudflare was actually to come up with a implemented in Go in order to be able to integrate it into the Circle library. But in addition to that, the plan is to implement it, an optimized version in Jasmine. So, Jasmine is this, just to provide a bit of a background, Jasmine is this programming language that sort of has an assembly-like syntax. But it's designed for writing cryptographic code and is designed to be more easily verifiable. And so what that means is it comes with this formally verified compiler that predictably transforms Jasmine source code into x86-64 assembly code. And you can also extract Jasmine, you can also convert Jasmine code into this equivalent representation in what is called EasyCrypt, which is a proof assistant. And then you can prove things about the Jasmine code in EasyCrypt. And so you can prove something in EasyCrypt, and then you can produce assembly output based on that. And so, if you did everything right, what happens is you get assembly code that you can reason about, that you have formal guarantees about, x64 assembly code that has some formal guarantees along with it. And the plan is to, what we're aiming to do is to try and integrate some formally verified assembly code into Circle using Jasmine and EasyCrypt. Yeah, so one interesting thing about Jasmine is it's not just formally verified that the code is correct, it also has some sort of side channel protections. Is that right? Yeah, you can also, so in EasyCrypt actually you can reason about the constant timeliness of the code as well. There are like facilities for that. And Jasmine also comes built in with a memory safety checker, an automated memory safety checker that you can use to verify for like some programs, perhaps like not all of them, that like it is indeed memory safe, or like it is memory safe for like certain ranges of the memory. And why is it generally important to have this formal analysis and verification efforts? What is it preventing? To anyone else, not only to you, but for everyone who wants to say. Sure, I can take a stab at it. So these formal methods are not a silver bullet. They won't prevent us from like, you can still, it is still possible to discover like new attacks that perhaps like nobody has ever thought of before. And then try and mount them. But I think part of the appeal of these formal methods is that they force you to think more carefully about what you're doing. And they also serve as sort of a, as like a database for all the mistakes that we've made before, for all the, to keep like. So, you know, now we now know, for example, that you have to, you shouldn't branch based on secret data. We now know that you have to write code in a certain way in order to, in order for it to be secure. We, all this like knowledge that we've built up over the decades of like writing cryptographic code and like designing protocols. I think having formal methods is a way to like sort of encode all of that. And like have a computer check it for you, as opposed to you manually doing it yourself every single time. Which can be, which is error prone, which is, you know, it's, it's, yeah. Error prone to have a human check it over and over again. It turns out computers are very, very good at doing what we tell them to do. Quite often, we think we've implemented something wonderful and amazing. And it turns out that it's horribly, horribly broken. And especially with big protocols, I mean, Tom was mentioning TLS. You can end up with, you know, 18 modes, repeated handshakes, all kinds of weird and wonderful combinations. And finding attacks by hand is just incredibly, incredibly excruciating. It's really, really slow. And computers, on the other hand, are really, really good at checking lots of things really, really fast. And so formal methods, and I mean, this is more formal analysis, is you take a protocol design, you say, we're going to try every single combination of messages, every single combination of modes, and every single combination of attacker actions that could possibly happen, and check that nothing bad ever happens. And like, it's a little bit more sophisticated than just brute forcing, but not that much more sophisticated. You know, you just try everything. And sometimes, not that often, sometimes you find, oh, if you do this really complicated dance of things, actually the protocol is broken. And, you know, we saw this in the TLS 1.3 standardization process, there were two or three attacks found using formal methods. And so this is sort of something cloud players continue to work on on our future protocols, which are not yet standardized. We're checking them before they're sort of mass deployed and everyone's using them, we're checking to make sure that they're actually secure, for some definition of secure. So, taking a step back, Tom, how did you generally enjoy your internship experience as a cloud player? Yeah, it was, yeah, I think the first thing that comes to mind is it's quite different from academia. I've been, I've had some side jobs, did a bit of web development alongside my studies, but I've basically only really seen the inside of the university. So it was very nice to also see like a medium-sized company. I think cloud players will have to learn that it's medium-sized now, no longer a small company. But for a medium-sized company, they run a pretty big chunk of the Internet. And what I appreciated a lot was also the open nature in which that a lot of stuff was shared around the company. And you could really see that, like, this is something that the streaming team is struggling with, or this thing is going on with the HTTP3 people that are experimenting with that kind of stuff in the protocols team. And that kind of thing was really cool. And also, it was nice to do, I like doing research, I really do. But I also, it is sometimes also to spend part of your time just building things and seeing those things move on to production and thinking about, I need to make a deployment out of this. And you don't get to play a lot with Kubernetes inside of the university. Let's put it that way. Which is, I mean, it was fun at times, it was horrible at other times. But that's the kind of stuff that you don't get exposure to in the ivory tower. And I really enjoyed doing that as well. And also just the research team is a whole bunch of like very, very smart people. And yeah, it was always great to get also lots of feedback from people that have also a very wide range of expertise is actually. It's definitely no longer the crypto team, it's the research team. And definitely also a lot of networking and knowledge in there that I benefited from a bit in some of the research questions as well. Thank you very much. And Gautam, how did you feel? Yeah, I also really enjoyed my time. Like Thom was saying, I also spent a bunch of time in sort of the academia ivory tower. And it was nice to have to think about some of the things that you might not think about when you don't do deployments in the real world. Or when you're not like, there's all sorts of, I guess, considerations that pop up when you actually have to deploy something to production. You have to be careful about it and you have to like, you really have to like look over what you're doing and the like. And yeah, I also, there's, you know, Cloudflare like runs a blog that like there's all sorts of posts about like all aspects of engineering and what is going on at the company. And I think there's a lot of great content over there. And it's nice that we also, the research team gets to present its work as such. And I also really enjoyed that like to add on to the openness part that Thom was talking about. Thank you very much. I think we were very happy to have you both as interns. And you really helped us a lot with advancing our goal to make the Internet post-quantum secure. So that's really fantastic. Would you like to add something, Sofia? Yes. So in 30 seconds, I guess. What are you, Thom, excited about the future in post-quantum cryptography? I hope that NIST will come out with the selections for the next step in the process. So drafting standards soon. And I'm excited to see, they said they were going to choose between two quite opposite options. And I think that those things will start shaping protocols in a much more direct way in the near future. And say the IETF, et cetera. You, Thom, in 30 seconds, are you excited? I'm also excited about the choices that NIST is going to make. But even more so, I think this is, deploying post-quantum cryptography is also a great chance to like deploy formal methods. Use of formal methods in production systems. And I'm really looking forward to tools like Jasmine, Tamarin, and EasyCrypt actually being out there in the real world. Yes. So I guess with that, thank you very much for joining us. And if you are a person that is interested on applying cryptography into the real world or about networks and how they actually work in the real connections. Please apply to our research internship program. It's open right now, so you can apply and we'll be happy to have you on the team. And with that, thank you very much, Thom and Gautam, for everything that you did. And also for being in this segment. And thank you Bas and Donald and also for being here.

Cloudflare Research

Don't miss these great sessions from the Cloudflare Research team!

Watch more episodes