Zero-Knowledge Proofs for Private Web Attestation
Come hear the behind-the-scenes story of our prototype Private Web Attestation: why we made it, how it works, and what it can do. See how modern cryptography further protects your privacy on the web.
Good morning, I'm Watson Ladd. Today we'll be talking about zero-knowledge proofs and the cryptographic attestation of person and project, making for easier, faster CAPTCHAs to tell humans apart from bots.
Joining me are two of my colleagues on this project, Thibault Meunier and Armando Faz-Hernandez.
So Thibault, what are the problems with CAPTCHAs and how does what we do address some of them?
So hi Watson, thanks for the introduction to feel like I'm excited to be here.
So yeah, as you mentioned, we're going to talk about the cryptographic attestation of personhood that we released over the course of the last three months in like multiple steps, those like first introduction of the cryptographic attestation of personhood, what it is, and then like an enhanced version.
So the problem we were trying to solve for like quite some time at Cloudflare is in order to verify that people that we have are not bots, we do have like a bot management system that like ideally is able to detect if someone is bot who is not a bot.
But sometimes we don't really know and so like we need something extra to ask from the user.
This extra test used to be a CAPTCHA that like we displayed to users and we definitely had like some great content on our blog about like why we have CAPTCHA, how we decided to have this CAPTCHA challenge.
And as part of the research team, so last summer, like summer 2020, we like a team was formed in order to like research new kinds of challenge that we could have in order to like reduce the number of CAPTCHA we see and something that is more practical whilst you're having a good level of security.
And so that's really how the cryptographic attestation of personhood came to be.
So for some of our viewers, they might not be familiar with what a CAPTCHA is.
So what's a CAPTCHA and why are they so terrible? Yeah, so definitely that's something I do think which would be best to have an example for.
So I'm going to like show you like the CAPTCHA like we display to users when we don't know if they are bots or not.
And that's something like we, Cloudflare is serving on behalf of a customer when they have like certain security features on.
So you should all be able to see my screen.
And so as you see, this is like a regular CAPTCHA page that Cloudflare would issue.
And we will ask you for one more step, which is a step as mentioned before, we don't know if you're a bot or not, we ask you for one more step.
So I will click on I'm a human, which is a checkbox pattern, which usually is intended to have a checkbox to complete it.
So when I click on the checkbox, I get a challenge to complete, which already doesn't look on my screen, but that's something else.
So I'm asked to click on an image containing an airplane.
So we have to like look for some of these airplanes, click on them. Definitely like something tricky, especially I do think with not being on an airplane for a while, for most of us, that's something even trickier.
So I click on that, I do verify, and then ta-da, I'm announced through the website.
And in this case, it was Praviz Fast website.
And now I would like just showcase what the user experience challenge looks like, and how it does handle some of the security aspect that CAPTCHA were meant for.
So I will just click on like one button, which is verify with CAP.
And as you see, I'm prompted to verify my security keys. That's something I do have on my computer.
And I will allow the website to access my security key, because that's what Cloudflare is using as of now to determine if your security key is like legitimate, and like if we are seeing it too much or not.
And so as I do that, sorry, go ahead. You might not be aware what we're talking about.
Oh, yeah, we're talking about this kind of security keys, that I think like beautiful pictures on the blog posts about like what is a security key.
It's also available with like some of the indicators you have on iOS, for instance, was like the type of space ID for Mac.
So what was happening is like we created a WebAssembly credential.
We verified that credential by sending it to Cloudflare, which can take some time, like this time was like rather fast.
And then once we know that like the manufacturer of your key is actually legitimate, then we just allow you through with some additional check on that.
And that's it, then we can proceed through. So that's what CAP is about.
Yes. So, yeah. So what are the actions that you made in order to interact with the security key?
So this would really depend on like the model you have of security key and the type of case you would be using.
So in my case, I just pressed like the metal, just to say it was okay.
But like on iOS, for instance, you may use face ID, on Android, you may use your own pin.
So like it really depends what kind of indicators you have.
And that's something that is specific to your device and you should be familiar with already.
So that means that you don't need to other like look for some exchange image that can be kind of annoying, right?
Try to pick in one image and try to differentiate it from others. Yeah.
I mean, definitely that's something which is different from like the CAPTCHA challenge, where you would have to look for specific images that are randomized.
So that's in our like preliminary study seems to be giving like a great factor of like accessibility to make this challenge more accessible and like less annoying than for CAPTCHA.
Of course, we don't know yet the effect on the long-term. So that's why we definitely like rolling this out slowly to better understand the accessibility differences.
Yeah. You raise a very good point about accessibility because yeah, not all the people can, for example, can see well the images or there are some other CAPTCHAs that use sounds rather than images.
And then yeah, you are not always in a good environment like in your office or in the school.
Sometimes you are on the street or you're on the subway trying to get to access to these resources and then you don't have like sufficient time and resources to solve this CAPTCHA.
So yeah, I really like that as an alternative for bypassing challenges.
What were the challenges associated with doing that? Yeah, well, so we already see like the user interface of this proposal, but behind the scenes there's some protocols and some cryptographic protocols that run behind that.
And one of the most important is like the attestation that is contained into the device.
So by attestation means that your device or your security key contains a secret that cannot be revealed, but they only can give a proof that the device actually owns that secret.
And with this challenge, what is happening is basically the webpage asks for one of these kind of attestation proofs and then these ones that it retrieved, so the proof can be publicly verifiable using public key cryptography.
So for this, so there are some concepts like digital signatures, the PKI, the public key infrastructure, and certificates.
So we can, so we need to verify these proofs in order to be, to have certainty that the device is actually not compromised.
And yeah, so going a little bit deeper on the implementation sides or on the security aspects of this.
So yeah, so if we take as a general, this concept of verifying some kind of proof, so and this proof is provided by the device as a digital signature.
So something that we need to implement there is a software that can verify these proofs and that can generate other proofs in order to hide some bits of the key.
So, okay, let me start by just trying to detail some aspects of the implementation.
So basically, as you can see in the demo, everything is running in the browser.
And basically, we get with a big number arithmetic.
So this is like the base thing that we need in order to work with cryptography and public key cryptography.
And in this case, using elliptic curves.
And then with that, so we can ensure that these mathematical operations get converted into the verification of signature, verification of, and the generation of signatures.
So yeah, some alternatives in order to, for making this proof concept is that the use of WebAssembly.
So we can, for example, try to compile some high level language, such as Go or Rust, to WebAssembly in order to construct this.
So basically, we rely on TypeScript.
We use big numbers. We construct elliptical implementations.
And then we ensure that these operations get computed faster. Kibo, would you like to introduce us to our special guest?
Yes. So my cat is not a human, so he doesn't pass a capture.
But he still likes to appear on TV. Definitely.
I do think, once again, Armando, maybe question, how does all this code interact with the WebAssembly API in order to get the credential?
Yeah, basically, the Web API is a standard that is already backed by W3C.
Basically, it provides some API and some functions in order to get attestation from the device.
So basically, it's just calling a function that access through the web server.
I mean, if the web server is compliant with this standard, most of the, sorry, if the browser is compliant, and most of the browsers are currently compliant with the standard.
So basically, what you only need to do is try to ask for this kind of attestation proof.
And then the device internally will work in order to produce this proof.
And of course, there are options for verifying and options for configuring settings in order, if you want to get more security or a different parametrization.
All of this is based on that standard. So this is fairly cutting edge stuff.
And we're happy to announce that we'll be publishing some of the underlying cryptography that lets us make a more privacy-preserving form of WebAuthn attestations at SAC 2021.
And there's a lot of reasons why Cloudflare Research pursues publication, doesn't just make great products.
And that's in part so that the rest of the cryptographic community looks at them and can build on them.
And we hope that browsers will adopt these sorts of things deeper so that we no longer have these annoying pop-ups warning you about that you're clicking your key because we'll have stronger privacy -preserving properties, not just for us, but for the whole web.
Yeah, speaking of which, so this is something that this protocol is like using serial knowledge cryptography in order that you can prove that you hold certain information without revealing it.
So this is basically the very general concept of this.
And then, yeah, it would be great if you share with us a little bit about the CKB blog with the blog, Watson.
Sure. So when we started creating this project, we realized that we had a bit of an issue with the way WebAuthn was originally used.
WebAuthn imagines that you can use, you can register for a website and say, okay, I want to log in with this security token.
And it does so in a way that's anonymous. The two different websites don't know that you've used the same token.
However, there were companies that were particularly in the financial sector were concerned about the security of the tokens.
And they wanted to have a way to be able to say that this token is manufactured by a manufacturer that meets certain security standards and the token's been tested.
And so there's this thing called attestation. And attestation is what we use in sort of the vanilla version that is live and is actually handling some challenges right now.
But attestation comes with a cost. The manufacturers essentially say this token is in a batch and the batch has to be a certain size of tens of thousands or I think a hundred thousand devices.
And so this was a compromise between sort of manufacturability and the impact of keyless leaking and the security of the browser.
We thought we could do better. And so what we did is we used something called zero knowledge proof.
Zero knowledge proof is a protocol between two parties, a prover and a verifier.
And the prover claims that they know, so there's a statement and a witness.
The statement could be something like 15 is not a prime and the witness would be three times five.
Now, if I'm the prover, I might want to convince the verifier 15 is not prime, but I don't want to tell them the factors.
So how can I convince them? Well, in this case, one thing I can do is I can take square roots mod 15.
And it turns out that being able to take square roots mod 15 is equivalent to being able to find the factors of 15.
So if the verifier, if there's some way for me to convince the verifier I know it takes square roots, then I'm done.
And there is in fact such a way.
It's a little complicated to explain, but it has the property that after the verifier and the prover do this a couple of times, the verifier can't take the transcript and show it to someone else.
Because if they, the verifier could also sit down and figure out what the right, sort of it's a three move thing.
The prover goes first and says something and the verifier says something back and then the prover says the final thing.
But if the verifier could also go work backwards and figure out what the prover had to say in the first sentence to the final thing work out.
And that's the kind of protocol we use. And there's other sorts of variations like snarks, which are used in Zcash, which some people may have heard of.
When you're spending Zcash, what you're doing is you're creating a zero knowledge proof that you are the owner of this Zcash token and you're sending it to somebody.
And all the rest of the network sees is that you've done that and they don't see who you're sending it to or how much you're sending.
All of that is hidden behind the zero knowledge proof, but the zero knowledge proof means they can be assured that you're not making money out of nowhere.
So these are very useful primitives that have all kinds of applications.
So, okay. Sorry, just to get a map to what we see in the demo.
So basically what we are running is like this zero knowledge protocol between the browser and the client, right?
So who is who in this protocol?
Who is the prover, who is the verifier in this case? Yeah. So I think we see how the, I think we saw the zero knowledge version, but it's very similar.
And it sends that proof to the server, which verifies it and then says, yeah, that checks out.
You have a token and it's a token we allow, but we don't know which one it is.
You can go on through. Yeah. And so, and so if I understand correctly, so we're hiding some kind of information to the server in this case.
So that is protected by the zero knowledge proof.
So basically what is this information and why we need to protect that?
So the information that we're sending is protected.
It's a signature of a message. So the WebAuthn protocol says that when you do attestation, you do it by signing a certain message.
And the message has various components, some from the server, some from other places.
I don't really know the details, but our protocol works for any sort of message.
And what we need to hide is the key that was used to sign.
The reason we need to hide the key is that key would let us know what batch you're in.
We don't want to know more about who you are than is strictly necessary.
We'd much rather learn you have a token that's secure than you have a token that's secure and it was made by this giant factory.
And there's hundreds, there's thousands, like, we don't need to know that.
We don't want to know that.
And so that's what we're hiding. And we're also. Yeah. So basically this is the spirit of privacy per se.
And so we don't want to get too much information and then we can use like serial knowledge proof, serial knowledge cryptography in order to remove that part of, let's say, the information that is not entirely necessary in order to run this kind of protocol, like, for example, for solving CAPTCHAs.
Exactly. And these techniques have many other applications that we're excited to explore.
There's been a lot of work over the past years expanding the repertoire of techniques, new tooling, all kinds of very exciting things.
Yes. So you mentioned that there are more applications, but if I want to know more about this, so where I can go to get more resources, is there a way to write the protocol, like if I want to implement it for my website or what are the resources available?
Sure. So if you look at our GitHub page, github.com slash Cloudflare slash zkp dash ecdsa, you will see the source code of our zkp demonstration.
Our zkp demonstration is at zkp.Cloudflarechallenge.com. And it's a demo that anyone can do with a WebAuthn device.
Just a demo. It's showing off that we can do this.
And we have a paper that's going through the publication process that will be publicly available quite soon in ePrint and other places, too.
And there's also our blog, which goes into a considerable amount of detail about the underlying mathematics and what we had to do to make this protocol run in the time and space allotted, as well as a companion blog explaining what we're doing in terms of CAPTCHAs.
And that gathers it also, the blog had an experiment where we asked people to try both the old way and the new way and see what their opinion was.
And we'll have more about that hopefully coming out to a human-computer interaction conference near you.
Yeah. So all of these are really great resources. So basically summarizing.
So you are not, with this project, we're not only providing like a mechanism to, in order to bypassing CAPTCHAs or try to removing CAPTCHAs and provide an alternative to bypassing challenges.
But also, so you mentioned there is code available, which is publicly available.
There is an academic publication that is also available.
There is a blog post in order to get a friendly introduction of the project to the users.
So yeah, that sounds like a very big project.
And then you also mentioned that there are more other applications. So yeah, so I wasn't, it seems like a very simple, like, you know, like clicking two or three clicking on the web security key.
It should be as easier as two or three clicks, but actually there's a lot of work behind that.
Right. So yeah, that's amazing.
Yeah. We, we always, there's always a lot of work behind the scenes whenever you connect to a website or use Cloudflare and we try to hide as much of that as possible where we work behind the scenes that you don't have to.
So speaking about like, you know, like so this, this you know, that there is this web API to access to hardware devices.
What about try to, do they support like this?
These devices didn't support this ZKP generation proof? No, but that's not a problem because, because our ZKP just works with the signature devices, the hardware devices already know how to generate.
Adoption can happen entirely in the browser.
And where we've been in communication with makers of hardware devices that benefit from a slight variation of this, we've reached, we've had some conversations sort of preliminary about how, what this could look like.
And Thibault might know more about where, what the future is in standardization, but we're hoping that we get to a point where all uses the WebAuthn API benefit from the additional privacy that these techniques can offer.
So Thibault, want to share something? Yep. No, so like as we build the challenge, we build it around like, and like with the extension of the ZKP, we build it as an extension to WebAuthn.
And so that's something that WebAuthn protocol supports.
And we definitely will be working towards adoption in the standard.
That's not something we've engaged with now, even though the protocol has been designed in a way that we could integrate in the future.
There's definitely more work than it just works on our side.
There would be more consideration about how it could be used in the ecosystem, et cetera.
And that's a decision we didn't have yet.
So you keep surprising me. So now, so basically if I want to implement this for my own server, I don't need to rely too way to the press force and I can use it right now because it's just an extension of the current protocol.
But if in the future, I want, like, let's say, try to move in this to a standard way for all accessing all devices without revealing too much information.
So this is the way to go, like trying to push in this standard, this as a standard proposal, a new attestation method in order to get this natively in the devices and get this more privacy preserving for the user.
Is that correct? Yes. I mean, yes, that is, does not necessarily need to be directly in the devices.
It could be like only like in the browsers.
With having especially new consideration regarding like the privacy that you get, you would not like any more, like reveal the exact batch and manufacturer you're part of.
You would just know that your key is secure and like you would provide a proof that your key is part of a set that is like, for instance, like the set, which is like a manufacturer authorized at a certain level by FIDO or something like that.
So that means that's the same. That could be directly like in browsers to improve, once again, just accessibility.
Should it be for like capital or the services that would require an attestation?
Yeah, I think I have a final question for you. So is this already deployed on Cloudflare?
The challenge is deployed and like we have some users saying it.
For the ZKPs, that's something which is, we could say it's deployed on Cloudflare because that's how the demo is run, but that's not something we've integrated as part of our challenge platform.
Yeah, and if I want to provide some feedback to you, so what are some links that I can go there?
I think definitely, if you want to provide some feedback, the easiest entry, as you mentioned before, would be the blog because that's where you would have most of the information to be like links to the GitHub repository, future blog that may arise.
Also, you have a demo that contains a feedback form where you could have more feedback and where we do also get some information about how you're performing, like if there were some errors or if you've been performing impressively faster, more like a long time.
So definitely the blog post would be the main entry point on that.
And yeah, I think that wraps up our session.
So thanks everyone for attending and joining us. So I want to, of course, thanks Watson and Armando for the expertise and participation, but also all the people at Cloudflare that helped bring this challenge forward.
Thank you so much. Thank you so much for appearing. That's it for our segment.
I will be back soon with more things from Cloudflare research.