Changing Password Policies at Scale
Presented by: Junade Ali
Originally aired on June 9, 2020 @ 9:00 AM - 10:00 AM EDT
This talk discusses the state-of-the-art with password security, credential stuffing attacks and the development of Pwned Passwords (including how Cloudflare products like Workers helped scale the project and the anonymity approach used).
English
Security
Authentication
Transcript (Beta)
Hello. Good afternoon or good morning or good evening depending on where you are in the world at the moment.
Welcome to Cloudflare TV. My name is Junade. Today I'm going to be taking a bit of a step back and talking through a story about something I've worked on over the past few years on and off to do with pwned passwords and password security and changing the Internet practices around that.
A bit of background on me is at Cloudflare I work on the support operations engineering team.
It's a team I look after. It basically does everything from diagnostics in the dashboard before people submit support tickets, natural language processing style things for classifying support tickets, processing data which is useful for diagnostics, and things like allowing our support agents to have the tooling to run various commands at our network edge.
So this is a project I kind of worked on over the past few years.
Never anything too official but it has quite a large impact so it's often nice to talk about this one and it was definitely a fun project to work on.
So let's start off by talking a little bit about passwords.
Passwords have been quite a dated mechanism as you can imagine. They have a long history behind them.
The first thing a lot of people ask is why haven't they been replaced?
Surely there's something better we can do. What I've got on screen here is quite a few years old now but a paper which actually compared different mechanisms for web authentication and in the end it found that effectively no replacement matched the full set of benefits of passwords.
So for example even if we think about things like fingerprint technology, in those instances there is a proportion of the population who don't have fingerprints that can be used in authentication.
There are challenges around access to technology and things like that.
With that in the background we have humans who aren't always the most security conscious beings and password dictionary attacks.
A few years ago we would have large databases of passwords and that censors some of them out here but effectively people would be able to do brute force attacks offline and things like that.
The industry tried to add password requirements to get around this. Let's make sure people have the right amount of entropy in their password.
Let's require them to have all these things in their passwords, alphanumeric characters etc.
Unfortunately what's been analysed has been the case that this actually hasn't helped reduce password reuse, especially around people putting in personally identifiable information into their passwords and things like that.
So at this point you may be thinking okay so we've got a good strategy on the whole.
These brute force attacks can take a lot of time, especially online examples.
So what's the big deal here?
We've managed to have something which is potentially secure enough in most instances.
Someone's not going to know this PII, someone's not going to know these large brute force lists or not be able to to run them all through a website before they get locked out.
Then entered credential stuffing. Credential stuffing is an attack whereby someone is basically from large vulnerability disclosures, discloses entire lists of breached passwords of usernames and passwords.
So for example you go to order a pizza and that website maybe you know it's the local shop, the website hasn't got the best security practices.
Either through something like the database being compromised or even where they haven't limited the brute force attacks that can be done with rate limiting or so on.
Someone is able to ascertain what those usernames and passwords are.
Once someone has ascertained them they're able to inject them into more secure websites and in effect what this means is where passwords are reused the accounts get compromised and someone is able in effect to work up the kind of supply chain or work up what they value in their security ecosystem.
So from a low-risk site like a pizza website to potentially a banking website or something which could have far more disastrous implications for that individual.
That's where we reach the problem of credential stuffing and credential stuffing has been a particularly, the past few years has been a more and more eminent problem across the Internet.
So in effect there are three core things which are vital in order for us to be able to combat this problem.
The first is really around developer education around security practices.
We need to have developers aware of how to securely store passwords, how to use rate limiting to lock out accounts, how to do those types of practices.
There's also a user education problem which ties into this.
We've had the growth of things like two -factor authentication where people can use apps on their phones to authenticate whilst they're logged in that can improve their or reduce their personal risk.
And the third thing is really around how we change the practice of how password composition rules are done.
Password composition rules historically, you know, we've seen the examples with complexity where those are designed to mitigate the problems that the fundamental issue really was around password reuse and that was the thing we want to eliminate.
And this is really the problem which pwned passwords sought to address both from a developer standpoint, user education standpoint, making sure developers had tools they had to be able to lock out these accounts but also users could also educate themselves and this provides a fairly meaningful route for us to do this.
Blocking breached passwords, you know, this comes from again the position composition rules are ineffective, instead we should block users from signing up with breached passwords.
The interesting thing is there's been research around this which shows with fear appeals users are actually most likely to change their behaviors.
Interestingly before pwned passwords was generally available, the way researchers would do this was we'd use typing patterns to try and see, you know, evaluate these types of things but obviously with password managers there's overall a low accuracy rate on things like that so they weren't perfect.
And then recent guidance from NIST over the past two years I think has basically effectively done, the main thing it's done is it's put explicit recommendations for website owners to block password reuse where the password is known to be breached and also there's been increasing guidance from various organizations that put greater responsibility, if you like, on website owners.
But this is easier said than done, data dumps of millions to billions of passwords are available online, password lists are huge, developers can use a, yeah, so originally at the start of this problem, you know, developers were able, there were API solutions available to check if a password has been breached, the very first version of the pwned password, Troy Hunt's pwned password service offered this, but users or developers rather faced the option, either they would send the raw password or in an unsorted hash which wasn't really the most secure solution and this led to a situation where developers, if they wanted to do this, would have to download large amounts of breaches when they spin up these services to do things like this.
This obviously isn't practical, you know, downloading 20 gigabytes of data when you want to run an online service, potentially depends how you shard that data around, multiple copies, things like that, it really wasn't an ideal solution.
So, February 21st, 2018, I worked with Troy Hunt to put out version two of pwned passwords and this contained the functionality which kind of mitigated this solution and the launch was quite successful, Troy put out a blog post, I wrote a blog post as well which had some of the more technical information on how the protocol worked, there's been videos made, if any of you watch Computerphile, Computerphile is like a YouTube show which explains computer science concepts, there's been videos explaining how this protocol works, this protocol got a lot of press attention over the various stages of it being out, so it was something that attracted quite a large amount of information and fundamentally the problem was how do we minimize, how do we let someone find out if a password has been breached without them needing to share, you know, the entirety of that unsorted password.
In order to do this, the approach that's used is effectively hashes, so there was an existing solution which people had originally thought about, this was a private set intersection problem, private set intersection really allows two parties to anonymously check if they have overlap, if there's an overlap in the sets of information, this has very, very high computational and communication overhead, more so than downloading the entire data set, so this wasn't really a practical solution in that sense, so the approach kind of landed on was to minimize the data loss, someone would share a partial hash which would disclose multiple suffixes of the hash, so they would just take the password, hash it, truncate it to a given size, they would look up those passwords in that bucket, they'd get a response back and they would check if any of the user's passwords had been breached, there's an approach here which defines how this, how this, how the, it was decided to ensure that there was at least one password in the bucket, some of you, if you're familiar with kind of crypto examples would have heard of the birthday paradox, the birthday paradox is really around hashing algorithms and whether there are collisions or not, so on the left is the traditional interpretation which is we want at least, what's the probability of there being at least one collision, and you can take a bunch of however many messages you want to put into a hash and the probability of a collision and you can get out how many bits you need to return, on the right hand side there's a slightly modified example which I've written about elsewhere which is instead of at least one hash this is the probability of there being, of there being collisions in the overall data set, I won't dwell on this too much but if you want me to go into a bit more detail there is a email link under this video where you can either ask questions about this entire video, if I have time I should be able to go over them, if not I'll see if I can get your details to provide more information, but in effect really this is the example of how password search here works, you take an input, the input here is the example just a phrase test, take a SHA1 hash of it, you truncate that hash and then you make a and that request will return a lot, a bunch of different hashes and if it contains one of the password you're looking for it will indicate if it's breached, so here's an example of a curl request I'm making just to that end point and this is a structure, so the suffix of the hash and then a colon and a count field, this structure basically allows someone to put in a given input and check if the hash is breached, so you can see a different example, normally when we talk about password hashing algorithms it's all about adding more entropy and you know we will have a salt and we will put that into the, we'll put that, concatenate that with the password, we'll hash it, we'll put it back in, we'll keep hashing it, add more computational difficulty, this example is more about instead of in an example where we can't do that, we're adding obscurity and that's basically how the protocol works, the protocol allows a user to effectively with a degree of anonymity check if their password is in a data breach or not without disclosing that the complete hash and then later on after the version two of parent passwords, I worked with both individuals inside Cloudflare and also some academics at Cornell to devise improved algorithms to this so we could further minimize the data loss and this paper compares, it gives an empirical analysis of a bunch of different protocols, there were two novel protocols added to this as frequency size bucketing, identifier based bucketing and the other protocol mentioned here is Google password checkup, so Google password checkup adopted our work on k-anonymity, modified it a bit with improved examples and what I'll just say on this is, this provided a really good framework to develop the knowledge on this, Google for example, they added a few different things, they added a degree of private set intersection, they added some other things which made this quite nice, there are two kind of distinctions of protocols, one's where username and password and one's which are password only, so this really advanced the field of knowledge here.
The advantage I want to talk about as well here is around performance, so this is quite old, the usage has ramped up fairly dramatically since this point, it's now probably 50 times that much but in effect we had very very good performance on this, Troy put up this origin service to services requests but there was a very very high cash hit ratio which meant there's very very minimal cost for running the service, so in this example originally we're at 94% and I'll talk a bit more about how we ramp that up even higher, so today it is around 98% with additional optimizations, this screenshot again is a bit older, date was a bit off but the cash hit ratio is nevertheless pretty much the same.
The other advantage because we were able to figure out this caching model, we also additionally were able to get, we're able to deal with traffic spikes no problem, you know huge amounts of requests surging into using the service, ordinarily you know HTTP 429, too many requests, that isn't really a problem here.
And there are other examples where you can get very high cash hit ratio, Troy himself wrote about how a simple website was able to get 99% cash hit ratio, why no HTTPS, critical thing to remember in this example of pwned passwords is we're talking about 16 to the power 5 queries which someone can make, making sure the caching for that is optimal.
So first thing in optimizing this is when a user makes a request, there is something called an origin header where they make a javascript request from one site to another, so this is something in the front end for example.
This type of header is you know used for cause basically, so a site can identify which site is allowed to make cross-origin javascript requests.
This is by default in the cache key, so the first thing we initially did is we altered this kind of cache key, so regard if someone was making cross-origin javascript requests, we would basically remove that.
We tried it as well initially on a CDN offering which is on Cloudflare, this CDN service went from 91% cash hit ratio up to 94% and it continued to ramp up after this, after a planned cash purge and things like that.
And it eventually ramped up this particular service to 99 plus percent for javascript assets and things.
The other thing we started to do is we used Cloudflare workers.
Cloudflare Workers originally started to deflect various bits of logic and this Cloudflare Workers allows you to write various bits of javascript that run out of network edge.
And in this example, just a simple example, it's adding some browser headers for cause but we're able to build up on this.
We were able to build up to do some validation checking but also to unify the casing.
So if someone put in lower hexadecimal digits, you know, five f's in lower case, because that's a bucket they want to query, we would make sure they're a unified case that would halve the amount of assets we needed to cache.
So that kind of optimization process drove us to a point where we were able to deal with very, very significant amounts of load on this service very easily.
Next example I just want to draw towards is a service which has actually implemented this.
So this is a service called EVE Online. EVE Online is effectively, excuse me, EVE Online is effectively an online gaming service.
So naturally they have to deal with lots of these types of attacks, lots of different credential stuffing attacks across the network and they need to protect people's accounts.
So this is an interesting example about how they protected this.
So they introduced this actually very, very quickly, 2nd of May 2019.
They actually use the have I been pwned passwords API and they shared the dates they've blogged about this themselves.
They also shared with Troy and me a lot of the information on this. In this example what they were doing is when someone would sign up to their site on the back end, they would make a request to pwned passwords and they would check to see if this data was breached or not.
And effectively they measured a few different things, password changes, median two -factor authentication changes, and I've kind of, the data's been bucketized here into three different areas.
The time frame leading up to that change, the time frame immediately after and later on.
So you can kind of see there's a very dramatic increase in password changes, users enabling two-factor authentication that trails off a little bit over time.
And there's some metrics here, 30 days post rollout, 184% increase in password changes, 45% increase in two-factor enablement.
So this is after rollout and then 30 to 50 days we get some other metrics here.
So you can see in the user actions when that says hit security, that's when they've been given a fear appeal and that tells them to change their credential and stuff like this.
So this is an example of a more educational use case.
And in this use case, really what's happening is EVE Online wanted to educate users and tell them, hey, this is something you're at risk of with your account.
You should really pick a more secure password.
You should really potentially enable two-factor authentication.
In these examples, the other form of examples is where developers actually just outright block sites from reusing breached passwords.
They can either do it on a fixed amount of usage or something like that.
But what this shows is even where we are using general kind of fear appeals, we have a very kind of higher rate of compliance on the whole in terms of users understand the security risks and they want to improve things.
And the anecdotal evidence from the EVE Online blog continues to show that this has been effective.
When they first implemented about 90% of logins generated the message.
Today, this has dropped to about 11 or 12%. So there's been even with additional signups.
So there's been a continuous positive running benefit from them being able to do this.
The other thing is as well, this has been implemented in many, many sites in various different fashions.
What is often a great thing is if you have at your company or for your side project implemented potent passwords and if you're measuring this, please do share the data around if you can do it within your company's privacy policies and so on.
Because it really does help contribute to this area.
There's various other examples of this sites where they've either used our API to do this, some areas where they've just used the outright data sets.
Various different sites have just downloaded the entire data sets and use these to offer this level of security.
Finally, I want to talk a bit about one of the newer kind of innovations in potent passwords, which just came out a few months ago.
But if you are using potent passwords and haven't already implemented this, it is often a useful tool.
So I was speaking last year, I believe it was in Sweden, PasswordsCon 2019.
I got in touch with Matt Weir.
Matt Weir had been doing a lot of security research into potent passwords and the protocols and had a ton of suggestions as to how we could improve things here.
He was basically curious as to what the impact of padding would be. What he'd actually been studying was given the responses you get back from potent passwords, even though they're encrypted under TLS, how possible is it actually to fingerprint which bucket someone is looking at?
There's blog posts about this, which goes into more detail with various different people.
But effectively, we wanted to have a solution of padding to mitigate any of these issues.
So we added this particular header option where someone can add padding with a value of true, and it will randomly generate a bunch of hash suffixes.
So if you put a curl header, which has add padding true, or if you make a curl request, you just add that header to the request, and you'll get back a bunch of random information.
So here's an example of what this data now looks like with padding.
You can see this random suffix of a hash with the count field set to zero.
So the first three there are legitimate use cases from the bucket that's being searched, and the last three are usage, which is just padding effectively.
And this is designed to make sure we have this kind of buffer on the wire as to what's going on.
So the way this is kind of devised is, I'll just actually go through a few examples that demonstrate how this works.
So here there's a for loop. The for loop is counting to 10, adding padding, and it's counting the length of what we get back.
The length is devised in an interesting way.
I think it's, we ensure there's a minimum baseline as to the number of responses, so something like 800 I think, and then there are about 400 responses which can be, which are randomized about 800 to 1,200 in total.
So that makes sure from request to request, regardless of how something is cached, you get a different size of response back.
You kind of see this working here.
The same buckets, when we look at the download size, yields different results.
And the implementation for this is quite straightforward here. It is effectively, sorry, actually I meant 800 to 1,000, but we basically modified this workers here.
So what the worker would do is the worker would determine how much to pad the response.
So it says, you know, how much do I need to get up to that base value of 800?
And then it would generate, you know, a number between up to 200 in effect, which would determine how many responses to add.
This uses a cryptographic random number generator, which is quite cool at Cloudflare.
So the way it ends up working is there is a Cloudflare worker, and this worker goes into a, in effect, it goes into the, you know, DEVU random of, and it, from the VA engine, it will pull in some random data, and it will use a cryptographic random number generator to get that.
Cloudflare, we've used lava lamps in the past to kind of seed that information.
So lava lamp seeded randomness there as to the amount of padding.
On top of that, the amount of hexadecimal characters which we yielded is basically, is effectively determined on a slightly different basis.
It's, as it doesn't need to be cryptographically secure, it kind of just uses a more pseudo random algorithm, and it adds the additional padding that's needed for those 35 hexadecimal characters.
And there is a, yeah, and then that's basically how that works, and adds the padding information.
Right, so that's kind of the overview of what's going on here.
What I did want to do, as we've got a bit of exercise, I wanted to go back, and I wanted to highlight some of the additional bits which I spoke about in the initial round, but didn't have time to kind of drill it, or I wanted to save to drill into a bit later.
Again, if you have any questions to me, or anything as we go through this, please do give me a ping on the email address, so I can answer any questions which you may have.
So one of the kind of common questions that comes up is really around the security protocol that's devised for parent passwords itself, and how this algorithm here works, and how this query algorithm works.
So when someone is making a request to this service, they are making a request on the basis of a SHA1 hash, which is five hexadecimal characters long.
So to understand that five characters long is about, each hexadecimal digit can represent zero to nine, A to F, so it's about 16 total characters.
So it is represented by a nibble, a nibble is like four bits, and so we have those, that each bucket is queried on the basis of five hexadecimal characters, which is about, you know, in total 16 to a power five different buckets, which you can search for, and you get the response back, which is on the suffix, total of a SHA1 hash is 40 hexadecimal digits, and things like that.
So the interesting thing is, one of the things people often query is managing the data leakage, and how that's done.
So there is a small amount of data leakage, in this instance, hopefully the first five characters, you know, that's not sufficient if someone's password hasn't been breached.
In the event someone's password has been breached, it really depends upon usage, you know, often something which is discussed is how can we use usage to infer how someone, you know, how someone's password has been, which password they're using, or so on.
Hopefully in many cases we'd be able to mitigate this in the instance, you know, where someone has been warned not to use a breached password, or told not to, or their password is already at risk.
But in the new protocols we defined, we were able to eliminate a lot of these risks through a few of the new things in the new protocols.
Firstly, frequency size bucketization. So this works on the basis of, in effect, someone minimizes the data leakage.
They will try and only query how much they need, and, excuse me, and in the event, you know, and keep drilling in effectively as they need more information.
So this is quite an interesting example as to where, you know, a little bit more control is needed, or given as to how the user discloses information.
The other algorithms which use things like Google Password Checker, Identifier-Based Bucketization, these work on the basis of the search is actually based on something which isn't private really at all.
So it's based on something like a username, and it returns the passwords which are derivative from that.
So that's kind of how it's searching. That minimizes any security loss.
It means that someone does not need to leak their passwords at all. This is kind of an interesting area which has been looked into as to, which we're considering as to with potential for future versions, whether this is something that is an interesting area.
But there's been so much great research going on in this area that this is a very productive area of discussion that's going on.
Caching, I'll drill a little bit more as well into some of the tricks used.
So what I've kind of discussed already has been around things like where people will, you know, that we minimize anything else which is in the cache key.
So things like the origin HTTP header from the browser.
The other trick that was used is we actually use a feature called Cloudflare Argo.
And what Argo allows people to do is allows one of the features which it does is it allows tiered caching.
What tiered caching allows us to do is tiered caching means if something isn't found in a particular local data center, it can, there is a hierarchy basically of data centers which can be queried to get that bit of information.
And that offers, you know, cache hit ratios. If someone is in Iceland and searching for a fairly, you know, relatively unused bucket of passwords, it offers that as well as a performance enhancement.
The other thing I'll notice in terms of just in general with caching and also usage on the whole, hashes are uniform.
That's a fairly uniform in their distribution. You know, if you were to hash some random information, you should get a fairly uniform distribution.
There's something in cryptographic cache functions we have called the avalanche effect.
The avalanche effect hopefully means that a small change leads to a very different output.
So the interesting thing is unlike other caching examples on websites, there isn't one bucket which has significantly higher usage than others.
It kind of ends up being equalized out, which is one of the reasons why this is quite useful.
And I kind of skipped over this bit a bit earlier, but I'll drill in a little bit more about the birthday paradox and interesting issues there.
So in order, the initial interpretation, in order for us to figure out how much information someone would share, we initially did it on the basis of someone will query a bucket and that bucket will return back, you know, at least more than one record.
It turns out when someone queries for five hexadecimal digits with the data size, someone, you know, the response size would be about 300 hashes each and that would guarantee at least one collision.
The birthday paradox is something, if you kind of studied cryptography before, is something in theoretical terms you may have heard of.
What this effectively means is if you had a bunch of kind of pigeonholes, if you like, on a wall or a bunch of buckets and you started throwing balls at them, what the probability is that a ball would eventually, you know, be that a bucket would have more than one ball in.
So that's the birthday paradox there.
And that's a useful kind of scheme for estimating the probability of there being a hash collision.
Data anonymization, the reason this approach was slightly tweaked is because actually what matters more is the overall rate of collision.
You know, you want to know if I have a thousand messages and they're, you know, into however many buckets, you want to know what the overall rate of collision is, which is often more useful for anonymization than at least one collision.
So that's kind of the example that's drilled into there.
And question, do you think physical security keys like Yubikeys could replace passwords?
So physical security keys like Yubikeys, so this is kind of where we drive back into some of the original work I pointed out with, sorry, why haven't passwords been replaced?
There are barriers to entry of different things we use.
So there are barriers to entry on, I spoke earlier about fingerprint technology.
Fingerprint technology has a requirement that, for instance, a user has a workable fingerprint, they also need to have the technology or the physical device able to do that.
In many instances, you know, those physical devices may not be accessible for a user to do that either, which also raises the potential that there is increased barriers to entry.
This paper here does provide quite a good useful analysis of quite a few different schemes.
On the whole, Yubikeys are generally very good security measures.
We also have things like two-factor authentication.
Challenge with passwords is really about what the usage is like across the board.
And that's where the interesting challenge comes about.
Alongside passwords, we have seen passwordless examples where, you know, someone will be emailed an example, you know, a link to sign into something.
But again, there are similar problems around that in terms of usability.
So unfortunately, my own view is that passwords are here to stay for the most part across the board.
And I think in general, that's the situation we'll be in for a little, for certainly the foreseeable future until something comes out.
The other interesting thing I should mention with physical tokens or phones and two-factor authentication devices, a big problem that hasn't actually been solved throughout the industry has been the problem of reuse, or not reuse, but of loss of devices.
So if someone loses their phone, if someone loses their YubiKey or whatever else, the recovery process is for that.
And in general, some people have devised things like backup token solutions or solutions like that.
But where there are cases where someone is facing a near enough total account loss situation, they've lost the tokens, things like that.
That's a problem as well, which hasn't been universally solved across the industry.
There's been overall little work into that, which in general, hopefully over time, we'll be able to fix.
Hopefully that answers that question there.
The other thing I was going to dive into, so a bit of a deep dive into some of the caching things as well, and some of the API scheme things.
The other thing is I would also note is when you, if you build a system like this is to be quite defensive about how the system is built.
So for example, one of the, excuse me, one of the things which is considered in the Pwned Passwords workers is what if someone enters in, you know, too many hexadecimal digits, that's the case where you want to reject instead of passing back the response, because in that case, someone could, you know, unintentionally be leaking more data and you don't want them to find that out.
So making the protocol kind of as rigid as possible in that direction as well is quite interesting.
Going back over usage information, speak about EVE Online.
There is some interesting information I've shared here about use cases.
I would definitely recommend anyone check out the EVE Online blog where Stefan has graciously gone into a lot more detail about things on this front and how we can mitigate and the effects of this solution.
Please do again, let me know if anyone has any specific use cases of this, which has any interesting data.
But overall, all the results we've seen in general have been, this has been a really positive measure to reducing credential stuffing attacks.
And on the whole, it's been great. And the padding thing as well, definitely let me know if there's anything anyone has of interest in terms of that.
Feel free to drop a question on that front. But getting closer to wrap up, I will just say thank you to a fair few people who've helped on this.
Troy Hunt, obviously, who runs the Have I Been Pwned service, he has done an enormous amount of work to basically, you know, build the solution out to help it get used and widely adopted.
Tim at Cornell, who developed a lot of the protocols that have been absolutely brilliant, and working through a lot of the formal analysis challenges and things like that and developing the way this goes forward.
So my colleagues at Cloudflare as well, you know, who've helped in that process and some of the analysis has been great.
Stefan, we spoke about EVE Online, he's done a great job of, you know, capturing some of the initial results, which have been great.
Matt Weir, who helped with some of the padding work, has been amazing as well.
And there's been so many other academic and product contributions to this, you know, Google Password Checkup is one example I spoke about a lot, the work they did with Stanford has been amazing.
There's been other examples like 1Password have integrated this into their browser, you know, their Watchtower solution, Okta have created a Pwned Passwords Chrome extension, which has been amazing as well.
So there's been so many different great implementations of this project.
And yeah, everyone who's contributed to it has done so amazingly.
And I will keep an eye out for any more questions whilst we're getting ready for the next session, which will be on, presented by Elisa and Peter on Project Galileo's spotlight, the water project.
So yeah, someone just just messaged me was, if I could talk a little bit more about how we got the attention around this project, and it developed fruition.
The idea around this project originally came a long while before it was implemented.
It was something I had in thinking when Troy originally put out kind of Pwned Passwords, I'd already known Troy and worked with him on various things.
So I got in contact with him and we kind of worked out this approach and we were able to get out.
Definitely it being part of a big project like Have I Been Pwned has absolutely helped it develop a lot more attention.
Yeah, something else was, yeah, in terms of the kind of cost for running the service, how it's worked for out for people like Troy as they've been building out this solution.
Troy originally set the service up to use kind of a serverless solution on Azure with low cache hit ratios.
In order to ramp, as we've ramped that up and as we've got more and more caching, you know, the amount of requests that go through to the origin is quite negligible.
It's about one to 2%. Troy has written a good set of blog posts on that approach as well.
And just see the other message I thought was really around, yeah, so I think in terms of where things are in terms of how it's being used nowadays, I'll check with Troy and see if he's got any more updated metrics on this.
But daily usage of this is now within the hundreds of millions of queries.
So it's definitely gone through a lot of different growth.
I think that's everything which has been raised with me for discussion.
So unless anyone has... All right.
Well, in that case, what I'll do is thanks for the folks who've either messaged me or emailed in various different questions on this front.
What I'll do is I will let you guys kind of hear from Project Galileo next, an amazing project by Elisa and Peter, which we're really proud of at Cloudflare, where there's been some really phenomenal work on things like, you know, on our public and trust work.
So I'll let you guys hear from that. And thanks so much for tuning in this early in the morning, wherever you are, whether it's morning, evening or afternoon.
And thanks so much for taking the time to listen. And I will give you guys a breathing room of about 10 minutes until the next talk.
Thanks. So you run a successful business through your e-commerce platform.
Sales are at an all -time high, costs are going down, and all your projection charts are moving up and to the right.
One morning, you wake up and log into your science analytics platform to check on current sales and see that nothing has sold recently.
You type in your URL only to find that it is unable to load.
Unfortunately, your popularity may have made you a target of a DDoS or distributed denial of service attack, a malicious attempt to disrupt the normal functioning of your service.
There are people out there with extensive computer knowledge whose intentions are to bypass Internet security.
They want nothing more than to disrupt the normal transactions of businesses like yours.
They do this by infecting computers and other electronic hardware with malicious software or malware.
Each infected device is called a bot.
Each one of these infected bots works together with other bots in order to create a disruptive network called a botnet.
Botnets are created for a lot of different reasons, but they all have the same objective, taking web resources like your website offline in order to deny your customers access.
Luckily, with Cloudflare, DDoS attacks can be mitigated and your site can stay online no matter the size, duration, and complexity of the attack.
When DDoS attacks are aimed at your Internet property, instead of your server becoming deluged with malicious traffic, Cloudflare stands in between you and any attack traffic like a buffer.
Instead of allowing the attack to overwhelm your website, we filter and distribute the attack traffic across our global network of data centers using our Anycast network.
No matter the size of the attack, Cloudflare Advanced DDoS Protection can guarantee that you stay up and run smoothly.
Want to learn about DDoS attacks in more detail?
Explore the Cloudflare Learning Center to learn more. Cloudflare Stream makes streaming high -quality video at scale easy and affordable.
A simple drag-and-drop interface allows you to easily upload your videos for streaming.
Cloudflare Stream will automatically decide on the best video encoding format for your video files to be streamed on any device or browser.
When you're ready to share your videos, click the link button and select copy.
A unique URL can now be shared or published in any web browser.
Your videos are delivered across Cloudflare's expansive global network and streamed to your viewers using the Stream provides embedded code for every video.
You can also customize the desired default playback behavior before embedding code to your page.
Once you've copied the embed code, simply add it to your page. The Stream player is now embedded in your page and your video is ready to be streamed.
That's it.
Cloudflare Stream makes video streaming easy and affordable. Check out the pricing section to get started.
and that people can rely on it.
It's like a small family or community here and I think elections around the nation is the same way.
We're not a big agency. We don't have thousands of employees. We have tens of employees that we have less than a hundred here in North Carolina.
So what's on my mind when I get up and go to work every morning is what's next?
What did we not think of and what are the bad actors thinking of?
The Athenian project, we use that to protect our voter information center site and allow it to be securely accessed by the citizens of Rhode Island.
It's extremely important to protect that and to be able to keep it available.
There are many bad actors out there that are trying to bring that down and others trying to penetrate our perimeter defenses from the Internet to access our voter registration and or tabulation data.
So it's very important to have a elections website that is safe secure and foremost accurate.
The Athenian project for anyone who is trying to run election anywhere in the United States is provided by us for free.
We think of it as a community service.
I stay optimistic by reminding myself there's a light at the end of the tunnel.
It's not a train. Having this protection gives us some peace of mind that we know if if for some reason we were to come under attack we wouldn't have to scramble or worry about trying to keep our site up that Cloudflare has our back.
A botnet is a network of devices that are infected by malicious software programs called bots.
A botnet is controlled by an attacker known as a bot herder. Bots are made up of thousands or millions of infected devices.
These bots send spam, steal data, fraudulently click on ads, and engineer ransomware and DDoS attacks.
There are three primary ways to take down a botnet by disabling its control centers, running antivirus software, or flashing firmware on individual devices.
Users can protect devices from becoming part of a botnet by creating secure passwords, periodically wiping and restoring systems, and establishing good ingress and egress filtering practices.
The real privilege of working at Mozilla is that we're a mission-driven organization.
What that means is that before we do things we ask what's good for the users as opposed to what's going to make the most money.
Mozilla's values are similar to Cloudflare's.
They care about enabling the web for everybody in a way that is secure, in a way that is private, and in a way that is trustworthy.
We've been collaborating on improving the protocols that help secure connections between browsers and websites.
Mozilla and Cloudflare collaborate on a wide range of technologies.
The first place we really collaborated was the new TLS 1.3 protocol and then we followed it up with QUIC and DNS server HTTPS and most recently the new Firefox private network.
DNS is core to the way that everything on the Internet works.
It's a very old protocol and it's also in plain text, meaning that it's not encrypted.
And this is something that a lot of people don't realize. You can be using SSL and connecting securely to websites, but your DNS traffic may still be unencrypted.
When Mozilla was looking for a partner for providing encrypted DNS, Cloudflare was a natural fit.
The idea was that Cloudflare would run the server piece of it, and Mozilla would run the client piece of it, and the consequence would be that we'd protect DNS traffic for anybody who used Firefox.
Cloudflare was a great partner with this because they were really willing early on to implement the protocol, stand up a trusted recursive resolver, and create this experience for users.
They were strong supporters of it. One of the great things about working with Cloudflare is their engineers are crazy fast.
So the time between we decide to do something and we write down the barest protocol sketch and they have it running in their infrastructure is a matter of days to weeks, not a matter of months to years.
There's a difference between standing up a service that one person can use or 10 people can use and a service that everybody on the Internet can use.
When we talk about bringing new protocols to the web, we're talking about bringing it not to millions, not to tens of millions, we're talking about hundreds of millions to billions of people.
Cloudflare has been an amazing partner in the privacy front.
They've been willing to be extremely transparent about the data that they are collecting and why they're using it, and they've also been willing to throw those logs away.
Really, users are getting two classes of benefits out of our partnership with Cloudflare.
The first is direct benefits. That is, we're offering services to the user that make them more secure and we're offering them via Cloudflare.
So that's like an immediate benefit these users are getting. The indirect benefit these users are getting is that we're developing the next generation of security and privacy technology and Cloudflare is helping us do it, and that will ultimately benefit every user, both Firefox users and every user of the Internet.
We're really excited to work with an organization like Mozilla that is aligned with the user's interests and in taking the Internet and moving it in a direction that is more private, more secure, and is aligned with what we think the Internet should be.