In Conversation with Troy Hunt
Presented by: Junade Ali, Troy Hunt
Originally aired on March 21, 2022 @ 12:30 PM - 1:30 PM EDT
Junade Ali is joined by Troy Hunt, a world-renowned internet security specialist, to discuss his career, Have I Been Pwned? and trends in cyber security.
English
Cybersecurity
Transcript (Beta)
Hello everyone, so while the sun is setting I guess here in London, I'm joined by Troy Hunt literally on the opposite side of the world, or close enough I guess.
I guess you're an hour or so away from sunrise. I don't think you really need any introduction to be honest, but I'll do my best.
Troy is probably one of the best known Internet security researchers around.
He's a plural site author, Microsoft regional director and MVP amongst many other things.
He runs Have I Been Pwned, weekly podcasts, etc.
So many things to list off. And yeah, today he's joining me to ask him some questions.
I'm incredibly grateful for you for you being here.
So thanks so much for joining. Cheers, mate. It has actually just passed sunrise now.
So we're at 7am here on the coast of Australia. Excellent. Perfect.
I guess. Yeah, it's about lunchtime in the US and midnight in our east.
Right. So let's kick things off. So first thing I'd like to first talk about is really about setting off in your direction is at the start of your career in terms of going independent.
And the story from being in a big tech company and really going from that to being in a more independent focused role.
So I was wondering if you could just walk us through that journey.
Yeah, so you know, like going out on my own was was the easiest way to do that is when they get rid of you.
And actually, for just for context, it wasn't tech company. So I was with Pfizer pharmaceuticals.
If you don't know what Pfizer is, you probably do know what Viagra is.
So Pfizer made Viagra as well as Lipitor, which was the world's largest selling drug at the time for cholesterol, Zoloft for depression, and just a gazillion other things.
And when I joined them in 2001, they were they were 150 years old at the time.
So like a really, really traditional incumbent company.
I just remember when I started there, I, this was in an era where even though I was a developer, it's like, you have to wear a tie and a shirt and pants and Friday is casual day, no tie, you still got to have a shirt and pants and things.
So that was kind of the world of Pfizer. And over the years, it relaxed a little bit, but very sort of traditional incumbent corporate environment.
And I and I've written about all this before and on talks on as well.
So it's all pretty public.
But I got to the point where I was just hating the job there. Because for me to progress in that career, I had to stop doing what I loved.
And I know many people face this, right?
Where it's like, you're, let's say you're a developer, and you're a good developer, and you like developing.
And then they say, for you to progress, you've got to stop that.
And you've got to be like a people manager. I don't like people.
Okay, it's not that I don't like people. But I don't like managing people. It's not my strength.
But you get stuck in this this situation. And I just remember them saying, it's just like, look, your the term they use was red line, they're like, we we can't pay you any more money, we can't get you doing any more things unless you stop doing this and go into another role, which is just a terrible position to be in.
So a lot of the the work that I started doing around info second blogging was largely scratching the itch, because I couldn't do it anymore at work, because I did go up through the career ladder.
So I'd do it independently. And then just fortuitously, things like the online courses with Pluralsight, I was able to do whilst I was still in the Pfizer job, generate a revenue stream.
And then when they got to the point where they got rid of a bunch of people via via redundancies, in this part of the world, which was a very expensive part of the world, still is an expensive part of the world.
But it was looking after a very cheap part of the world being Asia Pacific, they went, all right, we're gonna get rid of a bunch of people.
Here is a wheelbarrow full of a couple of years worth of money that we would have paid you, which is what we have to do for redundancy.
You know, go free. I was like, Yep, sweet. Happy with that. Never looked back.
Excellent. Yeah. By the way, sorry, if I'm continuously brushing my hair off the way I've been suffering from the lack of open hairdressers, but there we are.
We've got COVID here, mate, look at that.
So in terms of lessons for other people in that position, going from being in a big tech company to kind of doing your own thing, whether it's going to Pluralsight, whether it's kind of setting up your blog and sharing things through that avenue and developing side projects.
What kind of lessons did you learn from people who are kind of following in your footsteps in that sense?
It's a good question. I was just looking up the name of this, the title is blog post.
So I wrote a blog post, New Year's Eve, 31st of December, 2018, called 10 personal financial lessons for technology professionals.
And a lot of the thoughts are sort of detailed more there.
But there are sort of a number of things. If I go back to the start of my sort of blogging experience, the first blog post I ever wrote was why online identities are smart career moves.
And my hypothesis then was that by having some sort of an online presence where you can independently demonstrate your capabilities and your experience and the things that you've done, that better positions you for future opportunities, whether they be future opportunities working for another employer, and they're like, you know, what's your experience?
You know, you've sort of got this, this corpus of things, or whether it be that you want to be independent.
And the thing that really struck me when I wrote that is I'd be interviewing people for roles, because now I'm a people manager, because that's what my career had to do to progress.
I'd interview them for roles. And that they'd send their CVs and their CVs, you wouldn't believe it, the CVs always say they're awesome.
It's like, here's all the awesome things that I've done.
And I'd be kind of like, well, it's, of course, you're gonna say you're awesome.
It's your CV. And then again, and I like, I've got referees, like, go and speak to my references.
So well, you chose them, they're going to say you're awesome as well.
Like, how do I know you're awesome. And things like code tests in interviews are problematic in all sorts of ways.
Frankly, things like degrees and pieces of paper only go so far, like, I want to see what you've done.
And I just found it fascinating that I just couldn't find any information about a lot of these people.
So my theory was to start creating an online presence. Now, for me, that was a blog.
But I was also answering questions on Stack Overflow. And I started going to some user groups and later on some code on GitHub and things like that.
And I really love the fact that in this industry, anyone at any age and any competency and any position in life has the ability to go and start creating this online presence.
So I think that was a super, super valuable thing.
And then over time, and again, a lot of it sort of in that blog post, but being able to do things like have side gigs, as they're often called.
So Pluralsight for me was a side gig. And that was a side gig that, as I said publicly before, by the time I left Pfizer, it paid twice as much money as the Pfizer thing.
So you know, what am I going to do probably not go back to an office.
And again, that's one of the absolute joys of this industry, where the nature of it is that anyone can go out and do this stuff.
And I certainly, for me, a lot of what I've written, and even a lot of the courses I've created before, it's not that I knew the thing really well, but it was an opportunity to learn it, and then relay that learning via those medium.
So yeah, even if you don't understand the thing, you can still jump in and actually write content on it and learn as you go.
Yeah, definitely. So yeah, the ability to kind of learn being very important there.
So I guess on that sense, I think you mentioned to me a while ago that you had studied, I think, computer science at university.
And I was wondering where you thought kind of, you know, it seems for the most part, a lot of the computer science education, either, you know, building people to work at, you know, Java development factories, or to be academics.
So I was wondering where you thought potentially the education of developers could be improved going forward?
Or is it best for us to focus on more of the self learning direction?
Well, I guess for me, university was something to do while I was going surfing, you know, it was a way of filling in the time.
So my plan A was that I wanted to be a pilot because my dad was a pilot.
And then he talked me out of it, because this is sort of mid 90s era.
And you know, the airline industry wasn't quite what it used to be.
It's maybe you don't want to do this. And I was I was getting into a bunch of IT related stuff.
So when the time came to make a decision, I was like, well, I'll just go to university and do computer science.
So I started uni in 95.
And that was also the year I started using the Internet. And I saw the Internet.
I was like, this is amazing. Like I want to build web pages. I'll just do a course on how to build web pages.
There's no courses. I couldn't do any courses.
You know what I did is right here. I still got the book, I went and bought HTML for dummies.
Like this is literally a 25 year old book that I and it's still mostly relevant to which is kind of interesting.
So I literally went and got the book on how to learn HTML. And I started self learning.
And I look I did do some programming related courses at university as okay, we did COBOL.
Well, you learn some programming constructs, did database design, okay, that's some of that's useful.
But a lot of it wasn't particularly useful.
But it gave me, I guess, something to start focusing my direction on.
So whilst I was there, it became clearer what I wanted to do. And eventually, I dropped out and never finished the degree.
Now that that said, I think today, I think today, I probably still do the same thing I think things like, like university give you give you focus and direction.
But it's, to me, that's never the goal.
The goal is never to get a degree, the degree is there to get something else, which is gainful employment and opportunities in life and things like that.
And again, what I love about this industry is that anyone can go out there and self learn.
This is not like being a doctor, like, mate, if I go to a doctor, I want them to have all on a wall full of degrees.
That's a different thing. But if it's someone writing software, I want competency.
And I want to say that they've been able to actually deliver things over a period of time.
And these days, you can go online and learn from Pluralsight or learn from blog posts and all sorts of other things out there.
So I think it's a very different time now. And people are judged much more on the capability.
And in many ways, this industry is like the ultimate meritocracy.
So if you are good at what you do, whatever your path was to get there, then you'll have opportunities.
Definitely, yeah, that makes sense.
Yeah, it's, it's interesting, because I guess not to be rude, but I guess I followed, I probably look far older than I am.
But going to a path later in the UK is quite quite interesting, because instead, you want to be a pilot, I kind of wanted originally to be a lawyer, but I wouldn't have been in a financial enough, good enough position to be able to do that.
So I went and did kind of an engineering apprenticeship, learning on job, which was really, really good for that.
But then, I guess, when you're working in high reliability systems, and you need that piece of paper, that's when I at least got to shortcut the system and do do a master's kind of without having to go through the process of, you know, finishing school or a bachelor's.
So I guess, in a nice way, there is always a shortcut, even for the people who want to do things which are, you know, engineering, where, you know, safety and things matter.
So but definitely, I think, on the development front, it's, it's great to see how much of a meritocracy that is.
So I guess, moving forward a bit, bit now to when you kind of became independent, and you're doing your various projects, how did, have I been, how did, have I been planning to come about as a, as a service and a tool?
Well, have I been planning predates independence.
So I started, have I been pwned in December 2013. And I didn't leave Pfizer until April 2015.
So, yeah, have I been pwned really came about for two reasons.
And as I've told the story in the past, I've always said these two reasons are sort of equally important.
And one of them was, obviously, creating a data breach service, because I I'd been writing blog posts on data breaches, and doing analysis of things like, yeah, here's two different data breaches.
Hey, look at this people who are in both use the same passwords, isn't that interesting, you know, but actually getting empirical, quantifiable evidence of things like password reuse.
Siri's always listening. So I was doing these blog posts.
And then the Adobe data breach came along, and it was 150 million something records.
And both my Pfizer email address, my personal email address were in there.
And I was like, how does Adobe have my data? Oh, yeah, macromedia, because I was a big Dreamweaver user back in the day.
And of course, macromedia was bought by Adobe.
So my data flowed into somewhere else. And I was like, well, this is really interesting the way information flows.
I wonder if people know about this, you know, I can probably stand up a little website to do this.
So that was kind of like one part of it.
And then the other part of it was, I was really trying to drive Pfizer very much towards not just cloud, but platform as a service cloud.
So I, because my career had to progress, I was no longer a developer, I'd been an architect for many years.
So I was making decisions about how we're going to do things like hosting of applications.
And I really wanted to push the Azure front and sort of move things to Microsoft's cloud.
But I didn't want to just like, pick up virtual machines and move virtual machines into the cloud.
I wanted to use new cloud paradigms, which was mostly paths at the time.
But I wasn't building stuff anymore. And I didn't want to be like one of these architecture astronauts or it's like, let's just draw UML diagrams all day.
So on my time, I went, okay, well, I'm going to just going to go out and I'm going to build this this idea of a data breach search service on Azure.
There's 150 something million records, like that's a pretty sizable thing.
So I can actually this is not just going to be hello world, it's going to be something sizable.
And that was honestly, like 50% of the goal, let's play with cloud.
And I did use that to then take things back into the organization and build a whole bunch of cloud related things within Pfizer.
And then of course, have I been playing just shot on as well.
Definitely, yeah. I guess, given the size of things as things stand, I guess, have I been pwned is a pretty epic example of big data things stand.
So the original have I been pwned service, I guess, think of that sort of moment, it was literally email based, you know, someone would go ahead enter in that email, tell you what data breaches you're on.
And you're about to sign up if you were you were in a to be notified if you're in, you're in certain breaches.
And that kind of concept has really taken off. Right?
I think you've got a lot of different kind of organizational members and even like people in law enforcement and things which are using that.
So I guess that must be it must be interesting to see how things are on the how you make those judgments, judgment calls as to when you get a data breach.
And also the ethical side side to this, which given you've developed your reputation on a very high level of trust must be a very critical components.
And could you kind of talk us through how you how you manage that aspect of the of the project?
Yeah, it's an interesting question.
And I think one of the most significant parts of the way it's run as it is today, is that because it started out as a pet project, and because even today, I believe that remains community centric.
And it wasn't a let's figure out how to monetize this thing, I got to make decisions that were always in the best interest of of the data and the people and frankly, the organizations as well impacted by these breaches, as opposed to if it had been, I'm going to sit out and I'm going to build a service, which is going to become really valuable.
And that's going to be my goal. So yeah, I've always been able to kind of prioritize what I thought was the right thing to do in general, as opposed to what was the highest and best use of the technology or the best way to monetize it or something like that.
So when we look at things like ethical decisions, a lot of these I haven't sort of sat down and planned it all out in advance, it's literally been just adapting to the environment over time.
So a good example of that is when the Ashley Madison data breach happened, when was that about 2015?
Before Ashley Madison, there was no concept of a sensitive data breach.
It's like, look, my ethical viewpoint was that if there is data, it's out there, and it's floating around the web, being able to search it really doesn't pose any more risk than what it did before, because it's out there anyway.
And of course, I was conscious that having a handy web interface that anyone could use does make it more readily accessible, but for the most part, it's still public data.
And then the Ashley Madison thing happened.
And when that happened in July 2015, whoever break into the system sort of put out that message.
And they said, you know, here's the proof.
Okay, that checked out. If Ashley Madison doesn't shut down, we're going to dump all the data.
And there's like 35 million records or something really, really sensitive stuff.
So I had lead time to think about what I'd do. And I ended up writing a blog post and said, here's how I'm going to handle the data.
And that was when I said, Look, I don't want have I been pwned to be the vector by which someone discovers that their partner is having an affair, which was always the concern.
And it might like my worst nightmare at that time was, you know, someone, someone kills himself, because they've just discovered that their partners, you know, on with someone else or something like crazy, crazy like that.
So that was when I went, All right, well, there's going to be a class of data, which people who are in there can search it via an email verification process, but the public can't.
And, and that turned out to be a good decision, because there's some pretty, pretty shady sites out there that allowed anyone to do this.
And I don't know if you remember, but at the time, there were things I remember one of the stories was someone said, Look, someone has posted up a list in our church of all the church members who are in Ashley Madison.
Oh, man, I'm glad they didn't get that.
Have I been pwned? That just feels so, so dirty. But you know, that was not something that I had the luxury of planning out from day one, it was as the environment changed, what should I do?
The way the API has been implemented in terms of limiting the ability to abuse it and rate limit it and then having keys, like all of that has just been reacting over time.
And I don't know what else will change in the future.
But I do know that it will adapt in other ways, either to provide new features or to focus more on privacy, or there'll be other things that happen in the industry that cause it to change.
And I'll just deal with it when I get to it.
That's, that's the view I have. And, you know, just for the most part, I just enjoy running it as it is.
Definitely, yeah. I think it's interesting kind of with the analogy earlier of like, you know, a medic versus someone's doing this, I guess, the importance is far more on risk management.
For us, more than anything, it's, you know, preventing the eventualities and communicating, I guess, decision making processes, which, which lead us to those decisions, I guess, one of the things I guess, you've developed quite a solid reputation for as well, being when you kind of vulnerabilities come out, or when there's certain, you know, controversial decisions in the industry.
And you're able to take very, you know, well judged stances on on these things and not lean to kind of the absolutist principles.
So I was wondering, what kind of how do you kind of make decisions on those types of cases?
How do you, you know, weigh up the practical utility of, you know, of something versus, you know, security risk, whether it's theoretical or practical?
How do you go about thinking about that? It's funny, as, as you know, I went through a process last year of looking to see if there was another another home for have I been pwned for someone to buy it.
And we went through this big M&A process.
And one of the things that that came up several times, because these are big companies that go through this process, and they acquire other big companies, and they somehow expect me to run like a big company.
And there'd be questions along the line about, you know, how do you and your your board of advisors make decisions on which data breaches to publish or how to communicate?
I'm like, border.
Sometimes I get on my paddleboard, and I think about it, but that's the only board involved in this whole thing.
Like there is no board. It's just me.
But the joy, both the joy and the curse of that is, I get to be that the sole decision maker for how I communicate these things.
So I don't have to defer to a panel and go around around in circles like that.
But that the curse is that it all sort of falls on me as well.
And if I screw this up, it's like, well, as one person falls back onto if I misattribute a data breach, for example, if I, if I'm too, too direct in holding a company to account, and they they stick the lawyers on me, you know, like, that's, that's all on me.
But I guess having done this over time, and we're at what 440 data breaches or something, how many data breaches are we at?
I don't know, like, I just load stuff, and then the number ticks over. 455 data breaches.
Doing this over and over and over again, I think I've sort of, I guess, developed a little bit of a sense to how most organisations will respond, how far I can push them in terms of, I don't say what can I get away with, but like, what is reasonable?
What is in the best interest of of the of the community. And, you know, one of the things that I've actually found time and time again, is I try and talk to organisations as much as possible.
And, you know, there was one yesterday who's, the penny has just dropped that they've had a breach.
And I was on Skype with with the CTO there for an hour, just sort of talking through it.
And we're, you know, we're literally videoing like this.
And they see that, that, that I don't have a hoodie, which is a good start.
I take that off when I do this.
But you know, you can communicate and talk like a reasonable human being. And you can sort of say, look, this is the situation.
This is how I've arrived at the conclusion that you've had the breach, this is what has to happen next.
You know, can I give you some time to formulate your communication, you go and get some legal advice, all the rest of it.
But you're going to need to disclose this. You know, and if you don't do it of your own free volition, it will be done anyway.
So it's like, the end state is people will know about this, we can do it the easy way or the hard way.
But this is the position that we're going to get to. Yeah, I guess one of the lessons that develops over time is certainly, you know, the individuals or the companies who are more, who are more open, I guess, with their client base as to what what went wrong.
And, frankly, them as opposed to try and conceal it are the ones who kind of, you know, find it easier to, to rebuild trust after such an event.
It's totally right.
And that one of the pennies that dropped for me, probably about three years ago now is that I believe that we we judge companies much less harshly these days for having a data breach, but more harshly for how they handle it.
So in a case like this one yesterday, I think people will be very sympathetic.
If the organisation communicates it well and transparently, and I think it will have very little impact on them.
But it's it's those organisations that either try and suppress the information or miscommunicated or downplay it.
And particularly man PR companies, they're the worst as well, because PR companies try and spin it.
And I've been involved in a few PR companies trying to put a spin on it, which is, yeah, everything from like blaming the messenger.
There was, there was one early last year where they threw a lot of blame on myself and another researcher for reporting it.
So I actually wrote a blog post.
And this is another one for folks watching, if you want to look at it, it's if you Google Troy Hunt, stages of data breach grief.
And I wrote this after, after the AI over in your corner of the world was denying that there are credit cards exposed in a breach.
And it's like, well, I've got the credit cards, they look like credit cards.
I've asked, have I been paying subscribers? And they went, yeah, that's my credit card.
So you're basically in this denial stage. So it's like the what is the Kubler-Ross five stages of data of grief, where it's like, you're going to go through like, what is a denial, and then you're going to go through anger and bargaining, and you're going to get to this end point, how hard you want to make it.
And, and frankly, the harder you make it, the more interesting the story is going to be as well.
So that's up to you, I'll work the other way. Definitely.
And I guess, the internal organisational culture is, I guess, important to that, not having, you know, a blame culture internally, people being more willing to, to be open to their mistakes, particularly, you know, developers know they are not going to be, if they do something wrong, they aren't going to be to be hounded for, I guess, ultimately, that that all rolls up into the, you know, the way the organisation presents itself in the end.
So I guess it, I guess, what's your feeling on on that side, I guess, it's more about whether it's more about keeping your own house in order versus your outward appearance.
I think that there's definitely a cultural thing in many organisations that that predispose them to handling a data breach poorly.
And there's one that that comes to mind, I won't say which company it is, because a friend of mine works there.
But I disclosed it through a friend of mine.
And it's it is a massive company. And after the disclosure, it was taking I think it took about 11 days or something for them to after my disclosure, then took about 11 days to communicate it publicly.
And I was kept chatting to him.
I was like, mate, what is going on? Like, why are you taking so long?
And he said, Look, basically, it's a room full of lawyers sitting around arguing about how to spin this.
And that's just that is such a, it's just such a selfish thing to do.
But it's also very litigious society as well. I mean, particularly in the US, the number of class actions that amounted against organisations that have a data breach, you know, like I say this all the time, there'll be a website, it's had a data breach leaked some info.
And and I get emails from lawyers wanting to mount class actions, because I want information about the breach, etc.
And I sort of look at and go, what are you manning a class action for?
Well, their data was exposed. And okay, that's, that's not good. I've been in many data breaches.
What harm actually came to the people? Oh, we don't know yet. It could be identity theft, class action.
I don't even know how you demonstrate damages.
But it just seems to be like, let's try and mount a class action. And maybe we'll try and settle out a court early and we'll get some money.
And it's, it just it's like shadiness on top of shadiness, isn't it?
And as there's people like you and I sitting there again, look, we just just want everyone to be friends.
We just want to like, you know, make sure that if bad things happen, there's disclosure and it gets cleaned up and things are secured.
But yeah, throw in lawyers and PR people and the thing just becomes a circus.
Yeah, for sure. Definitely.
Yeah, it's one of the things I'm genuinely quite proud of at Cloudflare is to how we generally handle kind of, you know, situations in a transparent way, both internally and externally.
I think that type of organizational culture definitely works to the business's advantage in the long run.
Yeah. And look, I mean, Cloudflare is a modern tech company as well.
And modern tech companies tend to be very good at handling this stuff.
So if I think about things like Disqus and Imgur are two of the ones that come to mind, so I disclosed to both of them and within 24 hours, they had good communications out publicly, which is far in excess of what their, their sort of minimum requirements are.
They had the right messages, they drilled this stuff before they understood what data breaches were, they weren't upset with me, they're grateful.
But then it's it's more sort of the larger incumbent organizations that work in a very traditional fashion that are more problematic.
And I guess that that is just a cultural thing, right? Definitely.
Awesome. I think we should move on to Pwned Passwords, seeing as we're both here.
But obviously, you're far more well known than me. But I got to play a small part in the journey of Pwned Passwords, which is, I guess, the other side to have I been pwned, where, you know, both there's a people can download breaches of passwords, or they can, you know, anonymously search if a password has been breached.
And that's something which, quite recently, I think Scott, well, I think right off the right off the bat, I think when Pwned Passwords version two went live, I think we got we got quite a lot of traction on it, like so many people got involved, you know, Stefan from EVE online, of course, he provides some really good actual real world data as to how this thing looks.
And, you know, it was adopted by Okta, you know, Google, Mozilla, One Password.
And now I think even with WWDC, I think, you know, Apple are announcing, you know, breached of, you know, they'll notify users if their password has been breached in Safari.
So it's, it's definitely been been interesting journey.
So I was wondering if you could tell our audience quite more about that project and what, what Pwned Passwords looks like how it came about, and how it's iterated now to version six, is it as of a few days ago?
Yeah, so firstly, you're massively underselling yourself, because you're a huge part of why it is as successful as what it is now.
And we'll definitely get to that, okay, anonymity bit.
But look, I mean, this, this was sort of like we're talking about before I didn't have master plans years in advance for have I been kind of was just literally reacting to things as they happened, and stuff seemed like a good idea.
So after NIST came out with a bunch of their authentication guidance a few years ago.
Yeah, one of the things among many interesting things, because I was sort of talking about things like arbitrary composition rules are counterproductive, stop that, you know, stop this, this sort of mandated rotation without evidence of compromise of credentials, this sort of stuff.
And one of the things I said is, is you should block known bad passwords, because passwords that have appeared in data breaches before are far more likely to lead to account compromises.
And I thought, Oh, that's, that's really good advice, but they didn't give you the passwords.
And then I'm looking at have I been Pwned, and I went, Oh, I got a lot of passwords, you know, maybe we can turn this into something useful.
And for the longest time I had had requests, and I still have many today to provide credential pairs.
Now, the legitimate usage of credential pairs is organizations would like to better identify when either the staff or the customers are using the same passwords on their systems as what they have in a data breach, because they want to force reset the passwords because of the risk of account takeover.
So very, very reasonable, good use case, but very high risk to have the pairs.
So I thought, I'll just extract out the passwords, disassociate them from email addresses, and I'll make them downloadable.
And then people will be able to take this list.
And they'll be able to say, hey, look, has this password appeared in a data breach before?
If yes, then do whatever they want, block the password altogether, say that the password is risky, maybe you should make a different decision, etc.
So that was version one couple of years ago. And, and after that, you came along and went, hey, I've got another idea about how to do actually, there's one other thing first.
So I was worried, I was happy just to give everyone the passwords.
But I was worried about the fact that a bunch of passwords have got personal identifiable information in them.
Because people like literally use their email addresses, their password a bunch of times, and a bunch of the source data.
hackers don't always format their data. Well, just say that source data would be like, all right, I've, I've sort of extracted out the password column, but some of the delimiters aren't right.
And instead of just having the password, we've got the entire row.
And it's like someone's phone number and address and things.
And I didn't want to provide that in plain text. So and I will just share one, everything because it's not that we want to store passwords as sha-1 in a normal operational system.
But if I've sha-1, a whole row worth of crap that's been there, because it hasn't been delimited properly, then that's not the sort of thing that people are going to come and crack because you'd have to have effectively some massive long, crazy string in your word list.
So sha-1 was going to provide sufficient protection to hide the PII from the malformed data or the crazy ones where people using their email address and phone number as their password.
But it's also super fast. So at least if an organization says when someone provides their password at login, we'll just sha-1 and compare it, then we can meet the objective of seeing if the password has been used before.
So then you came along and you went, hey, I got this like this k-anonymity idea.
So rather than people actually sending you like a full hash of the password, which was the version one, which could easily be cracked because they are just sha-1 hashes.
Yeah, why don't we do this thing where you actually just send a small part of the hash and then we come back with a whole bunch of suffixes that actually match the prefix.
And then whoever's calling the API can just see if their suffix is within there.
And that was a beautiful model because it's simple and everyone understands it.
And I remember at the time that we were having discussions about like private set intersections and things like this.
And there are some really clever cryptographic implementations out there.
But we want like, proverbially speaking, every man and his dog to be able to use this thing.
You don't want someone to necessarily have to be a cryptographer or someone who's very, very advanced in their knowledge to be able to consume this.
And everyone can figure out how to make a sha-1 hash and to take that 40 byte string and just take the first five bytes and send an HTTP request like that's simple.
So what I love about it is that the simplicity of the implementation got adoption.
And as you know, we just kept seeing it get hit more and more and more and more.
And then because Cloudflare can cache so much content and it can be cached on all of those edge nodes around the world.
I was just looking again now at the stats from yesterday.
Our cache hit ratio is 99 point something percent for the last 24 hours. It's out of 24.6 million requests like 24 .5 million have been served from cache.
So that meant that suddenly there's very, very little burden on the underlying origin servers on Azure.
So that's great. That saves me some money. And then it's just super fast for everyone consuming it because we've got all of this content cached on the edge.
So in implementations like Stefan who works for CCP games, it does even line, you know, you log into even line or register an even line and they're hitting this and they're checking to see if your password is known bad or not.
And it's just such a nice thing because it's so by design, not just easy to integrate, but easy to access, you know, no API keys, certainly no money or anything like that.
No identification. I got no idea who uses this. It's just I know it's a lot of people but yeah, no, you're like you have done a fantastic job on it and I'm just so stoked to see it grow and grow.
Thank you. Yeah, I think it's been interesting concept, I guess with password attacks and how, you know, the history of passwords and you know, how they were made for something completely different and then, you know, we had brute force attacks and wordless and then composition rules to try and prevent this problem, you know, you have to type in several numbers and whatever else and then we've now got to the point where where, you know, we're preventing or warning users before there's a, you know, before they reuse a breached password, which is really close to model as we now see with like credential stuffing attacks, you know, where people have those pairs just able to inject them into into sites over and over again.
So I think it's been a really fun thing to be able to be involved in that evolution, I guess, and it's something which has certainly got widespread adoption.
It's something which has really moved the industry forward and hopefully people will be able to build on it and I guess it's interesting with, I guess, the anonymity model, the way we often think about it is, you know, when we secure passwords is, you know, we add entropy and, you know, we add a salt to a hash and we keep on doing that to get computational difficulty and this is more about doing the reverse.
It's more about making that ambiguous for search and I think on the whole it's been really exciting as well to see all the development off the back of this, like the team led by Thomas Rispenpart who I worked with at Cornell.
They've been able to develop, you know, more sophisticated protocols for doing this.
You know, we have to work with that the Google password checkup team as well.
They were able to add some of the private set intersection parts on it and, you know, even, you know, the amount, the idea of hash-based anonymity, it's kind of exploded, it's been very relevant in the world we live in right now.
So, it's been remarkable to see that development and to see that that work will get traction.
So, I guess it was an interesting sweet spot, I guess, between something, you know, people needed, obviously you having the reputation to be able to live that, the data and all the rest and, of course, the, you know, the fundamental concept.
I guess that all works together in a really nice way and I guess throughout different projects you're very good at doing that, you're very good at being able to deliver these high impact projects which are often, you know, based on ideas which people really need.
So, I guess it's really exciting to be in the development of that and I guess certainly your judgment has definitely helped in that process and, of course, I'm eternally grateful for the thanks you give to me for the role I played for that.
And I guess the performance aspect of this whole protocol is huge as well, it's like very easy to cache and it's very, very optimal to deliver to people and I guess that's really helped reduce the cost as well for you, right?
Yeah, I don't even know what the cost is, if I'm completely honest, because I've got, you know, sub one percent of requests actually hitting the origin.
So, apparently the uncached request in the last 24 hours was only a hundred and hundred and fifty seven thousand.
So, I've got a hundred and fifty seven thousand requests hitting an Azure function, which is their serverless model and they charge based on the number of requests and then the amount of memory used over time and, you know, it's a small number of requests and the amount of memory used over time is going to be minuscule because it literally just picks up a file from blob storage and it's like, yep, here you go returns the thing.
Yeah, what one thing I'd actually like to do, someone asked me about this the other day, they said, why are you even using Azure functions?
Like we could call, I know we're turning this webinar now into like a design session, but like you could literally just take that Cloudflare worker, you just go straight to blob storage, pick up that item from storage and then all you're doing is you're paying for blob storage and egress data and there's no more apt here on functions, which I thought was a kind of cool idea.
We could probably do that in an afternoon, if we had an afternoon free, I think we could probably do that.
Yeah, some cool ideas I think there as well, there's also frequency size, bucketization, all that fun stuff.
I guess Matt Weir, when I met at PasswordsCon, he helped us get the padding functionality into responses, which was really cool.
Lava lamp wall, I've got this Zoom virtual background for at the moment, but yeah, the cool aspect there is a user or someone who's integrating it can request their responses be padded to add a further degree of anonymity over the wire and yeah, it's kind of cool in that sense from the lava lamps in the Cloudflare office, we're able to feed entropy in to the servers and lava lamps help power the randomness for that, which is another really cool area.
We probably should explain that actually, because there might be people listening here going, what is padding and why are we doing that?
The thing that Matt came up with is he said, look, you've got HTTPS between the client and Cloudflare and of course, 99% of stuff from Cloudflare, that last 1%, well, then we've got HTTPS from Cloudflare to Azure.
So, the hash being searched for is, and again, it's just a hash prefix, it's just 5 out of 40 bytes of a SHA-1 hash.
So, even that 5 out of 40 bytes, which doesn't give you any degree of confidence as to what the underlying password is, is sent over TLS connection.
So, any interception along the way, yeah, you can't actually see what's being searched for, but Matt sort of made the point where he's like, even though you've got TLS, if you have a look at the traffic size and then you profile every one of the 16 to the power of 5 different possible combinations, you can start to work out which prefix might've been searched for.
Therefore, someone who can intercept traffic might actually be able to figure out the password.
Now, there's a lot of, when I first read it, I was like, man, there's a lot of dots to join there, like the risk of that is like super, super, super low, but the fix was really, really easy because then you guys said, well, why don't we just like start adding random bytes to every single response such that the smallest response before padding and the largest response before padding, they both get so many random bytes added to it, you just can't tell what's what.
And this was really, really easy.
And I think you pretty much did it in an afternoon, just in the Cloudflare worker.
So, I really liked the solution because I didn't have to do anything.
I had to write a blog post. That's what I had to do. The leaders literally on the edge were like, let's just start throwing randomness into the response that doesn't actually impact anyone consuming it, but they just toggle on a header and problem solved.
Yeah, definitely. I guess from your dream of have I been pwned being an experiment into cloud has kind of become the pinnacle of cloud computing at this point.
I think that, yeah, it was definitely really cool functionality to add into the mix.
And yeah, I think we've both written blog posts on that.
There's one in the Cloudflare blog. I think you've got a blog post out about it as well.
So, yeah, it's really cool technology for people to take a look at for sure.
So, I guess shifting gears slightly. So, funnily enough, this is the first time I've actually ever interviewed someone.
So, before I came on today, I was talking to Troy Hunt.
What should I ask him? So, Andrew on our support team was lovely enough to put together some questions for me.
He had for you, which I guess we could run through a few of them.
So, he asked, in terms of making an app secure, what are the biggest challenges today's developers face?
So, I guess we've spoken about the credential stuffing component, but so many other security risks, I guess.
There's a couple of ways of answering that depending on what he means.
But in terms of what are the attacks, obviously credential stuffing is a big one at the moment.
And I'm really sympathetic actually to organizations that are the corporate victims of credential stuffing.
And we've seen so many big headlines lately of really, really big names.
I mean, Disney Plus was a big one. Netflix keeps getting hammered.
I see Spotify lists the whole time. Google Nest has been in the news.
Nintendo has been in the news. Like everyone's getting hammered by this.
And the bit that I'm sympathetic about is that we've got a situation here where a request comes into their service.
It's a legitimate email address, a legitimate password, but it's not a legitimate user.
And you're meant to block that somehow.
And even the FTC has said that they're going to bring cases against what they phrased as corporate victims of credential stuffing.
So, that's an enormously difficult situation.
And look, I mean, something like pwned passwords goes a long way to helping that.
It's a pretty simple blunt instrument as well. And of course, there's new data breaches with new credentials that then get fed in.
And it's not an easy, there's a real sort of imbalance here where it's very easy for attackers to mount credential stuffing attacks because they're just replaying HTTPS requests with a bucket load of credentials.
But it's very hard for even big organizations to block them without impacting usability.
So, for example, without throwing captures in front of people or without forcing people into 2FA or things like that.
So, that's a massive one. In terms of other attacks that I still consistently see that feed data into have I been pwned, SQL injection is still really big.
It's not going away. Every now and then someone's like, is that still a problem?
It's like, yeah, it's still a massive problem. And I think probably the biggest one in recent years has been misconfigured databases.
So, publicly facing, it was a lot of S3, then it was a lot of Mongo, now it's a lot of Elasticsearch instances without passwords.
And I kind of joke about it. And I say, look, you know, cloud is amazing.
Like you've never been able to screw stuff up faster and more cheaply than with cloud.
And that is part of the problem because anyone can spin up cloud.
And anyone does. And then they miss these like fundamentally simple things.
But then I think the other way of answering the question, biggest challenge for developers is everyone has got a lot of pressure to deliver working software.
And the priority often tends to be on features and getting a minimum viable product out there.
A lot of these products come from small startups.
And there's not anywhere near as much pressure on the security side of things.
And I think part of that is that if you build a feature that works well, people see that and you get value from it immediately.
If you build a security control that works well, nothing happens.
You know, like, you know, it's worked because nothing happened.
So it's still that age old problem of being able to actually get time and focus and particularly education on security.
And it's normally something that organizations are reticent to spend on until something goes wrong, and then they'll spend anything to make it go away.
But you know, to some extent, it's too late then as well, right?
Yeah, and I guess it's gets even more crazy in the world of IoT, which I think you spent some time exploring with, get the car, the one with the overflow.
Oh, yeah. Oh, man, I'm going so far down the rabbit hole with IoT now.
So one foot is in the how broken is all the IoT stuff?
And then the other foot is in how cool is all this? Oh, my god, it's hard to make all this stuff work well together.
I got so much home assistant and sensors and ZigBee things and MQTT and all sorts of acronyms I didn't even know two months ago that are now filling my house.
And I'm sure a lot of them are just terrible as well.
So I've got to start poking away at that too. Did you hear a little while ago about a tracking point company?
So they made, they made like, basically, real life aimbot sniper rifles, basically.
And they were Internet connected.
I think there was some security researchers who basically managed to, when someone was like lining up to take a shot, would be able to log into the gun and be able to literally change where it would target.
There's so much to unpack here about all the things that are wrong with that.
But I feel that if I started doing that, I might isolate part of the audience watching.
Yeah, I think it's gone lunchtime in America.
I wasn't going to name them, but all right.
And that's as political as this will get.
So I guess on the flip side to that problem, we've got the, the other question I was asked was really about the average Internet user and how they can protect themselves.
But as I guess, with the partnership you've done with 1Password and Have I Been Pwned, it's a really nice user interface.
Like someone will be in a data breach and it will tell them like, you know, here's the three steps, you know, change your passwords, enable two-factor authentication, get password manager.
It's like, it's made really straightforward for the people.
And I guess on the flip side, you know, I guess in the past, I think you've mentioned in some cases, it's better to have, you know, a physical book for people to write passwords in if they can't, you know, if you have relatives who can't handle a password manager, that's often the best.
So I was kind of wondering what were the top kind of real-life cybersecurity tips that you've come across for, you know, for people who aren't in the tech community or how should that be articulated to, I guess, other people?
Yeah. Okay. I think you sort of nailed it there in terms of password managers, because for most people, their greatest point of risk is we can reuse passwords.
There are other bits we'll get to in a moment as well.
But that's sort of the one security control that everyone interacts with in one way or another.
I mean, even my kids have passwords in one way or another.
It's like a pin for their luggage or, you know, my son's 10, he needs to use a laptop at school.
So he's, you know, constantly dealing with passwords. And we've just been talking about credential stuffing, which of course is predicated on the fact that people reuse credentials.
We know empirically that 90% plus of people do have the same one, two, three passwords they use everywhere.
So that the biggest thing that they can do for themselves is having a strong, unique password.
Now that the problem is, and I always just laugh when I see this guidance, and I'll give you a story.
I went and did some training in a bank here last year. And I was talking to the CTO and this, you know, massive multi-billion dollar bank, and they got all the posters up on the wall, you know, like many organizations always use strong passwords, make sure they're unique, so on and so forth.
And I sort of said to him, it's like, well, do you guys give everyone a password manager?
Like does your standard operating environment have a one password for business kind of model or something?
And he's like, no. I said, well, how do you expect them to do this?
You know, honestly, how do you expect your staff to have all the passwords they use just in the day-to-day business?
Because it's not just Active Directory, there's all the cloud services they need to use.
And then there's all the legacy apps internally that have different credentials and things.
How do you expect them to do it?
And, you know, the simple answer is that they're using terrible passwords.
And I'm sure that if they took PwnPasswords, took the NTLM hashes version and run it against their ID instance, they'd find a whole bunch of really terrible ones.
So we got to have some form of password manager. And then, yeah, obviously the ideal situation is a digital cryptographic solution, such as 1Password, but you mentioned the book as well.
And I think the book just leads to a really, really interesting discussion about the practicality of security.
So I've written a blog post, it's a blog post on everything.
There's a blog post called password managers don't have to be perfect, they just have to be better than not having one.
And in there, I talk about the book. And I continually get people sending me tweets, because they'll be at a news agency or something, they'll see one of these books, and it's literally called password book.
And they'll take a photo and they'll tweet it at me.
And I say, Hey, hey, Troy Hunt, I just saw this password manager.
It's not even encrypted. Ha ha ha, how stupid. And let's take a step back.
So we got 90% plus of people using the same terrible passwords everywhere.
Now, the risk there is that any script kitty in their bedroom anywhere in the world gets access to one of those passwords, and they get immediate, anonymous, remote access to a whole bunch of your accounts.
And it's just like, it's game over, because this is literally the skeleton kitty lives.
So the the risk is extremely high.
If you had a book, and you put it in your drawer, and in that book, you had a unique part, let's say it's a passphrase, and it's different for each website, right down the name of the website, right down the passphrase.
Okay, what's your risk now credential stuffing goes away, because there's no reuse, no one's going to be brute forcing it over an HTTP connection.
You know, you're not going to be sending billions of requests.
It's not like cracking hashes offline. So that problem goes away.
Your risk now is that someone gets your book. And then people sort of go, you know, like, what if you what if your partner at home takes your book and logs in?
It's like, if honestly, like, if you're worried about your partner stealing your digital life, and you're sleeping with them, like you probably have another discussion that you need to have first, you got bigger problems.
Someone might break into your house. But if someone breaks in here, they're going to take that monitor and that computer and that iPad that I want the books that are in my drawer.
And if they do take the book, I know that the book is gone.
You know, here's the problem at the moment. I don't know how many of my online credentials are gone.
But I do know someone breaks into my house and steal something important.
So it's a much better proposition than credential reuse. It's just not as good as the digital solution.
And then of course, the other bits which which is sort of the normal bread and butter, you know, consumer infosec stuff is like keep your devices up to date and patch take security updates when they're available.
Don't use Internet Explorer on Windows XP. This sort of thing. Awesome. All right.
I think we're we're into our last two minutes. So hopefully I haven't been such a disastrously bad interviewer that you know, there'll be rage hitting on Twitter, you know, your reputation will be destroyed.
Ask on Twitter, you'll get honest feedback.
So to finish things, I guess on on a more optimistic note, I guess there's lots of people right now who, you know, aren't in, you know, in the best frame of mind, things like that across across the world.
What are the things in in our industry that make you optimistic for the future?
I'm optimistic that if you work in the infosec industry, you're going to have a very long lucrative career ahead of you because we're just getting worse.
And that's, that's a bit of a double edged sword answer.
But I guess on slightly more serious note, I'm optimistic that there is nowhere near enough supply to meet the demand in this industry.
It's it is a fascinating industry. I mean, we spoke about stuff like IoT and cloud services and organizational responses and things.
I think it's just a really, really interesting area.
Thank you so much for taking the time.
Really appreciate it. And thank you to everyone who's watching on Cloudflare TV.
Thank you.