đź”’ Security Week Product Discussion: Data Loss Prevention
Presented by: Patrick Donahue, Sam Rhea, Daniele Molteni, Misha Yalavarthy
Originally aired on November 2, 2023 @ 2:30 AM - 3:30 AM EDT
Join Cloudflare's Product Management team to learn more about the products announced today during Security Week.
Read the blog posts:
- Announcing Cloudflare’s Data Loss Prevention platform
- Using Cloudflare for Data Loss Prevention
- Protecting your APIs from abuse and data exfiltration
Tune in daily for more Security Week at Cloudflare!
English
Security Week
Product
Transcript (Beta)
Hello and welcome back to Cloudflare TV. This is security week. I'm your host today, Patrick Donahue.
I'm joined by some product managers and some folks from the security engineering team.
So I'd like to let everyone introduce themselves and we'll go from there.
So why don't we start Misha, Sam and Daniele. Hey, so I'm a security engineer on the detection and response team and I'm based out of San Francisco.
And if I had to summarize my main responsibility, it's to make sure that we're monitoring our infrastructure and making sure that we're protected against all types of different threats.
My name is Sam. I'm a director of product management here at Cloudflare and the products that my team works on, we think of Misha's team and the whole security group as our first and foremost customer.
We want to use Cloudflare's network to help organizations of all sizes stay safer.
Hi everyone.
I'm Daniele Molteni. I am based in London, in our London office and I'm the product manager for firewall rules and the API shield.
So I'm looking after all the security tools to protect API and also contributing to building the firewall.
Terrific. Well, thanks everyone for joining me. Sam, big announcement this morning, data loss prevention or DLP as some folks abbreviate it.
We hear a lot about it.
Can you tell us what is the challenge that we're solving? How do we go about solving it?
Most, when the term DLP is pretty loaded, because when we talk to a lot of organizations, whether they're partners or others, customers or others in the space, and you ask folks to define what DLP is, if you ask 10 teams, you'll get 11 different answers in many cases.
But at its core, it's this concern that an organization has data, sensitive data, whether that's trade secret data, PII about customers, financial information.
And by the kind of ubiquity of SaaS applications by everyone like us working remotely or in hybrid model, that data, instead of all being in like literally a physical place where you had eyes and ears on it, that data now lives everywhere, including environments where you don't control.
And so at its core, it's this problem of all this data that's critical to my business.
And if ever lost would be a huge problem, would be a tier zero incident for my organization.
How do I make sure it stays safe and only lives in the places where it's supposed to live?
And only the people who are supposed to reach it can reach it.
So we're just scanning data or what are we doing? No. And that's what's fun about what we've built within Cloudflare.
And that's something that our network and this approach we've taken powers.
Because our opinion is that DLP is not scanning for data alone.
And for those in the audience who aren't familiar with that, a lot of more traditional DLP tools will look at data in transit.
So I'm making a connection out to Google Drive and we'll scan, for example, the file, maybe it's a CSV file that I'm uploading and check, hey, is this specific customer social security number, or is it data that looks like a credit card number?
Is that in the file? If so, block it. And that's an important part of DLP, but it's not all of DLP.
We think DLP really begins with permission control, because at the end of the day, a lot of data is going to live in a lot of places for any organization.
If the four of us on this call started a company tomorrow, we'd probably use 20 to 30 different SaaS tools by the end of the week just to manage our business.
And so trying to control on a very granular level by scanning just assumes that you've already lost control of role-based permissions and who's able to do what in these different applications.
Some applications have really powerful role -based controls or RBAC role-based access controls to use the initialism, but a lot of applications don't, whether it's a new SaaS application or something that maybe you have to pay extra to have the RBAC controls.
And so we've talked to a lot of customers for whom this is the beginning of their DLP problems, not necessarily just scanning for data, but how do I, in a single control plane, think about who's able to reach what and who's able to do what, who can view that customer record in your CRM or who's able to download a file from this Jira project, for example.
And by starting there, by using our Zero Trust products in our network, we begin by saying, hey, these are the rules about who's able to manipulate or interact with this data.
We then move on to more of the traditional scanning type features as well, some of which Daniele and team are implementing very soon around the corner with our reverse proxy, and others are more forward-looking with things thinking about data in all directions.
Yeah. And I think if we do start a company, I think getting the food delivery software as a service up and running would be my priority.
But once we tackle that, maybe we can get our CRM running and things like that.
Oh, good. You saw my pitch deck. And we don't have it actually, but share it with me.
No, I think the Salesforce is in the work days, right?
The apps that have been around for a long time, presumably have a lot of this stuff built into them, right?
So they've gotten, as you've been a product manager here and product director here, you field a lot of requests from customers.
And over time, you build a lot of that stuff in, especially as you get larger and larger customers you're serving.
And Salesforce probably went through this a little bit earlier than we did, late 90s, early 2000s, and bolted this stuff on.
But if you're starting a SaaS application today, it's probably not the first thing you build, right?
And so I think that the thing that I'm pretty excited about is the ability to bolt on those role-based access controls to SaaS apps, right?
So you don't have to do the least common denominator. You don't have to figure out, okay, well, Salesforce does it this way, and Workday does it that way, and some other application does it a third way.
I think there's a lot of parallels there to how we think about our web application firewall, for example, right?
Where you can establish a security posture, and a security administrator, and somebody on Misha's team, or her team broadly defined, could go in and review that and say, here's all the controls across all of my origin on-premise systems, not just one particular set of security tools.
So that's pretty cool. So is there anything that you do before the role-based access control?
I think in your blog post, you walk through some...
I like how you did it. I'm going to copy it for a future blog post online, but walking through the layered approach to what you should be doing.
What else besides the role-based access control? Well, the first comes...
I'm going to quote the leader of Misha's team, Joe Sullivan, the Cloudflare CSO, who describes it as high visibility, low effort.
Because the first step in thinking about, do I have control over my data, is being able to answer that question.
And that is incredibly difficult right now. If you think about the MacBook, I'm here in our Lisbon office, I'm in my Lisbon apartment.
If you think about the MacBook sitting on my desk, I'm not in a Cloudflare network, or an office network of any sort.
And if I'm going to use a tool later today, something let's say Jira or Salesforce, I'm going to connect through my home Internet connection to Salesforce, which is not an application that Cloudflare hosts ourselves.
It's a SaaS application and we use it.
And in theory, as an organization, the default is to have no visibility into what just happened.
Because I'm not using a network where it's logging that I made these requests to this customer record.
Maybe I'm using a corporate device, but for some organizations and some enterprises, even installing an MDM tool that would look at all the happening, goings on, on the device, that's prohibitively cumbersome or expensive.
And so they just ignore it at all.
I'm going to a SaaS application, which maybe if you want the logs about a user session, you have to pay extra or open an incident ticket.
It's really difficult to do, to have high visibility.
It takes high effort right now. So with what we've built out using a tool that we call Cloudflare Gateway product, Cloudflare Gateway, it allows an organization to deploy a very lightweight agent on any device.
And that agent both does DNS and HTTP filtering, but it also maybe even more powerfully does DNS and HTTP logging.
So that if you're ever operating in a model where you're assuming that breach, you're assuming that someone has already compromised your organization, you're looking for signals that would suggest that something's gone terribly wrong, the logs, comprehensive logs from both DNS queries and HTTP requests allows you to play back a user session.
So maybe someone's laptop was stolen, right?
And maybe they're in a place where they can go into a coffee shop, it was stolen while open.
And we want to know for the next hour, what did that open laptop reach?
And with these tools, you're able to do that, able to play back that user session.
And you're also able to prevent people from, maybe that stolen laptop is, the attacker's smart enough to know to turn off a gateway solution.
If that's an option, you can make it an option to where no one can turn it off.
But we can also block users from connecting to these SaaS applications if they're not using this gateway product.
So that you can have complete confidence that any event, any interaction, any request to the SaaS application was logged by gateway.
So that's the first step that we recommend is way before the kind of more comprehensive or complex scanning, just have visibility into what's happening in your organization.
And so that's interesting, right?
How does that, these applications, you might not normally think Salesforce or Workday or Zendesk or whatever the application would be, you'd have visibility on.
Again, you can do it from a DNS lookup perspective. I can see you're going through it.
How are we actually, for the technical audience at HomeLink, forcing that connection through?
Can you dive a little bit deeper? Yeah. It starts with the login.
So we have a product called Cloudflare Access, which is a zero trust solution that organizations deploy, in many cases to replace their VPN, but increasingly just to have very granular identity based controls over who can do what with the resources in my organization.
But one of the most powerful things about Cloudflare Access is that it becomes this identity engine.
It becomes this aggregator of maybe you use Okta and a company that you've just acquired uses Azure AD, or maybe you have contractors who are logging in with GitHub.
Access takes in all of your identity provider logins.
It consumes the device posture signals about your device.
Maybe you want to check for countries or multi-factor method.
It brings all these in and aggregates them into kind of a single representation of the user in the form of a JSON web token in many cases.
But to insert ourselves into the flow of a SaaS application, the SaaS application thinks that Access is the identity provider.
So when you go log in to something like Workday and it redirects you to your SSO, that first hop is through Cloudflare Access, which now we're not an identity provider.
We then send you on your way to Okta or Azure AD or GitHub while also collecting those additional signals like country and multi-factor method.
And then we bring you back and we send the SAML assertions back to the SaaS application if you're allowed to go ahead and proceed.
So in doing that, we don't need organizations or customers to migrate their identity provider.
They can bring the existing one that they have, but it does allow us to insert Access and insert the Cloudflare network into the flow of a user attempting to reach that application because Access can add additional rules that say the user reaching this application must be using the gateway product.
That makes sense. And I think from an adoption perspective, I know Okta is a partner of ours and a lot of people have deeply invested in Okta and getting up and running there.
And so that lightweight tie-in is probably a bit easier of a lift or a migration path than having to rip and replace them.
Misha, I actually talked to Joe a few weeks back about how we can better protect not just our external applications, but some of our internal administrative panels.
So there's oftentimes where our support team would need to go and a customer would request something and take some action on it.
I think you had written in your blog post, one of the hacks that you discussed was Twitter, right?
And so can you tell us a little bit about how some of these tools that SAM has been describing and multi -factor and hard tokens, how would that have helped secure some of those breaches there at Twitter?
Yeah. And for anyone who's not familiar with the Twitter hack, what basically happened was, and also social media companies getting targeted is not anything new, right?
There's well -documented examples and a simple single compromise account could have pretty severe consequences for companies like that.
So the hackers took over the Twitter accounts of a few prominent customers, including politicians and celebrities, Barack Obama, Elon Musk, and Jeff Bezos, I think.
So it was suspected that they use social engineering to get access to a user's account that had access to an internal administrative tool that was able to make changes on customers' accounts.
And there's a few controls here that I think could have prevented an attacker from making changes.
So the attackers were able to post on the customer's behalf and also make changes to their settings, right?
Like their email or their 2FA method, they were able to export direct messages from these accounts too.
So if you had a customer make an approval when a change is made on their account, when it's coming from this internal tool, as opposed to someone on their mobile phone making the change, that would require an action or an approval from the customer, which probably would have stopped that from happening.
You could also implement 2FA and not all 2FA is made equal.
So hard keys, and I know Evan loves saying that, I stole that phrase from him, but hard keys are definitely one of the more secure ways of implementing 2FA and probably would have stopped the attackers right in their tracks.
You referred to role-based access control, right?
So multi -factor authentication would have been a must here.
And regardless of the type of role-based separation that you have, to be able to access certain customers, it would be good if you had like a privileged list of users that were able to make these changes in the first place, or even look at those customers' accounts to get to even that page, right?
And this list of users that have this access should be regularly vetted or have some sort of approval process to even get your name on there.
And ideally it's a very short list. Sure.
By the way, I'm kind of disappointed that my Twitter account wasn't popular enough to be targeted, but maybe one day.
You need the blue check. Yeah. So just going back to the hard tokens, I know that's something that we've deployed at Cloudflare.
I have one sitting here in front of me. I accidentally fat-finger it sometimes and have characters show up, but can you just talk a little bit about what do you mean by this not created equal?
Like why is that useful versus like SMS, for example?
Yeah. So hard keys work as a form of 2FA and it's a physical requirement, right?
The person literally needs this hard key. I have mine right here.
Oh, you can't see it. But yeah. So the YubiKey and the attacker won't be able to steal that very easily.
They'd have to physically be able to get that from you. And you mentioned one-time passwords or TOTP, where you use an authentication app and those have been known to be intercepted.
Those can be redirected. And if an attacker was able to vish a user, then they can convince that person to read out those numbers, right?
So, which has also been known to happen when you're in a stressful situation and you're convinced of doing that.
Hard keys are a way to protect yourself against that kind of manipulation.
We'll get to vishing in a second.
That was a new term for me, but it's one of those things, you see it now, you see it everywhere.
So we'll come back to that. But Sam, I think the thing that's interesting, as you and I talked about this product in the early stages, was from an access perspective, right?
In controlling access to SaaS applications, there may be things like read-only mode, right?
That aren't necessarily built into the product.
And if you can actually define on some of the data loss controls, whether you can do things, just even different HTTP methods, right?
So get versus post versus patch or whatever that may be, right?
Can you build that into these access policies?
Yes. And that's what has been really fun about it. And first let me apologize.
I now realize that my background is making my mullet appear and then disappear.
So I know that's distracting for me at least. But one of the things that we've been really focused on with building the solution is making sure that you can layer all of this on together.
So if you want to build that requirement we talked about earlier, where you only are allowed to reach this sensitive customer record when using a hard key or YubiKey, like what Misha had on the screen, you can do that.
But it doesn't stop there. You can take all of the other features that we describe in Cloudflare Gateway, including throwing something into read-only mode, whether or not the application supports it on an app -by-app or record-by-record basis.
And you can even go a step further. Maybe if you want to say for these particular records, use browser isolation, a product that we announced in GA yesterday.
You're able to add that later on as well. So for a lot of our customers, it either already is, or we expect it to be the case, a kind of choose your own risk model approach, where, you know, think about the applications, the data in your organization.
What is an acceptable level of sensitivity for how a user should be able to reach it?
We are big proponents that, frankly, no one should be able to get to anything of any sensitivity level without something like a hard key.
I think that's just a great posture to have for any company. But then beyond that, you can begin to think about maybe more advanced rules, like a read -only mode, which is just a few clicks in the Cloudflare for Teams dashboard, or the browser isolation tool, which something quite magical about it is that you can isolate just certain sites or host names or applications.
And for the user, they don't have to leave their normal browser.
It just happens without, and in some cases, them even knowing they're suddenly using an isolated browser.
So by giving organizations this layered approach where they can choose their own adventure on how they want to deploy it, it's both easy and also respects the kind of classification that an organization wants to have about the data in their team.
You're bringing back some funny memories on the choose your own adventure.
Were you a guy that kept your finger there so you could go back if you chose the wrong path?
Yeah, of course.
Yeah, you got to play out all variables. Of course. So yeah, I think the thing that's neat in my mind there is the multi-factor.
I think our CTO John Graham coming up talks about five-factor auth, where you can have like, you must be using a Cloudflare-issued device.
And we didn't really get into that, but I think there's threat models where you could take, to Misha's point, like somebody did break into your house and steal your security key.
Them putting that in their own laptop is not actually going to be effective, right?
Because you can say this has to come.
And I think we've got a post coming up on this later this week by someone in your team, Sam.
I don't want to fully spoil the announcement, but ability to sort of restrict stuff to company-issued laptops, coupled with hard keys, being in particular locations, time of day, so on and so forth.
It is really like a choose your own security posture.
And there's no way that those would all be baked into those SaaS applications, right?
And so I think the cool thing for me is that uniform experience.
And as somebody who used to administer security, I think that would be really nice to have.
I wanted to ask you, where does this actually get enforced, right?
So we've got people that are working at home today, right? We're hopefully, at some point, we'll get back into the office at least a couple days a week.
But where are these controls getting enforced on the network, right? And where would a customer of this product's controls get enforced?
Well, before a solution like this, where it would have been enforced is for me back in San Francisco.
I'm here in Lisbon. I have a company-issued laptop. But if we were going to roll out a solution like this a few years ago, but in many cases, it's still what's deployed in many enterprises, all of my traffic would go back through a physical appliance in San Francisco, where it would be inspected, filtered, evaluated, logged, before maybe breaking out just to something I was searching here in Portugal, a website or host that's here in Portugal.
That's, of course, a pretty miserable experience.
Whether you are the organization responsible for maintaining that hardware and upgrading that hardware, and we're talking to a customer just this morning where one of their biggest pain points is literally the upgrade process of their security appliances causes them downtime and headache and weakened work.
So it's a big problem for that side of the house, but it's also a problem as an end user, because all of my traffic now goes through San Francisco, which moving closer and closer to the speed of light is pretty powerful, but at some point, it's going to slow me down.
With this solution, we're taking advantage of Cloudflare's network. Quite frankly, it's ubiquity in all corners of the earth, including a data center here in Lisbon.
So my laptop is using a lightweight agent that we call Warp, which opens up a wire guard tunnel to, in this case, the Lisbon data center just down the street there, a few milliseconds away, and all of the features that we're talking about are enforced there in the Lisbon data center.
So there's not a good, better, best data center topology.
Every data center in every city that Cloudflare operates around the world, it's over a hundred countries, has all the features that we're describing, which means for me as an end user, the Internet not only is not slower, it can feel even faster because once that traffic hits that data center, as much as we can, we're going to accelerate it to its destination using what we know about the Internet.
But again, what's really important is that all of these filters, all of these policies are being applied there in the Lisbon data center, not back in San Francisco or for Daniele in London, that would be in our London data center.
And so that gives organizations both the removal of that hardware that they used to have to maintain, but also the end user experience is just phenomenal.
I think one thing you mentioned that I hadn't really thought of with this solution yet, but that makes a big difference, especially where you are in Europe is the sort of the hyper-localization or the regional services aspect, right?
So we've announced some controls where you can decide where your data gets processed.
And I think that's actually, as we look at what are the killer use cases of workers, for example, it's yes, the performance is cool.
And at some point maybe we're orchestrating flying cars or something from a local data center that's a couple of milliseconds away, but the ability to process that traffic in region and have very granular controls over that is really important in Europe in particular, but also around the world as companies want to decide where their data actually gets routed and processed.
And so that sounds like another advantage of DLP that maybe we can edit your blog post and add that in.
You mentioned Warp and I want to get to Daniele's stuff next, but I just want to round up with you.
Warp, you described how it leverages WireGuard, but what is the difference between Warp and Gateway?
Did we learn stuff from one or how do you think about that?
Well, the difference between Warp and Gateway is a point of confusion.
We are transparently trying to better elaborate because one is a kind of technology and an agent and the other is a product.
But Warp, which is one of my favorite named things in the Cloudflare product universe, Warp started a few years ago as a mobile agent for any user of the Internet.
So not a business, not a filtering tool or inspection tool, but an agent that I could run on my device that would open up this encrypted and performant tunnel from my device to Cloudflare data center for all of my traffic.
And a few advantages of that include, the application itself is literally just a single toggle.
It's a giant button in the middle of the screen.
So for people who might be less familiar with how to change the DNS settings on their resolver at home, but want a secure path to the Internet, they download the app and just click a single button and they're there.
Other advantages include using Cloudflare's global private backbone on a paid version, which would accelerate traffic even faster.
So getting you to your destination, not just using the Internet, but in a pace that is better than the Internet.
And that's been out and used by anyone who wants a faster, safer Internet for a couple of years now.
And what's been fantastic about having that in the hands of all these users who have a choice about how they connect to the Internet, right?
Like you want a fast connection to the Internet. You don't have to run any agent on your personal device.
This has given them a much more private, safer, faster Internet.
And it's also helped us understand where does that break down?
So what are the strange patterns of behavior for how certain carriers implement IPv6?
Or what happens when I go to a hotel and I'm back when we traveled in a captive portal, and that might cause issues for a tunnel out to the Internet.
And so we've been able to, based on all the feedback from users who want to be, and we love it, that are vocal when it works well and when it doesn't work well, we've been able to make a version that we can offer to the enterprise that even though it's a newer product in the Cloudflare offerings, it was launched, this version of it last October, it's built based on years of testing at scale, at a scale that enterprises don't operate in today, because there's just anyone out there on the Internet, to make that a better product.
And so we've been really thrilled to kind of be able to, from day one, offer organizations of any size a superior experience based on all of that data and all that feedback.
Yeah, I think that we talk about that internally as our superpower, right?
The ability to leverage that just extremely broad customer base we have, work out a lot of the kinks and incorporate the fixes into the product and streamline the user experience.
And if you can operate something for millions of people using it, it becomes a whole lot easier to put it in an enterprise organization with a much smaller number of users.
And so we definitely find those very obscure use cases. And I've seen some of those bug reports come in and I appreciate all the work that you and the engineering teams have done there.
So, cool. I want to switch gears to some of the other stuff we announced today.
So Daniele, API Shield, something we announced originally back in October.
Tell me more about what we launched today and sort of why did we release...
Let's say, what are you seeing in the world that led us down this path?
Yeah, definitely. So for context, let's go back to October, one second.
So in October, during birthday week, we announced API Shield. So it was a brand new product that aimed at protecting API traffic.
The actual problem in Cloudflare in general, all our security products, always been deployed in front of both web and API traffic.
And we basically, out of the box, we're protecting API traffic since the start.
But what we noticed is that there is an increased trend towards the larger volume in terms of traffic coming from APIs, but also a more, let's say, it's becoming more risky in terms of attacks.
So if you take, for example, Gartner, which is the world's leading research advisory company, they predicted by 2022, API abuses will move from an infrequent attack vector to the most frequent attack vector on the Internet.
And so we saw this trend by talking to customers, by talking to analysts, and we decided to design something which is specifically tailored towards protecting APIs of our users and customers.
And so this was the mission we started with back in October, and we launched with two features.
One was MPLS, and we can discuss more about that later, and schema validation, which was in beta back then.
So what we did in the last few months after security, after birthday week, was to build upon those solutions and improve them and make them generally available.
So what we launched today was schema validation for all of our enterprise customers.
And also we built upon data loss prevention, which we were just discussing before, giving a different spin, if you want.
So more tailored towards protecting API traffic or avoiding data leakage or exfiltration of data in API responses.
So these are two main features that we launched today.
And I definitely want to dive into those features in a second, but what are the threats and vulnerabilities that you think about with respect to APIs, and what are the things that are top of mind that you're hearing based on your conversations with customers?
Yeah. I would say, if you think really high level, there are three main groups or three main areas where attacks and vulnerability happen.
So the first one is around authentication, authorization.
So trying to steal credentials or certificates, or trying to get access to resources that a specific user, a specific actor wasn't supposed to get access to.
The second area is more abuse of resources. So are there anyone trying to make too many requests, for example, API requests to your origin, or perhaps trying to scan for specific endpoints that have been...
I can call them shadow API or perhaps endpoints that have not been maintained over time.
So they actively expose part of the business logic or data.
And then the last part, the last area is more about data validation and data loss.
So API, by the nature of it, they are like requests and responses that carry a lot of information, and they directly connect to applications.
So the probability of exposing data to the broader Internet or returning data that the requester wasn't supposed to get or see in the first place is very high.
So when we think about products, we usually think about covering these three areas.
And actually, Misha, in your post that I was reading this morning, I think you talked about some API specific threats.
I think the example was Tesla, and really interesting use case.
I was actually looking at buying some of the Powerwall systems, a bit of a fanboy myself on some of the stuff that's coming out of Tesla, particularly on the solar panel and the energy side.
Can you tell me a little bit about, maybe let's talk through the API vulnerabilities there, and then Daniela, I would like to talk about how we could use API Shield to protect those.
So what did you see, Misha, in your review of that? Yeah, so the Tesla backup gateway is a component that's used to manage a tri -power source, so battery, grid, and solar.
So it helps you determine when to charge the battery, when to send the energy to the grid, and what source of or mix of energy to send to the house that's using this component.
And it allows you to monitor all of these installations through Wi-Fi, AT&T mobile, and the ethernet.
So a few things were exposed to their API through cURL without any authentication.
I mean, that's already a red flag right there.
And a couple of security researchers were able to grab sensitive information from these APIs.
And because they were researchers and not necessarily attackers, they didn't make any changes.
But theoretically, you could use these exposed APIs to have physical consequences, right, like make changes to the grid itself.
And these were accessible on the Internet, so anyone would have been able to push these changes out with a few key information.
And they were also able to access the Tesla power pack, which is significantly more powerful than a residential power pack, like 15 times the size of a regular one.
So I think in a hack like this, and Daniela can speak more to how API Shield can help, but it's important to consider how you protect your API.
So one, they weren't authenticated. And two, they could have required the customers to immediately change their default credentials right when they first log in, or even ship the products with randomized default credentials, we've, and of course, require 2FA, and such as a notification sent to the customers, you know, if they're accessing this through a mobile app to make changes to it.
Schema validation, I think alongside device validation would have been really great here.
Are like, for example, like are these devices that are querying this endpoint supposed to be querying this specific endpoint?
If and if this specific device is okay, is it located where it's supposed to be and where the query is originating from?
Because it's supposed to be a static, static location, right? Yeah, that's a good point.
I think that the source of those queries where they're coming in from, I think the thing that's interesting, and going to teach some stuff we're announcing later this week, or actually, I think it's I think it's on the week, Friday.
The first challenge in this case was these are APIs, my understanding was that they were running locally, and the expectation was that they were not exposed to the Internet, right?
And so I think I spoke with Joe Sullivan, who formerly was at Uber, and he was he was telling me that, you know, one of the things he was most afraid of was APIs that were exposed to the Internet that didn't have visibility on, right?
And so I think in this particular case, given that all traffic is reverse proxying through us, right, and we have that visibility across the entire domain, typically, to be able to provide, you know, TLS termination, web application firewall, all the stuff that we do from a security perspective, the ability to surface those APIs that your customers may not be exposed.
And so stay tuned, I might have some announcements there later this week.
And by might, I mean, definitely.
So first is surfacing those. And then I think like, once you surface them, applying a consistent, you know, security posture to them.
And so, Danielle, I think, you know, we talked a little bit about, or Misha pointed out a little bit about the weak credentials used initially, I think there was a brute force attempt.
So, you know, can you talk to me a little bit about like, can you use things like rate limiting?
And what about other stronger authentication capabilities?
Like, how would you use, in this particular case, probably, you know, client certificates is maybe not the right fit.
But in some cases, you know, we're going to make some announcements later talking about some hardware security modules, like that is absolutely the right path.
And so can you talk a little bit about what authentication capabilities that we've recently added to API Shield?
Yeah, sure.
So, well, one feature that API Shields offer out of the box is mutual DLS.
So, this has been developed with mobile application and IoT devices in mind, right?
So you can essentially generate certificates directly from the Cloudflare dashboard.
And you can embed those certificates in your mobile app or in your IoT device, for example.
And then any request, every request the device sends out to the origin via Cloudflare, we validate the certificate, we check the certificate, is it valid or not.
And you can enforce that all the requests without certificate get blocked, essentially.
What we also released this week is more controls on certificates.
So now you can, for example, revoke certificates that, for example, have been compromised.
So if you think about a case where you have a device that you know has been compromised, perhaps the credential has been stolen, you could revoke the certificate and start blocking any request that was originated from that device.
Of course, this is given the option of having the ability to add the certificate, either while manufacturing the device or later on in the lifecycle.
If you don't have that ability, of course, you can deploy other solutions and products.
One of them, as you hinted before, is rate limiting. So when you think about brute force attacks, it usually involves a very high number of requests where you're trying your way through, right?
You're trying to break, to find the password of the system or something like that.
So if you can spot that it's happening by looking at the rate of request and then blocking this rate, if the rate exceeds, then the user, then the attack is not carried out.
And this is something we offer already, rate limiting as part of the Cloudflare toolbox.
We are also working on more sophisticated and evolved solutions for that.
That includes also the ability to include the API key, for example, in that.
But that's something that we are working on right now as we speak.
Just to drill into a couple of those things for a second before we move on, I think the thing that I find interesting is looking at these APIs, for example, will return certain error codes, right?
If the credentials that were not successful, right? So 401s or 403s, I always get those two confused, right?
So I know we can kind of look at those response codes from the origin and actually control the roles.
I think it's neat that we're adding a lot more capabilities around other things to match on.
And so arbitrary headers and parts of the request, it's going to be great to see that roll out.
I think what you and I've talked about as well is a lot of these attacks, we'll talk later in the week as well about some account takeover protection stuff.
A lot of these attacks will come behind open SOX proxies and things like that, right?
And so can you tell me a little bit about the managed list aspect that is coming up or that came up?
Yeah. So we also launched a new IP list, which is managed by Cloudflare.
So back in July, we launched IP list, which was a feature where customers could write their own list of IPs to either block or allow, for example.
But today we launched something that Cloudflare takes care of.
So we pre -populate this list with IPs that are the exit IPs of open proxies.
And we offer this to the customers.
So a customer can now write a rule that, for example, blocks all traffic generated from open proxies or challenge in case there's, I don't know, perhaps there's an eyeball behind the request.
But yeah, essentially it gives some sort of threat intelligence to the customer and they can use it when they write the rules.
I think that the thing that I found really neat about that and all the tools that you're building is that you're essentially the way I talk about it with customers is we have this toolbox that has this really simple, expressive, powerful syntax that's based on Wireshark's display filter syntax.
And so as we add these capabilities, custom lists originally in the managed list, you can interpolate those lists into a rule maybe that also references bot management capabilities, for example, or some sort of threat score.
And so as we add into these capabilities, what I love about it is that the existing capabilities get better.
And I think we did something similar here with revocation as well.
And so there's attributes of a certificate where you may want to check that as part of a firewall rule.
And so you can use our guided API shield path that we're really streamlining for those controls, but you can also combine those expressions.
And so you can say, is the certificate in a revocation list?
And how about bot score below this? And does the schema validate?
And so on and so forth. Talk to me a little bit about the revocation piece.
So I actually met at one point with a customer who makes video cameras.
And so they're live streaming. Somebody would walk up to them and steal them, which I thought was kind of crazy.
You're live on camera there. But they wanted to block these devices from ever connecting back to their network through Cloudflare.
What is unique about the way that we did revocation versus kind of the traditional OCSP, CRL?
Yeah. So we basically take care of everything, right? So at the edge.
So essentially what the customers can do is they issue a certificate from the dashboard, and they can simply click a button and say, this certificate is no longer valid for me.
Please revoke it. And they could also restore it in case, of course, this was, for example, a mistake or the situation changes.
So really this streamlines the process of issuance of certificates, but also revocation.
So it's something that it can be done, yeah, basically without an API call or a simple click of a button.
And yeah, something to build upon what you mentioned before, we really expose the certificate validation and the certificate revocation as two different attributes.
And we, for example, talked to a few customers that they want to treat this thing separately.
So for example, they want to revoke certificates, but it's not necessarily that they want to block traffic from those devices.
Perhaps they want to redirect them to a different origin or perhaps a page and kind of like still provide kind of a user experience, for example, in case of eyeballs, that is not really like broken.
For example, if you want, if you block all traffic from revoked certificates, the device, for example, or the user will have really probably a bad experience.
While if you can handle that exception, everything flows much better for the customers of our users, essentially.
Yeah, that's a really good point. So I think the separation of, I remember when we deprecated TLS 1.0 and 1.1 for our dashboard, for example, right.
And there was just an RFC published yesterday or the day before talking about deprecating it, full stop.
But this was for our dashboard a couple of years ago. What we didn't want to do is we didn't want to just serve some ugly error page and abort the TLS handshake, right.
And show something that's like unintelligible back to the user.
We did want to actually terminate it and then you and status of what the handshake was done.
And I think this is another variable, right. Did you TLS handshake at 1.0, 1.1, 1.2, 1.3.
And you can decide using our workers platform, what content you actually show there.
And so I kind of see this as a similar technique.
And I didn't actually realize that people were doing rewrites and redirects, but that's really cool because they can do something similar where if you've had an IoT device in the field that was reported stolen, instead of permanently breaking it, it may be that you take them to a page and says, call here, file a ticket here to reinstate this device.
And so that's a pretty unique use case that we could accomplish.
What companies are you kind of seeing in this space that are looking for these API protections?
Like what industries are the most common?
I know there's been a lot of adoption of API Shield since it came out, especially around like managing the full lifecycle of the cert issuance, but what other industries are popping out to you?
So I think a trend is mobile apps and IoT devices and all the industries that really rely on those.
So if you think, for example, for mobile apps, your financial services is a big, big industry.
So there is an increase in the reliance of APIs for, for example, open banking or mobile banking.
So that's, we see that picking up, but also healthcare companies, for example, they also are developing their apps in general and for IoT devices, that can be quite broad, right?
So industrial sector where you have like very high value and very also high, I mean, any break or attacks to devices can cause a lot of damage.
That's something we are seeing out. Great. Yeah. Sure. Go ahead. I didn't mean to cut you off.
Were you about to say something? I just lost my thought. Okay. Well, I will try to jar it back with another question.
This is easy. I just get to ask questions and listen to y'all speak or y'all say, I'm sorry.
I'm still getting my Austin vernacular down.
One step closer. The thing that we'd asked today, data loss prevention.
So there's a couple techniques, I think you wrote about this in your post to help protect these APIs from leaking data.
One is schema validation and another is actual regular expression based matching of those responses and obfuscation before they go to the eyeball.
Can you talk a little bit about how these features are used to protect against data loss?
Yeah. So let's start from schema validation.
So they can both be used in the two phases of request response, but in a way schema validation essentially really is designed to validate the request coming in to the origin.
And then data loss prevention, at least the way we have designed it for the first release is more on the check on the response side.
So for schema validation, what we are doing essentially is building a positive security model.
And what do I mean by that? So traditionally a negative security model is more like I allow everything and then I write exceptions on what they want to block.
So for example, I want to block traffic coming from these open proxies, or I want to block traffic coming from a specific country.
The positive security model is of course the opposite.
And what it does is just it blocks everything and only allows traffic that complies with your expectation of good traffic or traffic that you want to receive.
So schema validation follows the second model.
So with API, you might have like a schema, which is essentially a contract that defines how your API behaves that you can leverage to build this positive security model.
So think about open API or the swagger, how other people refer it to.
So there you have essentially every endpoint of the API is defined. And for every endpoint, you have a definition of what type of parameters it takes, what's the name of the parameter, what data type it accepts.
And that defines essentially how your API operates.
What we built is essentially a module in the firewall where you can upload this file, like an open API schema, and the firewall automatically creates rules that check every endpoint and check whether every request complies with the schema.
When you find a request that doesn't comply with the schema, then you block it or you log it or you perform whatever customer decides what action they want to perform.
So this prevents attacks by really making sure that all the requests, they contain the parameters that are needed and they don't contain parameters that were not supposed to be there in the first place.
And also prevents, for example, attacks like SQL injection.
So perhaps there was a parameter and there it was expecting an integer or a string or something else.
And then there was something that is not really as the schema was predicting, and then you block that.
And that's for schema validation. So if the request reaches the origin, then it means that it complies with the schema.
Then the issue is, does the origin leaks any sensitive data, for example.
And here is where DLP comes into play.
And data loss prevention here is more like scanning the responses for known patterns or sensitive data.
And as Sam mentioned before, we're talking about social security numbers, a critical number, or perhaps secrets like API keys.
So you don't want this to leak out, for example, in an API response, then becomes public.
And so we built some patterns initially, right, that are kind of easy buttons that you can click.
But what about like in the future, can people define arbitrary patterns on what they want to do?
Yeah. So yeah, exactly. We started by providing something that customers can turn on very easily, right?
So that's in a way the idea also behind other products from Caldwell.
So easy deployment, you don't need to take like ages to set it up for your environment, for your system.
You just click a button, perhaps you give minimal configuration if you really want, but then it's already on.
And that's the way we are releasing now. So we have categories of type of sensitive data, like personal identifiable information, financial data, and you can select if you want those.
Going forward in the future, of course, we are planning to add more functionalities and let, let's say, the power user to define their own sensitive data, their own data type they want us to look for in the responses.
Almost sounds like we're kind of backing our way into an API gateway or an API secure gateway.
Is that something that you're thinking about? Well, yeah, I think the space is very interesting.
And the boundary between security and API gateway functionalities in general, they are becoming more and more emerging, right?
Those boundaries. So you are performing security all across, all basically in every stage of your API management and handling APIs.
So it's natural to start providing security products for all these stages that are traditionally exposed in API gateways.
Yeah. And I know that a lot of our customers have, similar to how they start sometimes with hardware appliances from a web application firewall perspective behind Cloudfire, and then either don't buy more or eventually phase those out.
I know a lot of our customers are paying quite a bit of money for API gateways.
They're just using a fraction of functionality for.
And so I think it's neat that we're able to build in a lot of those capabilities.
And then over time, I think listening to customers and as a product team, we like to get stuff out early and often and iterate on it.
And I know you've been spending a lot of time talking to customers and getting that feedback.
But I think for those out there that are starting to adopt API Shield, if you've got stuff that you're finding, you're doing in your gateway today, that you'd like to be able to do at the edge in front of all of your solutions, Daniela is the person to talk to.
And so we'll get your Twitter handle out there and get your email out there.
And if you're an existing customer, customer success can connect to you with Daniela.
And of course, file a support ticket and things like that, if you do get stuck.
So that's really interesting. I think I want to spend, I just want to make sure I covered everything here that you launched today.
If there's anything I missed or is there anything that's coming that you're thinking about in the future that you want to share?
And Sam, I want to ask you this as well after, but what else is in the works, Daniela?
Or what else are you excited about?
Yeah, I'm excited about a project of some other colleagues, more like specifically the bot management team.
So they are also working on API security products and they are about to launch product on Friday.
So stay tuned. It's about anomaly detection and API discovery.
So traditionally bot management doesn't work perfectly for API traffic.
And we basically decided to fix that. So this is going to be very exciting.
Also, it's going to be very exciting because it's going to work in conjunction with all the other API shield products.
So it's great to see expanding our solution.
Yeah, I think that's a really great partnership that we've developed where a lot of the intelligence and the machine learning experts are within the bots team, right?
And so they are, every request that's coming in, they're building these machine learning models and the way that we're exposing those controls to customers is in your area, right?
So firewall rules, interpolating that bot score.
And we talked about bringing some of the bot mitigation controls down to our pro and business plans.
And so stay tuned for the specifics of that announcement.
But I think what I'm excited about on the API shield side is that once you're able to discover like surface those APIs and then link them to a policy, I know you've been working on some great designs where you can say, I want to require a client certificate that's not revoked.
I want to do rate limiting.
Obviously DDoS protection is all built in here. I want to do schema validation, and then I want to also do some of the API anomaly detection stuff.
I got a demo of that the other day. I can't wait to talk to that team about it.
So that's really exciting. Thanks for sharing. Sam, what about you? What is coming up that you're excited about?
We're done. We've shifted. No. One thing I'm really excited about is how all of this begins to tie together on Monday.
And gosh, I can't believe this week's only been three, it's but on Monday Anika and the Magic ecosystem announced Magic WAN, which is going to be so powerful to give you organizations the ability to use Cloudflare's network to run their network and integrate with partners who want to give their own customers that type of feature set, that level of connectivity and performance and availability.
And what we're really excited about is bringing in more and more security features, starting with the Magic firewall they announced on Monday, but also things like the DLP level controls or the identity-based controls, things that are available today with a couple of the on-ramps, like the mobile agent that we were talking about earlier, but making those consistent regardless of how your traffic arrived at Cloudflare.
Because we know that customers, one thing we hear quite frequently, especially from different security teams is, one of the goals is just to have the same policy enforced everywhere.
If this is traffic leaving my office, traffic leaving a device, traffic leaving an application, like what Daniela and the team are building.
And so what we're excited about is bringing more and more of these security capabilities into other on-ramps, like the Magic WAN and that partner ecosystem alongside it.
Yeah, I think the exciting thing to me is we're building this DLP into the network at large, right?
And so wherever your traffic, however your traffic is getting to us or leaving us, you'll be able to have these controls.
And like you mentioned Magic firewall, I forgot to bring up with Misha, we were discussing before the segment about how you could use that to actually segment some of these devices.
And so if you think about your Tesla backup API, I can't remember what it was called exactly, but the ability to segment that network traffic and say, this device can only talk to this device I'm using in my home setup.
I'm using VLANs on my ubiquity equipment, but I think making that really easy to use and defining that in the cloud at the edge will be great.
And Misha, I wanted to, I mentioned this early on, but vishing, what is vishing?
I think 50 minutes later, we'll circle back to it, but what exactly is that and how does some of the access stuff prevent it that we talked about today?
Yeah. So vishing is like, it sounds a combination of voice and vishing.
And it's a form of social engineering that's typically voice-based or text message based.
And it's effective because unlike email, people have to respond and react right in that moment.
When you get an email, you can typically read it over, take some time to think about what you're actually looking at.
But it's easier to take advantage of people's emotions when you're attacking them through a vishing type of call.
So more often than not, companies have email protections, but it's more difficult to protect employees on their personal devices or if they get a message and they respond to it.
This is not something that security teams really have control over.
So we're really relying on our users to be able to protect themselves and be cautious there.
Yeah. And I think it's great to see the multi -factor and all those controls and all the data loss stuff as layered approaches, building on top of it.
And this is probably why I don't pick up my phone.
I don't like synchronous communication. And so one easy way to avoid that.
So anyway, we're just about out of time here. So I really want to thank everyone for joining today.
Really interesting updates on the product side.
Misha, I appreciate the color from the security side and describing how these tools can be used to protect against some of these threats.
And so thanks everyone and have a great day, everyone watching at home.
Thanks, Pat. Take care.
Bye.