Cloudflare TV

1️⃣ Area 1 Threat Intelligence Integration

Presented by Jesse Kipp, Blake Darché, Isaac Rehg
Originally aired on 

Join our product and engineering teams as they discuss what products have shipped today during Cloudflare One Week!

Read the blog posts:

Visit the Cloudflare One Week Hub for every announcement and CFTV episode — check back all week for more!


Transcript (Beta)

Hello. Welcome to Cloudflare TV. We're here today to talk about the blog post we released Tuesday about integrating Area 1 threat indicators into Cloudflare Zero Trust.

I'm Jesse Kipp.

I'm the engineering manager of the intel team here at Cloudflare and our responsibilities are to bring in threat indicators and distribute them to products all around Cloudflare.

Also on the call, we have Blake, who is the new head of Cloud Force One, which was announced on Wednesday.

Blake, you want to give an introduction to yourself? Sure.

I'm Blake. I'm, as Jessie said, running Cloud Force One.

Before that, I worked for Area One Security for the last six, five, six years...

eight years.

Then before that, I worked for CrowdStrike for a number of years. And before that, the NSA.

And also on the team, we...

also on the call here, we have Isaac who's a member of my team. Isaac, you want to give an introduction to yourself as well?


Yeah. My name is Isaac. I'm a systems engineer on the intel team.

I've been working here for almost a year now and most of my background is in computer vision, but I've been working on new phishing detection systems.


So, one of the things I wanted to start with talking about here is, you know, Area One's product is an email security product and Gateway is a product that protects users when they're browsing the Web.

And so when we think about pulling in indicators from Area One into Gateway and Cloudflare Zero Trust, I was interested in hearing from Blake what, kind of why phishing protection is so important as part of a company's cybersecurity infrastructure and then why did Area One build infrastructure to identify phishing websites as opposed to just phishing emails?


So as we see with a lot of attacks today, phishing is one of the top causes of cyber hacking events, right?

So from things like the Target breach several years ago where someone got into downstream system, typically these all occur via a phishing attack and there's many different kinds of phishing attacks.

Some of them are link-based, some of them are text-based where they're trying to get someone to fraudulently send money.

Some of them are an insider threat possibly, and some of them might be a file, right?

And so there's many different kinds of attacks that we see day to day.

And the problem just continues to get worse and worse as attackers become more and more novel.

Today, attackers really take advantage of a lot of cloud services, so they'll use online file sharing services to host payloads, send links out to the payloads in the hopes of evading inbound email security solutions.

So early on at Area One, Phil had the idea We need to have design a web crawler to kind of identify some of these URLs that are malicious out in the wild and look at them.

So over time, we developed several different web crawlers from a static approach to a dynamic approach, and today we have a super dynamic web crawler that has a bunch of different detections in it that helps identify things and those are some of the indicators that are now on the Cloudflare Gateway and other services that your team provides.


So as I understand, a lot of your detections are designed not just to identify a single domain, but to identify a campaign or a whole suite of phishing content.

Do you have any interesting stories you can share with us about some phishing campaign in which you found an unexpected connection between some different attacks?

Yeah, so we've seen a lot of different ones.

I think one that really will resonate is we tend to see like, a great example is like a landscaping service.

So a big company might have a landscaping service that's cutting their lawn.

But at the end of the day, this little landscaping service of like seven guys gets hacked and then all of a sudden that landscaping service is now phishing their company, right?

And so in the past, we've actually had successes where our crawlers come along, determine that, hey, this is a hacked website and we shouldn't allow traffic from this domain anymore, and this is the malicious URL.

And then all of a sudden, the landscapers are like, we can't talk to our client.

And it's like, Yeah, well, your website's compromised and there's like 13 credential harvesters, Gmail, office, 365, etc., all running on this website.

And the landscaping guy has no idea because he's doing landscaping and the attacker's running a full phishing attack from that website.

So there's a lot of interesting scenarios where a little company works with like a much larger organization and unfortunately, a lot of those little organizations have inferior security because they just haven't spent the time and resources on the security to protect their systems.

And then they all of a sudden are the next vector into an attack. We saw, we've seen some things in the past where we were seeing like Doosan Power was being used, a HR site for like Doosan Power was being used to phish Doosan.

And we have a, we issued a report several years ago called Operation Doos that goes over this and there's a full malware involved and everything, and it's just this little tiny site that got rolled over.

And so that's a really common theme and a common vector into a lot of organizations today.


So prior to the acquisition of Cloudflare, of Area One for in support of Cloudflare Zero Trust, we also had a data pipeline that both pulled in indicators and did some active crawling looking for indicators.

Isaac, can you give us a bit of an overview of how that system works?

Yeah, sure.

So what we we start from is the seed of newly registered domains that we pull from a set of different registrars.

And so from that, we get these domains that have really only been up for maybe a day.

And so they don't really have this reputation. And so that kind of introduces some level of risk around these domains.

We can't, we don't know whether they're doing anything wrong, but it's...they're somewhat risky, some level of risk associated with them.

And so we take these domains then and we pass them to a Web crawler that we've developed that then goes and collects the DOM from that website, takes a screenshot of the website, as well as downloads different images that are being loaded on that website.

And then it collects these features from the DOM, things like the amount of JavaScript in the page, the number of images that are being rendered on the page, and then basically runs a random forest classifier on these, on these DOM features.

And so we have this model that's been trained to classify whether these sites are phishing based on these, based on these features that we've crawled.

And it works reasonably reasonably well for for, I would say, the majority of like, on the majority of cases of phishing that aren't super highly developed.

So cases where they're maybe just loading a login page and not too much else, no market tracking or any of that sort of thing.

So we've seen a reasonable amount of success with this model and are looking forward to developing some computer vision-based approaches for detecting in the future.

And well, we'll come back to that.

So in addition to crawling and looking at URLs that come in, we also curate a variety of third-party feeds that we think provide good coverage for a variety of different types of security threats for, that Zero Trust customers are facing, phishing included.

So within the Gateway product, we kind of think of their...

our coverage for phishing attacks in particular is primarily covered by three different categories of websites.

So within the Gateway product, customers can build rules that allow logging, blocking or isolating within browser isolation, different categories of websites, and particular to phishing attacks, we have a category specifically for phishing, and this category kind of sits under the parent category of security threats.

And what we find is that most of our most of our gateway customers and our Zero Trust customers just go ahead and set everything that's in the security threats category to be blocking.

And this is also a group of categories that's blocked by the resolver, the 1.1.1 for families, public DNS resolver.

And then we also have another category that we call brand embedding.

And this is basically domains that, you know, contain keywords or misspellings, you know, potentially typo squatting of common brands and websites.

And we have a, we also have a category for just domains that are new and newly seen.

And what action to take within Gateway on these two categories is kind of up to a particular customer's preferences around false positives and and false negatives.

And so some customers choose to just log those or send those websites to Browser Isolation and other customers just go ahead and block those.

And we've, you know, we see tweets from you know, I've seen tweets from customers telling us that, you know, they got sent a phishing website and it was registered, you know, 6 hours ago and then it was blocked by Cloudflare's new domains category.

So the data that we're pulling in from Area One, from their threat indicator system, you know, they've developed their system to have a pretty high confidence because email delivery deliverability is really important to customers.

So we've just gone ahead and fed that data straight into our phishing category.

So customers who have Gateway set up today or are using 1.1.1 for families today, will get the benefit of that data from Area One without any additional configuration done on their side.

So, Yeah, Isaac can you..

I think new...

I think new customers...

I think few people realize just how many new domains are fraudulent.

You know, in the first 24 hours, like 50% or something of new domains will be invalidated for like chargeback, fraud, phishing, spam, malware.

And then like the first 48 hours, like another 25%, will get invalidated.

And after two days, you're looking at something like a 75% invalidation rate on those, so we tend to recommend customers be fairly aggressive in this area, right?

New domains are rarely good. Most businesses are not working with a company that just was created in the past 24 hours, it turns out.

And there's not a lot of good reason for that.

I know marketing people in particular get a little upset at this because they're launching their new campaigns, but they could just register their domain a couple of days earlier and it would have a better score, if you will.

So there's just a lot of invalidation of new domains, and that is a huge area of badness.

This is one of the areas that, you know, Cloudflare really has a great advantage in, in that the 1.1.1 resolver gives us early insight into when those domains are created and helps us see them early on in the process.

So, every 24 hours there's a file, there's a set of file published that sort of, that lists all the domains in many of the top level domains dot com, for example.

But a lot of phishing attacks are launched and taken down within a 24 hour period.

And so you don't even have the time to see and gather that information from those those public zone files before the phishing attack has already taken place and gone.


So this is an area where use of the 1.1.1 resolver and use of Cloudflare Gateway actually helps customers protect one another as well because if some customer sees a domain or gets sent an email or looks up a domain in resolver, that goes into our system, and we are able to track it and label it as a new domain - and that impacts all other customers as...

- And it's... And it's the same thing on the Area One side, right?

So with email, on the email side, we allow customers to set flexible policies on new domains.

They can basically drop all traffic from a new domain in the first 24 or 48 or however many hours they decide.

And as new domains are added, all customers get that information.

So that helps protect all customers from these domains.

Isaac, how about you tell us a little bit more about some of the things you've been experimenting with recently in the world of computer vision and phishing detection on the Intel team?

Definitely, yeah.

It's an exciting world. So we, our two main approaches right now that we're working on are logo detection based.

So just looking for particular logos of particular brands that we've kind of registered and know that that brand isn't maintaining this current site infrastructure and an approach that actually, actually looks at similarity between the overall page layout of a particular page that we're looking at and some set of potential target pages that we think that this this page could be impersonating.

And so I think, like the main sort of utility of these vision-based approaches is to overcome this limitation kind of in these more...

some of these kind of hard- coded, hard-coded indicator approaches.

So like if you have like, for instance, like oftentimes like, for instance with like identifying like specific threat families or a specific campaign, phishing campaign, you might look at like patterns and the DOM of a particular site to see that it's using this as like a signature for a particular campaign or a particular phishing kit.

And so that works really well. But the issue is, of course, that new campaigns or new phishing kits are being put out, designed every day.

And so it's kind of...

for the attacker to overcome that, all they have to do is write a new a new website that that in a way where that particular pattern isn't being, isn't being detected anymore.

And so with kind of the vision-based approaches, we're hoping to kind of get to the kind of the most core, like what phishing is at the core, which is presenting a page that is almost identical to some other page that they're trying to get harvest credentials for.

And so the the logo-based approach is probably the simplest in that it's just you can literally just create like a filter, a visual filter from a logo that, a PNG of the logo that some customer provides you with, or some common big brands that we know are already out there and are being actively phished.

And it has kind of one fundamental issue, which is that a lot of brands might be displayed on certain pages like Facebook or other social media accounts or social media companies.

You might have like a Facebook login or a Google login or something like that.

And so, in certain cases, like an instance of a logo isn't actually isn't actually necessarily an indicator of, of a phishing instance, although it can be helpful when use combined with other detection approaches that we've mentioned earlier.

But sort of the main reason then for this similarity- based approach, where we're looking more holistically at the overall page layout and trying to identify whether it is in fact almost a mirror image of the other page, that's kind of the main reason for this approach is to overcome this potential shortcoming of the logo detection method.

So, Blake, on Wednesday, you got to announce in the blog the creation of Cloudforce One, new security research team.

And I'm wondering if you could tell us a little bit about the kind of the perspective on security research you're hoping to bring to Cloudforce One?


So I think we're looking to provide customers a more in-depth look at kind of threats that might be targeting them, especially from the nation state side and understanding more complex, more complex threats and what the vectors they might be coming in on, right?

Whether that be like an email link, whether or not that might be a file, whether or not that might be someone trying to hack their server through a remote code execution vulnerability and in a Web service, right?

And so we've seen attackers over the last years, they'll use all of these different attack techniques.

Whatever is easiest is what attackers will use.

But, you know, attackers have kind of like deficiencies and one of the deficiencies they have is they like things to be easy.

So there will always be some way to detect an attack because an attacker will inevitably repeat the exact same way they do something time and time again.

Otherwise, they are just not able to hack as many people as they like.


So the more automated their operations are, usually the more detectable they are.

So we just hope to be able to highlight these things to our customers and produce more insight into the types of attacks that are ongoing out there in the wild.


So one of the things I'm really excited about looking forward into how the integration between Area One and Cloudflare is kind of this idea of seeing the same attack patterns repeated over and over again, and instead of just flagging domain names, really pushing some of those detections as close to our users as we can.

So actually running them in Browser Isolation.

If we're inspecting the DOM or performing logo detection or computer vision, really performing that in Browser Isolation or for websites that we're intercepting the traffic and proxying it with Cloudflare Gateway looking at the traffic as it goes through Cloudflare Gateway.

And I think that's really exciting because it means that we have the opportunity to target some of these threats without having seen them before they ever arrive at the customer.

We can take these patterns that we see over and over again and detect them the first time they're used in the indicators that are easy to fluctuate fast, like domain names that are purchased and URLs - that are used.

- 100%. I think especially for web browser isolation, I think social media is a huge area where there's been lots of trouble in the past, and many companies, especially in the defense industry, have been attacked on the social media side where an attacker will create a fake profile, reach out to an unsuspecting user and send them some sort of payload or a link to a payload and try to get the user to open that payload and social media is a great spot for web Browser Isolation, in particular because there's high engagement with an unknown set of individuals that often falls outside of the typical security boundary.

So being able to extend detections and protections into these areas will derive a lot of value for most customers and really help secure their architecture.

So Blake, do you have any things from Cloudflare that you're excited to bring into Cloudforce One?

I think we have a lot of things.

It just, we just have to keep looking at which ones to go with first.

But I think there's a lot of similarities in different areas that will allow us to create a lot more compelling products, not just at Cloudforce One, but at Cloudflare and throughout our entire ecosystem.

So, you know, web Browser Isolation, looking at like a BEC attack that might be occurring, say, on LinkedIn.

That's a great area of exploration.

I think looking at strange, unknown links coming in on web Browser Isolation or on Gateway, I think these are great areas to look at.


So, user's got a link the domain was just registered in the last 24 hours, what is this link doing, right?

Are we able to somehow crawl this link in real time and then interrupt that connection to the user before they receive a malicious payload?

So things of that nature all will feed into Cloudforce One and looking at kind of the overall threat landscape and providing a lot more detection and protection for users.

Isaac, anything from Area One that you've seen that you're excited to bring in to our our Intel team data pipelines and detections?

Yeah, well, when we met, they mentioned they had done some work already in logo detection.

So I'm excited to get some feedback on our approach from that from them and definitely excited to learn more about their Web crawler.

We definitely have, it would be helpful to have a have a discussion on some best practices with Selenium.

So yeah, that's that's all lots of good stuff to, to add to our approaches.


We're winding down with a couple of minutes left here. Any parting thoughts?

I think generally attackers just become more and more novel every day.

So they'll just think about what's this new crazy service that everyone likes, and they'll try to figure out how to use that service to attack people.

So that's just generally the way I think attacks are moving, whatever is the new most novel thing.

They just move to that.

All right.

Thanks everybody for tuning in and listening to our discussion here about Area One Threat Intelligence integrated into Cloudflare Gateway and stay tuned for the next segment coming up very soon.

We're betting on the technology for the future, not the technology for the past, so having a broad network, having global companies now running at full enterprise scale gives us great comfort.

It's dead clear that no one is innovating in this space as fast as Cloudflare is.

With the help of Cloudflare, we were able to add an extra layer of network security controlled by alliance, including WAF, DDoS, Cloudflare Users, CDN and so it allows us to keep costs under control and caching and improve speed.

Cloudflare has been an amazing partner in the privacy front.

They've been willing to be extremely transparent about the data that they are collecting and why they're using it, and they've also been willing to throw those logs away.

I think one of our favorite features of Cloudflare has been the Worker technology.

Our origins can go down and things will continue to operate perfectly.

I think having that kind of a safety net provided by Cloudflare goes a long ways.

We were able to leverage Cloudflare to save about $ 250,000 within about a day.

The cost savings across the board is measurable, it's dramatic, and it's something that actually dwarfs the yearly cost of our service with Cloudflare.

It's really amazing to partner with a vendor who's not just providing a great enterprise service, but also helping to move forward the security on the Internet.

One of the things we didn't expect to happen is that the majority of traffic coming into our infrastructure would get faster response times, which is incredible.

Like Zendesk just got 50% faster for all of these customers around the world because we migrated to Cloudflare.

We chose Cloudflare over other existing technology vendors so we could provide a single standard for our global footprint, ensuring world-class capabilities in bot management and Web Application Firewall to protect our large public-facing digital presence.

We ended up building our own fleet of HAProxy servers, such that we can easily lose one and then it wouldn't have a massive effect.

But it was very hard to manage because we kept adding more and more machines as we grew.

With Cloudflare, we were able to just scrap all of that because Cloudflare now sits in front and does all the work for us.

Cloudflare helped us to improve the customer satisfaction.

It removed the friction with our customer engagement.

It's very low maintenance and very cost effective and very easy to deploy and it improves the customer experiences big time.

And Cloudflare is amazing.

Cloudflare is amazing.

Cloudflare is such a relief. - Cloudflare is very easy to use.

- It's fast. Cloudflare really plays the first level of defense for us.

Cloudflare has given us peace of mind.

They've got our backs.

Cloudflare has been fantastic.

I would definitely recommend Cloudflare.

Cloudflare is providing an incredible service to the world right now.

Cloudflare has helped save lives through Project Fair Shot.

We will forever be grateful for your participation in getting the vaccine to those who need it most in an elegant, efficient and ethical manner.

Thank you.

You run a successful business through your e-commerce platform.

Sales are at an all time high.

Costs are going down and all your projection charts are moving up and to the right.

One morning you wake up and log into your science analytics platform to check on current sales and see that nothing has sold recently.

You type in your URL only to find that it is unable to load.

Unfortunately, your popularity may have made you a target of a DDoS or distributed denial of service attack, a malicious attempt to disrupt the normal functioning of your service.

There are people out there with extensive computer knowledge whose intentions are to breach or bypass Internet security.

They want nothing more than to disrupt the normal transactions of businesses like yours.

They do this by infecting computers and other electronic hardware with malicious software or malware.

Each infected device is called a bot.

Each one of these infected bots works together with other bots in order to create a disruptive network called a botnet.

Botnets are created for a lot of different reasons, but they all have the same objective: taking web resources like your website offline in order to deny your customers access.

Luckily, with Cloudflare, DDoS attacks can be mitigated and your site can stay online no matter the size, duration and complexity of the attack.

When DDoS attacks are aimed at your Internet property, instead of your server becoming deluged with malicious traffic, Cloudflare stands in between you and any attack traffic like a buffer.

Instead of allowing the attack to overwhelm your website, we filter and distribute the attack traffic across our global network of data centers using our ANYCAST network.

No matter the size of the attack, Cloudflare Advanced DDoS protection can guarantee that you stay up and run smoothly.

Want to learn about DDoS attacks in more detail?

Explore the Cloudflare Learning Center to learn more.

Thumbnail image for video "Cloudflare One Week"

Cloudflare One Week
It's Cloudflare One Week, featuring an array of announcements and discussions related to Zero Trust and SASE. Visit the Cloudflare One Week Hub for every announcement and CFTV episode — check back all week for more!
Watch more episodes