🔒 Linux kernel security tunables everyone should consider adopting
Presented by: Ignat Korchagin, João Pedro Lima, Daniele Molteni
Originally aired on March 6, 2024 @ 12:00 AM - 12:30 AM EST
Welcome to Cloudflare Security Week 2024!
During this year's Security Week, we'll make Zero Trust even more accessible and enterprise-ready, better protect brands from phishing and fraud, streamline security management, deliver dynamic machine learning protections and more.
In this episode, tune in for a conversation with Cloudflare's Ignat Korchagin, Daniele Molteni, and João Pedro Lima.
Tune in all week for more news, announcements, and thought-provoking discussions!
Read the blog posts:
For more, don't miss the Cloudflare Security Week Hub
English
Security Week
Transcript (Beta)
Hello, welcome everybody. My name is Daniele Molteni and I'm a product manager here at Cloudflare and I'm introducing a very exciting Cloudflare TV segment about one of the blogs we launched during Security Week.
If you haven't heard about Security Week, it's a very exciting time of the year where Cloudflare launches and releases new products, new features in the security space, security area.
And it's also during this week where we release a number of blogs also about how we do things at Cloudflare, what we build also internally to make our job and our life easier, and also the way to secure better our infrastructure.
And with me today, there are two of my colleagues from the infrastructure team.
We have Ignat, which is an engineering manager and João, a system engineer, that are very, very knowledgeable about our Linux kernel that we use.
And they just wrote a very interesting blog about the tunable features of our Linux kernel, which they want to talk a little bit about and tell us a little bit more about.
So let's start from Ignat.
Can you tell us a little bit more about you, what you do at Cloudflare, and also what's the blog about?
Yeah, hi everyone. I'm Ignat. I'm the engineering manager of the Linux team at Cloudflare and our team is responsible for the operating system we run in production.
So you might have heard about different products and services we run in Cloudflare.
You might also have heard about different internal infrastructure we run, clusters like Kubernetes or Kafka or whatnot.
But everything we run runs on Linux servers and my team is kind of responsible for ensuring their setup is properly secure and performing.
Yeah, and what about the blog today that we have released?
Tell us a little bit more about the story there.
Why did you feel like writing this blog? Well, I guess we will discuss it during our meeting today, but Linux kernel is the foundational engine, the heart of any operating system in production.
And basically, it's a piece of code which runs at the privileged levels.
So basically, it enforces all other security mechanisms you might want to design.
So if we want to separate several users in the operating system, this user can access these type of resources or files and this program can access this hardware.
The thing that does it is actually the Linux kernel.
So if you read old NGINX blog posts, they describe NGINX as being this high-level orchestration mechanism where actually everything which it does is done by the Linux kernel.
So the NGINX doesn't send data over the network.
It's the Linux kernel that does it on behalf of NGINX. So Linux kernel is the foundational piece of any software stack, and this is why it's important to keep it secure as well.
So it's the actual orchestrator. Great. What about you, Joao? Tell us a little bit more about Cloudflare.
Yeah, and let's go with that first.
Hello, folks. So I joined Cloudflare about four years ago. I was initially part of the security team, and some time ago, I moved to the Linux team under Ignat.
And so far, I've been mostly working on the boot process and more specifically on improvements to the security of the boot process that we can make.
So yeah, I think I would probably start there.
And Ignat very well described the kernel as being like the beating heart of any computing system, but there is a bunch of things that happen before we have a properly booted kernel.
And all of those things are important from a functional standpoint as well as from a security standpoint, which is the core topic of the blog post that we are covering today.
So I just wanted to share a tiny graphical representation, very high level of what the boot process of a computing system generically looks like.
So if we start with a cold server, so a server that has no power, we ask it to power on and it loads a bunch of firmware, which in our case is UEFI, which is like one of the modern inception of computer firmware.
And from there, it generally proceeds to load what's called a boot loader.
So it's an intermediate piece of software that prepares the system to receive the kernel.
And eventually it loads the kernel itself and the kernel executes several procedures to initialize the system further.
And eventually we'll start loading either both drivers and kernel modules, as well as application services, which is actually the applications that we will be running.
So if we look at this from a security standpoint, there is a bunch of very sensitive stuff happening here.
And one of the components of our boot process, which is called UEFI Secure Boot, attempts to protect some of these phases that we already talked about.
So the idea behind UEFI Secure Boot is that we are going to ask the system to execute a certain thing.
And we want to ensure that the thing we are executing is something that we have allowed to be executed before.
So in the case of UEFI Secure Boot, that is done with cryptographic signatures.
So when looking at the kernel in specific, when we build the kernel, we assign it with a specific set of keys.
And then when the firmware is preparing the kernel to be executed, it verifies if the kernel that we are asking to be executed has been assigned with the keys that we accept, and starts running that if that validation is successful.
That's great.
Thanks also for the overview. It's worth mentioning why are we doing this.
Because as I mentioned, the Linux kernel is this foundational piece. It's a big program running on every machine, and part of that program is actually providing security to its application and services.
If that program is compromised somehow, if the code within the program says instead of enforcing security, allow everything, that's bad.
This is where before even running this piece of program as a Linux kernel, we also want to ensure it's the program that we wrote and not somebody else modified for us and made us run.
So this is where we built a secure boot chain.
The firmware ensures that we can trust the boot loader, the boot loader ensures that we can trust the Linux kernel, and then we can trust the Linux kernel to provide security for drivers and applications and services.
Yeah, as in you want to make sure that security starts from the very beginning, for the first thing you actually execute.
Tell me a little bit more, in the worst case scenario, tell me an example of exploitation scenarios that we could run into if this is not secured properly.
Yeah, in the blog posts, we have an example of why we need to ensure the integrity of the kernel and, for example, what can we do if we don't have enough protections.
And we can actually repeat it here just for an interactive demo, and let's hope the demo gods are working today properly.
Yeah, while you're bringing that up, anyone who is interested to learn more, because probably some of our viewers haven't read the blog, so if you're interested to have all the details, you can go into the Cloudflare blog and you can read more and all the details there.
Yeah, so I have this simple virtual machine right here.
It doesn't run the Cloudflare, like it can be dual-booted into both the Cloudflare kernel and the stock Debian kernel.
So at Cloudflare, we actually use Debian as our production operating system, but we don't use the Debian kernel.
But as an example, now it's using the stock Debian kernel, so I would say like this.
So if you just install Debian, like the official Debian operating system from the Internet, this is what you will likely get.
And obviously, because security is always at odds with, let's say, flexibility, right?
So traditional distribution kernels are more relaxed from the security point of view because they need to support more users and more use cases.
So therefore, what is possible in the stock kernel is probably not possible in the Cloudflare kernel.
That's why actually I'm showing the exploit on the Debian kernel, just to show what is possible, right?
But in this system, I always also configure it with a security feature called SELinux.
So SELinux is like an add-on, a security module for Linux operating system originally developed by NSA, but I would say it's a good part of NSA, to provide more protections for applications running on Linux platform.
But because it's also now part of the Linux kernel itself, it's a mechanism of security implemented inside the kernel.
And now we will see that if we actually try to subvert that mechanism, we will bypass the secure boot chain and we will end up having less security on the operating system.
So if you're interested about SELinux, there are links in the blog post, but what you generally need to know now is that SELinux can be, well, it should be configured and then can be in enforcing or permissive state.
So let's, I don't remember what state is here, so I can check my state now.
So now my system is in permissive state. What it means is the security mechanisms provided by SELinux are not enforced, but rather, if we violate them, a log is being generated.
But we can actually enforce these mechanisms now.
So we can set the system in the enforcing mode. So now I would say my operating system is in a more secure state because the Linux kernel will do more checks for things to happen.
Now let's actually see how, by not having a secure boot chain, we can subvert that mechanism.
So I will just create a simple kernel module.
So I will make a directory here, my module. I'll go inside and I will just, I will not write the code from scratch.
I'll just copy paste it from the blog post directly.
So to show you, it's the same thing.
All right, before that, we actually, I wanted, before we start writing any code, I wanted to do some introduction.
What are we trying to do, right? So what we want to do now is we want to attack this state.
So my system is enforcing, but my malicious code wants to disable the system.
It wants to set it back into permissive, where security checks are not enforced, right?
So kernel is an open source operating system.
So we can just take a sneak peek how it is implemented. Oopsie, wrong screen.
Yep, so we will be using this external resource, which is very convenient for introspecting Linux kernel source code, because the code is kind of very well indexed.
You can basically select any kernel version you want, because the code differs.
So now we are running a Debian kernel and we saw that it runs kernel version 6.176.
So let's just, for completeness, let's explore this one. And we'll just go directly and see where the SC Linux source code is implemented.
So we can go security, SC Linux, and this is basically the source code for SC Linux.
There is, I know because I explored it earlier, there is a variable called SC Linux state.
So we can actually try to search it here. Yeah, and it's defined in this header file.
So this SC Linux state variable is in kernel representation of the current SC Linux configuration.
And what we can see here, if our kernel is configured with this with this flag, the structure will have this bit Boolean here called enforcing.
And this Boolean is checked by the kernel code on each action it wants to enforce.
So like every code pass which goes inside the kernel which needs to be enforced by SC Linux, the code will first check if this Boolean is true or false, right?
If it's true, the action will be enforced.
If it's false, the action will not be enforced. So to basically disable SC Linux, all we have to do is to somehow set this variable from true to false.
So an attacker could, if manages to do that, then once it's false, then basically they can disable any security.
Provided by, yes, provided by this. Yeah. So once we get back to our code, actually, this is exactly what this getEnforced command says.
It provides the internal value of that variable.
If it's zero, it will say permissive. If it's one, it will say enforcing, right?
And we can change, officially change the variable with this command.
But let's try to write a malicious kernel driver which actually changes variable quietly outside of our knowledge.
And here we will just write a bare bones kernel driver.
I will just copy paste the code from the post. I'm using Nano.
So sorry, all your MX and Vim users, if you don't find, if you find it unholy or something.
But Nano is quite simple here. Yeah. I'm just going to copy paste the code from the post now.
Okay. Like a true manager, I'll control C and control view the code.
So this is like a bare bones Linux kernel module.
Some traditional bits here, but it's kind of, it's a working example.
So you have some metadata. You declare two functions, mod init and module finish.
So this function will be executed when the module, by the kernel, when the module will be loaded in the kernel.
And this function, which is a no op, will be executed when the module is unloaded.
So, and the code is quite simple.
So what we have here is the just, we have the runtime address, a potential runtime address of this flag inside the Linux kernel memory.
And we just set it to false, to zero.
Quite simple. So our module will maliciously just write directly zero into some memory pointed by this address.
And if this address points to that flag, we kind of influence the behavior of the kernel, right?
So two things we actually need to do here, since I copied this from the post and we discussed about kernel address space randomization.
In my current instance, that address will be different now.
So I'm going to double check it first. And to double check it, I can just do it from here.
I would need to be an admin to that. So the kernel exposes this magical file, which exposes some addresses of all internal kernel symbols.
And here is this current address of this SELinuxState variable.
So we just take it, put it here. Forgot the X.
Yeah. So now technically, if we compile that module, it should just work.
To compile a kernel module, we'll also need a so-called KBuild file, which just describes how to build this module.
I'll also copy paste it from the blog directly. It's a one-liner.
So what this KBuild file that say that our module is comprised of only one file called my module O, which will be generated from my module C.
And now I can compile this module.
All these commands are taken directly from the kernel build documentation.
Okay. I don't have this because I haven't installed the special Linux headers package, which you need to have to compile kernel modules.
So again here, what you're trying to do is to simulate an attack by installing a malicious driver that you just built.
I'm simulating an attack and also trying to show how important it is to keep the integrity of the kernel protected, just because once the attacker gets inside the kernel with their malicious code, they can do basically anything, because the kernel is that all-powerful program which all the privileges on your system.
So now we compile the kernel and now we can actually try to load it.
Yeah.
Actually, SEO Linux is preventing me from loading the module and I had the same thing in the blog.
So essentially, because the policy is configured that your modules can be loaded only from a special module directory, but we can actually just copy it there.
Right. So lib modules 6.118.
This is our current kernel and some directory there.
I use crypto, for example.
So from there, it might work. Now the module is loaded and now we can check the flag of SEO Linux.
And now it's back to permissive mode.
So our module loaded, modified the kernel memory, and disabled SEO Linux completely.
So this is why... You successfully attacked the kernel in this case.
Yes, yes. And disabled the additional protection mechanism.
That's great. Thanks a lot. So I think one thing worth mentioning here is that we are using a lot of root-level capabilities to execute this attack.
But one key aspect that Ignat briefly mentioned, but just to reiterate on it, is that we are doing this silently.
So when you disable SEO Linux through normal mechanisms, you will produce an audit log, which will make it obvious that this was disabled.
But using this strategy, we can just attack the kernel, disable SEO Linux, execute whatever the attacker wants to do, put it back into enforcing mode, and the system administrators will be none otherwise.
So I think it's an important feature to highlight from this attack.
So you fly under the radar, basically.
And secondly, it's worth mentioning that SEO Linux specifically, it's an internal protection mechanism, which was designed even to be able to restrict the root access.
But we can see that even root is not restricted enough that the root can disable the security mechanism, which was designed to restrict this root user, right?
And the problem, the main problem here that we see on the Debian kernel is that, well, actually, if we go to kernel log, we will see that it supports signed modules, but our module is unsigned.
But the Debian kernel still allowed that module to be executed, right?
So the module doesn't have any signatures, but Debian kernel notified us that we're trying to load an unsigned kernel module, but it didn't prevent.
The difference between this module, Debian kernel, and the Cloudflare kernel, in Cloudflare kernel, we actually don't allow that.
So technically, you don't allow modules that are not signed. So technically, the way how we try to protect from this kind of attack, that everything we load inside the kernel should be signed.
Therefore, it is part of our secure boot chain.
So if a third party gets into the system, generates this kind of a module, they will not be able to load it, and that malicious code will not execute.
That's great. And that's a perfect segue to something I wanted to ask you, Al.
So can you tell us a little bit more about the kernel module signing approach we have?
Also, the way we manage keys on Cloudflare? Yeah, yeah, absolutely.
So, as Ignat just mentioned, the most viable and better strategy to deal with attacks like the one we just shown is to ensure we can maintain the secure boot chain throughout kernel module loading and execution.
And having those modules signed and the signing enforced throughout the kernel is the best strategy to do that.
So at Cloudflare, the strategy we chose to deal with that, especially with regards to key management, as is mentioned on the blog post, the best way to deal with managing keys is not having to manage keys at all.
So the kernel build process itself provides ways to generate keys on the fly.
So the idea is you generate a key pair to sign the kernel modules with, you compile the modules just like we shown for the attack, and you sign those modules with that key pair, and the kernel is built to trust modules signed with that key pair.
So that specific kernel build will trust all the modules signed with that, the thermal key that was generated as part of the build process.
And once we are done building the kernel and building all the modules, we just discard this key and don't have to worry about the key being leaked or the key being lost.
So if we look at it, there's definitely the advantage of not having to manage that key, with the relatively small downside of not allowing us to compile modules anymore once the kernel build is finished.
So once we finish the build of one of our kernels, it's locked and we can't build any more kernel modules for it.
And in our case, that's not so much of a problem, because we build kernels pretty frequently and release them with a much higher frequency than we need to deal with kernel module updates.
But the aspect that brings a little bit more complexity is when we are dealing with what's called out -of-tree modules, so kernel modules that are not provided by the main Linux source code repo.
And in that case, we have to find a way to bring those source code repos into our Linux build process and build those kernel modules and sign them with the same key.
But other than that, it's a pretty low-effort strategy to deal with kernel module signing and doing this in a secure way without having to worry too much about the process.
Yeah, that's good. While Joe was actually talking, I just rebooted my VM into the Cloudflare kernel now to show a little bit of how it's different on Cloudflare kernel.
So now we're on the Cloudflare kernel, one of their latest production kernels, and we can actually just try and load this malicious module again.
So I'm just taking it directly from the history. But now we'll see that the Cloudflare kernel will not allow this module to be loaded, because this error means that we weren't able to verify the signature on the module, and therefore the module will not be loaded.
And we can also confirm it here.
So here we have a kernel log message that loading of unsigned module is rejected.
So you will get also the logging out of this, right?
So if something was attempting to...
Okay, then you will also get some visibility on that activity. Yeah, Joe, you were saying something.
Yeah, I wanted to shift gears a little bit.
And we look at an attack primitive right now. So the idea was to use a kernel module to override the SA Linux protections.
But as Ignat was mentioning before, the Linux kernel is a vast piece of software that offers numerous features, and it's a massive platform for us to build our stuff on top of.
So is there something else we could leverage to carry on a similar attack to this?
And another example that we talked about on our blog post is a facility provided by the Linux kernel, which is called KXAC.
So the idea with KXAC is that you can use a running kernel to prepare and jump into a different kernel.
So you basically use your kernel to prepare another one and boot into that kernel.
Switch kernels in runtime.
Yes. So that is a feature that has been long on the Linux ecosystem, but it has a different security stand over time.
So the initial version of this functionality allowed you to just prepare a set of buffers in user space.
So you open up the kernel image, you prepare the artifacts to boot, and then you tell the kernel, just boot into this thing.
And the running kernel would trust it blindly and not perform any validation on this.
So if we apply a similar reasoning, as we applied to the attack we just showed, we could use the new kernel to do the same thing, like disable security functionalities.
So the approach we've used on our Cloudflare kernels is to disable...
So there are two flavors to this KXAC functionality.
One of them, which is more recent and also more secure, is called KXAC file.
And rather than receiving a bunch of memory buffers with the prepared kernel, it receives files.
So it receives a kernel file, it receives one or more init.rd files, and it receives a kernel command line.
And in this version, it allows you to perform signature checks, just like secure boot has done to the initial kernel before.
And that's how we have our Cloudflare kernel configured.
So we don't allow the older KXAC with no security validation, and we do enable the newer version and enforce that only kernels that have undergone the signature checks and passed them can be executed through KXAC.
So it's also worth briefly mentioning why we still enable this feature in our infrastructure.
So we reboot quite often. So why do we need to have a kernel booting another kernel rather than rebooting the whole machine altogether?
And the main use case for this is that even though the Linux kernel is very good and generally works well, sometimes it crashes for some reason.
And it's important for us to be able to produce what's called the crash dump so that we can analyze that crash and see what's going wrong with that.
So that's why we still have KXAC enabled.
It's the feature that is used by the crash dump utilities to boot a crash kernel and extract a memory image of the running kernel for analysis purposes.
Just to summarize, there may be many avenues and different ways inside how you can try to attack the kernel, but the fundamental concept is simple and it always goes back to secure boot code.
You need to build a system in a way that before you execute new code, you have to ensure that code is what you want and you can do it by signing it and the thing that executes that code needs to verify that signature.
Without that, you can't be sure of anything because that new code might be malicious.
So if you employ the secure boot functionality, you always sign your code and you always verify this signature and not allow it inside ConeExecute.
You're in a much better spot of the protection of your system than just on a regular system where everything is allowed.
Fantastic. Thanks a lot for that final summary.
That was very useful. And thanks both Ignat and Joao for joining me today for this segment.
I had a lot of fun and learned a lot and I hope also our audience did the same.
So if anyone is interested to learn more, you can go to the Cloudflare blog and read the details in the blog we just launched today.
And with that, thanks again and I'll see you for the next Cloudflare TV.
Thanks, everyone. Bye.