Internet Application Security Report

Presented by: Catherine Newcomb, Daniele Molteni

Originally aired on August 8, 2024 @ 3:00 PM - 3:30 PM EDT

Over the last twelve months, the Internet security landscape has changed dramatically. Geopolitical uncertainty, coupled with an active 2024 voting season in many countries across the world, has led to a substantial increase in malicious traffic activity across the Internet. In this report, we take a look at Cloudflare’s perspective on Internet application security.

This report is the fourth edition of our Application Security Report and is an official update to our Q2 2023 report . New in this report is a section focused on client-side security within the context of web applications.

English

Transcript (Beta)

All right. Hello, everyone, and welcome. I'm excited to talk to you all today about our 2024 Application Security Trends Report. My name is Catherine Newcomb, and I'm a Product Marketing Manager for Application Security here at Cloudflare, and I'm joined by Daniele Molteni. Can you tell us a little bit about what you do, Daniele? Sure. Hello, everybody. My name is Daniele. I'm a Product Manager in our WAF product group, and I lead our web application firewall, and I'm based out of the London office in the United Kingdom. Thank you so much. So let's dive right into our Application Security Trends Report. So you can see the report here on the screen. You have a QR code, which will direct you directly to that report, and we'll have this link at the end as well, so you can follow along if you'd like. But over the last 12 months, over the analysis period for this report, the Internet security landscape has changed drastically, and there's been a lot of geopolitical uncertainty, as well as a large 2024 voting season, both in the US and globally, and there has been a substantial increase in malicious traffic across the Internet. So in this report, we'll take a look on Cloudflare's view of cybersecurity and go through some of the high-level details. So looking at some of the key findings for our report, the first finding that we found is that DDoS attacks on applications remain the most common HTTP attack type at 37.1% of all mitigated traffic. Additionally, we saw that CVEs, also known as zero-day vulnerabilities, were exploited faster than ever. In one example, we saw a CVE exploited just 22 minutes after a proof of concept for that CVE was posted online. Additionally, and this is new research for Cloudflare this year, we focused on third-party code and the software supply chain risk that many websites are dealing with. For example, we found on average that enterprises have 47 .1 third-party scripts in their websites, and they make an average outbound connections of 49.6 outbound connections. Additionally, a lot of organizations use cookies. We found an average of 11.5 HTTP cookies were used in websites. Additionally, we did some research around bots in this report, and we found that a third of all traffic, and this number has stayed quite consistent over the last few years, is associated with bots. Of that percentage of bots, 93% of them are potentially malicious and unverified, and we'll go into a little bit more about what unverified bots means later. Additionally, we did some research on API security as part of this report, and we found that organizations are overwhelmingly using outdated and non-recommended approaches to API security. We'll get into some more details on what that means later. The first thing we're going to talk about, the first section we'll dive into is application layer DDoS threats. I'm going to turn it over to Daniele a little bit. Talking about DDoS attacks, let's start at a very high level of why do attackers DDoS? What is the goal of these attacks? Yeah, sure. A DDoS attack is essentially an attempt to overwhelm a server and take it offline. It's done by sending a flood of requests to that server, that application. And this can happen anywhere in the world. One of the reasons that is being carried out is, of course, for some reasons to bring out or go offline, make offline a specific server, to exploit it, for example, to ask for a ransom and get money out of it, or also for political reasons. This is something we've seen more happening. As you mentioned, there has been geopolitical instability, more reasons from all around the world to try to target specific institutions. An example is when in March 2024, we saw a spike in DDoS attack directed towards Sweden after its acceptance of the NATO alliance. And we saw the same spike when Finland joined back in 2023. So there's really some motivation that comes from the political world or, of course, from a financial perspective. And if you think about one of the organizations like Fancy Bear, which is a malicious actor, what they use DDoS for is to gain a financial advantage over the target and try to extort money. That's really interesting. So I'm curious, how have DDoS attacks changed in recent years, maybe looking at volume or how attackers are leveraging botnets? Yeah, sure. So again, we've seen a massive increase this year, around 93% increase in year-on-year DDoS attacks. One of the reasons we've seen so many more attacks is also because DDoS attack is becoming a product. So if we look at the cost of carrying out an attack, it's extremely cheap, extremely low. In 2023, you could spend $10 to carry out a DDoS attack that lasts one hour, or around $35 to $170 for a full day attack. And the cyber groups that offer DDoS attacks, they also promote bundles where you can save if you buy the possibility to run more attacks. So what we're really observing is that DDoS is becoming a product, it's being productized, and there are a lot of customers that are actually willing to use those services. And the way the attacks are carried out is by leveraging large botnets. If you think, for example, of the Mirai botnet, it's a pretty infamous network, which is composed by infected IoT devices. So the IoT devices have been infected with malware, and then they can be used by those malicious actors to carry out attacks that a customer pays for. Yeah, that's interesting. So looking at this chart that we have on this screen, we see a pretty big increase in DDoS attacks at the HTTP layer around the end of August. So I know that corresponded with the HTTP rapid reset attack invulnerability. So can you tell us a little bit about that DDoS attack invulnerability and why it was unique? Yeah, so that's a very interesting attack. We basically realized there was an ongoing attack at around 200 million requests per second, which was around three times bigger than anything we've seen previously. And we also realized that was coming from a relatively small number of machines, around 20,000 machines were launching this highly volumetric attack. And usually you have hundreds of thousands of devices that send requests at the same time. So what we found out is that this type of attack was leveraging a bug, or if you want like a vulnerability in the HTTP2 protocol. So the HTTP2 protocol allows you to send multiple requests within the same TCP connection. So this is at the basis of multiplexing and concurrency. And so what the attacker was doing, he was sending a flood of requests with just fewer TCP connections. And of course, this surprised us, but we managed to deploy a fix to our DDoS system very rapidly. So our customers, anyone behind the Cloudflare network started to be protected by those type of attacks. So they didn't see any disruption to their operations. Yeah, so that leads us very well into sort of, you know, how can we mitigate these DDoS attacks? What sort of resources and tools do organizations have at their disposal to address this large threat? Yeah, I think the recommendation is to look for a DDoS provider that allows you to absorb this huge amount of requests and peak of attacks, and let your server, your infrastructure intact. And usually the providers of protection, they protect against the attack by adding extra capacity to their network. So for example, what we do at Cloudflare is just we have such a big network around the globe that doesn't matter how big is the attack, we have enough spare capacity that we basically are never affected. And so we can filter all the traffic before it reaches any server. And one thing I think to keep an eye out is how it's being built and how it's being charged for. Some companies, they do meter how many requests they've been received by the attack, and other providers, they provide an unmetered, unlimited DDoS protection. And of course, this will give you a peace of mind. That's really interesting. So moving on to zero-day vulnerabilities and CVEs, what trends are we seeing in exploitation of zero-day vulnerabilities? Yeah, so again, this is a classic type of attack for application and in general for application security. So a CVE is a common vulnerability and exposure is essentially a bug that is present in one of the software that runs in the stack of our web server. And of course, because our web server runs a lot of different components and libraries, potentially there are a lot of different vulnerabilities that an attacker can exploit to extract data, for example, or to take over an account and basically carry out attacks and manipulate the server. A zero-day is by definition when you never heard about this vulnerability before, you didn't know about the vulnerability, and then all of a sudden, an attacker targets that vulnerability by sending a malicious payload and exploiting that vulnerability by, for example, extracting data. And what we've seen in recent time is that there's been an uptake in the attacks that target those vulnerabilities, but also we've seen a lot more CVEs, so bugs discovered in 2023. So 29,000, more than 29,000 bugs were identified and we had 100 zero-days in 2023. So 100 times a new payload was, malicious payload was trying to be leveraged. And this is an increase of about 50% respect of last year. And if you look at the activity in traffic, we've seen a lot, big activities in the scanning, probably will need to be edited this one. So let me start again. So yeah, so what we're seeing is that it's an increase in number of CVEs being disclosed, so more than 29,000 in 2023. So again, those are kind of bugs that affect the software running on servers, but also we've seen an uptake in the number of zero-days. So in the number of times attackers are trying to leverage those CVEs, around 100 times in 2023 or under zero-days, which is a 50% increase from previous year. And also if we look at generally at the total traffic, we have our network, we've seen a trend in vulnerability scanners. So those are automated systems that try to penetrate systems by leveraging old and new CVEs. And what's interesting is that we see still old CVEs being used and being trying to exploit it. And I think this points to the fact that attackers try to find the least protected server and try to break into them, because it's easier than trying to identify targets that are very well protected. And I think the learning from this is that, of course, having a WAF, having a layer of protection that protects against CVEs and zero-days, even if it's like a very simple one and basic, already provides, increases the bar and the difficulty for attackers to break into your application. So it makes you much more resilient towards that. Yeah. So let's dive in a little bit more to that. What sort of tools are Cloudflare developing specifically to respond faster to these zero-days? Obviously, we saw this example of a vulnerability that was exploited just 22 minutes. So effectively, organizations need to be able to defend within zero seconds or zero minutes of a vulnerability being disclosed. So what sort of tools that Cloudflare has can help with that? Yeah, you brought up a great point. So it's all about timing here. So when a new zero-day or new CVE is being made public, a race starts where you have attackers that try to exploit it. And then you have the security world where new rules, new protection are being put in place and deployed. And of course, the quicker the attacker can create a malicious payload and send it and exploit it, the more likely is that they are going to go through the system and no protection is in place yet. So the key here is to have a system where a new rule, a new protection is deployed before a malicious attacker targets your specific organization. So ideally, you have the protection by the time that a new vulnerability is made public. And for example, here you see on the right, we have a CVE, which is related to TeamCity authentication. And we saw the exploitation of the vulnerability 22 minutes after it was public. And so technically, we only had 22 minutes to create a rule and push it to all our customers. This, of course, is a challenge because rules traditionally in a WAF were created by humans or are created by humans, which they take time to create and develop and craft the rule. So that approach doesn't scale with the threat we see today. So what we're doing is we're actually using machine learning, AI, to identify attacks and mutation of attacks before they are public, before they're used and leveraged by threat actors. So for example, at Cloudflare, we do have a product called Attack Score that is basically a machine learning model trained on past attack data, and that allows you to identify threats before they're well known. And we've seen in last year to work and to be pretty effective on those type of attacks. I could go on and talk about this for a long time, but Catherine, I think there are other parts of the report that are very interesting. So I'd like to hear a little bit more about our supply chain side of things. So the client-side security, which is if you want the other side of security when it comes to application. So WAF protects the server, but then of course you have client security for the browser. Can you tell us a little bit more about that? Sure. So this year we did a lot of research around client-side security and the supply chain risk that was present in zero-day vulnerability, or sorry, in web apps. And I went over some of these findings briefly at the beginning, but some of these findings are pretty concerning. We're seeing that organizations have a lot of supply chain risk. And something that's really interesting about client-side security and server-side security. So as sort of modern application development has evolved, where these third-party components are loaded has changed, right? So maybe these third-party components used to be loaded on the server side. So any sort of compromise to these third-party components would be exposing the organization itself to risk in the server. But now sort of due to operational efficiency and whatnot, these components are loaded in the user's browser. And that means that these users themselves are being exposed to supply chain risk. So it's very important that you have a good sense of what sort of third-party components you have in your website, what sort of risks they have if they're being compromised by an attacker, often for compliance reasons, right? Because if you have an user that comes to your website and is compromised by a third-party component on your website, even though you don't have direct ownership or control over that third-party component, you could be responsible for any damages that they suffer. For example, credit card theft. Yeah, sure. And what about industry? So did you see any pattern in what are the most exposed industries when it comes to supply chain attacks? Yes, that's a great question. So you can see on this image that the median number of third-party components in websites is much lower than the average number of third-party components. And that is really due to SaaS providers on the Cloudflare network. So SaaS providers are way more likely to be exposed to third-party risk than other organizations due to the amount of subdomains that they have within their organization. So each of those subdomains could have 10s, 20 third-party components. Maybe they're bringing in a chatbot. Maybe they're bringing in a payment credit card processing system, such as Stripe. So all of these subdomains have many of these third-party components. And the SaaS provider has to help all of these subdomains and is responsible to basically make sure that the supply chain risk is mitigated and that end users of the SaaS platform are not being negatively affected by downstream risks of a compromise. That makes a lot of sense. And when we talk about client-side, if you want, the browser, there is always a conversation around privacy, right? So the user is concerned about the information they input, if you want, the browser, where do they go, who gets access to this type of data. Can you tell us a little bit how client-side security relates to privacy in this case? Yes. So when it comes to privacy, this is really important when it comes to the software supply chain. So if you think of, for example, let's give the example of a third-party component that processes credit card payments. This is really useful for web developers because they don't have to build a payment card processing system from scratch themselves. They can just use a third-party component to process those credit card payments and be able to perform transactions on their websites, such as, you know, in order to sell goods on an e-commerce site, etc. If you think of that third-party script that's used to process credit card payments, more than likely it's used by many, many websites. Many folks are probably using it to process credit card payments on their websites. If somebody, if an attacker is able to compromise that script, for example, with maybe inputting a data exfiltration call, for example, to like a cloud bucket, for example, everybody who sort of inputs their credit card information into that script will potentially have their data exfiltrated and sort of their privacy compromised in that way. So a lot of these sort of third -party components are used to process sensitive information. So it's really critical that web developers are actually able to have sort of control and visibility over these third-party components so their end users are sort of appropriately protected in terms of their privacy, of their sensitive data. Great, yeah, and what about regulation? So do you think this is affecting regulation and compliance and standards? How, where is that going? Yeah, that's a great question. So when it comes to third-party supply chain risk, there are many sort of regulations that touch on this. So for example, GDPR requires that any sort of like third-party vendors that you share information with, you have sort of visibility and control over that, and additionally that you're responsible for sort of how these third-party vendors process your data. That's just one example, but we've actually seen in recent years and in an upcoming regulation called PCI DSS, client-side requirements are actually specifically called out beyond just sort of third-party vendor ecosystem language in some of these regulations. So let me talk a little bit about PCI DSS. So PCI DSS stands for the Payment Card Industry Data Security Standard, and we are now on PCI V4, which is coming out or is going to be required at sort of March 2025 timeframe, and this regulation impacts everybody who processes credit card payments. So if you process any credit card or debit card payments with any of the major credit card providers, if you don't follow PCI DSS standards, you risk basically not being able to process credit cards, which could be massively damaging to your business because you wouldn't be able to process any transactions, right? So basically, it's not a government regulation, but it's basically the major credit card companies are regulating folks who process their credit card payments, so that's sort of how PCI DSS works. So for the first time, this new version of PCI DSS, which is version 4, actually takes into account client-side security when it comes to sort of end-user privacy and security, right, and credit card data integrity. So PCI DSS in particular, there are two requirements that are related to client-side security, and these first is 6.4.3, and the second is 11.6.1, and these two requirements basically ensure that organizations are ensuring the integrity of each script or confirming that the scripts are authorized and allowed on the website, require an inventory of all the scripts, and additionally, require that, you know, there are alerts when anything changes to this script, especially unauthorized modification of these third-party components. Additionally, these regulations make sure that organizations are able to check this quite regularly, so this is very important sort of looking forward to PCI 4, which, as I said, again, is going to be mandated in March of 2025. This is a new requirement that many organizations are sort of scrambling to try to address as they, you know, are required to address this for the first time. So I guess looking at sort of how Cloudflare specifically can help you address this, we have a product called PageShield, which can really help with this client-side component of third-party script risk. We help you provide an inventory of third-party scripts and components, and we additionally will constantly be scanning for changes to these third-party components, and then we take that script and download it into an ML classifier on our edge and basically run it for any sort of malicious activity like data exfiltration calls, malware, etc., so basically making sure that the clients are protected from any sort of attacks like that. And PageShield specifically can help with PCI 4 client -side requirements, so that's one way that our customers are addressing additional requirements for PCI 4. So moving on to API security. As, you know, we've secured APIs, and as our customers are working to secure APIs, we've noticed pretty interesting trends in API security and how our customers are securing APIs. But first, I'd like to talk a little bit about API security generally. So our report indicates that 58% of all Internet traffic is API-related, and when it comes to security, how do organizations have to handle APIs differently than traditional web traffic? Yeah, sure. So I think what's interesting of APIs is that although they power a lot of the same applications we see on a browser or on a web browser, it's still web traffic. And while web traffic and the pages change all the time, so a flexible solution like WAF is the best solution, when it comes to API, the contract of how those requests and responses work is very well defined. So security there could take one step further and actually validate every request and every response. And what we are seeing here, of course, is that there is a trend in the number of APIs and the volume of API calls. So now 58% of all dynamic traffic we see on our network is API-related. So there is really an opportunity here to have products that are tailored specifically for this type of traffic. And when it comes to, if you want, problems or challenges for API, one of them is the visibility component. So a lot of the organizations we talk to, they have a lot of teams contributing to the same code, to the same applications. And so a lot of the APIs that are open to the Internet are not well documented, or in some organizations, they don't even know they're actually up and running. And that, of course, poses a security challenge. So for example, if we look at one of our products, which is called the API Shield, we have a solution to scan automatically traffic and discover the APIs and new endpoints. And we saw that about 30% of APIs are actually new. We just find out that there are 30% more APIs than originally taught. And this basically tells us that a lot of the APIs are what we call shadow API, which is essentially APIs that are not documented and can be exploited by malicious actors. Yeah, that's really interesting, especially when it comes to shadow APIs and sort of developers and other folks creating APIs without the security team being involved. So I guess on a similar vein, what sort of types of attacks and threats affect APIs? Yeah, so there are a lot of attacks that are the same that we see for traditional web traffic. So if we think about the zero day and the CVs we discussed before, those are still valid for API traffic. But then, of course, there are a new group of attacks, like the OWASP top 10 for API, which involves authentication authorization, which involves also data exfiltration and abuse like credential stuffing attacks, but also account takeover that can be run through automation of those API calls. And to prevent those and to counteract those types of attacks, we see the adoption of more like a positive security model as opposed to a negative security model, which is what traditionally use in a WAF. So a negative security model is where you define what you want to block. So for example, you say I want to block all traffic coming from a specific country, let's say France. A positive security model is more when you define what good traffic look like, and then you only allow that. And we see more and more customers and more and more users relying on this type of protection to prevent those type of attacks. So that's a great segue into talking about what are some of the best practices for protecting APIs or what solutions can organizations deploy to protect against some of these API specific threats? Yeah, absolutely. So I think, again, using discovery and identify what's your attack surface area, this is the first step that anyone should take. So make sure that you know what you have exposed and where the attacks can come from. Then if you have things like schemas for APIs, you can use schema validation solution. So positive security models, again, make sure you have a solid rate limiting posture and policy in place for the very sensitive endpoints that you have exposed. And we have a few more also solutions that we recommend to put in place. By the way, we do have a report we released this year in January, which is about API security. So if you're interested in learning more about specific API trends and also learn more about how you can protect APIs, make sure you check it out and read that report as well. Great. So let's move on to the next topic, which is the bots and automated traffic. So this, again, is a very interesting topic, which is top of mind for all people involved in application security. Kathleen, tell us a little bit more about what's automated traffic and whether all traffic automated is bad. Yeah, so that's a great question. So a bot is basically a piece of code that automates an attack or an action on a website, for example. So not all automated traffic is bad. And this is really crucial when it comes to defending against bots because we don't want to block all bots. So, for example, a script can be used to crawl websites, to index for search engines. For example, a bot could be looking for key terms to basically make sure a website is accurately indexed in Google, for example. Bots can also do things like, you know, scraping content, which can be negative for competitive purposes. But in some purposes, it can be really valuable. For example, if you're a researcher, it can be very valuable to sort of mass scrape content off of certain websites. So we saw this year that 31.2% of all application layer traffic comes from bots. So this number has stayed pretty consistent around a third of all traffic over the last few years. But about three years ago, it was about 29%. And we've seen that percentage steadily rise to what it is today. So very crucial to be able to defend against bots due to how pervasive they are. Yeah. And what kind of risks do bots pose to websites? So why are they dangerous? Yeah, that's a great question. So as we mentioned, some bots are good and we want to be able to allow them onto our website. But many bots are being created by attackers or cybercrime groups to carry out actions like DDoS attacks that we talked about earlier, right? Botnets, for example, are launching these hyper volumetric attacks like HTTP to rapid reset. Additionally, bots can be used sort of in smaller scale capacity to do things like credential stuffing, which is where you take basically leaked credentials on the dark web and basically try username password combinations to log into accounts. For example, banking accounts, e -commerce accounts with the goal of, you know, taking an account over, stealing information or money, etc. So that's another pretty big example of what a bot can do. Additionally, so sorry, just to give one more sort of example that's quite common in the industry is inventory hoarding. So a lot of e-commerce industries, for example, see this very heavily on launch day of new products. So, for example, if you're a gamer and you tried to get a PS5, a PlayStation 5 a few years ago, you know, many folks found that they weren't able to get a PlayStation right on launch because bots basically bought up all the inventory right away and basically resold it at a higher price, sort of using an automated script. Yeah, that's a great example. And what about industries? So do you see any pattern in what industries are affected more by this problem? Yeah, great question. So this is something that we added new to this report this year. But unsurprisingly, because I mentioned that inventory hoarding problem that affects the consumer goods and e -commerce industry, manufacturing and consumer goods is the top industry affected by bot attacks. And this chart here shows basically the percent share of traffic to their websites and domains that is bot related. So as we saw in the last slide, 31.2% is the average across the whole Internet. So manufacturing and these sort of top 10 here are all massively above that average of about a third, right? Over two thirds for consumer goods of their traffic is bots, right? Additionally, in this top 10 list, we see a lot of critical infrastructure industries like energy, U.S. federal government, you know, IT and security organizations, pharmaceuticals. And then we see other organizations like music, for example. And this is not particularly surprising either. You know, a couple of years ago, there was a pretty large sort of ticket purchasing fiasco with Ticketmaster and Taylor Swift's Ares tour, where, you know, organizations or sorry, folks weren't able to buy tickets sort of at launch and bots and people sort of together crashed the website. So, you know, music is definitely dealing with that inventory hoarding problem when it pertains to ticket sales. Yeah. And when it comes to solutions, okay, what can anyone on those groups, especially, but anyone, I guess, with an Internet application, what can they do to manage the bot problem? Yeah, so this is a great question. So, if you are in any of these industries, if you don't have a bot mitigation solution that's specifically tailored towards identifying bad bots, definitely look into how you can sort of proactively protect yourself in that way. And when evaluating a bot mitigation vendor, it's very crucial to be able to evaluate a vendor that can identify good and bad bots with a low false positive rate, right? Because you want to be able to let those good bots through because, as I mentioned, they can do things like make sure your website is appropriately indexed and showing up on search engines so users can find it. So, you know, if you have a false positive there, you could potentially risk, you know, being unindexed or, you know, lower ranked on search engines. Additionally, make sure you have a well-trained ML model that can identify bots attacks. This is really important because traditional rules are not always the best at identifying bots, like talking about like regular expressions. They're not necessarily going to identify bots and bot behavior with as high accuracy as machine learning rules. Additionally, when you're looking at a bot mitigation solution, we recommend a Captcha-less solution and Captchas are those, you know, click all the traffic lights, click all the sidewalks type of puzzles that you'll see on many websites to prove that you're a human. At Cloudflare, we have a product called Turnstile, which basically runs a series of challenges in the web visitor's browser that don't require interaction. And this is really valuable because a lot of users we found really dislike having to complete Captcha puzzles and they also pose things like accessibility concerns, globalization concerns. For example, folks who are colorblind might have more challenge completing Captchas or folks who are blind might have more challenge completing Captchas. Additionally, a lot of them have been found to be quite US-centric. So, for example, a sidewalk doesn't look the same in the US as it does in India or the UK. So, this is very important to sort of make sure that these websites are accessible for folks of all abilities and throughout the entire globe. So, having a bot mitigation solution that can check if a user is human that just runs in the background without requiring any user interaction is much preferable and has much less friction for end users. So, I think we've reached the end of this discussion about the Application Security Trends Report. There are much more findings and details and recommendations that we have in the report itself. So, if you're interested, copy that link or the QR code for this report and you can read our full findings and we definitely recommend checking that out, as well as the API Security and Management Trends Report that Daniele mentioned that was published in January if you're interested in API-specific findings. So, thank you so much for tuning in today and we appreciate your time. Take care.