🔒 Security Week Product Discussion: Introducing Advanced Rate Limiting
Presented by: Daniele Molteni, Richard Boulton
Originally aired on October 2, 2022 @ 5:00 PM - 5:30 PM EDT
Join Cloudflare's Product Management team to learn more about the products announced today during Security Week.
Read the blog posts:
- Introducing Advanced Rate Limiting
- Announcing Friendly Bots
- Envoy Media: using Cloudflare's Bot Management & ML
- Announcing the Cloudflare API Gateway
Tune in daily for more Security Week at Cloudflare!
SecurityWeek
English
Security Week
Transcript (Beta)
Hello, everybody. My name is Daniele Molteni. I'm the PM for weekly meeting and WAF here at Cloudflare.
And with me today is Richard.
I'm Richard Bolton.
I'm the engineering manager for the firewall at the moment in Cloudflare.
So we are both based in London and we've been working a lot in the last couple of weeks with new products to be launched during security week.
So as you I'm sure you know and you are aware, this is a security week.
So it's a moment, a few days where we launch new products just focused on security.
And one of the big announcements of these days, at least for what concerns WAF is the new limiting advance rate limit.
So we think is a great tool for protecting web application API traffic and it's basically a great defense against those.
The DDoS attack targeted DDoS attack brute force attack.
So think about trying to break into a log in endpoint, for example, but even attacks that provoke resource exhaustion from on a server on a on a, on an application.
So in this case, it's like attackers try to send a huge number of requests or for example, a specific endpoint, and then they try to bring down an application.
But of course, rate limiting is not a new product, right?
So Cloudflare had a rate limiting product for a long time and Richard has been here a few years at Cloudflare.
So why don't tell us a little bit more about the history when we started off with limiting and yeah, and a little bit more how we came about having this product.
Sure.
I've been at Cloudflare for four years, but the rate limiting products actually is older than that.
So I think we launched our first version in 2016 and it went to all customers in 2017 and.
It's an interesting, interesting difference between rate limiting and most of our other firewall things, and that is that there is state in the system.
So most of the firewall systems that we provide the application there are they will look at the properties of a request or maybe the properties of a response coming back from the origin, and they will decide whether to allow it or to take some other action on it.
The rate limiting system works across many requests, so the first version that was built and launched in 2017 to everyone was something quite a powerful engine behind the scenes, but limited in the way you can configure it.
So it's essentially the main the main way it will count is looking at how much traffic comes from each individual IP address.
And the main way it would be controlled, whether it runs on a particular request or not is by looking at the URL.
And actually we have used that internally as well as externally.
There are various internal systems which which apply rate limits which are slightly more flexible than that.
But in terms of how could a customer configure that, that's the only way you could do it really is your URL and IP, and it's got a few, few extra features and things where you can count differently, whether requests hit the cache or not, and count differently on response headers, which was added in 2018.
But.
It's it meets a specific set of use cases that existed in the world of 2016 a bit more than they do now, in particular, IP addresses.
And how important is an IP address is something which is changing all the time.
We're getting more and more usage of IPv6, so we have many more IPs able to send traffic.
We are also doing a lot of very interesting work on privacy where we are trying to not see IPs and trying to not track people by IP is too much.
So rate limits based on IP.
Something which is a bit too limited for the use cases we have.
So Daniela, you know a lot more about what customers need here.
So if you could talk about why, why did.
Why did you ask me to start working on a more powerful version?
Yes, definitely.
So I think a couple of other use cases that came across basically connected to what you just said is that, for example, botnets, they tend to now send distribute attacks.
So they use a lot of VPNs and you get very low volume traffic from each one of those IPs.
But of course, in aggregate, those means that basically are attacking a specific endpoint.
And also something you hinted at as well is like now we have, for example, not so great network translation.
So think about a mobile provider or even a school, perhaps you have a single IP, but then behind that IP you have hundreds or thousands of individual devices or users and it becomes very hard to separate the traffic from each one of those users or devices.
And this is one of the problems that, of course, comes with the fact that we are using IP.
That IP is becoming, in a way, something that is not really linked to a specific device or user.
And I think the old idea of limiting was based on the equation, one IP equal an individual user and that worked fine until today, basically all this these those years.
And so when talking to customers more recently what I realized that a lot of traffic as other ways that we could use to identify individual sessions or individual users especially APIs for example.
So APIs some of them, they are authenticated.
Perhaps they have a session ID that could be in the header or they might have a cookie that is a better indicator of the session and on how many requests comes from the same device or user.
And of course, there is a positive side on privacy as well.
So because we are not relying anymore on IP, then it's easier to kind of like forget if you want the IP and not have to make that connection.
And so this is where I started and we started thinking about both Richard and I a year ago thinking how can we take limiting to the next level?
What can we build around with limiting to basically avoid this problem with IP?
And there is another component I think you should mention is the where you're scoping the scope of the role.
You mentioned that traditionally is more like a path, right?
So you're committing based on like a slash logging, for example, a specific endpoint or a specific path.
And another thing I realize by talking to customers is that often the role wants to be more granular.
So perhaps you want to create a limiting rule that only applies to certain geographical regions, because perhaps you expect a level of traffic that is different from other regions.
Perhaps you want to limit bots in a different way.
So you have verified bots, you want to give them a space, but not too much.
And so yeah, there is a little bit more on that front as well, which I think can be used by customers.
So I think what we could do is just jump straight into the product and I'll show you what we launched and then we can dive deeper into how we built it, what were the challenges that we had to face?
So let me bring up my screen and I'll show you our new dashboard.
There are a couple of changes here, by the way.
It's not just with limiting.
So you might notice that we have a new a new navigation section, which is called security before we call it firewall.
Now we change it to security because, of course, we are adding more products, new products.
We wanted to make this more general under security. We still have the overview tab where you can review the logs and the requests that have been blocked by the one from the firewall.
And you can basically check traffic in general.
And then we have created this new tab, which is the one we by talking to customers, we realize that the WAF in their mind includes things like firewalls.
So the rules that they can create themselves with limiting rules and the manager rules.
So if you want a more robust implementation, those rules that have been created by Cloudflare and we manage on behalf of customers.
So if we're going to the WAF, then we have this new tab, which is the rate limiting rules.
And here is where we are both showing the old way. I think you'll see that at the bottom.
The previous version of weight limiting and also the advanced weight limiting of the new product we just launched today.
So for the new product, you can of course, have the same pattern with the card, with the list of rules, and then you can create a new weight meeting rule.
Here you will see a very familiar view of the rule builder.
So at the top you will give a name to your rule.
And then the second bit here is the filter.
So this is where you are defining on what portion of traffic you are applying the rule so you can get access to all the fields of the HTTP request.
So this includes the IP address, all the URI for the path, the request method as well.
But you can also get access to dynamic fields like the verified bots, bots or bot score if you are part of the bot management plan.
And so you can choose exactly where you want to scope the rules.
So let's say you again, you want to create it for a login endpoint.
Perhaps you can also restrict it with a header, restrict it to a specific header that the request has or even that user agent.
Right.
And once you have created this, this is where the limiting applies to you can choose the action you want to you want to take once you reach a certain threshold.
So let's say you want to block. You can customize the response, of course.
But then here is the other settings for it.
So you'll be able to set the duration for how long you're going to block.
So let's say you're blocking for 10 minutes and finally you define the threshold so the rate you want to limit.
So let's say you want to only allow 10 requests over a period of 2 minutes.
And so if you set it in this way, then you will only allow 10 requests over 2 minutes.
And then once you exceed this threshold, you will block the request for 10 minutes for a period of 10 minutes.
And that's really interesting when you get to there what you've done there, apart from the detailed filter.
So we've got two things here and we've got the scope, which is that filter at the top and we've got, what do I do for a request which matches that scope?
And actually, what you've got there, this is what you can do with your product.
And I think what you're about to talk about is the things you couldn't do with your product.
So you couldn't you can count based on IP. Have you got 10 requests in 2 minutes as the old product?
But what we've done now is start adding much more flexible counting.
And I think it's actually I've always found it quite difficult to think about it.
It's quite a few concepts to keep in your head. So you've got the request comes in, we decide whether to apply the rate limiting at all.
And that's the rule builder at the top of the screen. And then you've got once it does match that, what do I do with it?
And that's where the real sort of new flexibility is, the systems coming in now.
Absolutely.
And so let me expand on that point. So what's new here?
So one of the big innovation here is this is the section with the same so with the same IP is what all the limiting does.
So it can't request from individual IP, but here you can expand and you can pick and choose many more characteristics of the request.
So you could only count on a specific header.
So let's say you have a session ID in the header or you could count on a cookie.
And in this case, by the way, the session would be, of course, you don't specify the value of the session ID you are basically bucketing requests with the same session ID, right?
So you're keeping track that every session does not exceed these rate of 10 requests over 2 minutes and the same you can do with the cookie.
You can also do that with a query or query parameter.
So this is I can tell you a bit more about that, but it's especially useful for e-commerce sites.
Of course, you have the IP as the old one and then you have one very interesting feature that we have been asked by a few customers.
And also they bought a bot management team, which is the J3 fingerprint.
So this is essentially a fingerprint that each bot has.
So this is a very specific way for us to track the bots and we can basically by building the role in this way, you are essentially tracking the number of requests that each individual bots can perform and then we limit that once we exceed the threshold.
Which is anything.
Anything to us, anything I must have, I might have missed. I think that's just to continue on that we what we've built now is not just exactly the features.
If you've got it, we've built a really powerful generic engine behind this.
So whenever we get new ways to think about bucketing requests, so maybe, maybe the J3 fingerprint becomes something, it becomes less useful.
As the threat model evolves.
We can easily provide new characteristics, new pieces of information to group and bucket requests by, and we'll be able to do that very quickly.
So as the world evolves in a few years time, I'm sure there will be different options here as to with different threat models.
But we've got a system now which we can we've done a lot of engineering to build a powerful, powerful engine behind this, and we can then we can adapt it very quickly.
Now, That's a great way to put it.
And there is also a way here to expand and get even more control.
Right.
The counter the counter section, which is why you don't like walk us through a little bit how this works.
So if it's not complex enough to think about already, we've done a lot of work with design, make it easy to work with, but so we've got three, we've got two things at the moment.
We've got how do I select which traffic I'm going to count on and which traffic I'm going to apply the rate limiting system to.
And then I got how am I going to group the requests I see into buckets so anything by IP or by fingerprint.
But there's a third thing you can do which is separate the requests that we're going to block from the request, which we're going to say these are incrementing, incrementing our counters.
So to think of the use case, imagine you have you're running a site and you're worried about Credential stuffing.
So you have a log in endpoint.
You want to make sure people aren't trying to log into that endpoint multiple times each minute.
So maybe ten requests every 2 minutes is actually probably quite a generous threshold.
But something along those lines, you are saying people aren't manually going to hit that, but an automated bot trying out a list of thousands of passwords is going to hit that quite quickly.
So in that situation, you only want to count requests to that log in endpoint because it might be perfectly reasonable.
Someone opens the front page, it's going to download maybe a hundred sub resources.
The limits for general browsing on the site need to be much higher.
So in that situation, what you can do is write a rule which will only count the traffic that hits, say, the login endpoint or particular part of the site.
So you would increment, you would write a custom expression here, so maybe your path equals or starts with the log in endpoint.
So it's a contains or matches with a register to select the part of the site you care about and.
With the old system, you could have selected that log in endpoint, but when someone hits that limit, then they can still balance other parts of your site.
They can still cause issues elsewhere.
Maybe you've got multiple parts of the site where there can be log in.
Maybe you've got multiple other sensitive things.
What you may want to do instead is say if they hit that limit.
I know they're a bad visitor.
They're someone I don't wish to allow to use my resources.
So I'm going to block them from everything so you can apply a different mitigation than you can accounting expression, essentially.
And.
This is an example of the power of the engine we've built here. You can use this for all sorts of use cases that aren't directly protecting your origin or protecting against scraping.
You can.
You can configure it in a much more flexible way to have all sorts of different behaviors and to meet quite a few different, sort of less frequent use cases.
What we've found with the firewall, and this is probably a general thing we've been building over the last four years, we've been building much more general firewall systems.
The first firewall systems we built in Cloudflare were systems where we would entirely manage it, all systems where we would write a set of rules and you could enable or disable the rule, but not much more configurable than that.
And then we what we've been doing more recently is allowing you to essentially write a programable firewall.
And from my engineering point of view, what I think of it limiting as we're adding memory to that programable firewall.
So rather than just being able to say select traffic in really complex ways, it's select traffic and complex ways keep track of behaviors in those in that traffic and then take an action based on that.
And we will be.
As you can see, the way we're thinking about it, this is not the end of what we're going to build.
There's a lot more powerful, customizable features that we are working on to make this whole engine more able to meet all the more, more and more complex demands of our customers.
Yeah, I think a natural consequence of this is that we can combine those new products to create solutions to specific use cases.
Right.
So one of the use cases we could make with this new rate limiting is, for example, with limit on Credential stuffing contacts.
So we have another product that is called expose credential check where we validate pairs of credentials like a password and a username, and we check whether they've been compromised somewhere on the Internet.
So we could essentially use rate limiting in conjunction with expose credential check to make sure that we limit the rate of request from individual IPs that try to to break into accounts.
But not just that, right?
So have a bunch of new products that we're building into the engine that can leverage one another, right?
Absolutely.
So another example, if I take that a bit further, perhaps you want a rate limit not to affect human visitors, but you want it to affect visitors so you can combine it with a bot management product.
You can say only count requests that are from like requests that are likely to be from bots and that are sending more than one Credential stuff match per minute.
So. You can be very, very specific with the things you are doing there and that why would you want to do that?
Well, the goal with the firewall is always block the bad traffic and don't block the good traffic.
And the more powerful the ways you can write things, the more likely you are to be able to select just the right traffic.
And.
That leads me on actually a bit. I think something that we need to talk about is how can how can we make it easier for customers to set off all these rate limit rules?
Because, as I say, this is a complex system.
It's a complex form.
We've done a lot of work to make it usable, but it's inherently a complex system.
So how can we tell customers what rate they're meant to build? Yes, I think that's a very good point.
So it's something we realized while building this, that the more power we give to customers also, the more complexity we need to expose.
And so I think one of the direction we want to take in the future, in the near future, is to provide more tools to the customers that can help them select the right threshold or select exactly what type of use cases they need to fulfill.
So a couple of things we are working on is include provide better analytics for the this is something we are planning for.
And another thing that is important I think is to define the threshold because as we are defining the traffic in a more granular way, perhaps adding components we are counting on like headers or session IDs, it becomes harder for the users to understand what a reasonable threshold was, the threshold they should use to be able to, as you said, to block the bad traffic but let the good traffic through.
And so suggesting threshold automatically, given a configuration of a rule, is something that would be very powerful.
And I think this is something the Cloudflare in general is already doing or is taking that direction.
If you look at, for example, API discovery, API discovery is a product that allows you to map all your API endpoints.
But at the same time, we provide a recommendation for a rate limiting threshold that you can start enforcing essentially the baseline.
They say this is the number of requests that I am sure or I'm quite confident is a good number of requests to allow all of the good traffic.
So there is definitely space for us to create those tools and integrate them with the with the product to make it more easy to use and easier to understand.
One more point, I guess is what do we do with all the weight limiting?
How are we going to map for basically releasing what we have the new one and also what we are going to do with the old one.
So the advanced weight limiting with all the power of the configurability.
We're launching it today in general availability for all enterprise customers on the advanced plan.
And but let's say the probes and free zones, they still get access to the our traditional limiting product going forward.
Of course, we will bring down to those plans as well some of the advantages of the new weight limiting, for example, the new UI, the new interface, and also the fact that it's going to be built on top of the engine.
So with a much, much more powerful API and more integrated experience with the other products.
Richard, what about from an engineering perspective? What are the big challenges you think we're going to run into in the near future?
So.
The challenge we always have with this is making a more powerful product that's still easy to use and is still reliably going to perform under heavy attacks.
So the of product, as you as you describe, is used for very frequently for protecting people against very large attacks.
It's maybe a slightly more customizable counterpart to our DOS protection systems and against attacks which are not just denial of service attacks like scraping attacks and, and other types of threats like Credential stuffing.
So.
The system has to work under very high load. We are.
We are constantly doing. We have a sort of extensive benchmark system.
We've made it so you can make much more complex rules.
How can we do that and let you do that?
Let you and be sure that that's going to be able to be evaluated quickly enough to handle vast attacks.
And the answer is, we've done a ton of work building a really powerful matching system.
The we've put a few booklets out about this, but our engine is written in rust.
It takes these filters that you can build with the rule builders that Danielle was showing and can evaluate hundreds of them in fractions of a millisecond.
So we are able to fairly confidently release huge new complex functionalities and know that it's all getting normalized into this single form in our in our some edge branch to match against.
And then we because we have all these expressions in the single center form, we can do spend lots of engineering effort optimizing that, making sure it runs fast and all of our products get benefits there.
So the engine is not just part of the firewall, it's part of transform rules, it's part of many of our products we're building now that this allow traffic to be selected or aspects of request to be selected.
So we also have things like custom logging where you can select request with the same sort of same engine.
So.
The challenge really comes down to how do we how do we make that really flexible system something which customers can effectively configure?
We've done quite a lot of work making sure we have TerraForm integrations, making sure that we have the right access level permissions so that you can use API tokens for your APIs that you can then do Zoom level access control works with.
So there's a lot of customizability to make this sort of powerful engine fit in very well with all of the aspects of Cloudflare service products.
And I guess another aspect of this is, is the UI.
So the engine is becoming more and more powerful.
As we discussed before, there is more pressure on design and UI to make it really usable.
So of course analytics, as we mentioned before, providing a threshold is something that is going to be useful for customers, but also having a more, let's say, a flows that are more direct to specific use cases.
So having kind of more templated if you want rules or paths that will guide customers towards fulfilling their use case.
So we talked about creating a rule against scripting attacks or perhaps like distributed bot attacks as well.
So all those use cases can be in a way that can basically show to the customer in a more opinionated way and guide them through the all the settings that need to be decided and need to be configured to get to the to the goal of protecting their their infrastructure.
Yeah, absolutely.
The system we've built is generic and powerful. How do we help people is probably the big challenge for the next year.
How do we help make it as easy to use?
Make it as easy to.
And to build the customs powerful protections that you might want to do.
And then there's other new features we're building as well.
So there's the idea of maybe you don't want to just increment your count by one for every request.
Maybe not every request is as expensive to serve as others.
So we're building ways so that our customers can tell us that. The one request is, is it more expensive to serve?
And then we will allow complexity based sort of rate limiting, essentially.
And I think we're coming up on time.
There's so many more things we can talk about.
The firewall is evolving very quickly.
Do you have any sort of summary thoughts, Daniel, for the share?
I mean, we would just love to hear from from customers, from your feedback.
So if you are going to try the new product, please let us know what you think.
What would you like to see next? So of course, we build it by talking to you, by talking to customer, collecting your use case.
So we want to continue on that trend.
And I think with that we are out of time.
So thanks.
Thanks a lot, Richard.