API Schema Validation
Presented by: Daniele Molteni
Originally aired on October 2, 2023 @ 1:30 AM - 2:00 AM EDT
Tune in to learn about validating API schemas, presented by Cloudflare product manager Daniele Molteni.
English
Transcript (Beta)
Good morning, good afternoon everybody. My name is Daniele Molteni. I'm the product manager for Firewall and API Shield here at Cloudflare.
And I'm very excited to be here today to be able to present one of the products we launched recently.
It's actually been quite a long time since I've been here on Cloudflare TV.
I think probably the last segment I had was a few months back.
So very, very good to feel to be here back and talking to everybody and share one of the amazing products we recently launched.
So let me share, I've prepared some slides to guide you through what we just built.
As you probably know, we are also accepting some questions. So if there is anything you want to ask, please do send those questions and I'll make sure to answer during this time.
So today, what we're going to talk about is API security.
So traditionally, if you want security tools from Cloudflare, the WAF, the Firewall, they've always been used to protect API traffic.
So natively, all our tools, they work for API traffic as well as web traffic and other protocols.
However, we have seen a big change in the last years, which called for specific and very targeted products that secure API communication.
And the reason why, I'm going to tell you in just a second.
So you probably are seeing right now a slide where I'm comparing two situations, one for web HTML traffic and the other one in the API world.
So traditionally, web and HTML traffic, they worked in a way that was more static.
So if you were on a browser, you were asking for a page.
So you were sending a GET request to a backend application. The backend application was handling that request, pulling data from a database, populating that page, creating that page and returning to the eyeball.
So the application, although of course dynamic, it was fairly static in the way it was executed at the side of the eyeball.
And the advantage of this is, of course, the database was protected by the application that was running in the backend.
So there was no direct connection with the database.
So the possibility of leaking data was limited or managed by that frontend application.
In the API world, this changes a little bit.
So we have a different dynamic here. So you have on the browser, you have the app that runs.
So whenever you want to render some information, like in this case, for example, it's a profile page, you send a request for data, right?
You send a GET profile with a JSON body, for example, you send it back to the backend API.
The backend API retrieved information from a database and then ship it to the frontend app that runs from the browser.
Of course, the advantage of this is that the application feels smoother, it's way more dynamic, it's a better user experience, and in general, there's more freedom of innovating.
The downside is, of course, the data that are exposed by the API are one layer less divided by the eyeball and by eventual attackers.
So attackers can send a call signature data or get dumps from database directly.
And this is, of course, very dangerous.
However, API is something extremely, it's growing, right, on our level of traffic we see in the API.
And the reason is because, of course, businesses are being built on APIs, because APIs allow business to ship feature much, much more quickly and also allows to kind of decouple how the application looks in frontend, but also the way the actual logic, business logic works in the backend.
Let me show you a couple of data why this is becoming so interesting and intriguing, because we looked at the traffic that is flowing through Cloudflare.
And 50%, almost 50% of the requests are API related. And specifically, JSON is by far the biggest protocol that is being handled by a proxy like Cloudflare.
And because Cloudflare is handling 18% of the total proxied Internet traffic, this is a huge number, right?
This is like a massive amount of requests or traffic is JSON API related.
Of course, there's also XML and there are other protocols in this.
But in general, API basically is a big chunk of what we see at our edge.
And of course, there are other protocols. Web is still 20%, although it's actually shrinking.
We have binary protocols, gRPC. We've seen an increase in uptake in gRPC since we launched it.
We were one of the first proxies to launch it, to launch support for gRPC more than a year ago.
And of course, the media, which increased for COVID.
But the actual very interesting dynamic here is that API is taking the lion's share of the requests while web is shrinking.
And the growth actually over time, this is incredible.
So 29% of growth has been observed on zones of our customers since last year.
So they've seen a 29% increase in API traffic since last year.
And if you look at the industries, it's even more telling. So you see cryptocurrency, financial services, they are leading the pack.
And of course, cryptocurrency is relatively small industry, but financial services is massive.
And it's actually, you see here the monumental shift towards let's use API for our service.
And this calls for tools and in general, solutions to protect those endpoints and the backend, and also all the data, the sensitive data that those services hold and expose.
To analyze the type of attacks, so if you want type of events we see hitting specific APIs, we can look at the error codes that are returned by our edge.
So this tells us a little bit what's the composition of type of requests that fail to reach the origin.
This can be of course, due to unintentional mistakes, perhaps a request has been malformed by a bug in the application or perhaps was intentional with some attacks here going on.
Anyway, if we look at the statistics or if you want the volume of different error codes, it tells a little bit of what's going on and perhaps where we should focus on when we develop new solutions and tools.
So here are the three main categories. So we have a category of type of failed authentication authorization.
So this is when you fail to provide a valid authentication or of course authorization, then we return a 401 or a 403.
So if you look at 4143, that's kind of a bucket of type of attacks or failed requests.
There is another bucket which is the 400, which is the failed request validation.
So the request didn't really, was not as we expected.
And then the third bucket is the 429, which is excessive rate of request, which is the classic way limiting response code.
There is of course the 404, which we don't care at this point.
But for those three, we see that each bucket accounts for 20-22 percent of error codes.
So, and they're actually quite equally distributed, which is interesting.
So when we think about new products here at Cloudflare, we look at the data, we look what's going on, we look at different attacks, we also hear from customers and we prioritize the features we want to build.
And of course, if you look at this, then there are these three macro areas where we should really focus on and develop new products.
One is to protect against brute force attack or excessive rate of requests, which of course we cover with products like rate limiting, for example.
There is authentication and permission, which we cover with other products like MLS, for example.
And then there is the malformed malicious request space.
And we want to prevent basically requests that are malformed or malicious in nature by the way they are formed to reach the origin of our users or our customers.
And this is where API schema validation comes into play and where it sits, why we develop this new product.
So there was a pretty long preamble, but I thought to give some context of why we're developing these features and where they sit in terms of like, okay, what type of attacks are we addressing with this type of product?
At a level, so now I'm going to tell you a little bit how the product works, but also at the end, towards the end, I have an example of a pack that we can mitigate with schema validation.
And hopefully that makes everything more clear.
So before jumping into the details of the product, let me tell you a little bit of the difference between schema validation and more traditional security tools like the WAF, like a firewall.
So traditional security tools like WAF, they follow a negative security model.
The negative security model follows the principle of allowing everything, you allow all requests, and then you write exceptions for what bad traffic looks like or unwanted traffic.
So you say, okay, allow all requests, but block traffic coming from Italy, which is my country of origin, or allow all requests, but block traffic from this set of IPs that perhaps is, I don't know, is the IPs of a VPN or a turnout or something like that.
And this is kind of, yeah, the approach that we commonly use when we write those type of rules.
On the other side, we have the positive security model, which is becoming very, kind of like a, yeah, a bit of like a common buzzword in the space, right?
But essentially what it does, it does exactly the opposite.
So what you do here, you block everything, you block all requests, except you define what good traffic looks like, and you allow it, and you just let it go to the origin.
An example is, you know that every request needs to have a valid user ID made of eight digits.
So whenever you receive a request, you look for a user ID and is there a user ID?
Does it have a digit? Yes. Okay. Then you can forward the request.
If it doesn't, then you block it. And when you deal with APIs, sometimes it's much easier to create a positive security model.
And of course, you will probably guess it, the positive security model is much more stricter, right?
It defines what good looks like, and then you block everything else. So you don't really write the exception, you just write what you really want to see.
And so if you have the ability to build a positive security model, then you should definitely do that.
That actually would be good practice. And when it comes to API, sometimes the perfect condition to be able to write a positive security model.
And the reason why there's a perfect condition in place is because you have a schema.
And what is a schema?
A schema is the contract of how the API works. So the API is essentially a call, a function call, right?
So you're calling a function and asking for some data, giving some parameters as input.
So you can define how those interactions work.
And schemas are sometimes generated through doing the software development process, right?
So you're defining, you are creating a new endpoint.
The developer is creating a new endpoint. And so it can, while writing the software, it can also automatically generate the schema that defines the interaction between the outside world and the specific endpoint.
And in this contract, in the schema, you will define operations.
And each operation includes a URL and a method.
So for example, I get from specific URL. And also you define where to expect parameters.
So the parameters can be in the path, in the query, header, or in the body.
And also defines the structure and the data type of those parameters.
So for example, if you define the body, if you say, okay, I want those parameters, then you will define how the structure looks like, what parameters are integers, what parameters are strings, and so on and so forth.
And there are also standards, right?
That allows to have something which is uniform across industry, which is the OpenAPI or Swagger type of file.
And so if you follow the standards, then you will automatically create a file and a schema, which is usable, for example, from by Cloudflare to create this positive security model, which is what schema evolution, of course, does.
But just to give a little bit more color and details on how this works, let me pull up an example of an API call.
Okay, let's assume we have created a software, an application, which is a pet store.
A pet store is an application that contains a number of pets, like profile of pets, with their data and their information.
This can include their name, the owner, a picture of them.
Just imagine a social kind of social networking for pets. You send a request, and that is a get request to an endpoint, which is slash pet slash pet ID.
In this case, the ID is one, two, three, four.
And then you receive a response, which is the ID of the pet, additional information like the name of the pet, and perhaps other data like the URL of a photo or the picture.
And the application running on this case on a mobile will take this information, and the application knows how to visualize and to render a page or the UI of the specific application, and will use this data to populate and to give you back this information on the application.
So this is very high level how an API call works. How does the schema of this API call is?
So here is an extract on the left of the schema for this API call.
You'll see on the top, you have the URL of the operation.
So this is essentially an operation. So you'll see the URL, so the destination of the call should be slash pet slash pet ID.
Then you have the method, which is get.
You have the description of what that operation does. And then you have a block that defines the parameters.
The parameters include the name, of course, of the parameter, which is, in this case, pet ID, where it's expected in the path.
Is it required for this specific request? Yes, that's true. That's required.
So if there is no pet ID, then we don't want to see this request at all.
And the type is an integer. And then you have another block here at the bottom, which is security.
So it defines how the security should work. You have also a block for responses.
What are the responses that you expect? And there are other blocks that are not shown here for other type of calls.
It could be the body, for example, how the body are, the headers.
So any part of the request can be specified here, and really tells you how to expect this specific request.
And on the right hand side, you see the request that we just discussed in the example before, and how every blurb of the schema maps to the specific part of the request.
So schema validation, the product, what it does is to create this positive security model using leveraging schema.
So if you are a customer, you can upload an open API version of the schema to the dashboard, where we automatically parse the schema, understand the endpoints.
We provide a list of endpoints. This is an example here with associated methods.
We also provide what type of the request we validate. Is it the path?
Is it the query? Is it the header, cookies, or body? And then we also provide a way of customer to customize the behavior.
So here you'll see a couple of actions you can set.
So the first one is the endpoint action, and then you have the pull-through action.
The endpoint action defines what you want to, what actually you want to take when a request doesn't comply with schema.
So in this case, you see there is a log.
So it means that if we receive a request that perhaps doesn't have the pet ID parameter, then in this case, you will log it.
You could also change it to block.
So you will enforce that positive security model by blocking the request without the pet ID.
The fall -through action applies when you don't match with any of the listed endpoints.
So sometimes you just have an endpoint which is not present in the schema, and you want perhaps to block the request that goes to that endpoint because you don't expect any other endpoint outside of the one you have defined.
And so you will change the fall -through action to block in that case. However, that's not always the case.
Sometimes developers are publishing new endpoints without updating the schema, and perhaps they don't want to break the traffic going to those new endpoints.
So perhaps for the fall-through action, you'd like just to simply allow the request to go to another endpoint.
Of course, in that case, you won't enforce that security model for new endpoints, but at least you won't be blocking or disrupting the traffic to new URLs that perhaps you just launched for testing or for early development.
Let's take a look at an example of an attack we've heard from customers that can be mitigated by this product.
So this is an attack that involves a malformed request and a targeted denial-of-service attack.
So let's assume you have an e-commerce store, or any store for that matter, and you have an endpoint that can be queried to retrieve a balance one account.
In this case, the URL is account balance, as you can see on the right, and then it's a public API.
The payload is expected because you need to provide more information.
And in this case, in the body, in the payload, you will have a field, which is account, which is a number, a set of numbers, and a pin, for example, or any other parameters that you can imagine.
And this request is not authenticated. It's not behind any login endpoint in this case.
So you can't really track the session, for example. So the attack we are seeing happening in this type of situation is that there are distributed attacks made by botnets that send requests with dummy payload, and perhaps they send two, three requests maximum per IP, or per bot.
So each bot has a different IP, and they send a very low rate of requests over time.
And they hit the endpoint and try to basically steal data, or try to exhaust the resources of the particular endpoint or server.
The reason why bots, they use such a small volume of requests is to avoid rate limiting, for example.
So rate limiting usually works when you have a spike of requests coming from the same IP, or the same session, or with the same cookie.
If you see a spike, and you have a threshold, then you can handle that request by limiting or blocking the requests coming from the specific IP.
If you send just a very few requests from the same IP, that's a way to basically fly under the radar.
And we're seeing this happening more and more often.
In general, if you see attacks are moving away from using specific IPs or individual IPs, and using very, very wide networks of botnets, sharing different IPs and rotating IPs, this is why we're moving away from IPs.
But that's a different story for another time, for another product.
So let's focus on this.
So you send a lot of requests coming from different bots. How do you mitigate this type of attack?
So we can use, of course, schema validation in this case.
So you upload the schema where you define where the account, that you expect an account, and a ping from requests hitting the specific URL.
And whenever you see a different request coming from there, then you block it.
And in this case, very easily, you can mitigate a denial of service attack that doesn't generate requests that comply with what you were expecting.
So let me show you how the product works.
So this is the dashboard, Cloudflare dashboard. So you see here you have API shield, which is a new tab.
And you can deploy a new shield in this case. So of course, we prompt for a name, my new shield in this case.
And then we ask you where you want to deploy the API.
What's the base path, the host name, and perhaps a base path of your API.
So often, we have APIs that change, for example, slash v1 or slash API.
So you can add those information where you deploy specific product.
You click Next. You select schema validation from here. And then let me pull schema I use for demo.
And then you drag and drop your schema file into the box. You click Save.
And under the hood, the file will automatically create the rules we discussed.
So here you see the list of rules. You can change the, as I mentioned, the endpoint action.
Or you can change also the folder action whenever the request doesn't hit any of these endpoints.
And you'll see here that for every URL, you have different methods that are accepted.
You click Save and Deploy. And of course, it goes back to your original window.
And you'll see the list of your shields.
Yes, this is very automatic. So in a way, the idea here is that customers don't have to set up anything apart from the schema.
Once you have the schema, we can handle all the process of setting up the rules from our side.
We have a few more minutes. Let me take those minutes to discuss where this product sits when compared to other products that we're developing for API security.
So as I mentioned, API security in general is a growing field because of the increasing number of requests we see at our edge.
And API schema validation is only one of the features we are developing inside Cloudflare.
So when a customer with an API problem comes to us, we tend to suggest a strategy that we've seen working very well for many of our customers and a lot of the traffic we see at our edge.
And it's essentially a way to remove noise incrementally.
So here you see four main stages.
So the first stage at the top is about blocking DDoS attacks. So these are more about volumetric attack network layer attacks.
Then once you pass that stage, it's more about enforcing authentication authorization.
And if you remember what we were discussing before about the error codes we see at our edge, authentication authorization is one of the three main buckets where we see attacks and blocked requests.
Of course, the third bucket is about the request validation. So it's the business logic of the request value.
And this is about the malformed request, the other bucket.
And finally, the load control. Are we able to handle the rate of requests in a correct manner?
Are we sure we are not overloading the origin server with too many requests?
So this is kind of the idea or the structure we suggest customers to deploy tools and in the way we deploy tools for them.
And here is just a quick recap of other products we are developing in this space.
So API DDoS protection, of course, Cloud for the entire network of cloud provides that always on DDoS protection without a single point of failure.
Authentication authorization, we have a few products, MTLS, for example, and other products we are working on.
The request validation part is, well, schema validation, as we discussed, but we're also providing other processes fall into this category.
So for example, the validation of credential against databases.
So this is a product against credential stuffing attacks.
We have also discovery and inventory, which is another product that provides visibility on the type of traffic, on the traffic and what API endpoints are reached by this traffic.
Payload scanning is another feature.
And finally, the control load that's about preventing brute force attack and protect origin against abuses in general of excessive requests.
And of course, here, rate limiting is the king of this space, right?
Because it's a great last resort when it comes to protecting origins.
And we have been building a new rate limiting that moves away from IP.
So this is something I just mentioned a little, a moment ago.
So we have a new rate limiting that can act on other dimension, other than IP can count on user ID or session ID on query parameters.
So it really gives customers the flexibility of setting up rules that look for patterns in the request rather than focusing on the origin IP.
And then of course, in the future, this could become more like a quota management tool.
And one of the tools also that customers are using, which sometimes we forget about, is logging and monitoring, which is one of the key, if you want, tools that customers have available or is available when it comes to security.
And here is just a very high level summary of the products we are developing already about.
So the bot management team, for example, developed a new, very interesting product, which is called API Discovery.
This is a product that automatically provides you a list of endpoints, just by looking at the traffic, that it's your host name or your site.
We have MTLS, which is a very useful tool for mobile traffic IoT devices.
Of course, on the top right, schema validation.
Advanced rate limiting is something I just mentioned and discussed.
And finally, data exfiltration is a massive problem also with API.
And there is a product we recently launched, which is for identifying sensitive data leaving the origin server of our customers.
This is a very, very final slide that recaps products we are building for APIs.
And I'm sure that there will be more opportunities for me to talk about this in upcoming segments on Cloudflare TV.
So if you are interested in learning more about any of these, stay tuned because new segments will come up with me giving you more information about this product.
I hope you really enjoyed this segment. And with that, I think my time is almost up.
So I wish you a great rest of the day.