💻 Developer Week: Security Spotlight
Presented by: Dina Kozlov, Simon Thorpe, Reid Tatoris
Originally aired on November 2, 2023 @ 10:30 PM - 11:00 PM EDT
Welcome to Cloudflare Developer Week 2023!
Cloudflare Developer Week May 15-19, our week-long series of new product announcements and events dedicated to enhancing the developer experience to fuel productivity!
Tune in all week for more news, announcements, and thought-provoking discussions!
Read the blog post:
Visit the Developer Week Hub for every announcement and CFTV episode — check back all week for more!
English
Developer Week
Transcript (Beta)
Hi, I'm Simon Thorpe here with Cloudflare, excited to be part of Developer Week and I'm joined today by Dina and Reid, who Dina is going to talk about a new secret store capability that we're bringing online and Reid's going to talk to us about securing your large language model APIs with Cloudflare.
So let's start with Dina.
Hey Dina, how are you doing? Hi, doing great. I'm happy to be here. Great. Could you introduce yourself and your role at Cloudflare?
Yeah, I'm Dina Kozlov. I've been a product manager here at Cloudflare for about four years now and I've worked on DNS before and now I'm currently on the SSL TLS team.
Awesome. So we've just announced a new secrets management capability.
Can you talk to us about what you've been working on and what this capability is going to bring?
Sure. So before I talk about the actual capability, I'll take a step back and talk about the problems that developers face and why we're even building this out.
So essentially today, there's lots of variables that engineers need to handle.
So even if you think within the Cloudflare ecosystem, we have our serverless platform that allows customers to deploy their code within a worker.
And so a code can have multiple variables and some of those may be things like API tokens, authentication tokens.
And so this is information that shouldn't just be visible by anyone. It shouldn't live in plain text.
It should be encrypted and stored securely. And only your application should be able to grab it when it needs to be used.
But ideally, no one has read access to it outside of authorized users.
And so even within the Cloudflare ecosystem, we saw a lot of use cases for customers wanting to keep different values and variables encrypted and make sure that they have the proper permissions around those and making sure that there's proper auditing around it.
So what we are planning, what we are working on building out is a secret store that's going to allow you to manage different values that you use across Cloudflare products.
And you're going to be able to manage it in one centralized location. And so that way, if you're using one variable across, for example, multiple worker scripts, you only have to keep one version of that variable alive in one centralized location.
It's going to be securely stored. And then when you need to rotate that value, you only need to rotate it in one place and it will automatically be updated and deployed wherever it's being used.
And so that is the core thing that we're working on right now.
But even beyond that, there are so many use cases for a secrets manager.
And so we know that customers have a need for secrets, not just within the Cloudflare ecosystem, but also within their own applications, whether they're hosted on cloud or on -prem.
And so the next thing that we're going to work on building out is allowing customers to store their general secrets, ones that are not used across Cloudflare products, store those within the Cloudflare secret store, and then have a way to be able to fetch them.
And so that way they can maintain the secure storage, the access controls, and so on.
Gotcha. Okay.
So let me break this down a little bit to understand what we're talking about.
So when I'm building an app and if that application needs to talk to some external API and I need to pass an API token, usually I'm not going to store that in code.
That would be a really bad idea, especially if I publish that code to a repo, which is publicly accessible.
I'm now giving everybody access to my API keys.
So usually I go and reference where I need to use that API key, referencing an environment variable on the local machine.
And then I've used other products like Twilio functions, for example, where in the Twilio function, I have access to virtualized environment variables, which I can use in exactly the same manner.
So how is this different to just having an environment variable per function?
So it's very similar.
I think the main parallel there is, for example, on our serverless platform, you have your worker scripts.
And so today, slash in 2020, we built out the ability to, within a worker, have a local variable that's used within that worker.
And then you can also encrypt that value to turn it into a secret. But if you're building on top of our workers platform, it's likely that you have many workers for many different services.
And it's possible that a few of those workers share the same secret or token.
It could be an API token. It could be a zone ID. And so instead of storing multiple copies of the same value, what we're going to allow you to do is have one account level version of it.
So that way, you only need to manage it in one place.
And you won't need to replicate it all across your workers, which is very similar to, for example, local variable functions.
And so we will allow customers to maintain the per worker secrets.
But we will also give them the ability to have global variables that they can use across different worker scripts.
Gotcha. So it's like service-wide scoped to my account variables, which happen to be secrets that I can use anywhere in any of the code across any of the Cloudflare Workers and pages and all those types of future functionality.
Exactly.
And there's many use cases for these secrets. So for example, we've mainly talked about using them within workers.
But you can also, for example, if you're setting up a webhook, you usually have a shared secret.
And so that may be another place where you want to keep that value stored and encrypted within the Cloudflare ecosystem.
Another example is if, for example, you're setting up firewall rules.
In your firewall rule, you may indicate here's the off-header that we need to look for in this firewall rule.
But you don't want that to just live in the Cloudflare dashboard where anyone with WAF access can go and see it, because that is sensitive information.
So what the secret store is going to allow you to do is to manage that value in that secret, keep it encrypted in the secret store, and then within your firewall rule, just reference it.
And so that way, you can really scope it down.
And in terms of access controls, we will have access controls based on users, but also based on different products.
So for example, you'll be able to specify only allow workers one and two to be able to use secret X or restricting it only to a firewall rule and so on.
And I also think it's just as we build out more Cloudflare products, we're going to see more and more use cases for this.
That sounds really interesting. So as you describe this to me, I'm thinking that web application firewall rules, not only am I probably using sensitive data in there, but would it be fair to say that I might just want to use a variable that references maybe a protocol or maybe some identifier that's not necessarily secret, but I don't want to have to keep repeating this piece of information in every rule that I happen to set for different applications.
We could just use this as a regular, like a global variable store across policies I'm defining all over the product.
Yes. And even though we're talking about it as a secret store, you can also just store plain text variables in there that you can reference across.
And then you also have the option to encrypt them, to turn them into secrets.
That's super interesting.
So what's the plan to get this in customers' hands? So if you go to our blog post, which we launched today, at the bottom of the blog post, you can find a form that you can fill out.
And so as soon as this is ready to use in beta, we will reach out to you.
Okay. And are you in a position to talk about timeframes or is this coming soon?
It's coming soon, but we're hoping to have something in our customers' hands in the next few months.
Fantastic. All right. I really appreciate that.
Thanks very much, Tina. Okay. Over to Reid, who's going to talk to us about securing large language models with APIs.
So Reid, thanks for joining us.
Just describe to us again your role at Cloudflare and how does that relate to language models and AI in particular?
Yeah. So I am on the product team for our application security products.
And so nothing about those products is really unique to generative AI or to large language models.
But as this is a new category that our customers have started creating tools based on large language models, they've turned to our application security tools to protect them.
And then there's a couple of things that are really unique about LLMs. And so we've had to build some unique tools that help protect them.
Let's start with what is, it's a bit of a mouthful, a large language model.
What is that? So let me start with everyone in the world, I'm pretty sure has heard of ChatGPT that launched at the end of November.
That is kind of the really big application in the category that most people call generative AI.
So really high level generative AI is you give a prompt and it returns some type of content in a particular style.
So like, give me some Taylor Swift lyrics or give me an image that looks like Banksy.
So ChatGPT is the most popular of these, but like there've been just hundreds and hundreds created.
All of these generative AI applications are essentially large language models.
Large language models, they are all based on the transformer model type.
All large language means is that when you are training an artificial intelligence model, you can kind of tweak the number of parameters that go into the model.
You can also tweak the amount of training data that goes into the model. And so a large language model has just tens of billions of parameters.
So that is the large, and then they are generally trained on just insane amounts of training data, like the entire Internet.
So large language model is a model with billions of parameters trained on pretty much all of the Internet.
Gotcha. So you mentioned that you're kind of involved in web application security.
So there must be some convergence here of as these, if we talk more in the terms of ChatGPT, because to me that I can visualize that a little bit easier.
Are also that a lot of the image AI tools that we've seen, are they also considered built off the back of large language models as well?
Yeah. Generative AI is a model that generates content, but the content could be words, that content could be images, the content could be video if you've looked at like mid-journey, but like any content response.
The thing that's interesting that all these different models share is that like at a really high level, whether it's ChatGPT or mid-journey or stable diffusion, or any of these, they are essentially one API.
That API accepts free form text.
So you type in into the model, give me some content that looks like this. And then they make a response.
This leads to a couple of pretty unique challenges.
So one, these are generally open.
You want tens of millions of people using them. And so traditional API security tools, like setting up MTLS, like doing token authorization, don't work in this case because you don't have a few partners here.
You want as many people as possible using them.
And the second thing is that free form text input makes security really difficult.
So if you think of, we have an API at api.Cloudflare.com and that API has what's called a schema.
And that schema essentially says, we will only accept these 10 parameters.
And each of those parameters has to have this data type.
And that means that an attacker has to meet a really specific format before they can perform any type of abuse.
And it helps you put some structure in place and weed out a lot of your basic attacks.
With these generative AI APIs, because they're free form, you really have no structure in place.
And so you're accepting input from basically anyone.
And the input you're accepting is basically anything. And so that just makes the number of attack vectors can be exponential.
So the thing that immediately pops to mind is I type in drop all from, right?
Instruct the AI to drop its own database, which I'm sure there's not just some simple secret database behind all of this.
But I understand what you're saying is that there is no scope to the input that can come into the AI engine.
It is, you're explicitly saying, type whatever you want.
That is the whole purpose of these AI tools is it's free form language in the context of the user that's being typed in.
And the AI is smart enough to figure out, oh, I understand what you mean.
Let me interpret that and deliver back the response you're looking for.
Which from a security perspective is a nightmare.
Because even from a privacy perspective, I've seen news of struggles where they're trying to stop people from asking certain questions that have certain political or certain issues.
And so people are getting around that by telling the chatbot to say, this is what I want you to do.
And then sneak in the query inside the context of another context, which is incredibly hard.
So yeah, I see how this is super difficult.
So what advice then, knowing that essentially you have this massive model of data, this really very kind of exploding industry where people are all over this trying to find new use cases for this type of API.
What advice do you have for developers that are building new applications based on top of this to help them?
And I think there's two great impacts here.
One is just the general security issue, right? The threat vectors, people trying to denial of service or service or trying to get it to do something that will break it or something of that nature.
But also I think I remember you talking about in the blog post, there are cost issues here.
That normal APIs, you hit them and it's a small amount of data that's returned and it may be a small amount of compute on the back end.
Well, when you've got these very big open queries, it requires a lot of processing time on the back end to actually deliver an answer.
Yeah. And that cost aspect is really the biggest thing that the customers we've talked to are concerned about.
And so think of like a traditional API, like an e-commerce site will have an API that if you query it for product information, it'll return some attributes about a product.
As an e -commerce site, you don't want people scraping that data, that data is valuable.
However, if I'm an attacker and I call your API 10 million times in a day, that's a few cents total of cost impact to you, right?
It doesn't really matter. Whereas if you think something like we have customers where the response is they'll generate video and that video can be like up to 10 cents per API call in terms of compute cost to generate.
And so think about then attack an attacker that comes in and generates 4 million additional API calls that they're not paying for.
That is a massive cost impact.
And it's really unique because the vast majority of APIs that of all of our customers are using today, they don't have this unique cost aspect that generative AI APIs can have.
Right. And of course, you want to get this software into people's hands to get the data, to get the use cases.
So you really don't want to limit them in any way.
You want to attract all these queries to kind of help you build and help you get exposure and usage.
So I think the number one recommendation we'd make is like there is no silver bullet.
There's no one neat trick.
So in the blog post, I think we had 11 different recommendations, but I think my number one is like, you're going to have to try a multitude of different approaches and you really want to have a layer defense in depth because your attackers are going to be trying so hard to get around them.
Right. One of the things that I would recommend is you're not going to be able to set up like token authorization for every single user because you have so many, but what you can do is track users that are calling the API and set up some kind of smart quotas and do rate limiting based on something other than just user ID.
So something like a fingerprint.
And what that means is that is going to help you detect like if a valid user has their credential stolen, for example, and you can see that there's a spike in calls coming from a particular login.
So when you say fingerprint, so we're talking, let's say there's an anonymous request coming to the API.
It's unauthenticated.
What do you mean by a fingerprint? What type of information is used in that?
Yeah. Like we want to try and do a sticky identifier. And so you can look at things like browser plugins.
You can look at things like ASN, like IP, like other browser characteristics and request characteristics of where that request is coming from.
So like a network and the environment as opposed to an authenticated user.
Correct. That is kind of step one though. And in this case, what I'd say is what's going to be much more useful is actually tracking what happens like post connection or post login.
Like after the user hits the first path, what are they doing on their second API call on their third API call.
And so kind of how we describe this as sequential analytics.
And that means like understanding the flow of what is a particular user doing when they call your APIs multiple times in sequence.
And that flow is really useful in identifying potential abuse.
And so for example, you might have an API that is generally called like four times per hour.
And then all of a sudden you see that spike to eight times in five minutes.
That is not like a large spike that's going to have any DDoS type of implication, but it is unnatural from the way that your users generally behave.
And if you track that granular detailed level of what sequence and frequency at which users are calling multiple APIs in conjunction, that's kind of the level of granularity you have to get to, to understand is abuse happening.
Because we see a lot of the abuse is generally low and slow and distributed.
And that means it's going to be hard to put in like any standardized rules, like set one specific rate limit is generally not going to be super useful.
It's almost like you need a dynamic hysteresis based rate limit model that allows you to kind of have, you know, follow a trend, you know, allow access to a certain point and then limit it as it kind of ebbs and flows whilst, you know, because legit, like I'm just trying to think of my own use here of chat GPT.
I might be working on a project and all of a sudden I'm kind of hitting it with, you know, quite a lot of questions in a short space.
And then I get my answer, understand the problem, and then I don't touch it again for another couple of days.
So my own usage is going to have quite a few blips. So obviously there's going to be a fine tuning of which of those blips you determined to be legitimate and which of those blips you determined to be satisfactory.
Yeah. And the benefit here is when you have an application with millions of users, like your individual patterns may be fairly spiky, but if you compare you to millions of other users, you can come up with generally a distribution of what real usage looks like.
Right. And so you work in application security and Cloudflare. What's kind of products or techniques within Cloudflare relate to this type of security concern?
Yeah. I go back, I mentioned this a few minutes ago, but I think is the two answers are one, it depends and two, all of them.
So like generally when we're working with customers that have these popular tools is that we have to throw lots of different application security tools at them.
So we can look at generating really unique session identifiers for an API.
We can look at mapping the sequences, like I said.
The other thing that's often really effective here is having an option to slow down abusers.
So I think generally we will talk about stopping abuse as blocking traffic.
But remember the thing that we want to prevent here is we want to make it uneconomical for someone to abuse your API at scale, right?
You don't have to stop them from ever hitting it, but if you can stop them from hitting it with a frequency and a scale that they can then go and resell and become profitable, that's really what you want to slow down.
And so for example, you could put in place, like we are going to do something to challenge users.
So human users can get through. It's going to be a bit of a pain for them, but let's say a 15 second delay for a human user doesn't really matter to them if you're logging in one time, but that delay for someone who is trying to scrape your API and then kind of build a product that takes the responses from that API and resells it, that is going to make that uneconomical at scale.
The other thing you could do is like we have a waiting room product where you can put a delay in place and actually randomize that delay.
So like I am just not even going to challenge users, but to some random frequency, I'm going to make the user pause for eight seconds, for 15 seconds, et cetera.
Gotcha. So anybody who's trying to abuse the system can't learn the behavior and then optimize their abuse to the behavior.
Wow.
That's quite powerful. So another thing that I kind of am thinking while you talk about this is, is it possible to actually use AI to help detect and analyze the illegitimate traffic in the first place?
Because I know that Cloudflare themselves, we have our own models which help us look at and understand what is legitimate versus illegitimate traffic.
Can you end up with the phrase, we use AI to protect our AI?
Yeah, it's funny. We have a lot of internal discussions at Cloudflare around AI versus machine learning, right?
And so AI is the trendy word that everyone wants to talk about.
I would say that we heavily use machine learning at Cloudflare, which generally machine learning is you are training models that do pattern recognition and finding either particular patterns or anomalies to normal patterns, which is different from AI, which is generally generating, it's not totally unique, but generating unique responses.
And so at Cloudflare, we have multiple teams that are heavily using ML.
And so the vast majority of the time when we are finding anomalous traffic patterns, we are doing that not based on a heuristic, but based on, we looked at the traffic either across the millions of sites that Cloudflare protects, this traffic pattern is anomalous, we're going to flag it.
Or in some cases, we looked at the traffic to your particular API endpoint, and we see that this type of traffic is anomalous to your endpoint.
So we use those heavily today. We have multiple machine learning teams, and we generate lots of ML models that we will use that are targeted to particular attack vectors.
Right. So if I was a developer today working at a well -funded startup that's kind of in this space and you peep my interest, let's say I've read your blog article, I've gone through a lot of these high levels and understand it.
If I was very keen on understanding how Cloudflare can help and how I can put some of these products in front of my API to increase my confidence of protecting the platform, what would you suggest I do next?
Is it continue to read your blogs, follow, read on Twitter?
What would be the next step to get into the second level of detail to understand this more?
Yeah. Start with the blog post. I think we tried to give more detailed links to each of those.
So I think I'd recommend, one, look at the recommendations we gave.
If any of those jump out to you as, oh, these are pretty reasonable, we should have a link that goes out to how you can actually enforce that specific recommendation.
And then if you try some of those and they're not working, reach out to us.
If you're a small site, go onto our Discord.
We love getting in the weeds with community members and we get lots of our great ideas from community members who recommend some edge case that we're not protecting well.
If you're someone that's an enterprise customer that has an account team at Cloudflare, talk to that account team and then we'll come back and come to you with recommendations.
But yeah, we love hearing from the community and our Discord community is a pretty good place for you to go make recommendations.
Discord. Okay. That was the thing I was going to access next.
Where's the best place to go to be a part of that community? And it sounds like we've got a Discord server that if you're not familiar with, go hit up.
That's developers.Cloudflare.com is the best place to find that information.
Yeah. That's where you'll find all of our developer docs and then you can get links to where you can ask more specific questions.
That's great. Well, thanks very much for both of joining us today.
That's all we have. And anything, Dina, you want to say?
Is there any comment you'd like to leave us with? No, just check out our blog post and sign up to learn about when it's the secret source ready to use.
All right. Smashing.
All right. Well, thanks so much, guys. I hope you have a good week. Thanks a lot, Simon.
Take care. Thank you. Bye.