Ask a Solutions Engineer
Presented by: Kabir Sikand, Jason Farber
Originally aired on August 17, 2020 @ 4:30 PM - 5:00 PM EDT
Get ready for a live Q&A session with Cloudflare's Solutions Engineer team, who will be ready with answers, expertise — and unparalleled whiteboarding skills. Send technical questions about Cloudflare products (or the Internet in general!) to [email protected]
English
Q&A
Technical
Transcript (Beta)
Hi Jason. So we are doing Ask a Solutions Engineer right now. This is a recurring series that we've been doing with a variety of solutions engineers across the Cloudflare org.
So today we have myself, Kabir Sikand. I'm a solutions engineer. I've been here for just north of a year.
And Jason. Hi Jason Farber. I'm a solutions engineering manager and I'll be approaching my four-year mark in December.
Great.
So I'm going to share my screen and we can go through a few of the questions that we've gotten in over the past few weeks.
But for anyone watching, if you do have questions, feel free to send over an email.
You'll see the email addresses at the bottom of your screen below the video player.
And we'll try to get to those as well, either today or in a future segment.
All right.
So one of the first questions we got in, I'm migrating a few dot cloud TLDs and was wondering if Cloudflare would support these.
So I think to answer this one, there are a few things that that we can support and a few things we wouldn't with regards to dot cloud.
So on our registrar product, we do support a variety of TLDs. Dot cloud is not one of them quite yet.
We do have more information on that on our website.
But with regards to just proxying that traffic and providing the rest of our services, any of these TLDs would certainly work.
Anything that's really DNS routable is something Cloudflare can support on our edge.
We would just advertise our IP addresses instead of your origin and the traffic would come over to us.
We could proxy through any of our layer seven services and then send that traffic back to your origin.
Anything else to add on that one, Jason? I think you nailed it.
Just to summarize, we have the ability to proxy any TLD that you own. So making Cloudflare an authoritative name server provider versus Cloudflare being the registrar for your domain, two different things.
So if you are interested in Cloudflare supporting the dot cloud, please reach out to us, send a support email.
We funnel all of the feedback that we see to the product team.
So the more support we see or the more requests we see for various TLDs, potentially the registrar team might prioritize.
So definitely reach out if you have any.
Definitely a good recommendation there. So on to the next one. I actually read this one a little bit before we hopped on and it seems like a pretty interesting one.
There's definitely a few layers here, but just to shout this one out, can Cloudflare provide any assistance against spam, multiple registrations, using browser fingerprinting, and that kind of thing?
The classic case would be banning a user, only for them to sign up with a new account again.
Also in general, the IPs are dynamic, so they change every day and they can't necessarily be blocked effectively.
So a lot to unwrap here. I think first I'd say we can talk a little bit about the attack vector and then maybe dive into what Cloudflare can do to help.
So the attack vector itself, there's things like multiple registrations can often harm a site because you are paying for potentially obviously more traffic going to your site.
You have things like fraud detection systems that you need to put in place.
You don't want users, real users, to lose trust in your system because they can't sign up as a result of a fraud or a spammer signing up with their information.
In the cases of highly regulated environments, some of these organizations need to pay for validation mechanisms through third parties, and those can get expensive.
So there's a lot that kind of goes into that. Anything else on the attack vector?
Well, depending on what type of data they're inputting into the email and password fields, that could be also a potential risk.
SQL injections, cross-site scripting injections.
But we'll try and focus on the specific use case that you guys had and not go crazy.
But as far as like paying for services that aren't for real users or real humans that are visiting your site, I mean, also ads is like a really common use case that I see.
Nobody wants to pay for services from bots, and ads are a big cost for many companies, especially if you're like e-commerce or some type of platform where you're selling a product.
Yeah, I actually used to work a little bit in the ad tech industry, and that was a big problem is things like making sure that your impressions and your click-throughs are from real humans and not from spammers.
And then you end up paying for a large premium, at least, on any of the clicks and the impressions that you're getting to just get much, much lower return on those investments.
Yeah, so I used to work at another company that specialized in bot protection detection.
And one thing I saw was when you were, and I know we're going on a small tangent, but like one thing I saw was that companies who were trying to get refunds from the ad companies by trying to show them that the users were not actually human, the ad companies would ask them for proof.
That's a really challenging problem.
So, okay, so moving on to how can Cloudflare help, and what are the challenges with this?
So I think the challenges are a couple of things.
One, you don't actually know who is visiting your site to create these new accounts.
So for example, the multiple, the different products that Cloudflare would have to help you stop this sort of problem would be a combination of our security products.
So bot management, for one, bot management is one of the products we have that help identify automated traffic.
So are you human or are you performing some type of task that's automated from a script or another tool?
And then on the flip side, are you human that's maybe rotating IPs?
So like two different problems both have the same outcome where they're signing up for a new account.
One is easy for us to identify and the other one is slightly more challenging.
So if you are a malicious scripter, a malicious user who's scripting against your site, creating new accounts, bot management at Cloudflare or rate limiting would be able to help you.
So if you surpass a certain number of requests in this time period, no problem.
Bot management, are you coming from curl? Are you scripting from phantom JS or Selenium?
Sure. That bot management would detect that very easily.
Are you a human who's using a VPN to change your IP address, create new accounts, but you're using the browser like a regular person?
That's a little bit harder.
Anything to add Kabir? Certainly. So some folks that I've talked to in the past had worked at companies that specialize in that specific problem and it's a whole kind of set of problems there.
Tools that Cloudflare tends to build out are designed to stop risky threat actors from attacking your site in some capacity.
And that, as you touched on Jason, can be things like bots scraping your site or signing up for user accounts using some sort of automated system where it's really rapid or it's something that's very low cost for the attacker once they've built it out.
There's also things like botnets that can attack you. We have an IP reputation database and threat feeds that we consult against to build those out.
DDoS mitigation, so volumetric styles of attacks, anything that's attacking your actual application infrastructure as well through a web application firewall.
But when you start to get into things like identifying a user using the data that you have, maybe that's very specific to your application and how people sign up in general versus how maybe a scammer or spammer might sign up, those types of fraud detection systems require very different types of information.
And they're usually very application aware, in my opinion, just because you need a lot of information from the application to do that.
And so since Cloudflare sits at the edge, we're able to be really, really effective in identifying that automated traffic and stopping that from ever getting to you.
But when you're talking about very context aware and application aware types of fraud detection, it's a much larger and much more difficult nuanced problem to solve.
Agreed.
So I know, Kabir, that you're also the workers, me, is that right? Yeah. Have you seen any use cases where there might be a solution via workers by maybe using the sub request feature where you can integrate a third party API to maybe enrich the request or inform the origin?
Have you seen any use cases like that that you talk about?
Yeah. So I'll obviously be pretty generic about the answer there, but the workers platform is pretty interesting when it comes to this problem because now you can use data that Cloudflare has in general gathered throughout the whole process of inspecting a request and going through our security mitigations, including things like the bot management score, as well as data that you know about your own application that we're not necessarily always taking a look at, right?
So things like what's the user coming in? Are there specific cookies or headers or data within the body of a request that you can use?
And then, as you said, are there third party services that are potentially integrated with your application?
And then kind of combine all of those at Cloudflare's edge using workers and decide, yes or no, is this someone I want to let through or is there some further mitigation I want to put into place or maybe a challenge that I want to implement?
Those are all things that you can do with the workers platform. And other things that we've started to see as we've released or talked about things like building more applications on the workers platform is, if you start to bring your application logic to the edge, now suddenly a lot of that context is there that wasn't previously that maybe was at your origin.
And so now you can start to enrich using the data that we have, the data you have, data maybe third parties have to start to make decisions around these things.
I think I've had a couple of customers who take the bot score and just pass it along to their origin.
They don't use Cloudflare to actually take any action on their request.
Yeah. And that's a common pattern.
Yeah. I think it's pretty common. Yeah.
And then you spoke about the bot score being something interesting. We haven't really dived into what the bot score is yet on this actual segment.
But I think it's important to note that there's a scale, just it's like a one sentence primer on this.
There's a scale, hundred being human, one being bot. So one to a hundred, if you get a one, you're definitely not human.
If you get a hundred, you're definitely human.
And you can see that score in workers and use it to write some custom logic.
Say like if the score is between one and 30, there's a pretty good chance that you are malicious or a bot or certainly automated.
It doesn't necessarily mean that you're maliciously automated.
It could be a good script that you just aren't aware of or something like that.
And you can use that as a condition to take some type of action.
Oh, but before I, I'm not sure if you have any more thoughts on that, but I wanted to talk about how you can use workers to use the bot score and also funnel requests into a honeypot.
So this is like a really powerful function of workers with regards to new accounts, signups, registrations.
If you're not sure, you can just send, like if you get a bot score that's less than 50, for example, you can send that request down a different path.
Have an isolated origin that returns, if you're e-commerce site, you can just return fake prices or you can send them down a, into a, a staging server that doesn't have any production data on it or production environment.
So lots of different things you can do with workers.
And yeah, I think it's, it's pretty interesting.
Just the idea of like using the heuristics that we build up over the time that we're inspecting and then doing kind of custom things with it.
And it's all, you know, kind of a sandbox that you guys, anyone using the service can, can leverage.
As you said, folks might send this back to the origin and just use it as another heuristic in fraud detection systems that are on the backend or simply something that they can log and monitor over time, all the way to things like black hauling or honeypotting certain requests.
Yeah, so there's certainly a lot to talk about on this topic.
But I think just in the interest of time, we can try to get through a few more of the questions that came in this week.
So a bigger one here, or rather a broader one here, how does Anycast work?
So Jason, I think I'll let you kind of start on this one and we can dive into a little bit of the nitty gritty, but there's definitely a lot to talk to.
It's, this is kind of the backbone of how Cloudflare has built our network and our edge.
You stole my opening line. So I would say like if, if I was to describe Anycast to somebody, I would, I would start by describing Unicast to them.
I think that's probably the concept that most technicians or people in the tech space are familiar with.
When you explore a CDN, I think almost all of us use Anycast to some degree.
It's, it's super useful on many different levels.
So Unicast is communication from one device to, to another.
Only each of those devices has, has, has that IP address that they're announcing to the world.
So you know that if you have a server located in San Francisco and you're somewhere else in the world and you connect to it, that you're definitely connecting to that server, point to point.
Anycast is the concept of broadcasting that same IP address from so many different locations.
So as many locations as you want.
I'm not going to begin to try and deep dive into exactly how that works from a BGP perspective, but it uses BGP and it works by routing the user to the closest point of presence.
So we have 200 plus POPs, each of them. So when you sign up on Cloudflare, you'll get a zone, you'll get a pair of IP addresses, and we'll advertise those IP addresses from every single one of our POPs.
So if you are in Europe, you're going to go to the closest POP to you because BGP will identify that location as the least number of hops away.
And that's how BGP actually operates.
Do you want to talk about Kabir, how Anycast helps with DDoS mitigations, for example, maybe some other benefits?
Yeah, I was going to dive into that. So there's kind of a lot of really interesting side effects of this model.
One of the big ones and kind of the bread and butter of where Cloudflare as a company started selling services is around DDoS mitigation.
So inherently, a DDoS attack, a distributed denial of service attack, is generally built on some sort of a network of bots or hijacked computers or IoT devices, whatever it is, that all will send requests to the target from a large variety of sources distributed around the globe.
When you put a Anycast network in between that, and specifically an Anycast network where every single one of our data centers has the capacity to handle that type of request volume, we are able to distribute the load of a DDoS attack across our network.
And then from there, we can do inspection due to the fact that we're also going to be terminating SSL connections.
We're going to be terminating SSL, so we're going to be able to look inside of a packet that's coming in.
And we can identify the volume of attacks.
We have information around the reputation of the IP addresses coming in.
Is it a known botnet, things like that? And does it look like an attack?
And if so, we can just drop that at our edge and never have our users worry about or see any of those volumetric attacks come through.
So, the idea of distributing the network like this and distributing traffic definitely comes with good security benefits.
On the flip side, we also have benefits around the ideas of localizing things like cache and SSL termination.
So, getting really close to the user and making that experience of connecting to your website even better.
So, SSL termination takes a lot less time.
Fetching something from a data center that is 100 milliseconds from you versus something that is 1200 milliseconds from you is a way different experience.
You can start to make multiple requests in the amount of time that it used to take just to make one.
So, you can see how all of these kind of effects start to domino to build out kind of the core of what Cloudflare's early and continued services are.
So, I think we can probably talk to any task for over 30 minutes, but we'll try to get through a few other ones as well.
So, is rate limiting or block traffic included in monthly bandwidth usage?
And this one kind of gets a little bit into how Cloudflare prices traffic and is also related to that DDoS question or that DDoS piece that we just talked about.
So, a side effect of us hosting many, many, many sites is that we will see many, many, many attacks.
And we've gotten pretty good, very good at blocking those attacks.
We've given our users many different tools they can use to block attacks, including things like rate limiters, web application firewalls, and the automated DDoS mitigations we have.
And since our mission statement is to kind of help build a better Internet, we don't want our users paying for that.
On one side, that's obviously, you know, we could be charging our users for the amount of traffic that we're blocking, and that potentially is a missed opportunity.
But I think it really helps get to the mission statement. Let's build a better Internet.
Let's not encourage attackers to attack our users. Let's build a wall in between and kind of help them only see the clean traffic and only pay for that usage.
Yeah. I think it's also really comforting for our customers that, you know, nobody wants to be hit with a big bill at the end of the month.
You know, you sign up on Cloudflare, you're protected by Cloudflare.
And then if you get hit with some massive attack, it's okay.
It's just, you know, you're not going to get charged for it.
So I think that's a huge plus where we differentiate ourselves as well in the market.
Predictable billing is, I think, something that all finance people look for.
Yeah. And certainly, you know, this helps our customers know that we're incentivized to block traffic.
We don't want to pay for the traffic that, you know, of course, the cost is not being passed on, right?
So it's something that we need to just make sure that we're continuing to do better and better.
So this question, does Cloudflare have mitigations against competitors or copycats duplicating my website?
I think this gets towards a few of the things that we talked about in our second question around kind of mitigations that we have around scammers and things like that.
But this really touches on a specific type of attack, which is content scraping.
So many botnets or competitors or copycat sites might come to your site and look for things like, are you having a sale?
And then content scrape using their bots against all of their competitors in the same retail space and look to see if you're having a sale, what the sale prices are, and maybe try to undercut you by 10% or 5%.
So this is something that, you know, really big in certainly the retail industry and many others.
But our bot mitigation service can definitely help against a lot of these automated versions of these attacks.
Yeah, actually, I have an SE on my team that will demo this exact use case when talking to prospects about bot management at Cloudflare.
So they run a script, and it actually downloads, it starts crawling a site, starts iterating on every link that comes back on each of the pages, and then builds a version of the site locally.
And you just open it up as a, you open up the main HTML locally in your browser, and it's actually a copy of the website.
And it only takes maybe like, you know, 30 seconds or a minute, depending on how big the site is.
So yes, we have bot management.
And it's a really interesting use case. Yeah. And one of the interesting things about that bot management solution, I think, is that instead of kind of making assumptions about what's good and bad, you know, there might be use cases where you actually do want to go and copy your site or something like that, or have a bot script against it.
And that could be something like integration testing.
We would just allow you guys, as a customer, to use tools like our firewall rules engine to punch holes into that firewall, and allow your own testing services to hit your site, while still blocking things that are malicious to you.
So it's kind of like putting you guys in the driver's seat as a customer at Cloudflare.
Right. So I think we have time for maybe like one, two more, if we're fast.
Yeah. Yeah, let's try to do this one. So this should be a pretty quick one. Can you increase origin response timeout?
The answer is yes, on the enterprise plan, we're able to increase that origin response timeout.
But in general, you know, as a consultative individual, I'd want to look and say, see what the reasoning is, is are there other optimizations we can make to kind of help with how long the origin is taking to respond.
And certainly, there's a lot of avenues that can come out of that.
I have one more thing to add on that last one. Yeah, just real quick with with our recent announcement of workers unbound.
If you have a process at your origin, that's actually causing the really long request cycle.
You might want to consider putting it in workers, right? You can just have a process that runs in workers for an unlimited amount of time, and you'd be good to go.
Yeah, maybe that's do you have anything you want to add on that? Since you're the workers to me?
Yeah, no, that's a really good point. So I think that touches on a few things.
And one of the big ones is there's an idea in the software development world, I may get it wrong, because I've spent a few years since I've been in the in that world.
But the idea was around kind of shadow development.
So when you wanted to replace a service, you in the past really needed to like build up your new replacement while still serving the kind of older version of that.
And with a tool like workers, once you've built out that replacement, you can put it in front of the at the edge and kind of decide how much traffic goes through that particular new version of the service and have some of it go back to your origin.
So there's various like kinds of ideas like that along those lines where you can start to build out new pieces of functionality that maybe replace old monolithic pieces of code, maybe at a specific endpoint, it's taking too long to respond and start to optimize it.
Yeah. And then as you said, you know, you can kind of let the workers process start to take that and, and, you know, respond to the user in a more timely manner.
I think we have time for the last one.
Shall we? Yeah. You want to give it a go? Yeah.
So what is BYO IP? So this is an acronym, you guys may have seen it on our blog recently.
This is bring your own IP. What what this allows you to do as a customer at is bring an IP address to our edge.
And then users will then connect to Cloudflare instead of your origin.
And that allows you to take advantage of all of our services without having to have your partners change IP firewall entries or DNS entries that point to things like that.
10 seconds left. Yeah. Well, thank you.
Do you have one more thing you want to say? No, I was just gonna say, thanks for the time today, Jason.
You too.