Deploying Complete Web Experiences with Workers
Presented by: Kristian Freeman, Gabbi Fisher
Originally aired on June 4, 2022 @ 7:00 AM - 8:00 AM EDT
Best of: Cloudflare Connect NYC - 2019
Session 1
Deploying Complete Web Experiences with Workers Learn how to leverage Cloudflare Workers to deploy custom code and applications to the network edge, brought to you from members of the Cloudflare Workers development experience team.
Session 2
Introduction to Bot Management Stop the bots! Cloudflare Director of Product Sergi Isasi provides an introduction to Bot Management.
English
Cloudflare Connect
Workers
Transcript (Beta)
and they're going to talk about how to use Cloudflare with workers. So let's get started.
♪ Okay. Hello, everyone.
Can you hear me? Oh, hot mic, hot mic. So we're going to be talking about a couple different things today.
I think the official name of this talk is, like, developing Cloudflare with workers or something.
Like, everyone's kind of set up until now.
We dog food a lot with workers and we have a lot of different things, especially around workers' sites that we've done with this.
We're going to kind of cover some of that here.
So real quick, does that button work? Oh, that's the Zoom button.
Okay. So my name is Kristian Freeman. I work as the developer advocate for workers.
You can find me on Twitter, at SignalNerve. And I'm Gabbi Fisher.
I'm a systems engineer on the workers' developer experience team and you can find me pretty much everywhere as Gabbi Fish.
So we're both on the developer experience team.
And I think I would describe what we do as, like, trying to build cool stuff with workers.
And that's how a lot of this stuff kind of happened. So Steve mentioned in the last talk this idea of Jamstack.
How many people in this room know what Jamstack is?
Okay. That's a pretty good number. So we're going to be talking about what Jamstack is and why workers' sites, in particular, turns out to be an exceptionally good fit for it as an architecture pattern.
I kind of describe it as a new way of writing applications.
Well, I don't know about new.
In a lot of ways, it's actually kind of old school. But particularly, like, in this sort of new stack definition, it's applications that are performant, scalable, and easy to debug.
So the Jam, it always makes me hungry when I think of Jamstack.
But it actually stands for JavaScript APIs and Markup. I'm going to go through each of those and kind of talk about the specifics of each of those.
And then we'll talk about how workers, in particular, and workers' sites is a really great fit for implementing this stuff.
So first, starting with JavaScript. So the idea here, and this is, like, from the Jamstack site, but we'll kind of put our own spin on it on the workers' platform.
So any dynamic programming during the request-response cycle, like, you've seen a lot of examples, I think, in this track about, like, requests and responses.
So that should feel pretty familiar. But in this case, it's actually talking about the client.
So this is, like, front-end frameworks like React or Vue, or if you want to, like, write your own vanilla JavaScript framework.
I know some people are, like, extremely not into React and Vue.
That's all fine. The idea here is, you know, there's no server running. You know, I don't have a Node server.
I don't have, like, Ruby or something like that. It turns out, you know, this works really well with a serverless architecture.
Right?
We are basically trying to just send things to the client as much as possible.
We also have APIs. So server -side processes, database actions, all that kind of stuff is abstracted into reusable APIs, and it's just accessible over HTTPS.
Again, workers, you know, the way you talk to a worker function is with HTTPS, so it works really well for Jamstack as well.
An important call-out here is this can be custom-built.
So, you know, I'm writing my own workers function. That's one way of doing it.
Or you can leverage third-party services. So it seems like a lot of people here know about Jamstack.
You probably also know that there are, like, a million Jamstack tools.
There's CMSs and observability, all this stuff going on in the Jamstack space.
That's a really big part of the API section of this as well.
The final part is markup. So there's this idea of particularly pre-building your assets and pre-rendering stuff like that.
So say, for instance, I'm using a site generator, something like Gatsby or Hugo.
We have this idea of, you know, on my local machine, I'm going to try and build as much of this out before sending it up to Cloudflare as much as possible because that kind of stuff is expensive, right?
So I want to kind of take care of it locally and then just sort of throw up some simple static HTML up to wherever the site's going to live.
Of course, you can kind of guess where the site's going to live. It's going to be Cloudflare Workers.
So this is just kind of a couple of the tools, a lot of tools I think that people know.
So things like Gatsby, which has been mentioned a couple times.
I am pretty partial to Gatsby myself. I think it's a really awesome tool.
Hugo, which we use for our workers' docs, which, as Ashley mentioned, those are open source, so that's all with just Hugo and Markdown.
It's very straightforward.
Jekyll, create React app. Has anyone done... We have the deploy React app tutorial in our docs to workers' sites.
Has anyone done that? A couple of people.
So we think that's going to be a really big one as well. Anything that is building out just like HTML, JavaScript, CSS, it's a perfect fit for workers' sites.
So just real quick, to talk about kind of the workflow here. Like I said, it's been actually really interesting to be sitting and kind of soaking in all the talks where everyone has been talking about the same sort of thing.
Steve mentioned the eyeball thing is kind of creepy, I agree. But the idea is we have clients, right?
We have this client, and in the past we had a server somewhere.
Let's say it's like US East 1 or some kind of big scary warehouse sounding name like that.
All of my stuff lives there. I have some bucket or something like that.
Did I turn on your Google Assistant? Awesome. I turned on Siri.
So all of that stuff is going and living at some bucket somewhere.
And then me as the developer, I'm spending much of time writing code, I'm deploying, I'm doing all this sort of classic developer stuff.
And what Jamstack does is it tries to sort of simplify that.
I'm going to try and use the word easy and have it be okay in this context because I think Jamstack really does try and make this stuff easier for you.
But the idea is I have my website code, I'm going to do all this stuff to make my application ready for, say, new deploy.
I'm going to build it locally, and I'm just going to send up that HTML, CSS, JavaScript to a CDN.
I think you probably all know what CDN I'm talking about.
Cloudflare again. And the idea here is that a user is going to make a request to the CDN, not to an origin.
So Steve talked about originless.
Jamstack is a really great fit in that way as well for the kind of stuff we want to do.
So for me, I spent a lot of time in the last couple years building this sort of stuff.
And then about a year and a half ago, I heard the term Jamstack, and I was like, what is that?
And I realized it was like everything I'd been working on for the last couple years because I really don't like infrastructure stuff.
I have like a Ruby on Rails background, and then I kind of messed around with Kubernetes, and I was like, no, I'm good.
So all this stuff has been really, for me, a really great fit because I just want to have a folder that's called build or something.
I just want to put it somewhere, and I just want to step away from it as much as possible.
And like I said, and particularly Cloudflare Workers is a really great fit for this.
So in the past, we already had a really great platform for serverless functions.
I can take my JavaScript, I can put it up somewhere, a request comes in, and that JavaScript gets run.
That's pretty easy for me to understand.
But we didn't really have a great way until recently to store HTML, CSS.
Steve kind of started paving that path with his own site, which was really cool, but we wanted to have kind of a better solution for that.
So Gabby's going to talk about that. So lots of preexisting static site hosting services also exist, such as options with Google Cloud Platform, Amazon Web Services, Zyte, or Netlify.
And if you wanted to use those hosting solutions with Cloudflare as your CDN, you would have to maintain two services.
And that kind of begs the question, could we provide that hosting service to customers as well?
And if we helped customers with asset hosting, could we help customers reduce their dependencies on multiple services and be able to manage their site deployment and their CDN configurations in one place?
This idea gave way to WorkerSites, our serverless site hosting functionality, that we're excited to talk to you about today in more depth.
WorkerSites empowers users to deploy both their static site assets as well as their dynamic workers functions in one place so that they can build interactive web experiences on top of our serverless technology and deploy these sites to our global network so that they are quickly reachable by your users all around the world.
Let's briefly go over how WorkerSites is built on top of the Workers platform.
So there are a couple of building blocks we were able to use to create the service.
So when it comes to hosting static assets, we actually have an underlying storage resource for that, and it's been around for a while, and that's going to be Workers KB, which Steve just talked about.
Theoretically, anybody could save their site assets from HTML pages to images in Workers KB and serve it from there.
While all of these components have a place to live, there is no straightforward or standardized way of actually serving these assets from KB in response to requests to a worker.
Some advanced Workers users even started rolling their own complex logic for serving and even rendering content from Workers KB.
We wanted to see if there was a way that we could simplify this logic and help all Workers users leverage Workers KB to actually make their websites.
And maybe after Ashley's talk, this friendly crab will look familiar to you, Ferris.
It's our mascot for Wrangler, the official Cloudflare Workers command line interface.
We added a hat with a WebAssembly chip in it and a lasso to cowboy Ferris.
So this became another building block for WorkerSites because we could use Wrangler to wrangle some baseline logic for serving sites and abstract that logic away from users to make that process of deploying a website to Workers as simple as possible.
Our third building block is our Workers Cache API, which allows Workers to directly cache requests and response pairs as well as retrieve cached responses.
And you can directly access the cache from a Cloudflare Worker.
So we decided to leverage the Cache API to cache assets served from Workers KB and to further streamline the performance of a Workers site.
So this allows a Workers site to be really fast. And as Sonic the Hedgehog says in the trailer for the highly anticipated Sonic movie, I'm excited about it.
Gotta go fast. I'm glad that landed. I thought this would be a tough audience. So let's go over a high-level overview of what the Workers developer experience team built.
We'll start with what happens to a Worker user when they use Wrangler to upload a Workers site.
So first, we need a place to actually store our assets. And in JAMstack land, remember that the M includes markdown files that are generated in the HTML and other static site assets, tongue twister, that we serve to web browsers.
So we introduced logic in Wrangler to assist with the encoding, versioning, and uploading of assets to Workers KB.
This constitutes step one in this JIT.
The next step, too, is to actually maintain a mapping of paths from URLs to the actual static site asset that they're supposed to serve.
All Workers sites rely on a mapping that actually maps what a path to a KB asset is.
And these mappings are actually injected into a project by Wrangler.
Wrangler is able to pull a basic static site lookup and routing template that handles all of this mapping between your website paths to what file we're actually supposed to serve from Workers KB.
So we handle all of that translation for you and keep it out of your hands so you could focus on just developing your website and not actually the process of deploying it.
Now, let's see how the Workers sites are served up to a browser and how the cache API plays a role in Workers sites.
So at first, this JIT may look a little complex.
There are admittedly a lot of steps, but we're going to break down what's happening here.
In step one, a user on a browser sends a request for your website, which hits the Workers site script that's been deployed on your zone.
And in step two, the Workers site script checks the Cloud Player cache to see if a response for that request is cached.
And if it is, you can serve the response immediately and this request has been fulfilled.
But if the response is not actually stored in the cache, we have to move to step three.
And in step three, what we do is we fall back to actually getting that asset from Workers KB.
If we go through step three, that means that in step four, we can then cache that asset that we've pulled from KB so it's actually pulled up faster in subsequent similar requests.
And then we respond to our original request with that asset that the user asked for.
So this architecture ensures that we maximize cache usage so that your site is as responsive as possible.
Now that we've looked at how Workers sites work, let's see how quick they are to deploy.
Deploying a Workers site typically can be done in five minutes or even less.
And you can actually do it in just, I think, five command line instructions.
So I'll go over those now. The first thing you need to do is actually install Wrangler, which is the tool you must have to deploy a Workers site.
So you can do that through npm install or Cargo if you're more Rust inclined.
And after that, you run Wrangler config to provide your authentication credentials so that Wrangler can actually upload things to Cloudflare on your behalf.
After installing Wrangler and providing those credentials, you run these three steps to actually deploy a site.
So first you have to actually generate a Workers sites project.
And what I'm doing here is generating a Workers sites that's going to live under the my site directory.
Then you have to actually go into that directory and edit a file called Wrangler.toml.
This Wrangler.toml config file is used to store information about where you're actually deploying this Worker and what account it's on.
So typically identifying information like that.
And finally, you call Wrangler publish and your site will be live.
We already use Cloudflare Workers sites to host our own assets. So one site that we actually serve through Cloudflare sites is all of our documentation for Workers.
So I wanted to go through an example of where I actually deploy these docs to a Workers site.
So wish me luck. I'm going to be doing a live demo on a Linux machine in front of all of you.
So let's see how this goes. The first thing to keep in mind is that our docs are stored and generated by Hugo.
So I've got ALS here.
People who have used Hugo for static site generation may find a lot of these files to be familiar.
But the key thing you need to know is that to generate static site contents, from Hugo and its templating, you just run Hugo, and that should do it all for you.
So now you'll notice that there's a new directory called public.
And that public directory is what stores all of our generated HTML and CSS that we're going to be serving up to our website visitors.
Now what I need to do is actually init a site.
So instead of using generate, I use Wrangler init. The difference is that init lets you use a preexisting directory that has a bunch of static sites content in it.
Wrangler generates actually makes one for you from scratch.
So since we already are using Hugo, we use Wrangler init, and we pass in the site argument.
Is your Wi-Fi working?
So one little hiccup about the Wi-Fi in this building is that some of the endpoints that we use are kind of blocked by the New York Stock Exchange Wi-Fi.
Literally we had to send them a list of endpoints we needed them to.
We had to send them a list of endpoints we needed them to whitelist, and it's a little touchy.
It worked when I just did it before, but everybody says that.
It worked for me yesterday. Okay. Yeah, so let's remove my...oh, wait. Wrangler.toml.
So let's try that again. There we go. Come on, Internet. I believe in you.
Even tethering is weak from here, and that's what we're relying on right now.
There we go. All right, so we have generated a worker's site repository. This basically just does two things.
It creates a Wrangler.toml.config file I mentioned, and it also pulls in a template with a worker for actually serving the worker's site.
That's kind of like the mapping worker that I mentioned before that maps requests to what your assets are in KB.
All right, so now that we have that running, I'll show you the new Wrangler files.
You'll see that there are two. One is Wrangler.toml, and then the other one is worker's site where our template worker lives.
There are a couple of other ones, too, but those aren't critical to think about at the moment.
So we're going to change Wrangler.toml. I'll show you what it looks like.
So to deploy to your own zone, for example, you need to put your account ID.
I can't spell.
And then you're going to use zone ID and put in your zone ID in here.
And another important step is to put in your directory where all of your statically generated assets are placed.
So that would be public for us. What we're going to do is I'm going to deploy this to a preexisting website of mine.
So I'm going to replace this Wrangler.toml with one that I already made.
So you can see that I put in my own account ID, my own zone ID, and I also specified a route, and that's important because we need to, as with every worker you do publish, you have to present a route for that worker to run under.
And what I'm doing is that I'm running it under every possible route for my own subdomain, docsmirror.gabi.phish.
I actually do own the gabi.phish domain, and I'm really proud of it.
So now I'm going to run Wrangler publish. Fingers crossed the Internet's happy with us right now, so let's see how this goes.
I don't know if going to the window will work.
It seems to have hit the API, so this is promising. Okay, we might get lucky.
Yeah. So usually what this would do is it uploads all of these assets to workers.kb for you, and then it deploys this worker template for you that handles all the routing.
Okay, let's try connecting to the Wi-Fi here, and maybe it'll work.
I did it from my machine.
Let's see.
Hello, Mr. Wi-Fi? There we go. Okay, so it looks like we're connected to the Wi-Fi here again.
Let's give this another try.
Okay, so Wrangler publish.
There we go.
So it does all of this for you, and now your website is actually live.
The caveat and the reason why we used Christian's phone for tethering is that since I didn't ask the New York Stock Exchange to whitelist gabby.phish, we're going to get a nice little error saying this is a blocked domain.
Why don't we give it a shot anyways?
Okay. Yeah, so go to – oh, that's a great idea. So let's go to docs. Oh, wait, what?
It worked? That's weird. Anyways, I deployed all of our docs. But that's all it takes.
We were really focused on making sure that this would be as simple and streamlined as an experience as possible for our users because in all of my previous experiences deploying my own blog, for example, I found that there were a lot of hiccups that could just show up along the way, and we wanted to make sure that people had as delightful a time deploying their websites as possible.
So I'm glad that worked out happily.
So that's our demo. And one thing I wanted to mention before I handed this back to Christian is that workers' sites are indeed fast.
I feel like this has been touched upon before, but I wanted to dive into the graphs that Ashley mentioned a little more because I think they're really exciting.
So when we were releasing workers' sites, Rita was testing the performance of our hosting solution by using Google Lighthouse.
And you can see here that whether you're in Denver, Chicago, Boston, or Atlanta, workers' sites tends to be the fastest web hosting service you can use to serve up your content to your users.
And the same applies for workers' sites deployed around the world.
Cloudflare's points of presence all over the globe ensure that your workers' site will be delivered as close as geographically possible to your website visitors.
And Lighthouse web rankings are pretty cool.
We were really, really excited that we were able to actually hit a 100% performance rating with Lighthouse just because if Google is setting forth these standards, that means that ideally they're going to have really high expectations of what a website can achieve in terms of performance.
And to hit 100 is just a really meaningful metric for workers' sites hosting.
And now that we've deployed a website, let's see how we can automate workers' sites builds and releases.
After all, you notice that I had to actually type Wrangler publish to actually publish a website and put it into production?
It would be really nice if we had continuous integration, CI solutions, to make sure that that deployment of sites could be entirely automated.
So I'm going to hand the mic back to Christian.
It could also be thought of as what happens if you type Wrangler wrong 50% of the time, which is me.
So yeah, we do have a CI solution that I built in the last probably month and a half or so with GitHub Actions.
Has anyone used GitHub Actions yet?
It's pretty new. It's great. They describe it as build, test, and deploy your code right from GitHub.
So it's really awesome. I'm going to try and cover this not in too much detail because the readme is on the Wrangler Action that we've published is pretty good at describing all the different use cases.
But the general idea here is that I have this sort of workflow. It's just a collection of steps.
That can be things like, for instance, running Hugo to build my project, running npm install, all of that kind of classic stuff that's probably in your readme for when someone new joins your team.
It's that kind of stuff.
And importantly, there's also this idea of attaching it to events. So for instance, I can say when a new commit comes in on my master branch, maybe that's when I want to deploy my project.
So it's a pretty classic CI solution. We're going to do another version of a live demo, which is a video.
Yay! Oh, where's the mouse?
There it is. Because I just want to show kind of how quick this can be. So in this example, I have a create React app application.
It just says hello. And I'm just going to make a little change here.
So hello connect NYC. It's just a really small change.
But I also want to add a GitHub workflow here. Let's see. We'll switch.
There we go. So this is just a YAML file. How many people love YAML? Really?
Okay. So I'm just going to call it deploy. Again, this idea of adding an event.
There's a couple other events. There's scheduled events and webhook events.
In this case, I'll just say on master when a new push comes in. I'm just going to set up a job here called deploy.
It runs using Docker. It's going to be using Ubuntu just because that's the best way to get everything to mostly just work.
And like I said, there's kind of a collection of steps here as well.
So first, I'm going to check out my repo.
That's going to clone down your repo. In this case, the master tag into the action.
It's going to run npm install. And in this case, for React, I'm going to run npm run build.
In Gabby's example, this would be like Hugo or like Gatsby, I guess, build or whatever.
It's different for every project.
And then finally, our Wrangler action here. So we just say uses Cloudflare Wrangler action and pass in our API key and email.
These are encrypted by GitHub as part of their secrets functionality.
And that's pretty much it. So I'm going to open up a terminal.
Past self is going to open up terminal. There we go.
And I'm just going to make a new commit here. And, you know, from the first commit where I add this GitHub action, it's already going to be building something.
It's going to be a master commit that I'm going to push. And when I push it, I do do a force push because my demo broke.
So just trust me. You don't have to force push to get GitHub actions to work.
But we have this kind of, you know, nifty actions UI in GitHub.
GitHub actions is really good at showing a lot of text on the screen and then immediately getting rid of it.
So this is a really sort of goofy looking, but like that, like that.
It just does it over and over again. But you can see each of these is like our separate step.
So, you know, installing NPM, building the application, publishing it.
This is installing Wrangler and configuring it and doing all that stuff for you.
And the result is a final deployed application.
This will happen, you know, every single time you push a commit.
Or, you know, if you have like some sort of special deployment style that you need, you know, if you cannot run or type Wrangler successfully and you also should not be trusted pushing to master, you can set up something that works for you there as well.
There it is.
So talking about dogfooding again. We use this for our docs. You know, Gabby showed deploying just like a pretty straightforward Wrangler publish.
We actually are on the GitHub action ourselves for the docs doing a couple different things.
So first we have both a staging and production version of the site.
What is our staging? It's like fluffy cloud or something. Big fluffy cloud player.
It's very cute. So we have this like separate staging section where like internally we can say, you know, does this new version of the docs look okay?
And we also have production, of course.
So when a new commit comes in to master, we get a production build.
But we also have just like pretty straightforward like build site.
So every commit that comes into the repo, we build both the worker and the Hugo docs and make sure that everything works as expected.
So, you know, it's pretty straightforward stuff.
But it just goes to show that like we can plug this stuff in with GitHub actions or, you know, there's a number of other sort of CI solutions out there that we want to kind of tackle that would be a good fit for this.
So why does this matter? It turns out that Jamstack also talks about deploy stuff, but it's not called Jamstack, like deployment.
It doesn't sound as nice.
But the idea is that like your site is now just a folder of static assets.
So deploying should be easy. I think I'm okay to say easy in this context.
Because it should be easy, hopefully, and also safe. And so if deploying your site is easy, we should make it easier and even easier and even safer, don't trust me with deploying stuff, by automating it as much as possible.
So we're super excited about this. I have so many projects using this already.
We use it for the docs. You can find it in GitHub's marketplace. It's called Wrangler Action.
I actually made a little bit.ly link as well if you want to check out Wrangler Action.
It's really cool. It's really, really useful. I think it was like kind of a missing piece of our workflow.
Do I have time to talk about Built with Workers?
Okay. So one last thing, I think it's been mentioned a couple times, is like, again, we're sort of dogfooding this process of like using Jamstack for our own stuff with a project we're calling Built with Workers.
It's Jamstack.
It's built with Gatsby. Sanity .io. Has anyone heard of Sanity.io? Cool. So it's like a headless CMS.
I'll show you what that means here in a sec. And then obviously workers.
So we're starting to build it out right now. This is kind of what it looks like.
And the idea is that you can come to the site and say, okay, I've heard of worker sites.
I've heard of KV. But like what do I actually build with it?
What is something cool that I can build with it? So you'll see all these example projects.
Things like the invoice tool, which Rita showed earlier. Obviously our own internal stuff.
All those kind of things. We're really excited in particular about the, oh, my gosh, are you serious?
Need permission. Okay. Well, that's my bad.
I can describe in great detail what the video was. Yeah.
Yeah. I swear this happens at presentations at Cloudflare almost every time.
Live demos. Yeah. So the idea with headless CMSs basically is that you can sort of live edit things.
And because it's still static, so your site is HTML, CSS, JavaScript, I'm going to plug in an API.
Will it work? Wow. Okay. So the idea is on the right here I have this headless CMS called sanity.
This is an earlier version of the built with worker site.
You know, when I make changes to this project, it's going to live update locally.
So I have this really sort of like virtuous cycle of like I make a change.
I see the new version of the site. And then when I deploy this, you know, I can be super confident that the deployed version of my site.
It's like a 15-second video. I'll start it over one more time. I need a GIF. The idea here is that I can be really confident that this version of my site is locally going to look the same as it is in production.
So we're super excited about it in like this workflow generally.
We think Jamstack is like a really, really good fit for workers.
Should I try and plug yours back in now? Okay. This is like a fun game of...
It's like a game where you switch chairs. We were so close to like a perfect...
No, that's not true, actually. Our live demo earlier didn't work.
Okay. I also have like the longest password in existence. All right. Okay. So, yeah, one thing we want to call out is that if you're building something interesting with workers and you want to showcase it, we would love to talk to you and learn a little bit more about what you're building and, you know, help showcase like the amazing stuff that people are already building with workers using KB, using sites, stuff like HTML Rewriter, things that we've been working on recently.
So come say hi to me or Rita, Ashley, Gabby, Steve. Is that everyone? Who else is here?
Yeah. Cool. And finally, I'll just put up this QR code. I don't actually know where this goes, but I'm going to assume if it's in your slides that it's good.
That's probably bad, right? Yeah. Okay. So we would love to have you try all this stuff out, and hopefully worker sites was interesting and exciting.
Thank you. Thank you. Thank you. Thank you. Thank you.
The release of worker sites makes it super easy to deploy static applications to Cloudflare Workers.
In this example, I'll use create react app to quickly deploy a react application to Cloudflare Workers.
To start, I'll run NPX create react app, passing in the name of my project.
Here, I'll call it my react app. Once create react app has finished setting up my project, we can go in the folder and run wrangler init dash dash site.
This will set up some sane defaults that we can use to get started deploying our react app.
Wrangler.toml, which we'll get to in a second, represents the configuration for my project, and worker site is the default code needed to run it on the workers platform.
If you're interested, you can look in the worker site folder to understand how it works.
But for now, we'll just use the default configuration.
For now, I'll open up wrangler.toml and paste in a couple configuration keys.
I'll need my Cloudflare account ID to indicate to Wrangler where I actually want to deploy my application.
So in the Cloudflare UI, I'll go to my account, go to workers, and on the sidebar, I'll scroll down and find my account ID here and copy it to my clipboard.
Back in my wrangler.toml, I'll paste in my account ID, and bucket is the location that my project will be built out to.
With create react app, this is the build folder. Once I've set those up, I'll save the file and run npm build.
Create react app will build my project in just a couple seconds, and once it's done, I'm ready to deploy my project to Cloudflare Workers.
I'll run wrangler publish, which will take my project, build it, and upload all of the static assets to workers.kv, as well as the necessary script to serve those assets from kv to my users.
Opening up my new project in the browser, you can see that my react app is available at my workers .dev domain, and with a couple minutes and just a brief amount of config, we've deployed an application that's automatically cached on Cloudflare servers, so it stays super fast.
If you're interested in learning more about worker sites, make sure to check out our docs, where we've added a new tutorial to go along with this video, as well as an entire new workers site section to help you learn how to deploy other applications to Cloudflare Workers.
Hi, everybody.
I'm Sergei Sassi. I'm also a director of product here at Cloudflare, and I'm directly responsible for our bot management solutions.
And today, we're going to talk about bots in general, Cloudflare's relationship with bots, walk you through a day in the life of www .Cloudflare.com, so what we actually see as bots on our own website, and then I'm going to invite one of our customers on stage to talk about their experience as an early bot management customer.
So first, let's kind of level set on what we're talking about.
A bot is an automated system on the Internet, or a computer task, and specifically, what we look for at Cloudflare is to give you tools to block malicious bots, while still allowing the good bots, the things that crawl your website and raise your SEO rankings through.
Our history with bots is actually quite long-running. If you go back to our very first presentation at TechCrunch Disrupt, we were talking about bots then, and we've given our customers lots of capabilities over the years to help manage bots.
So first was our denial-of-service protection. So bots typically took websites down 10 years ago, and we gave our customers the ability to keep their websites up just by having that volumetric DDoS protection.
The very first action that we gave customers for bots was I'm under attack mode.
A quick show of hands, how many in this room have actually turned on I'm under attack mode in their Cloudflare dashboard for a site?
A decent amount. So I'm under attack mode was our first attempt at slowing down a bot.
It would throw a JavaScript challenge. It still does throw a JavaScript challenge at inbound connections when your site is under attack.
And at first, it stopped bots cold. Bots were not capable of solving JavaScript, and they just couldn't get past that.
Eventually, bots got smarter and were able to solve that challenge and get through that block.
So we gave additional tools.
So we allowed you to have rate limiting. We have a web application firewall that allowed our customers to identify sketchy user agents, IP addresses, and start blocking, challenging, and even doing a CAPTCHA to potentially malicious attacks.
We, earlier this year, released our actual bot management solution to our enterprise customers.
And this takes everything that we have learned over the last 10 years and brings it into a very specific point, which is we will give you a score from 0 to 99 whether any given request is likely to be malicious and automated.
Let's see what that actually looks like.
So this is one day on www.Cloudflare.com. I'm sure you've all been there.
This was last week, and we have a rotating set of tiles, but I chose the Connect one.
And you would think, what would a bot do on www.Cloudflare.com?
There's not a lot there to really attack. But if you'll notice on this slide, there is a little on the lower right, it says Contact Sales.
And it turns out lots of people like to take this particular page, or lots of bad actors like to take this particular page, and send lots of bogus leads into our sales team, or attempt to send bogus leads into our CRM.
And in one given day, this is what an attack looks like on www .Cloudflare.com.
Let me explain this slide a little bit.
So this is a histogram of requests over time for one given day. Anything in dark red is marked as bad by our system.
So this was an identified bot. And usually on our traffic, we see about 60% good or human traffic, and about 40% bots on a given basis.
If you look at around 1620, I believe the time is, we had a massive spike in bad traffic.
And that was specifically an attack on the leads page.
1420 to 1740 UTC. During this attack, we had about seven times increase in our requests per second.
That's not optimal when you are trying to keep your systems running as lean as possible.
So any origin that was not capable of scaling up to seven times a given request would be knocked down by this type of attack.
In our case, because we were blocking it, the system handled it quite well.
This is one of the things that bots can do.
So that would be content spam or a form of it.
There are a number of other things that bots will do on different sites.
We don't see as much of things like content scraping or inventory hoarding as we have no inventory.
But we have lots of customer stories of things like credential stuffing, where a bot is just trying to gain access to either your site or to test credentials that may be valid on another site.
Content scraping, as I mentioned, is pulling information off the site.
Content spam can be anything from lead forms to ruining an online community by making their forms completely useless.
Inventory hoarding is probably our most interesting customer use case that we're seeing a lot.
I'm not sure how many of you are familiar with the world of low inventory sneakers, but those types of sites tend to get the most impact from bots.
We also have credit card stuffing, which we'll talk about in a little bit, where an attacker will actually use an e-commerce site to validate that a credit card is valid and sell that credit card to a third party.
And then lastly, there's just straight application DDoS, taking your site offline.
So what do we do to protect against that?
We've created a solution that uses three dedicated detection models, so behavioral analysis, machine learning, and fingerprinting, which I'll go into in a bit of detail for each one of those.
This all happens after our DDoS protection and after our rate limiting, and it happens in conjunction with our WAF.
So rules that you have written with the WAF will either take precedence or integrate directly with our bot management score.
And you can use any of the characteristics in the WAF to make even more specific rules.
Even more exciting is that you consume the bot score in a worker.
And that allows you to do something very simple, like insert the bot score into the request header back to origin and treat it differently there, or perhaps give an alternate version of your website to a bot.
I actually have a customer who has a bot attacking a content scraping and taking the prices of its airline tickets, so when it detects a obvious bot using bot management, it presents them with a random prices in the web form.
And then lastly, once it passes through and we decide that everything, the traffic is good, we'll pass it back to your web app or server.
All of our mitigations are available, so you can choose to block, challenge, or capture, or just log the bot score itself.
So now let's go into the different detection techniques.
So the first is machine learning. Machine learning for us is very important because we have a view of the Internet that is pretty much unparalleled.
Michelle mentioned earlier, we have 20 million properties on Cloudflare.
All of them feed bot management. We see 750 billion HTTP requests on a given day, and we see generally about one billion unique IP addresses going through our system.
With that, we identify connections that appear to be automated, and how do we do that?
So if we go back to I'm under attack mode or any of the challenges that our customers do, when we do a challenge to a customer or an end user, we get a lot of information about that connection.
We know what browser it used, what the TLS stack was, what ISP it's on, whether it passed the JavaScript and whether it passed the caption.
And our machine learning system takes in all of those characteristics and groups them together to find bad automated traffic.
Similarly, it does the same thing for good traffic. If there is a user who was challenged, gets through, and does normal behavior on a site, then we're able to use that information to allow that user through.
Behind the scenes, this is running, it's no secret, we're a big ClickHouse shop at Cloudflare.
Almost all of our logging is in ClickHouse, and we're running a cat boost on a dedicated GPU cluster.
We take that one step further with fingerprinting.
So again, we're looking at that same data, 20 million customers, 750 requests per day.
And we find groups of characteristics that never appear to be human.
This is actually from our bot analyst's dashboard.
I've had to cut out some things here that are very specific to Cloudflare.
But this specific fingerprint had about, I think this was over a three -hour period of time, 225,000 requests, over 2,000 user agents, so that's a bit odd for a signature, about 250 zones, and over 305 different ASMs.
So, networks. Out of all of these connections, this particular signature has never solved a CAPTCHA.
It has been challenged 225,000 times, and has never once actually completed the challenge.
What that allows us to do is automatically say that this fingerprint is automated and is malicious, and we assign it the lowest possible score.
Our last technique is behavioral analysis.
Now, everything we talked about prior was across the Cloudflare network.
Behavioral analysis is on a specific domain. So, for customers who have activated bot management, we look at what your traffic normally looks like, and we cluster regular usage.
So, most zones have typical usage patterns where most of your users have a very, very similar pattern on your specific zone.
So, maybe they look at five individual pages before they go away, and they do it at a specific timing interval.
Then you have usually some level of power users that kind of look at a much higher rate.
We calculate these, we cluster them, and we make a baseline.
For every bot management enabled domain, and then we'll ratchet down the score as things go beyond the baseline.
In practice, if we go back to that attack that we talked about earlier, this is what it looks like.
So, in this particular attack, 42% of the detections were due to the behavioral analysis, so specific to www.Cloudflare .com.
30% were on the machine learning, and 28% were based off the heuristics and the fingerprinting.
This will vary on a zone-by-zone basis, and different zones will have different levels of behavioral analysis, machine learning, or fingerprinting, but the defense in depth and the different approaches that we take allow us to find different types of bots.
This is effectively an arms race.
We have a dedicated group of individuals across the globe, so the bot management team runs out of San Francisco, Austin, London, and Warsaw, and those teams are looking at our data and what else we can do to help increase our capture rates and lower our scores, I should say.
Now I want to invite Nick on stage.
Thank you. Nick is the co-founder of RevZilla.
I guess before we get started, Nick, why don't you tell everyone what RevZilla is?
What is RevZilla? I don't know what the intersection is between potential Cloudflare customers and motorcycling, but that's what we do.
We are a direct-to -consumer e-commerce company based out of Philadelphia, and we sell everything but the bike.
Helmets, jackets, apparel of that type of sort, gloves, along with aftermarket parts and accessories, so exhaust systems.
I founded the business I guess 12 years ago with two of my best friends.
It's what happens when you take computer scientists that are also super passionate about bikes, you get RevZilla.
My last role at the company was as Chief Technology Officer, and it's in that capacity that I found Cloudflare as a great vendor.
When was RevZilla founded? 2007. How long have you been a Cloudflare customer?
We've been a customer for almost five years, and our story in terms of our path to Cloudflare is probably not too dissimilar from others in this room, which is to say we found ourselves under attack.
We went on Cloudflare.com, and we saw the big orange button.
We said, please help us, Cloudflare. It was the Bitcoin ransom, and ultimately this was before we had moved to the cloud ourselves.
We were on dedicated hardware.
Our hardware wasn't able to handle the volumetric attack that was presented to us.
We were working with our ISP at the time to also mitigate the attack.
Eventually, they shut us off because we were affecting their other customers.
None of that's a good solution. Ultimately, our mission as RevZilla is to advance the experience of the motorcycle enthusiast.
When your website goes offline, you can't really do that, and so we engaged with Cloudflare within essentially as quickly as DNS could propagate.
We were back on the Internet and didn't have to continue to worry.
Can you tell us about how you use Cloudflare today, and then specifically the challenges you had with bot management, with bots, and what you tried before using Cloudflare?
Sure. We use lots of different features of Cloudflare.
It's been a great day today hearing about all the great new tech that's come out, but even in the past four or five years, we've been able to take advantage of lots of the parts of the offering from DNS to the WAF to general rate limiting and the CDN and all these things.
Even though they're not a CDN, they still have a great CDN.
We've been able to take advantage of a lot of those different tools.
Most recently, we added the bot mitigation to our account, and this actually came about I guess it must have been maybe in the spring of this year.
We noticed in our checkout funnel that all of a sudden we were getting a lot of requests to the endpoint that actually validates a credit card and ensures that a credit card works and is billable and all that stuff.
We noticed this huge uptick in requests and we're like, uh-oh, this isn't good.
We think that we're pretty competent engineers, developers, so we built our own rate limiting tool in-house using an Elixir microservice, looking at lots of different elements of the request, the IP address, GeoIP, all sorts of different things.
Ultimately though, it looked like we were stopping good traffic and also not actually stopping the bad traffic.
This credit card stuffing essentially got us back into contact with Cloudflare and we quickly said, no, let's test out the bot mitigation.
Literally, it's like you press the button and you just see the traffic just drop off the WAF, completely obliterate the attack.
We were like, yep, we're sold. This makes a ton of sense. At the time, we were super nervous because Braintree and our upstream merchant services accounts were like, hey guys, you got to stop this attack.
This isn't good for the banks and their customers and all that stuff.
Have you expanded your deployment of bot management?
Yeah. In addition to protecting our checkout funnel, we use it now.
Historically, we were just always rate limiting login attempts but now we actually use bot mitigation there too.
That ends up just being a better experience for our customers because fewer customers actually have to get captured or checked in some way.
We use it there and then we're also starting to test it on certain aspects of our content and catalog.
You can imagine the 30,000 motorcycles that exist and all the different parts and configurations that might fit on a bike.
We've done a lot of internal cataloging work to figure out what fits what.
We see that as competitive advantage so to the extent that we can limit competitors from scraping some of this information, we're looking to leverage Cloudflare to help us with that too.
And then the kind of dangerous question, since you have the product manager on stage, is there anything that you want the bot management solution to do better or do differently?
Oh no. I think for us, what we always struggled with, and the tool is nice in that you can simulate it.
You don't have to actually block any real traffic. You can see what would be blocked or challenged and then ultimately make the decision of does this make sense for us?
And what we wanted to do ultimately is can we just roll this out across the entire site?
Wouldn't it be great if we could just have set it and forget it rule?
But we never felt like we got comfortable saying just the entirety of revzilla.com protect with bot mitigation.
Instead we felt like we kind of had to slice and dice across a couple different paths.
So maybe it's just partnering more with Cloudflare to kind of figure out what that might look like.
But overall we've been really happy with the product.
Lastly, any fun Cloudflare stories or any attack stories over your time at revzilla?
Oh no. So attack stories are never fun.
That's rule one. I think one of my favorite, it's not an attack story, but something that's interesting is we always want to AB test things.
Anytime we build a new feature, anytime we deploy a new product, we want to test that it has the efficacy that we expect it to have and it can kind of prove out its value and it's worth to our organization.
And so we were looking at Rocket Loader. We didn't hear about it today, but it's a way to speed up JavaScript in the client side execution.
It's a product that Cloudflare has. And we wanted to test it, but we just wanted to make sure that if we turn this on, does this turn into real conversion dollars?
And so we used Cloudflare Workers to construct an AB test that would allow us to turn on the Rocket Loader for a subset of the customers and actually prove out what at the time looked to be about a 5% increase in overall conversion rate just by speeding up the front end execution of JavaScript.
So not an attack, but also using Cloudflare to test Cloudflare.
Great. Thanks, Nick. And I think we're at time.
Thank you. Applause Music Hi, we're Cloudflare.
We're building one of the world's largest global cloud networks to help make the Internet faster, more secure, and more reliable.
Meet our customer, BookMyShow.
They've become India's largest ticketing platform thanks to its commitment to the customer experience and technological innovation.
We are primarily a ticketing company.
The numbers are really big. We have more than 60 million customers who are registered with us.
We're on 5 billion screen views every month.
200 million tickets over the year. We think about what is the best for the customer.
If we do not handle customers' experience well, then they are not going to come back again.
And BookMyShow is all about providing that experience.
As BookMyShow grew, so did the security threats it faced. That's when it turned to Cloudflare.
From security point of view, we use more or less all the products and features that Cloudflare has.
Cloudflare today plays the first level of defense for us.
One of the most interesting and a-ha moments was when we actually got a DDoS and we were seeing traffic burst up to 50 gigabits per second, 50 GB per second.
Usually, we would go into panic mode and get downtime. But then, all we got was an alert and then we just checked it out and then we didn't have to do anything.
We just sat there, looked at the traffic peak and then being controlled.
It just took less than a minute for Cloudflare to kind of start blocking that traffic.
Without Cloudflare, we wouldn't have been able to easily manage this because even our data center level, that's the kind of pipe is not easily available.
We started for Cloudflare for security and I think that was the a-ha moment.
We actually get more sleep now because a lot of the operational overhead is reduced.
With the attacks safely mitigated, BookMyShow found more ways to harness Cloudflare for better security, performance, and operational efficiency.
Once we came on board on the platform, we started seeing the advantage of the other functionalities and features.
It was really, really easy to implement HTTP2 when we decided to move towards that.
Cloudflare Workers, which is the computing at the edge, we can move that business logic that we have written custom for our applications at the Cloudflare edge level.
One of the most interesting things we liked about Cloudflare was everything can be done by the API, which makes almost zero manual work.
That helps my team a lot because they don't really have to worry about what they're running because they can see, they can run the test, and then they know they're not going to break anything.
Our teams have been able to manage Cloudflare on their own for more or less anything and everything.
Cloudflare also empowers BookMyShow to manage its traffic across a complex, highly performant global infrastructure.
We are running on not only hybrid, we are running on hybrid and multi cloud strategy.
Cloudflare is the entry point for our customers.
Whether it is a cloud in the back end or it is our own data center in the back end, Cloudflare is always the first point of contact.
We do load balancing as well as we have multiple data centers running.
Data center selection happens on Cloudflare.
It also gives us fine-grained control on how much traffic we can push to each data center depending upon what is happening in that data center and what is the capacity of the data center.
We believe that our applications and our data centers should be closest to the customers.
Cloudflare just provides us the right tools to do that.
With Cloudflare, BookMyShow has been able to improve its security, performance, reliability, and operational efficiency.
With customers like BookMyShow and over 20 million other domains that trust Cloudflare with their security and performance, we're making the Internet fast, secure, and reliable for everyone.
Cloudflare helping build a better Internet.