🏗 Building serverless APIs: how Fauna and Workers make it easy
Presented by: Rob Sutter
Originally aired on September 7, 2022 @ 2:00 PM - 2:30 PM EDT
Cloudflare Platform Week: Developer Speaker Series
As part of Cloudflare's Platform Week, we're thrilled to feature an array of expert web dev speakers, developers, and educators here on Cloudflare TV.
Building APIs has always been tricky when it comes to setting up architecture. Fauna and Workers remove that burden by letting you write code and watch it run everywhere.
Visit the Platform Week Hub for every announcement and CFTV episode — check back all week for more!
And join the community and members of the Cloudflare team at the Cloudflare Developer Discord
English
Platform Week
Transcript (Beta)
Hi everybody. Thanks for joining us for Cloudflare TV. The session is Building Serverless APIs – How Fauna and Cloudflare Workers Make It Easy.
I'm Rob.
I'll be taking you through today's session. Give me just a minute to get everything set up.
And away we go.
So again, in today's session, what you can learn is what a data driven API at the edge is and how it's different from a traditional API.
What some good use cases are for building data driven APIs on the edge instead of in your data center or region.
What are some less appropriate use cases where you may not want to do this?
Some advantages, some disadvantages.
And then we're going to get hands on with a demo, including a template that you can use to build your own rest APIs with Cloudflare Workers and Fauna.
So again, real quick, give you a little bit of an introduction to Fauna and myself.
We'll give you a piece on authentication at the Edge and how Cloudflare enables some really specific capabilities here.
Talk a little bit about Fauna, give you the solution overview.
In a way, we'll go.
Unfortunately, we won't have live Q&A during today's talk, but I will be joining the Cloudflare Workers Discord after this talk to answer any questions you may have.
So feel free to go ahead and ask them there and tag me in them.
Or if you're viewing this later, send me your questions on Twitter, @rts_rob.
You'll see that on every slide.
So who am I?
Well, I'm Rob Sutter.
I'm the head of developer advocacy at Fauna, and I'm a previous SAS startup co-founder.
And the reason that's relevant to this talk is that our users were located all over the world.
They were both geographically dispersed and traveling.
So we never knew exactly where they would wind up and we never knew where exactly they were signing up from.
The trick with that was that we had to manage regions like this and connect them together and this was easy at the time and worked well if they happen to be in that same region.
But it didn't work well when they started getting spread all over the globe, whether that was from new signups, travels or other reasons.
And you see in this case, you can't have a single database that handles all of that and you can't have a single compute workload that handles all of that.
And this is where API is at the edge with the solution, like Cloudflare Workers and Fauna becomes really powerful.
You can take advantage of all of Cloudflare's 275 points of presence.
You take advantage of a distributed database that's provisioned and managed by Fauna that you don't have to worry about operational concerns for and you can improve the performance for your users.
So what kind of users or use cases do you have that could benefit from this architecture?
The first is anything where time is money. And this is where we talk about the classic study that for every 100 milliseconds of latency that you add onto your responses, you lose 1% of revenue in an ecommerce scenario.
The others is when users require a fast response for non purchase decisions, whether it's performance indicators or telemetry, things like that, where the response time itself is critical.
Web and mobile applications are a good example of this.
You want your applications to feel snappy so that your users feel like they're getting things done and aren't waiting around for your app.
And finally, distributed Internet of Things data ingestion is a great use case for this.
Although the requests are asynchronous, so long as they're holding open requests, they're using battery on the device.
And that is something that you want to minimize or avoid, if possible.
Having the closest geographical and speed of light endpoint to make requests is a really powerful way to shorten that request.
Time for IoT devices.
With that in mind, some less appropriate use cases are sort of the opposite of that, as you might expect.
If all of your users are constrained to a particular region, for example, all of your employees inside a single office building in the old times, using a single line of business application, well then you don't get the benefits of distributed data and edge functions.
This also matters for those line of business applications themselves.
If all of your data is being created and operated on internally, like analytics data or industrial IoT data that's generated in a factory with co-located servers.
Well, those probably aren't good use cases for edge functions and Fauna. The advantages of using an edge API or correlated to the key characteristics.
Anything where time is money means that faster responses can mean both lower costs and greater revenue for you as a company.
And if you're choosing the right service provider, you can get a reduction or savings in egress bandwidth costs.
You're also delivering that more responsive, better user experience for users who require a fast response.
Similarly, the disadvantages are correlated to the key characteristics of inappropriate use cases.
If all your users are located in a single region, you may save money by just hosting a regional resource and data that's generated internally still needs to be generated by some centralized process.
Now there are some things that you might think of as disadvantages, like log aggregation and monitoring.
But distributed systems in a single region require robust solutions, and edge solutions aren't much different.
Cloudflare Workers also gives you some utilities like the live tail that can make the edge solution actually more intuitive.
So of course, these aren't prescriptive one size fits all guidelines you should still explore and see.
Maybe the benefits outweigh the disadvantages for your particular use case.
An important part of any API is authentication.
Authentication enables you to be sure that any user is an authorized user who should be able to access the system.
It also lets you perform things like rate limiting, backoffs, and billing.
Authentication at the edge works a little differently from authentication at the back end.
In this model, this is a typical back end authentication model where your bearer token is sent to the edge and forwarded to the back end and then a decision is made in the backend.
Either you're authorized and the requested data is returned or you're not, and the request is rejected.
But you can see here that that requires a round trip to the back end on every turn.
One way around this is using an authentication provider or an ID provider as a service.
And in this case, your requests are going to be forwarded to that IDP and the validation result can be cached or can be recalculated at the Worker itself so that only valid responses are sent on to the back end.
As you see in this slide, invalid or cached responses can be returned without that additional round trip.
This saves you time and it saves you money.
But one really interesting thing about using Workers is to remember that Cloudflare got its start with network protection.
So if your application comes under attack, Cloudflare is applying that expertise to protect your application at the edge before those packets even go either to your IDP or to the back end.
This allows you to continue serving traffic to users in regions that aren't under attack while shedding that load from your back end.
And this is a pretty unique capability that Workers offers to your API.
So that's the distributed compute part.
How does the distributed data part work with Fauna?
With apologies for the whatever happened to this slide here, Fauna gives you a single endpoint that you can read to and write from and handles routing to the nearest location for you.
Fauna is transactional and it determines those transactions via capability called the OCC or optimistic concurrency or optimistic, I always blank on this term, I'm sorry, contention checker behind the scenes.
That means that you don't pay for the data transfer costs for replication, Fauna does.
Once you get your record into the Fauna endpoint, we're handling replication and distribution across regions for you.
So this is in line with Cloudflare philosophy of not paying for data egress costs.
You can see that this is a pretty performant approach as well. Because of the design, you get replicated asset transactions without locking.
And that means when you look at a region group latency like this, you see typically 40 milliseconds of latency for right now these are strongly consistent, fully distributed acid transactional rights and in region groups they complete around 40 to 50 milliseconds on the global database, they complete around 100 milliseconds.
But you have that strongly consistent data. So your application becomes much less complicated to program for because you don't have to handle eventual consistency.
Note here that you also get these beautiful single digit millisecond reads because you're reading from the closest location to wherever your worker point of presence is.
So we consider this a next generation serverless architecture. This architecture is optimal for business logic that you want to run at the edge and especially good if you can take advantage of caching.
You get really fast reads whenever your cache misses.
Now, that's beyond the scope of this talk, but of course, here you'd be thinking about integrating Workers KV to provide that caching layer.
For the demo that we're going to do today, we're just going to integrate Cloudflare Workers and Fauna.
And of course, you probably know the benefits of Workers or that they deploy to everywhere across Cloudflare's global CDN.
They use Cloudflare DNS, they have zero millisecond cold starts worldwide and as a database partner, Fauna is offering you those global rights and reads from the lowest latency replica.
So a very simple overview of the solution is that Cloudflare is forming that Smart API layer over and around Fauna so that your clients access Cloudflare giving all the benefits that we talked about previously while storing all your data in Fauna.
Keep in mind that it's not a single monolithic instance of Fauna. These are distributed instances of Fauna that are pushed out closer to Workers.
And this image does a good job of depicting how that distance and thus time or latency gets cut down by that distribution.
Now a lot of this may seem like Advanced Distributed Systems magic.
And on the implementation side, honestly, sometimes it seems like that to me, but it's really easy for you to get started and build that way.
And in fact, we're going to go ahead and build a rest API using Workers and Fauna right here on this talk.
To do that, we're going to go to this template, this templates available in Fauna Labs where we provide source for you.
But I'm also going to give you a link at the end of the demo.
That link, if you want to follow along right now is fauna.link/workers- repo.
For later.
There's also a tutorial that we have in the Cloudflare Docs that's fauna.link/workers, and I'll give you that link at the end of this presentation as well.
But we go back and we want to create a new Wrangler application or a new Cloudflare Workers using the Wrangler CLI and we get the instructions here.
We just generate a named API by passing it the template url.
In our command line.
That looks like this: Wrangler generate, the name of the API and the template URL.
Only a second to generate that.
We go in here and we see that we have a Fauna directory with some resources that we're going to use, we have our Workers, and we have our wrangler.com.
Now, this is already set up and ready to go.
So after you install your dependencies, of course, the curse of a live demo.
This is already set up and ready to go.
With not much that does anything at all.
But that's okay because it's giving you this blank template.
We can simply curl that template.
And we see our workers deployed.
So let's open this up and let's take a look at what we need to do to make this actually an API that's fronting our Fauna database.
We need to go into our dashboard and create a database.
Now, a quick note.
If you don't have a Fauna account, you can sign up for one.
It only takes a few seconds.
You can sign up using your GitHub account.
You don't have to provide payment information until you're ready to start building your application.
And we have a free tier that's free for life.
So we'll go over here where I'm already signed in, and we have a database that we've created just for this application.
Now the next thing we need to do is to deploy our finite resources.
To do that, we're going to need a new key.
I'm going to copy this key out of here.
This is an admin key.
Which means I'm going to pull this off the screen so that it's a little bit hidden because this allows you to do anything with your application, including creating infrastructure, destroying infrastructure, etc..
We go back to our instructions and we need to run this Fauna schema migrate command that you see here.
Fauna schema migrate is infrastructure is code for your Fauna database.
And what that does is take these resources that we've defined here, including a product collection.
And the relevant functions for our account as well as security roles.
And this creates them in the database for us.
We'll investigate more on this in a second.
But all you need to know is that this is set up and ready to go with leading practices for you as is, so that you can see out of the box how we recommend implementing CRUD routes using Workers and Fauna.
So to do this, first, we generate our migrations.
And we see that it's going to add all of those resources that we talked about.
Now we're going to apply our migration.
Oh.
I'm actually going to run it so that it'll ask me for the key. Oh, okay.
Give me a second to put that key in here, in my environment. Clear my screen.
And with that key there, we see that we can apply all of these migrations into our database.
Now, I'm going to bring that back down so that you can see what that looks like.
We created this product collection with no documents in it, and a collection if you're not familiar with the document database.
It's just like a table in a relational database.
It holds documents which are analogous to rows or items.
We created these functions for CRUD functionality.
Add product quantity.
That's an update statement.
Create a product, delete a product and get products by ID.
This all does what you would expect it to do.
And then we have our key that we created previously, and we have roles for each of these routes.
So again, this isn't a basic way to make this work. This is a leading practice framework for you to start building your application.
Now that we have this framework in place, we need to put a secret for our application.
That's going to be our key, not the one that we used earlier, but a Worker key.
I want to show you what that means.
This Worker key is going to be associated with this Worker role.
The Worker role can't do anything to collections or indexes.
It can't read data directly.
It can't read indexes correctly or write.
But what it can do is call these user defined functions.
So the Worker role only has access to perform the actions that are defined in these user defined functions.
These are defined functions are like stored procedures that are stored in your database and can be run.
It's important to note they can also be unit tested using the framework of your choice.
So if you're familiar, for example, with just then you can unit test your user defined functions using.
Just then you call them from a client so that you're sure that they're correct.
So this is our Workers role.
We're going to go back and create a key associated with that role.
And in this case, I'm actually going to leave that on the screen.
Because all that you can do here is invoke those four functions with it.
In general, you would still protect this as a secret, but I want you to see what all of this looks like.
And you returned to use Wrangler.
Is a Wrangler put secret or is a Wrangler secret put there it is secret.
Put Fauna secret.
Paste that value and we're done.
Now all we have to do is either publish it or test it locally.
Let's go ahead and test it locally so that you can see that Wrangler dev gives you that option as well.
And you can see that our Workers are listening on for 87-87.
Now, in this template, we've given you some preexisting commands to let you exercise all of these rules.
The first is a command to create a product.
I'm going to go ahead and paste this here.
But before we run it, we want to take a look at what's going on in our application.
We have our product selection with no documents in it. We run our command and we get a product ID back.
Now when we reload our products collection, we see that that document was created by the Worker.
So what's happened is we ran to the Worker locally.
The Worker stored that data in Fauna, and now it's available for us to read and write against.
Similarly, you can get information back on that product.
This is the retrieved part of your CRUD api.
And the routes work just as you'd expect.
If we run this.
And pipe it to something a little more enjoyable to look at, then we see that we got that same document back with the data that we have in our console.
So all of this is working as expected. Let's look at the code for a second.
Let's see what's actually happening here.
The first place to look is in our Worker.
Going to blow this up so that you can see it a little bit better.
We were using worktop, which is one router that's available for Workers.
You may use something else, but the general format here should be familiar.
The first thing we did was send a post request to create that new product.
In that case, we have a root that's listening on the product's endpoint and it invokes a function that breaks out parameters from the body.
And calls this user defined function named create product.
Passing it.
The parameters that we specify here. Then we sent back a response with the returned ref ID and handle any errors.
So what does that user defined function itself look like?
This is the fault of query language, and it's the language for directly manipulating the database and fauna.
It's a functional programing language similar to Lisp.
Here we see that we've created a function named Create product.
That's the one that we call.
It accepts parameters for serial number, title, weight and pounds and quantity that match the parameters we provided.
And then it creates a document in the product's collection.
Using that data that we've specified in the parameters.
And you'll see here that we have key and value pairs that match the parameters that were passed in.
Now these names don't have to match. This is the key that we are assigning in the database and this is the variable, the parameter argument that's being passed into the user defined function.
And finally, you'll see that this function runs with the role create product UDF.
So let's take a look at that as well and see how the security boundary is limited by using roles.
The Create product UDF role is defined here again in SQL.
It's fairly straightforward.
You give it a name, Create product UDF, and then you give it a list or collection of privileges.
In this case, to create a product, you only need one privilege. You need to be able to create a document.
In the collection products.
So this is a fairly straightforward role.
Now we've run, create and retrieve and retrieve or read is basically the same.
But let's change it a little bit by running an update.
Updates are instructive because they require a little bit different permissions from what you would need.
So we'll take a look at this. Well to paste our sample code.
Get our document ID.
Replace document ID in our URL.
And send it.
And now we see that we should expect to have five Bluetooth headphones.
In our products collection. And if we reload this, we do.
So let's look at this code path again.
Similar to create.
We start with a put root or a patch root.
On the products collection.
We have written a blog about why we choose to use patch and not put here.
It's because we're giving instructions on how to update the data rather than the updated shape of the data.
That blog post is linked here from the template in case you want to see it.
In this case, we're listening on this route and we get the product ID to add quantity.
We... We structure the variables that we're interested in from the request body and it parameters.
And again, we just call a user defined function passing only those parameters.
Now note here, we haven't passed any information about the weight in pounds, the name of the product or anything else.
We're only passing quantity information.
This is important to remember because when you run a Fauna update, it only updates the fields that you specify.
To remove a field, you need to set that key to null.
If we go back and look at the ad product quantity, user defined function, we see that it's a little more complex, but it's taking in a product ID quantity.
It's constructing a reference which is Fauna's native ID type from the ID that you specify.
A reference is a combination of a resource and an ID in this case, the products collection and the ID you specify.
It's getting the entire document as it exists.
It's pulling out the current quantity from that document and it's updating the amount not by setting quantity to five, but by adding five to that quantity.
And that's why we have a patch here instead of a put.
If we were to run this again, we see that our quantity is now increased to ten and that's as expected based on our code, which is adding the quantity to the existing document rather than setting it to the amount that we specify.
This section simply defines the shape of our return.
So we're going to return that object as you've seen it before, and it's a little easier to see if again.
We pretty it up just a little bit.
Let me get here for you.
By passing it through something like JQ.
And here, now that we've run our update three times, as expected, we have a quantity of 15.
And then the final route that you have is to delete a product.
This is almost exactly the same as reading the product.
You just need to get that product ID.
Pace it at the end of the URL and note that the delete returns the last state of the object as it existed right before the delete.
So you know what you have before you deleted it.
But when we go back over here and reload, we have no more products.
So a quick review here.
In this talk we discussed what API is at the edge are and how they're a little different from traditional APIs that you might host in your data center or in a single cloud region.
We talked about what are some positive use cases or appropriate use cases for data driven APIs at the edge?
We covered some less appropriate use cases, those that are centralized by their nature or where they generate a lot of data from line of business applications that are already internal to your company.
And then we talked about the advantages, primarily speed and increased revenue and some of the disadvantages.
As promised, there's a link of references here, or there's a few references here on this slide.
The first one takes you to a tutorial in the Cloudflare documentation that walks you through step by step, building a rest API using Cloudflare Workers and Fauna.
The second is a link to that repo that I just showed you that we used for this talk.
The third, fauna.link/ calvin is a deeply technical paper that was that forms the basis for how Fauna handles replication as a single step process and avoids multi region locking.
If you really want to do a deep dive on does this actually work and how this is the paper for you, I still recommend that you read at least the first page of it or the abstract to get a general idea for how Fauna works.
If you'd like to learn more about Fauna, be sure to follow us on Twitter.
We have a YouTube channel with additional talks.
If you're watching this talk later and you have questions, my Twitter handle is there at the bottom of the slide, rts_rob.
And again, I'm going to go ahead and jump over to the Cloudflare Workers Discord Server to see if any of you had any questions during the talk.
Thank you so much for joining us.
Look forward to seeing what you build.
What is a bot?
A bot is a software application that operates on a network.
Bots are programed to automatically perform certain tasks.
Bots can be good or bad.
Good bots conduct useful tasks like indexing content for search engines, detecting copyright infringement and providing customer service.
Bad bots conduct malicious tasks like generating fraudulent clicks, scraping content, spreading spam and carrying out cyber attacks.
Whether they're helpful or harmful, most bots are automated to imitate and perform simple human behavior on the web at a much faster rate than an actual human user.
For example, search engines use bots to constantly crawl web pages and index content for search, a process that would take an astronomical amount of time for any human user to execute.