๐ป Serverless WebAssembly with Cloudflare Workers
Presented by: Robert Aboukhalil
Originally aired on July 10, 2022 @ 5:00 AM - 5:30 AM EDT
Cloudflare's Full Stack Week Developer Speaker Series
This talk explores how to get started with building APIs powered by WebAssembly on Cloudflare Workers. As a concrete example, we'll take a data analysis tool written in C, compile it to WebAssembly, and deploy it with wrangler. We'll also discuss various debugging tools, along with the advantages and pitfalls of serverless WebAssembly.
Visit the Full Stack Week Hub for every exciting announcement and CFTV episode โ and check back all week for more!
English
Full Stack Week
Transcript (Beta)
Hey everyone, thanks so much for tuning in. I'm really excited to tell you a bit about how to use WebAssembly with Cloudflare Workers.
So before I get started, just a little bit about myself.
So my name is Robert Aboukhalil, and my background is a mix of software engineering and bioinformatics.
And I've kind of been working at that intersection for a while.
I'm the maintainer of the open source tool BioASM.
It's basically a repository of bioinformatics tools that are pre-compiled to WebAssembly for easier use.
And finally, I'm the author of the book, Level Up with WebAssembly.
So if you're just getting started, this might be a good resource.
All right. So before we talk about serverless WebAssembly with Cloudflare, I figured it's worth spending a little bit of time talking about what WebAssembly is and why is it useful in the first place.
The way I like to think about it at a very high level, you can think of WebAssembly as really just another language alongside JavaScript that can run in a browser.
Now, what does this language actually look like?
Well, here's a very simple piece of code. Now, thankfully, you don't have to develop in this language, but the way that WebAssembly is typically used is as a compilation target.
In other words, you can take code written in other languages, for example, here's the same code in C, and we can compile it down to WebAssembly.
So that's another way to think about it as a compilation target for languages like C and C++ and Rust.
Okay. What is it useful for?
Well, first and foremost, it's often used to reuse existing code. All these tools that I'm showing here are tools that are millions of lines of C or C++ code that you don't want to have to rewrite in JavaScript just to bring it to the web.
So WebAssembly lets you do that without rewriting everything.
Another powerful use case is performance improvement.
All these tools I'm showing here have examples of replacing slow JavaScript computations with optimized WebAssembly in order to speed up their performance.
Now, obviously, that won't always happen. JavaScript is quite powerful, but there are some use cases where WebAssembly is the right fit there.
And finally, there's also the portability story with WebAssembly, that you can not only run it in the browser, but also in these other environments.
And in particular today, I'm going to focus on Cloudflare Workers. And so what does it actually look like to run WebAssembly in Cloudflare Workers, and why would you want to do that?
And so typically, when you have a user's computer, and they have a browser, and you can run a WebAssembly module inside of their browser, now we're looking at a different kind of model where the user sends an API request, and we want to execute the WebAssembly inside Cloudflare Workers, it actually looks pretty similar.
And what's really nice about Cloudflare Workers is that they're powered by V8, and V8 is the JavaScript and WebAssembly engine that powers Chrome.
And so what this means is that if we run WebAssembly modules in Cloudflare Workers, we're technically still running them in a browser.
They're just running in a browser that's hosted on Cloudflare servers instead of the user's computer.
The one benefit that's really nice about this is that I can actually take the exact same WebAssembly binary, run it both on the front end and on a serverless function that runs on Cloudflare using the exact same file, which would not be the case if you used other providers that aren't based on V8.
So that's one of the really nice benefits there. So let's take a look at a very concrete example.
Let's say I want to build an API that calculates the nth digit of pi.
And obviously this is a very complicated algorithm. I don't want to have to rewrite it in JavaScript.
And so we can take this off the shelf C program, and essentially what we want to do is compile it to WebAssembly, run it on a Cloudflare Workers, and then return the result to the user.
So what is this actually going to do?
If you look at this, these are all the digits of pi, for example.
Not all, but some of the digits of pi. And if you give the algorithm n equals one, it will return these numbers in red.
If you get n equals two, it returns these numbers and so on and so forth.
And so what we actually want to do is build an API where you give it a value of n.
It uses the WebAssembly module to compute the result and then returns it back to the user.
So in order to do that, the first thing we have to do is to compile our C program to WebAssembly.
And the way we're going to do that is using a fantastic tool called WebEnscripten.
So if you ever have to compile C or C++ code to WebAssembly, that is the tool of choice.
It has a lot of great utilities and a lot of ways of managing virtual file systems and so on.
So it's a fantastic tool. And so usually if I wanted to compile the C program to binary, I would use the compiler GCC.
In this case, because I want to compile it to WebAssembly, I'll use EMCC, which is Emscripten C compiler.
And so you can see here I'm asking for a .js output.
What this will do is generate both the .js and .wasm.
And the .wasm makes sense, right? We're compiling to WebAssembly, but why do we want a .js?
And the reason is that this JavaScript file is actually what's called glue code.
So it helps initialize the WebAssembly module and contains all these utility functions that we're going to want to use later on.
A few other things to note.
We give it a few custom Emscripten settings, like modularize equals one.
We just do that so that we're able to essentially import this as a bundled module from a JavaScript file instead of having it just be defined in the global state.
Another few things of note that text decoder equals zero. This is just to highlight one little thing is that there are sometimes differences between running it in the browser and in Cloudflare Workers.
Where this text decoder API is not yet supported on Cloudflare Workers for certain encodings.
And so we're just telling Emscripten, we need to polyfill this in the meantime.
And we're basically telling Emscripten, give me a way to call the main function from my C program.
And we want to be able to have that accessible to us within this module. Okay.
So now we've compiled this. We have our WebAssembly file. We have our JavaScript file.
Next, I want to deploy this to Cloudflare Workers. And there are three files that I'm going to add in order to do that.
And I just want to go through them and show you really the bare minimum that is needed to do so.
So first you have package.json.
All I'm going to do here, obviously usually your package.json will be much larger, but I just want to show this is all you need, is to tell it my main entry point is essentially index.js.
And we'll talk about what index.js looks like.
But you also need a wrangler.toml file. So this will contain all your configurations.
A few things of note, here I'm calling my worker pi. So it's going to be deployed to pi.robert.workers.dev.
And this is live. You can try it out. The one thing that is WebAssembly specific is this WASM modules line.
And so what this will do is map pi WASM variable to the pi.wasm file.
And I'll show you why this is needed in a second.
Because we're going to want to use this global variable pi underscore WASM inside of index.js over here.
And I'll explain in a bit why exactly we do that.
So if you look at this index.js file, the first thing we do at the top is import the glue code as the name module.
And remember, we could only do that because I passed the modularize command here to Emscripten.
And so then the rest should look fairly familiar with the event listener and the handle request.
The one thing that's different here, however, is we're initializing the WebAssembly module and giving it a few parameters.
In particular, we're giving the custom values for print and print error.
What this does is tell Emscripten, I don't want to send standard out and standard error to console log.
I want to actually store it in a variable called output that I will then return to the user as the value of the API request.
And then we do something here where we also specify that when Emscripten is instantiating the WebAssembly file by default, it will use the path of the pi.js file to infer where it should download the pi.wasm file.
But here we're not actually hosting these files because we don't need them to be executed in the browser.
And so this is why we actually take the value of the WebAssembly file in this global variable pi.wasm and instantiate it that way.
And so this is one of the few differences that you'll see in how you actually execute this compared to in the browser.
So now that we've instantiated our module, we can retrieve the parameters from the URL and then call the main function with that value of n and then return the response to the user.
So in order to deploy this, all we have to do is use Wrangler publish.
And it basically does it. And you can see in the dashboard, it'll show up and tell us that we have deployed our tool to pi URL.
And then one thing you'll notice is that the dashboard will list a new section that will show you all your WebAssembly variables and their contents.
And so then if I actually go to this API and don't specify a value of n, it just shows me the default usage message.
And this is text that's defined in the original C code. So I didn't touch any of that.
But then if you actually specify a value of n, then it will do the right thing and give you the decimals that you expect to see.
So great. This is working.
We've gone from a C program, compiled it to WebAssembly so that we can run it on Cloudflare without rewriting this complex algorithm in JavaScript.
However, if you ask it for the digits of pi at position 1,000, then it takes too much resource and gives you this error.
Worker exceeded resource limits. And that is because by default, we're using workers that are in the bundled type.
Which means that they only have 50 milliseconds of CPU time.
But because we're running a pretty compute intensive calculation, what we're going to want to do is pick the unbound option.
And so, very briefly, the differences between the two is, you know, with bundled, you have 50 millisecond of CPU time.
But that doesn't include fetches to the network.
With unbound, the 30 seconds does include fetches to the network.
In our case, we're not fetching anything. We're just doing running some math calculations.
So, we're going to go with unbound. And now, if we run the API call again, now it actually works and gives us the right answer.
But I was kind of curious, you know, what are the limitations of this?
Like how far can I push it for this particular computation?
How many digits of pi can I go up to? And so, here I'm going to show you a graph of the benchmark where on the X axis, you can see the nth digit of pi.
So, the value of n that I give the API. And on the Y axis is the runtime in seconds.
And so, this is what it basically looks like. So, at 1 ,000, maybe it takes a second or so.
When you go up to around 3,000, now we're at 10 seconds.
And at around 6,000 digits of pi, we're talking about almost 30 seconds runtime.
But anything above that, as you can see with these red triangles, exceeds the limit.
So, it just fails. And it's really interesting to see that, you know, it does actually go up to 30 seconds, which is pretty great.
Now, of course, an API call that lasts for 30 seconds, I'm not sure that's the right way to do it.
But in any case, if this fits your model, that can be an approach you can take.
Okay.
Another thing I wanted to talk about is how do you debug these sorts of WebAssembly modules?
One challenge that you have more generally, WebAssembly or not, is with serverless functions, it's sometimes hard to really know what's actually going on in real time.
And you sometimes have to rely on logs that are lagging.
But there are a few ways you can get some pretty good debugging experience with workers.
So, here I'm going to show you a different application. So, we talked previously about looking at digits of pi.
Here I want to talk about simulating DNA sequences.
So, the idea here is that I have an API that uses a WebAssembly module to simulate DNA sequences.
But it actually needs some initial data to get it started. And so, it's going to reach out to an S3 bucket, download a subset of the data, and bring it back to our worker so it can use that as a starting point upon which it simulates more DNA sequences.
So, the way this is going to look in the worker's dashboard looks something like this.
So, I'm going to have my script on the left. I'm going to have a bunch of ways to run that script.
So, for example, if I make get requests and I click send, then I will see that it returns a bunch of simulated DNA sequences.
So, you know, it's working. But what's really interesting about this interface is really what's down here.
And I want to point out this, you know, this is the DevTools, right, that we're all used to seeing.
But this isn't the DevTools that's running in my browser.
This is the DevTools that is actually running on the browser that's executing my script on the Cloudflare servers.
Which is really neat because, you know, I can go to the network tab and see that, in fact, we did make a query to get some data from S3.
And I can, you know, validate that by clicking on this and seeing that, in fact, I did do a request to that bucket.
And, you know, again, I do want to point out, this is not running on my computer.
This is running on Cloudflare servers.
But I'm able to debug this as though it was running on my computer.
And so, in fact, we can look at preview the response and see what the output is.
Other tools, you can also use the JavaScript profiler.
I haven't used this as much.
But it can be really useful if you want to see which part of your code is taking up the most runtime that can help you in optimizing your application.
Another thing I really like about this interface is the fact that I can make a modification to my script.
For example, I can add a console log statement saying hello world over here.
And if I click the send button, it will use the modified script.
And I can see the results over here. But I haven't actually modified what's currently running in production.
And so, you're able to make these adjustments without making changes to what's live.
And I find that really neat for quick tests.
A few other debugging tools.
Things that are specific to WebAssembly. So, when you're compiling, you know, C and C++ code to WebAssembly within Scripten, you can use these two flags to expose a lot more information about what the error actually is and where it came from.
And just keep in mind that that produces much bigger binary sizes.
So, you probably don't want that turned on in production. You can also, you know, stream the logs with Wrangler, tail, or directly from the UI.
These are really fantastic.
And they have very low latency. You know, the logs appear immediately, which is a really great experience when you're trying to debug something.
The other thing is this tool called Miniflare that I found to be really great for local development.
And if you're using it with WebAssembly, all I had to do was specify dash dash wasm.
This global variable pywasm equals this file, py.wasm, so they hook up together.
And you can make API calls to localhost, and it just works.
So, it's a great way to iterate more quickly on your development. Okay.
I do want to talk about some of the pitfalls of WebAssembly and things you probably want to keep in mind, you know, whether you use it in a serverless fashion or in the browser.
I sometimes talk to people who say they want to write a new tool in C++ or Rust so that they can compile it to WebAssembly and run it in the browser.
That can be a valid way to do things, but you probably want to ask yourself a few questions, like, you know, are there already existing tools that do similar things in those languages that I can leverage and not have to reimplement everything myself?
And in particular, if you're considering going down that route for performance reasons, just make sure that you aren't making too many assumptions about JavaScript being not performant enough.
Because the fact of the matter is that JavaScript is quite fast and has been optimized pretty drastically by the browsers in the last few years.
So, it's not a given that using these other languages through WebAssembly is going to give you more performance.
But it's definitely possible, and I've seen it with my own applications.
So, just something to keep in mind.
So, in other words, you know, obviously compiling off-the-shelf tools is preferable to writing your own because then, you know, you do less work.
But even then, it's also worth asking, well, for these off-the-shelf tools, has someone actually precompiled it to WebAssembly for me so that I don't have to go through that trouble?
So, that's the other even more preferred approach. So, for example, you know, tools like SQLite or the Python interpreter, TensorFlow library, FFmpeg, these have all been already compiled to WebAssembly so you don't have to go through all that yourself and worry about all the intricacies of WebAssembly.
So, that's kind of how I think about the various ways of using WebAssembly.
Now, if you do use WebAssembly, you should use it in the browser or at the edge.
It kind of depends on what your application is.
But there are several benefits to using WebAssembly in the browser, namely that you're using the user's compute power, which can be a positive or negative depending on how heavy the compute is.
But in another sense, it can also help you reduce your cloud costs by distributing the computation over to every single user.
The other benefit is if you are analyzing the user's local data, so, for example, if you have an application where you ask the user to provide you a file and you will then maybe subsample that file and do some computation on it, in some cases it might not even make sense to upload that data before analyzing it if you can do it directly in the browser.
Now, what about just WebAssembly in general?
When should you not use WebAssembly?
And, you know, obviously this is a hard question to answer. But from my experience, it kind of comes down to these three things.
If your application is not compute intensive enough, it's probably not worth using WebAssembly and you should just stick to good old JavaScript.
You know, if you're not doing audio processing or image processing or video processing or simulations or math or any sort of computation, then you probably should not use WebAssembly.
If your app is too compute intensive, on the other hand, right, if you're trying to do too much and analyze too much data and you're using WebAssembly in the browser, that might not be ideal either.
You know, WebAssembly is limited by how much RAM it uses, but also you probably don't want to use up all your user's RAM when they visit your website.
So, just another thing to keep in mind.
And finally, on a less technical aspect, you know, WebAssembly is not simple, so to speak.
And so, it does add some complexity to your code.
It adds complexity to your build process. It also adds complexity to your team in some way in that, you know, everybody needs to be at least a little comfortable with working with these new technologies so that they can build upon them and modify them.
So, I don't want to sound too negative. Obviously, WebAssembly, I think, is a fantastic tool and I've had a chance to use it a lot.
But it is a tool that you have to make sure you're using for the right reasons given all the potential increases in complexity it would add.
So, that's all I had to share today.
If you're interested in learning more about WebAssembly, you can check out my book at levelofplasm.com.
And I do have also a whole bunch of free articles and videos at this URL down here, including a couple of these articles are about serverless WebAssembly running in Cloudflare Workers.
And they also include some sample code if you want to try that out yourself.
And with that, I want to thank you very much for tuning in.
Thank you.
Thank you.
Thank you.
Thank you.
Thank you.