Latest from Product and Engineering
Presented by: Jen Taylor, Usman Muzaffar, Achiel van der Mandele
Originally aired on September 11, 2020 @ 6:00 PM - 6:30 PM EDT
Join Cloudflare's Head of Product, Jen Taylor and Head of Engineering, Usman Muzaffar, for a quick recap of everything that shipped in the last week. Covers both new features and enhancements on Cloudflare products and the technology under the hood.
English
Product
Transcript (Beta)
Okay, hi, welcome to another issue of the Latest from Product and Engineering from Cloudflare's product and engineering teams.
My name is Usman Muzaffar, I'm Cloudflare's head of engineering with me, Jen Taylor, say hi.
Hi, I'm Jen Taylor, chief product officer at Cloudflare.
And we're really excited this week to have Achiel van der Mandele to join us.
Achiel is a product manager on Jen's team. Achiel, why don't you say hi and tell everyone what you're responsible for.
Sure.
Hi, thanks a lot for having me and also just like to chat with y'all. So I'm a product manager here at Cloudflare, I've been here, I think a little over 18 months.
My focus area is very much on like the edge of Cloudflare, I always see a whole bunch of things, but very much like the first time a byte or a connection or something hits our network.
That's kind of where I try to focus. Edge of the edge, right?
Edge of the edge. So that's like HTTP and advanced protocols, but also like other non-website protocols, like FTP gaming, that type of stuff.
Yeah. So let's talk about that word protocols for a second.
Like what do we even mean by that? It keeps showing up all the time, we keep saying protocols, protocols, and you go back to like, I think the first time I heard that word was when I was watching Star Wars as a little kid and C-3PO says I'm a protocol droid, like this idea that when two parties are contacting, these are the rules, like that doesn't follow protocol.
So why does the word protocol even show up all the time? Why do we have a team called protocols at Cloudflare?
What's going on there, Achiel? The fun thing is we literally have a team called protocols that I happen to work with, but that's a great question.
So a lot of time when we say we follow protocol, it's like, well, these are kind of like the steps that we are following to be able to achieve a task or do anything or interact with other people.
And it's very much the same on the engineering or Cloudflare side.
There are ways that your browser, your laptop, or whatever, like needs to talk to a website and there are like certain steps and certain contracts, if you will, or a protocol of how you interact with the website and retrieve that website from Cloudflare and have it show up in your browser.
It's really the order in which we say hello, right? Literally.
Like, hi, I'm a web browser and I would like some content and the server says, oh, that's nice to meet you as a client, here's who I am, you can prove that I'm really this website.
So yeah, that's really interesting. Another team that you are responsible for that we're going to talk about is called Spectrum.
Tell us just a little bit of, again, by way of introduction.
What is Spectrum? Why do we call it that?
Sure. So a lot of, most people know Cloudflare as an operator of like HTTP services.
And when I say HTTP services, I mean like mainly websites, right?
Most people know like, hey, I can go to Cloudflare. I can put my website up on there.
You do security and you do CDN and workers, all of that stuff. The funny thing is the Internet is like a lot more than just that, right?
We all know like you play video games, that doesn't go through a browser.
There's no HTTP there. There's no website there, but you're still interacting with the Internet, right?
Or another thing is you might be transferring files through FTP or you're doing email.
All of those services want to get benefits on the security side, but also on the performance and reliability side.
That's where Spectrum comes in. Spectrum is essentially the way for you to put Cloudflare in front of those types of services.
And we offer stuff like DDoS protection, advanced firewall rules, but also allow you to like speed up those protocols by employing technologies like Argo smart routing.
Yeah. Yeah, so it's interesting. Like the picture in my head is a stack, right?
So like there's, you know, the lower layers of the network stack is literally physical wires connecting to each other.
Then one layer above that is, okay, two computers can talk to each other.
Then they can have IP addresses.
So I've got a number on the Internet, you have a number on it. And the top layers of the stack are those applications.
And what we built was a lot of stuff that's specific to websites.
But then we have all this technology, like you said, about protecting websites and protecting against DDoS attacks, you know, protecting against malicious intrusions, protecting against bots.
And that stuff applies to anything on the Internet.
So it's being able to apply all of those infrastructure products for security and reliability to more than just websites.
So it's the whole spectrum I like to think of it as.
You know, everything on the rainbow, the entire spectrum of applications on the net.
So, and Jen, I think this is probably a question that I was going to ask you.
Like one of the things that we were asked to work on recently was regional services.
And, you know, just from the product point of view, like what is regional?
What's regional about this?
After all, it's a global Internet. Like we've got points of presence everywhere.
So, you know, where is some of the requirements of regional coming from?
And what is it that Akil and the engineering team have to work on? Well, it's interesting you ask that, right?
Because we go back to like Akil being responsible for the edge of the edge, right?
And basically like the doormat to the front door of Cloudflare, if you think about it.
You know, one of the things that we're starting to see is that different parts of the industry, different parts of the market, specifically different regions, have different requirements for how their traffic and the data around their traffic should be processed.
And in particular, you're hearing in markets like in Europe, where they want to just ensure that all of that traffic is only processed in Europe.
I can understand why they would want to do that, want to do it for regulatory reasons, privacy reasons, security reasons.
You know, they have some very specific regulatory reasons why they want to do it.
Now, if you step back and you're like, okay, that makes sense.
But if you remember, Cloudflare isn't any cast network, which means that any, you know, typically what ends up happening with the Cloudflare traffic is it comes in the doormat, the front door of Cloudflare, and it's processed in the colo in which it is taken care of.
And then we send a bunch of information back to the central brain that lives in Portland.
The challenge that we pose to Akil is says, Akil, solve Europe's problem on our any cast network.
And so Akil, I pass the challenge off to you.
How did you tackle this? How did you even frame the problem?
So the problem I think originally was that we were increasingly seeing people asking about this.
Like, hey, where do you process? Where is my data?
But the challenge here was that that seems like a very simple question. Like, can you just process or do data stuff in this region?
But there's a lot of nuance into like, what does that mean?
Does that mean data can flow through us? Does that mean we decrypt or apply this product or that product or store it on disk?
So from a personal point of view, I thought this was a really, really interesting challenge.
Not so much from the engineering point of view, but very much from the product point of view.
And I'll tell you why. These things mean very different things to different people.
And it's because a lot of this is up to like the interpreter, right?
We have certain laws and we have certain people who feel a little bit icky about these things.
But a lot of times they just say, like, make it local.
But then don't tell you exactly what that means. So we actually spent... Sounds like a job for a product manager.
A lot of fuzziness. Turn it into something concrete.
Big, important, poorly defined thing. Call a product manager. And if you ask 10 different people how to solve this, you will get 10 different answers.
11 different answers.
Ultimately, what we just did is we're just putting up like straw men.
Like, hey, how would you feel if we approached it this way or that way? And really trying to narrow down into what is processing to you and what does that mean to you?
How does that manifest in your daily life? And that's where the interpretation also came from, right?
A lot of people don't necessarily even care directly about GDPR, but they've been forced to write stuff into their contracts that say certain things.
So a lot of this... Let's pause there for a second. GDPR.
All I know is that right around May of 2018, every single website I ever visited started giving me big warnings about cookies and big things that said, except cookies, except cookies.
So what would this... In two sentences for everyone to know, what's GDPR and how did that come into this whole thing?
GDPR has a whole set of strict requirements surrounding like where is data allowed to flow and who can look at that data and where do you apply processing or products.
Obviously, with us being an Anycast network, that does raise certain questions surrounding how does that work?
How do you operate that? It's not as simple as saying, well, we only have one data center and one server that's handling your traffic.
In many ways, the whole point of the Internet and the whole point of Cloudflare is to make sure we process your request and we will find a computer to process it.
Even if it means sending it far from where the eyeball is originally, that's literally the point.
And while we have... It's very interesting, right? Because networks are highly aware of other networks they talk to.
They have no clue about where the political boundaries are that they are crossing.
They know about autonomous systems and they know about LANs and they know about WANs, but they have no clue where they are crossing.
They're totally not bound to countries in any way.
Yeah, why wouldn't they? In the absence of any definition of country or whatever, when we thought about building our network, the thing that we optimized for is process that information as quickly as possible.
As fast as you can. So we optimized for speed.
And we're like, location is irrelevant, speed is paramount.
Yeah. But then we started hearing from customers that maybe we needed to add another piece into that equation.
Access to this whole puzzle. It's three-dimensional.
Yes. So what did we do, Akhil? So in the end, what we discovered, and talking to a lot of customers and proposing things, one, most of them just ask for, can you do a regional Anycast in a smaller area?
And the issue with that is it's very antithetical to how Cloudflare operates.
And ultimately, in my opinion, not exactly what customers really want because you want that broad Anycast network, right?
Most of these people care a lot about DDoS protection.
The larger our network, the more we can mitigate. And that's extremely difficult in a smaller area rather than larger.
And what they end up really caring about is very much like, where is traffic decrypted?
The quote I always like to use, which is a verbatim quote from a customer, it's like, we can slice and dice this in a million different ways as long as you can promise me that no machine outside of the EU will see a decrypted bank account from an HTTP request from one of my customers.
We're good.
Everything else is, I'll just moot. I just want you to be able to make that promise.
So we've delivered literally on that. We use our global Anycast network, but we make sure that we don't decrypt.
All of that just flows through us and the Internet back to a data center inside the region of customer's choice.
We do EU and US, and we only decrypt and offer processing there.
That's great. Hold on a second.
I want to double click on that because you just said something that was really interesting to me, which was we wanted to preserve the power and the strength of what we do for Anycast, but also respect the regional processing decisions of our customers.
So how do we balance that? When a customer has regional services turned on, what still happens in Anycolo?
And then at what point does the traffic get passed back?
Why was regional Anycast such a bad idea? So regional Anycast very much limits how much network capacity you have.
When you look at large scale DDoS volumetric attacks, those can spend many hundreds of gigabits.
You want as much network capacity to be able to disperse it across the globe and absorb it across the globe as possible.
Having a regional Anycast just very much limits what you can do there.
It's also a little bit trickier in that we have a round globe, so traffic naturally balances a little bit more nicely around there than if you have one region where all of the attacks potentially come from the outside.
But I guess a part of it is too, right?
With DDoS, I guess you just don't have to decrypt.
With DDoS, we're just seeing, we just look at the volume of the traffic and we're like, that is an unnecessarily large amount of traffic.
I'm going to handle and absorb in Singapore this huge glob of traffic and then pass the traffic back that's shuttled for Europe, back to Europe for the decryption.
Is that it? Yeah, exactly.
Yeah. The capacity aspect of it, another way of looking at it is from the OG layer model, and I'll try to quickly recap.
The three OG layers that we most often look at are three, four, and seven, with three and four being very much on the network and connection layer, and then layer seven being HTTP to a website decrypt, that type of stuff.
If you look at the types of attacks that are very difficult to scale up but also block, those are the layer three, layer four stuff.
And we don't need to decrypt to be able to block that. It's still just data. It's opaque.
So let the power of the network absorb that, but when it comes time to actually open the envelope and look inside, let's make sure we do that part in the data center that matches where the customer wants the regional processing done.
Exactly. It's awesome. Best of both worlds solution. And it's still a heck of a lot of work, because it meant making sure that those private keys and the data needed to decrypt, basically the letter opener, is only available for these customers in the places where they need to be.
And that's another part of it. That's really great.
That was one of the things we've done recently. The other thing I wanted to ask you about, going back to that protocols part of your responsibility, is HTTP 3, which is pretty new.
So HTTP 2 feels like it was relatively new, and now there's HTTP 3, and there was a little bit of a rename going on there, because at one point it was called QUIC, Q-U-I-C, which is a standard that Google and the IETF were working on.
So what is Cloudflare's role with HTTP 3?
Hold on a second. I've got a question. Before we even get there, if we have HTTP, why do we need a new version of it?
That's a great question. If we've got a protocol that works, why do we need new ones?
I'm going to take a shot at answering that.
But why do we bother to upgrade protocols?
Why do we need 1, 2, 3? That feels like it's actually making things more complicated.
Is the complexity worth it? Why? That's a great question. I think when we go through the different creations, no one exactly knows what you're really going to run into when you develop a protocol, right?
Or just for simplicity's sake, we're going to implement it in a certain way.
So one of the examples or a big example of HTTP 3 is something we can improve versus HTTP 2 is head-of-line blocking.
So what does head -of-line blocking mean? Basically, it says normally you can only send one resource at a time, right?
So your browser is talking to a server and you're sending a JPEG and an HTML file.
So you're blocked. If one is slow for whatever reason, then everything kind of breaks down.
With HTTP 3, because it's UDP-based, which is different than TCP, we can do them out of order.
So if there's one thing that is blocked for whatever reason, the other resources can continue to send, which is vastly more efficient in terms of transferring large websites, which have like a whole bunch of different files, right?
If you go to your browser right now, you open the network browser tab, there's like dozens, hundreds often when you go to a website of all these different resources that transfer the wire.
And it's great to be able to not be blocked on one but be able to get move at the same time.
That kind of makes sense. I mean, I think about it like if I think about the websites that I used to look at in the dawn of the Internet, right?
Where it was just lots of like, you know, text on a, you know, black text on a white background and very simple design.
And I look at the websites that exist today with all sorts of kind of rich graphics and stuff like that.
We've increased the complexity of the page in order to improve the design.
And so what I'm hearing you say is we have to make the protocol smarter to make sure that that experience stays fast.
Exactly. It's no different than and so remember we talked about protocols as just being the rules by which two parties communicate.
And so we came up with this fantastic way for a browser to talk to a server.
And it was simple and it was general and it's part of the reason the web took off.
But as the kind of conversations we're having evolved, it became obvious how the language itself, how the protocol itself could evolve to become smarter.
And of course, there's got to be 100% backward compatible because there's still zillions of devices and servers out there.
So really, it's almost you can think of HTTP3 and any protocol evolution as how do we tune the protocol for the way people are using it in the same way that human languages evolve to get smarter and people develop jargon, people develop shorthands and people develop more efficient ways of communicating using English.
This is protocols evolving so that they can be faster and more efficient at communicating and that head of line blocking that Akil was talking about is a great example.
But you mentioned something there that I think is actually really interesting and Akil, I'd like to understand a little bit more.
Like, you know, you talk about, Usman, you just used the word evolution a moment ago and like part of the way that protocols work is because I speak the same protocol that you speak, Usman.
Right? Like if that's the case, I speak Gen. Yeah, exactly.
Thank goodness. I'm so glad you speak Gen. I sometimes speak Gen. But like, then how do we upgrade it?
Like, how do we decide that we're going to, like, how do we actually get everybody to start speaking Usman?
Like, what, like, who goes first and like, how do you, how do you get everybody to start speaking Usman?
Yeah. Akil, how would you, how does that work?
How do you solve that? How do you solve that problem?
It seems like we have to be, we have to be talking to the major players on the Internet who control a big chunk of these standards and implementations and, and work through that low level of the stack.
Yeah. And it's, it's even more interesting because now we're talking about like speaking Gen between two humans.
But here we're talking also about like very different parties, right? That have focused on very different things to kind of make that concrete.
When you talk about like the implementation of protocol like HTTP2, you, you need a client to generate the browser.
You need a server side component, which is a web server, maybe like NGINX, which is what we build our technology on.
So that also maps to parties like, such as ourselves.
We operate a web server and then there are other parties that operate browsers, Firefox and Google and Apple and Safari.
So you, you kind of need both to be able to do this, right?
And then it's, it's, it's kind of interesting question.
Like is chicken egg problem here. Yeah. Just invent HTTP4 and then hope that everyone follows it.
I mean, that would be interesting, but maybe those people aren't interested in supporting HTTP4.
So there has to be a lot of collaboration.
So that's where parties like the IETF, the Internet Engineering Task Force, where they set up these groups of folks, often with people from parties, such as Koffler and Google and Apple and they meet together and together there they come to these standards and they, they agree on like, this is the goals all these parties seem to achieve and that gets you a good mix too, right?
About things that web servers maybe care about, which might be efficiency but also browsers because browsers have vastly different opinions about how things should work and they have a better feeling for well, this user is on mobile so he goes outside a lot so he switches networks that gets you all sorts of new interesting problems to tackle.
So all of those people bring all of those problems together and that's when we start talking about new standards.
And that's also the only way in which you can get people to literally like agree on how to move forward.
And I guess everybody pushes the button on the same day, right? Where it's like, okay, on September 1st we're going to push the button and I'll start speaking in response.
Well, and that's just it, right? Can't work that way. It's got to be backward compatible so it rolls out very slowly and our servers have to be able to handle in all directions.
But the cool thing is because Cloudflare sits in front of so many things once we get it right all of our customers can pick up those benefits almost automatically.
And it's so great for Cloudflare engineers who get to be part of these IETF conversations and sit on the committees that are literally designed to be designing the future.
So, it's very exciting work. Yeah, I'm very excited about that aspect of being able to help out here because nothing's more important for people than for like browser implementers to have a server that's everywhere like Cloudflare.
So, many, many different websites that they can test against.
So, with us enabling HTTP3 on our end all of a sudden we have I think it's 200,000 domains right now that have HTTP3 enabled today.
That's amazing.
With Google and Mozilla and Safari or if you want to roll your own Usman browser and you want to support HTTP3 you can test against those 200,000 websites.
That's amazing. That's so great. It's so great. And it's and it just it makes the collaboration a virtuous cycle because the more feedback they get the faster the protocol can evolve and so it's it's really great.
Hey, listen, one thing I wanted to ask you about since we're always talking about speeds it's like the simple answer for my parents in case they're watching the end of the day it's all speed like all the the primary value of all everything we just talked about for the last five minutes is making things faster.
You know, and of course everyone from even a child playing a video game on the Internet wants to know like how fast is my connection?
You know, one of the other things we released was just a public tool called Speed Test speed.Cloudflare .com Let's talk a little bit about that.
Why do we do that?
What's it actually measuring? Great question. Yeah, we launched Speed.Cloudflare a few months ago which is a new way of testing the speed of your Internet connection at home.
Yeah. I think what we really wanted to do is we were looking at a number of these other speed tools which are great but they're often a little bit simple in that they give you one number and if that's what you care about that's totally fine that's totally great.
We really wanted to give you like a better more exact insight into how your network is performing.
So, you'll see that when you compare our speed test to others that you get fast, more metrics more graphs literally showing you how the different measurements you can download the measurements.
We also show you stuff like latency and jitter which is the how much your latency changes over time which people care about a lot for gaming.
Yeah. So, our MO was very much to offer people like more detailed metrics on their network performance.
That's great. Part of our mission is to just give as much information as possible.
What are some of the things we learned as we built that? So, this was really cool because for us we just wanted to put this out there and it gave us some metrics in terms of how fast different Internet providers were connecting but on day one when we launched we noticed a lot of people saying hey this is not really great my upload speed isn't really that high what's up with that?
So, that was really great that it allowed us to kind of look into it and we noticed it for people that were on very very fast connections they would report vastly lower speeds.
So, we took that to engineering and asked like hey so what's up with this? And they looked into this and like hey that's interesting so if I go directly to the server like circumventing CloudFlight fast but through CloudFlight slow that's not great.
So, we ultimately figured out that the default buffering which is like the rate at which NGINX the web server that we build on accepts data had like a sub-optimal tuning performance in some cases.
So, we were able to change that to dynamically scale up which immediately sped up upload speeds to like the speeds people were actually expecting but it also allowed us to figure out this bug which has been around forever and apply to all of our customers.
So, all of our customers have fast upload speeds now.
So, I really love being able to build stuff on top of Cloudflare really like dog food like use your own software and really look at your own network and see where you can improve.
There's a really famous quote that I've always loved.
I remember hearing it when I was just starting out in this industry which is given enough eyeballs all bugs are shallow and that is exactly it.
The more we embrace the community and be transparent and show everybody everything we're doing like the better it is the more information we have and signal we have to improve things and make things better.
It was great as well for us because after we did this we figured out how to fix this in NGINX and we also happily give back to the community.
So we open sourced a patch and I believe it's under review with NGINX or F5.
That's really cool. It's part of helping to build a better Internet, learning at scale, leveraging the power of our own insights and dog footing.
Those are all sort of like key tenants about how we think about and how we actually build product.
Part of it makes being here so fun. We only have a few minutes left but I'm going to test your analogy generating facilities.
What's a port range?
It's a pretty esoteric thing. It's down in the ocean. We shipped it about five, six weeks ago.
What are support for spectrum port ranges? Something we could get away with not having for the first year and a half but we need to implement this.
What's up with this feature? Good question. Before I can answer that I need to explain what a port is.
A port is something that's open on your web server or server.
You can look at it if you're allowed to laugh about this. Like your house.
Normally you have one door but you might have multiple entrances.
Multiple doors.
In every door there's a different service. For instance in a web server we talk about port 80 and 443.
Different protocols use different ports. A gaming server will use totally different ports than 80 or 443 or an FTP server.
might have heard of 21 and your mail server has thousands of ports and everyone is talking about the challenges with configuring mail clients.
With Spectrum you can put Cloudflare in front of these services.
You can put Cloudflare in front of a gaming server or mail server.
Most of those just operate one or two ports. That's not great.
just go to the UI and say click at port 20,000, at port 20,001, 20,002.
You have the ballroom with 18 doors, 100 doors to the rest of the hotel. It's all the same service but has many different doors.
That's not great. We have had customers say we want to open a few 100,000 ports across a bunch of IP ranges.
That's a bit clunky but maybe you can build a script to do this.
They went back and two hours later we get paged because our API limits are getting hit.
Because the script is killing the API.
So we built port ranges. What you can now do instead of having to put them in one port at a time, you can say I want ports 20,000 to 30,000 and we'll proxy all of those.
It's somewhere in the UI where you entered a number.
Instead of entering a number you can enter 200-500 as opposed to just a single number.
That's awesome. Protocols like FTP or gaming or video streaming those are often protocols that care about this.
That's cool. Akil, we're at 29 minutes I think.
Jen, did you have a last question? Do you want to give one last teaser in the last minute you have of where are you going next, Akil?
Great, thank you.
I'll try to keep this quick. Yes, we're definitely always looking to support more protocols.
One of the protocols that we might have heard a lot of people asking for is TRPC.
If you are interested in this drop me a line at akilaCloudflare.com.
That's it. That's the teaser. Thank you, Akil. It's so great having you, Jen.
Always a pleasure talking to you on a Friday afternoon about all the amazing stuff, our team builds that we take credit for.
It's Akil, awesome work.
We'll see you again next week on Latest from Product Venge.
Thank you everybody. Thank you, everyone. Bye. Bye. Bye. Bye. Bye. Bye. Bye.
Bye.