Latest from Product and Engineering
Presented by: Jen Taylor, Usman Muzaffar
Originally aired on January 18, 2022 @ 5:30 PM - 6:00 PM EST
Join Cloudflare's Head of Product, Jen Taylor and Head of Engineering, Usman Muzaffar, for a quick recap of everything that shipped in the last week. Covers both new features and enhancements on Cloudflare products and the technology under the hood.
Original Airdate: July 24, 2020
English
Product
Transcript (Beta)
Hi and welcome to Cloudflare TV. This is the Latest from Product and Engineering. My name is Usman Muzaffar.
I'm Cloudflare's Head of Engineering. And I'm Jen Taylor, Cloudflare's Chief Product Officer.
Hi Jen. So much cool stuff coming off the factory floor this week.
So much cool stuff, that's right. In fact, we didn't do an update last week, so we've got two weeks worth of great stuff to talk about.
So where should we start? I don't know, what's your favorite thing? Oh my god, that's a really hard question to answer.
You would ask me this as we were prepping.
I was like, how am I going to choose between all these favorite things? One, I'll just go in reverse order.
One of the things that we shipped last week, and again, it's an early support for a new protocol that we call HTTP3.
And so we're not calling it that, that's what the industry is calling it.
It's the next version of how browsers can connect to web servers.
And it's focused on making sure that that handshake, that way that browsers and servers talk to each other, is as optimized as possible.
Because every time your web browser has to talk to your web server and go back and forth and negotiate, are you who I think I'm talking to?
Do I have the certificate? Is this correct? This takes a whole bunch of extra time, and that's where the delay comes from.
And so one of the things that the whole industry has done, and it's so great to see, is that we involve newer ways of making clients and things like your web browser talk to servers, things like something that hosts a website.
And I'm really proud to report that Cloudflare takes a lead on that, and we implement some of the earliest ways to try that out.
And one of the cool things is that new browser manufacturers, sorry, new versions of browsers, Chrome and Firefox and the ones that we all use and love, they have experimental versions.
And just last night, I was playing with the Canary version of Chrome, where you can turn on these experimental features and launch them, and you can start to explore how much faster it is.
And you have to actually have a stopwatch and be timing this to actually notice the difference that a browser can talk to a server that much more efficiently.
And so it's still early stages, but one of the things we shipped last week is support for a new standard called HTTP service record, because here's an interesting challenge.
How do you know that you can support the new language?
How do you know that the thing I'm talking to can speak something more elevated than HTTP 1.0 or HTTP 1.1, which came out 20 years ago now?
And so the way this works is usually the client starts talking and the server responds and says, and by the way, I speak HTTP 3.
I can speak a more elegant protocol if you switch to it.
And then the client can say, actually, I do too, and so let's switch to that.
But you've already wasted one round trip.
We wasted one round trip in doing it the old way when we could be doing the new way.
So the question is, how do you get that extra bit of information to the client before it reaches out to the server?
Of course, the answer is DNS, right?
The answer is always DNS. DNS is the phone book of the Internet. It's what you consulted in the first place to figure out where the server was.
And so by supporting a new kind of record, DNS can respond and say, by the way, this server speaks HTTP 3.
It speaks the new version, so you can use it. So again, it's not something that's broad.
You can only test it in certain places, and it's still experimental, but it's the kind of forward -leaning thing that I think is a lot of fun.
Yeah. Well, you're talking about speaking new languages. One of the things I'm super pumped about is this past week, we now speak four more languages.
Four more languages.
What are the four new languages? Four new languages are Spanish, and not just Spanish generically.
It's Spanish with specific locales for Chile, Ecuador, Mexico, Peru, and Spain.
Wow. Then we have Portuguese, Brazilian Portuguese, Korean, and traditional Chinese.
That's amazing. So here's my question.
It's a global company. What does it mean to support these different languages?
Where do we even see all this? Yeah, this is a great question because for so long, we've offered the product in English.
We're like, yeah, you ship it. It's fine.
So what we've been doing over the course of the last year has actually been making the language in our dashboard localizable.
So it means that we have taken all of the text that exists within the dashboard, all of the buttons, all of the instructions, all of the helpers, all of that stuff, and we now have localized it such that if you're a native German speaker, you, for example, can come to the Cloudflare dashboard and select German, and everything that you encounter in that experience in Cloudflare will now be in your native German.
And part of what we do at Cloudflare, part of what I love about our mission is we focus on taking complicated things and making them easy.
Well, it turns out a big part of making things easy is speaking to you in a language you understand.
You understand, yeah.
Exactly. And the internationalization and localization, and those are two really long words, and they're abbreviated in famous ways.
You take the word internationalization, it's an I followed by 18 letters and an N.
So almost on the inside of any software company, it's abbreviated as I18N, and localization, the same nerdy trick, L10N.
And the interesting part about this is that you have to do it right.
You have to do the internationalization first, which means that your product actually doesn't speak any language at all.
It has completely separated all the human facing parts of what the end user sees, all what are called the strings, all the text, so that then a localization team for different languages can go make different versions of those.
And that's how you're able to add four completely different languages, all in one shot.
But here's an interesting thing that I don't think a lot of people talk about enough is it's not just making sure that things like connect and disconnect are translated into different languages.
People write numbers in different ways around the world. They put the comma in a different place.
I grew up in Pakistan. I was born in the States, and I moved to Pakistan when I was a young boy.
And I was taught, like we're always done in America, the comma goes after three.
So it's a thousand separator, then the million separator.
If you go to Pakistan, that's not where they put the comma.
It's after the hundreds marker, and then after the hundred thousand marker. And I was like, how is it that I seem to know the rules of commas for numbers, and no one else in this country does?
Finally, I realized this is like American jiggle centric, and it's worse.
Like this is not the way. There's more than one way to do this.
And so that means even things like the formatting of numbers is part of internationalization and is part of what the system needs to switch.
And there's numbers all over the Cloudflare dashboard when you pick German from that language.
But the good news is internationalization work is mostly done.
So we're going to be able to pour in all those more languages because we do have visitors all around the world, and it is awesome.
We can support this in their language. Well, and also a big part of what we've been doing a lot about lately has really been enriching our analytics and enriching the way we give customers the opportunity to view data.
So being able to localize the numbers that are associated with that data, incredibly important.
Because for example, if you're looking at the English dashboard in Pakistan, you might dramatically misinterpret the numbers based on where the commas are.
That's right. Because the commas are in the wrong place. That's exactly right.
While you're talking on dashboards, another feature that we put in that was, again, cool because of the way the system has been done once it showed up in one place, it's showing up in different graphs, is zoomable analytics.
So this is the dream that all of us had when we used to watch sci-fi movies, and they would say enhance and zoom in, and you'd see something.
And it's what Google Earth gave us for real when you were able to dive in.
And whenever you see a graph as a system administrator or an operator of any kind, a network operator, and you're like, that blip there at UTC 216, let's dive in on that.
And you know exactly what you want to do.
You want to grab your mouse and just drag it over that window so that it zooms in and you can see what happens underneath there.
And we've added that feature.
And it's a small touch, but it's such a great, cool thing because it makes it so easy to dive in and dive out of those numbers and check out all those things.
Well, and back to the easy theme, right? Because a big part of what we want you to be able to do is quickly understand what's happening in any moment with your traffic and the ability to just click and drag and dive in rather than being like, I got to type a query and I got to think about the syntaxes and where does that begin and where does it end?
Really powerful. For me, it was always the software in the crime shows, the interfaces that they were using to sort through databases, that was for me.
I remember it was Enhance. Yeah, Enhance. Enhance is the best.
And the funny thing is Enhance for all of us, and I'm just going to go off script here for a second, Enhance, anyone who understands data is like, you can't enhance something if there's no data there.
But now with machine learning, you can actually add more information here.
So I think actually Enhance is a legitimate feature of new analytics systems, which is actually, tell me more, tell me more.
What else do you know? Well, I mean, it's like, I always really liked it when they had like a grainy image and they click Enhance, Enhance, so they get more data.
And that's kind of what Zoomable is doing, right? Because you're basically taking a big snapshot and clicking into it and getting more data points.
That's right. That's right. What do we have to do under the hood to make that work?
Because I know that one of the challenges we have is we have huge volumes of data at Cloudflare.
How do we make it so that that's like even doable and performant?
Yeah, so that's a great question. So one of the great, like a great thing that all the entire tech industry has been benefited from is the renewed interest in how do you manage large volumes of data with all this, the success of cloud, which means that there's so much event information that is pouring into centralized systems and gives us all this great insight, if only we can organize it properly.
And so one ingredient is the idea of what's called a columnar data store, something which is very good at being able to handle large volumes of information in a way that makes it very easy to query.
And so, you know, across the industry, there's things which are quite different from what we call a relational store.
So this is not the kind of technology you'd build a shopping cart on, because it doesn't order the information in rows, it orders it in columns.
And the idea is then it becomes very easy for me to just rip through and have the computer quickly identify what is the information I need in this window of time.
And that's really what's powering something like that zoom in and enhances like, I just need to know everything that fits in this window of time, but I need you to answer me fast.
And so that's, that's one thing that and then the other is a new is a programming paradigm, an API called GraphQL, which really is gave front end engineers, the ability to interrogate a database in a way that made sense to them.
And so what that means is, they were able to say, I want you to return me information of the and I want it to be in the following format, because it wants us in that format, I can jam it onto a graph, I can put it onto a web page really easily.
And we started adopting GraphQL in a big way, about a year and a half ago, and it really started to power all kinds of analytics inside our system and became an API that our customers can use too.
So it's, it's really great. Well, and even enhancements and improvements this week and the ways in which customers can access their data from Cloudflare.
That's right. Yeah. Big improvements with log push now. We've added the yes.
Yeah, let's talk about log push. Yes, that's right. Very smooth.
Yes. Right on from analytics into push is the ability to exactly as it sounds, we have the logs, right?
The where the where the front most thing that received information on our customers websites, we have a responsibility to send that back to our customers so that they can study it and analyze it and look at it and draw insights from it and everything else that they have to do.
So the question is, how do they get it?
Well, for the longest time, we had a feature where you could you could connect to us and pull those logs.
But why connect? I need them. Cloudflare has them.
Why don't we just ship them to you proactively? And even better?
Why don't we put them in whatever system you need to, to, to hang on to them?
And that usually takes the form of anything that handles big, large, unstructured data, you know, this basic classic cloud storage, that's Amazon's S3, Google Cloud Storage, Azure and Sumo Logic and these kinds of these kinds of systems.
And so we've had custom integrations with each of these, which lets our customers just as easily as they can, like quick pick the destination, put in the credentials.
And that's it. From that point on, Cloudflare will just start pushing our logs directly to your system.
So you can you can use them as if you had them, you know, all along.
But some people have grown their own systems. And, and when when people build their own homegrown systems that that manage logs, or when they use a lesser known, like not one of those one of the four that I just mentioned, they usually need Okay, so then how are we going to support all these, they need they need a way to integrate with them.
And the most common one is to support the S3 interface that was behind Amazon Simple Storage Service.
Because it's a well documented interface.
And you can you can you can you can implement against that interface, which basically means your storage system, even though it's on your your your premises, or you know, some other cloud provider, it speaks the Amazon S3 protocol.
And we've made what we did last week was make it easy for us to support any S3 compatible destination.
And so that just meant that basically, any place that you can store logs, we can write them to.
And that gives you so much more control over where your information is going.
Yeah, well, I really like, again, I think this is kind of a great example of kind of how on the inside, we think about some of these things, right?
Again, we want to make the data accessible, we want to make it rich and visualized, we want to give you the ability to drill in in the interface.
But then we also like if you like, you don't have to use our analytics, if you've got a different way, a different store, if you need to our data with something else, if you want to filter it, if you want to, you know, basically print it and make wallpaper out of it, like we're not going to stop you.
Like, and, and, you know, we want to make it easy for you to access to that information and be able to use it any way you can.
That's right. It's very much our job to make sure that customers have all the information and then some that they would have, if they were running it themselves, right?
So nothing should be, it's almost a company philosophy, nothing is opaque, it should be hyper transparent.
Yeah.
Well, I think one of the really interesting things is for a long time, so much of our data is talking about how the systems are experiencing the experience, the like, how long does it take for me to load this image?
How long does it take for me to process this request?
Like, how long does it take to go from point A to point B?
And like, that's one view. But if you pivot it, you turn the other way, you're sort of like, what does that feel like when I saw those things together for my customers on the other side?
That's right. Yeah, it's and it's one of the one of the hardest things to to pay attention to.
Because, you know, as an engineer, you have to, to some level, put blinders on to the parts of the system that are out of your control, so that you can focus and say, all right, I'm going to make the following assumptions about the system I'm sitting on top of, I'm going to make the following assumptions about input, I'm going to make the following assumptions about output.
And I'm going to engineer this part of the box, the one that I have control over the one that I'm building, to be as as good as possible to be as performant as possible and to catch all of this information.
And what happens is you wind up with a series of systems, each of which is responsible for their, their part of all the computers and all the pieces of software that are responsible, that a when a user just visits their favorite website from the time that their browser makes that request, talks to DNS, talks to Cloudflare, talks to the origin, comes all the way back.
And what gets lost is, well, wait a second, what did the end user actually find obscene?
Because you wind up with all these graphs, like clappers got grass, customers got grass and log request graphs and TCP connection times.
But wait a minute, what did the end user see?
What was the actual experience to the person sitting in front of the computer?
And that's, that's hard to measure. Because we're not standing there.
And we don't want to be standing there, so what we need is enough information that's on the web page itself, so that it can track some of these key methods.
And there's really two places we want a ruler. One is the connection time, because honestly, the no users laptop can do anything once the bits leave the laptop, or regardless of how far they travel at one, and they come back, that's the request time.
So that's really important to start a stopwatch on the web browser to track that.
And then the other one is the page time, because it takes time for a web page to load.
But the cool thing is modern web browsers have so many so much metrics in there, they can tell you, wait a minute, do I have all the assets?
Have I painted the picture of the pixels all in the place they need to be that ugly flash you sometimes see on older websites, that thing that you can you can count that.
And so we wrapped all of this into a product last September called browser insights.
And we released it. And it was it was great. People loved it. Because right now, now in the Cloudflare dashboard, you could see how is my website doing from the perspective of the people who matter most is not Cloudflare.
It's not servers operation, it's the end users, the people who were you're we're put this whole machinery in place for in the first place.
And it lets you dive in and say, okay, so these assets are taking too much time, or it's the request time, or it's the page time, or maybe I've got too much JavaScript, or I've got too, too many images that are suboptimal.
So it gives you all this rich insight into into where it goes.
So I said, we ship that in September. So why are we talking about it now?
Well, what we did last week was give a little more control over that feature. So it used to be you either flip this big ruler on or off.
And now what we gave our customers is the ability to turn it on in specific locations down to specific hosts of specific websites, they control and even specific parts of websites they control.
Because maybe you don't want it on all over the place. Or maybe you only want it on for this new page that you're working on, or a place that you have suspected is kind of slow.
And so by giving customers this control, we make it that much easier for them to then analyze the numbers and make the changes to make it faster, or to help raise, raise it to Cloudflare if it's if it's something that we can help make faster.
Well, and also, I really like that. Because again, I mean, if you just think about it, like not every part of my app is created equal, right?
And my homepage, and I've got a lot of static content on my homepage, but my shopping cart is probably all almost dynamic content and pulling from very different systems.
And so if I'm trying to think about how I optimize my app, I'm actually I probably want it.
I want that greater specificity. Yes, right. That's right.
You want the control. And that's something we see over and over again, right, which is we give our customers blunt hammers, because that's better than no hammer at all.
And then we give them smaller and smaller things all the way down to a scalpel so that they can apply the control and the power of flexibility of Cloudflare down to the most fine grained part of their system to give them give them all the good but of course, it doesn't doesn't count if you can't if you can't see the impact, which is why it's so important to tie all this to metrics.
Well, and I really like it because I think what it does is like, again, in product, like we're always trying to walk a mile in our customer shoes or walk a mile in the end user shoes.
And, and this is just a really powerful tool to kind of stitch together kind of the power of everything that the Cloudflare is helping you do for your site and helping you identify frankly, other ways in which you could probably flip a bunch of things on within Cloudflare and make that experience a step function better right away.
That's right. That's right. Yeah. So another thing that in that same vein of giving people control.
So one of the things that you don't hear a lot of unless you talk to network companies is this expression by BYOIP.
Is that a new, is that a new like part of party like Yeah. I bring my own pasta.
Bring my own ice cream. Yeah. It's bringing your own IP address. And this is something that a lot of people just don't have to worry about except people who run big Internet services.
So let's take a second to talk about this. So if you want a website on the Internet, most people know you start with something that's called a domain registrar, because what you want to do is buy a name.
So you buy the name.
I buy usman.com. I buy jenandusman.com. And we own that name. And then we'll buy hosting service from somebody else.
And we will, we will tell that hosting provider, which is basically a computer in the sky.
That's on the Internet, please, we will, we want you to now answer when, when, when somebody visits jenandusman.com.
And that's pretty much easy. That's DNS being connected to hosting.
But as we all know, computers don't actually understand names. They only understand numbers.
And so those numbers are what are called the IP address. And every computer on the Internet has an IP address.
And so the question is, wait, who owns the number, right?
And that's basically ISPs or big network providers like like Cloudflare.
We own the numbers. And we had to get them basically through the same organization that handed out the names.
In fact, the name of the organization is IANA, right?
Internet Assigned Names and Numbers, right? That's the other N in there is numbers.
And so you do acquire address ranges, if they matter to you in for for really big Internet services.
And so our biggest companies, they own big chunks of Internet ranges.
And the name is they own the names also, but they also own those they own those IP addresses.
So why do they need that?
Why do they want to own those ranges, though? They want to own those ranges because it gives them it gives them that much more control.
They just like just like DNS can points to numbers, but people want those numbers to to stay stable because there are those numbers.
And it also means they're not beholden to another any other provider.
They basically have staked out a corner of the of the address space that is theirs and it's flexibly theirs to assign and play with.
And, you know, and again, it makes a lot of other gives them a lot of other control at a level that most people don't think about.
But here's the wrinkle. Cloudflare also owns IP addresses.
We have to. How else are we serving things? Those all those computers need numbers.
So what happens if you already have your websites and your Internet properties on your IP addresses and now you want the power of Cloudflare?
The lousy answer is we say, well, sorry, you don't you can't use your pick or choose.
And that was that was the world we lived in years ago. The better answer is we will we will take over ownership of those addresses and we will serve your websites on your addresses even though it's Cloudflare's computers.
So, again, now we've got a whole new level of controller.
That means that Cloudflare is now announcing what's called announcing our addresses.
This is kind of like saying I'm going to I'm going to tell the post office that my address is actually in your house so that all the mail comes to you, even though it's still addressed to me.
So no one ever changed my address.
But now it's actually going to a different place. And so it doesn't it's not an analogy that makes sense in the physical world, but on computers you can do this.
And so part of what we wanted to do was make it easy to do this for ourselves and for our customers.
And so that, again, means putting control and what we have now is an API and why in the dashboard, which lets BYO IP bring your own IP address customers to specify, actually, this address range, even though it's mine, Cloudflare, you go ahead and start announcing it and furthermore, announce it for the following websites, only the ones these ones I care about.
So, again, all that control that they have, they don't have to give it up when they use Cloudflare.
They keep all that control and get the best of both worlds.
That's super cool. And the nice thing is, again, like you can bring your own IP or we'll take care of it for you.
Like, right. Yeah, yeah, exactly. Yeah. Yeah.
The other things I one other thing I wanted to bring up, you know, we keep talking about customer facing stuff, but there's there's some there's some changes under the hood that are so amazing and so important to talk about.
And and I wanted to touch on a on a couple of them, because the the database that connects, we blogged about this extensively, the database that connects our our control plane to our edge.
In other words, when customers log in and make changes, that thing is, is it's pushing it out to the universe.
It's the thing that you can think of it as if you're in the dashboard, anytime you hit save, what just happened next is information got written to this special database called Quicksilver and Quicksilver then copies out to the edge.
And one of the things we always are trying to do is to make sure that Quicksilver is performing and Quicksilver can scale.
And so an early interesting piece of logic that I'm proud of is the team is working on a new version of Quicksilver and new tests of how Quicksilver can be scaled inside our data center.
And so early tests of that were very promising in some of our test scores.
And so what does it mean? We don't talk about a lot externally, but like, it's very much what engineering is working on.
What is it? What do you mean when you say it scales within our data center?
What does that mean? I mean that we have, we have, we have so much traffic, and we keep growing all of our, all of the traffic in the in the colo that we need to make sure that that configuration database can scale with the number of customers and the number of settings and all that all the amount of information that's in it.
And that means that that database needs to get smarter.
It needs to be able to run on multiple computers and talk to each other and have all the information in the safe and performant way.
And so that's what I mean by scale.
It's getting even better at being able to partition the information inside Quicksilver and make sure that it's still available to all the computers at the edge that need it.
Got it. So we have this like ever-growing network, right? We're going to more cities and in cities, we're having more, more, more data centers and more cities.
And in the data centers, we're having more boxes in the data centers.
And then we're adding more features. And so Quicksilver just needs to keep up with the more, more, more, more, more, more, more.
That's right. That's right.
And just like, just like all, all major engineering companies, the big engineering companies know is like what worked at 10 needs to be rebuilt for 100 and then rebuilt for 10,000 and rebuilt for 100,000.
Like every time you make one of those step order changes in magnitude, which is a sign of success, we love the problem.
You sort of go back to understanding how, what do I need to build to make this, make this even better, even more capable of handling the growth and the load.
And in a COVID world, that's growing even faster. Yeah. Well, and, and there's also growth coming in the other end, right?
I mean, guys just kind of, you know, phenomenal sort of the, all the traffic now that it's hitting the edge.
You know, how do we make sure that, that, that the edge of the network as we're writing and pushing and does all this stuff, how do we make sure that keeps up?
Yeah, exactly.
And that's why there's an entire team of engineers inside Cloudflare that are responsible for traffic management, and they are constantly building new systems that keep track of where the traffic is coming in and where it can be, where it can be sent out to, where are there safe.
We run what is called an Anycast network, which means that multiple computers, this is another way that IP addresses can sort of, can sort of, you know, challenge your thinking.
We think of most computers as having one IP address, but actually using a technology called Anycast, they can have more, more than one computer can have the same IP address.
And that's really powerful because what it means is that DNS can say, yeah, you need to go talk to Cloudflare, Cloudflare has an address and the routers will figure out which actual computer of many that have the same address as the right one to serve and traffic management getting to your question of what are, how do you make sure you can scale this?
Traffic management's mission is to make sure that the request goes to the computer that is best equipped to serve it.
And so the technology that they're building on the inside is to have better understanding of where Cloudflare is peered, how are we all connected and how we can route that, those, those packets to the right place automatically without, without having to, to, to have to do anything manual.
Got it. So, I mean, it kind of, it means that like at some level, like the Anycast kind of means like we always got your back, right?
We're always picking up the, like, there's always somebody who's going to answer your call and it's, and it's making sure that like, you're not going to end up in, in a queue or waiting for, for any, for anybody to sort of pick up and can you just pass right through it.
It keeps it fast. Yeah, that's right.
Very good. Yeah. Good. What else do we, the, do we, the, you know, there's a, there's another aspect of edge, of edge in making sure that the edge works as well as it possibly can, which is making sure that you don't accidentally do more work than you, than any request wants to.
And so one of the things that we're very proud of at Cloudflare is we let customers, you know, write firewall rules.
And so there's a lot of extra accounting to make sure that the firewall rules that are written are, are highly performant and are, are doing the right thing.
And that they're, they're, they're taking up as little CPU as possible.
And there's really two good reasons for that.
One, the faster they execute, the faster the result goes. So that's obvious, that's a no brainer and everybody wins there.
But also the faster they execute, the less CPU time they wait, the more requests we can handle.
So in fact, it's also directly related to our ability to service our customers in an efficient way.
And so another part of sort of behind the scenes work is, you know, new numbers and, and the engineers in the audience will appreciate that anytime you add more metrics into a system that can count where something is taking up time, where something is taking up effort, that opens up all kinds of new, interesting places for you to optimize.
And so very proud that the Edge team has also put a new numbers and new metrics and new counters around how firewall rules are used so that we can use that to feed into our product development cycles and make everything more efficient and faster for our customers.
That's like a phenomenal, like, it's just so interesting to me as we've been through this conversation, we've talked a lot about measurement.
We've talked about the ways in which you can explore and visualize data, the way that you can dig into it, the way that you can export it, the way that we're constantly kind of adding new metrics and new numbers to continue to optimize.
I mean, I think it's just, it kind of continues to remind me that, you know, we are always striving to be better, faster, stronger, safer, and to put the tools in the hands of customers to do that in a really simple and easy way.
It's super fun.