Cloudflare TV

Leveling up Web Performance with HTTP/3

Presented by: Lucas Pardue
Originally aired on September 7, 2020 @ 8:30 AM - 9:30 AM EDT

Join Lucas Pardue, QUIC Working Group Co-Chair and Cloudflare engineer, for a session on how HTTP/3 is supercharging web performance.

Episode 1

English
Protocols
Performance

Transcript (Beta)

♫ Upbeat Music ♫ The web.

A digital frontier. I've tried to cluster pictures of HTTP requests as they move through the Internet.

What do they look like? Bomb? Space invaders? I kept dreaming of a bad analogy I thought I'd never see.

And then one day, one day, I got in.

So, I am not a computer game character, everyone, but I am Lucas Pardue.

I'm an engineer at Cloudflare. I'm also the co-chair of the IETF quick working group and today I'll be talking to you about leveling up web performance with HTTP.

So, where do we start? There's a lot to take on board in this respect and bear with me, we'll go through to it.

So, what is IETF quick? The IETF is an organization called the Internet Engineering Task Force.

They've been working on a protocol which is a replacement for TCP or UDP plus extra.

So, what is that in essence? It is the transport protocol that is providing secure communication, reliability, and it's designed to mitigate something called head of line blocking.

I'll come on to explain through the course of this next hour.

So, you may have heard of the HTTP protocol, which is a hypertext transfer protocol.

In essence, what this allows you to do, it's a commonly used application protocol to allow clients and servers to send messages to each other and you most commonly use this for web browsing.

And there's a lot of nuance and quirks in the way that this operates with transport protocols like quick or TCP that we'll explore in this talk.

But I want to focus on something called HTTP 3, which is the latest version that me, my colleagues, and many people in the industry have been working on for the last, I don't know, three years or so.

And other people at Google, well beyond that.

But effectively, what it does is take this HTTP semantic that we have, the requests and responses, and that lets us transfer resources across the Internet and the web and encodes them into a different wire format than we might be more organized.

And this HTTP 3 mapping explains how features of the quick transport protocol dimension the history of things.

I could probably spend a whole hour just talking about that.

I spent a long time creating some fun images. They look like trees. I think the technical term is cladogram.

It doesn't matter. Basically, there's a lot of extant documents out there and metadata around them that let you explore the relationships between these things.

And you have documents that say Sir Tim Berners-Lee published a version of HTTP to explain how the web, parties in the web can communicate.

And then people, standards people, decided that this is a pretty good thing.

And actually, to get more computers and vendors to interoperate, you could take these things, these documents, and find more rigid and define things better.

So, there's a whole load of years of standardization effort for what's known as HTTP 1.x these days.

And then, on to HTTP 2.0. So, this started with something called Google.

And you can see on the right-hand side of these slides, there's a whole bunch of documents and branching and all kind of clever stuff that goes on here.

But if you're interested, I recommend a post there.

It's pretty lengthy, but it tries to explain what standards are and how the process of writing some documents in PDF works.

But anyway, we're more interested in the new stuff, right? So, HTTP 3.0, that's what we're talking about here.

It stems from something called Google Quick, which we'll explain in the slides.

And basically, there's been a bunch of people across the industry working on this thing.

And so, we have this big grid of interoperability.

So, every couple of months, people from, say, Cloudflare and Google and Mozilla and various other companies, both software vendors or services, get together and do this big grid of stuff.

There's all letters and there's all color coding, and it's incredibly complicated trying to explore and test and verify that different features of a transport protocol and HTTP 3.0 work together.

But right now, this thing is pretty hardcore. You might feel like, to lean on the gaming aspect of this talk, that you died.

I had a critical information overload.

And actually, you might just be confused about what HTTP is again.

So, I'd like to go back to basics a little bit here and think analogies, bad analogies, using computer games.

I've got to be honest that these always break down. Like every analogy, I could use cars, I could use trips or anything.

So, you just kind of have to bear with me on this one.

But if we imagine a client and a server, send a message from the client, it goes over to the server, they respond, and it comes back to you.

That medium between the client and the server paddles is the Internet.

And you can see in this case that the client didn't actually manage to catch a response.

Pretty weird, right? If you sent a message out there, you'd want to be able to see what happened.

So, this is a strange game. And you might say the only way to win is not to play at all, but we want to do better.

So, if we make the client and the server a bit more adaptable, say, they can move.

The client still didn't catch the ball here, but he tried.

And what I'm trying to emulate here is that actually the communications between the client and the server, we imagine the white ball here is a HTTP message.

There's some work that needs to be done before the server can even get that message.

Again, this is pretty bad analogy, but just bear with me.

You can see here there's a few bounces back and forth of the ball.

These are actually pretty indicative of the three-way handshake that would happen between a client and server.

At a lower layer, you know, below a web server or below a web browser, be able to would have to occur before even, you know, at the very bottom, the HTTP request and response were received.

This is pretty accurate. This is what happens on the Internet. And so, that's all well and good.

If the client is going to be able to catch up, then great. But, you know, the Internet's the Internet, and stuff can happen out there.

You have hardware, any time you invoke people and physical natures of stuff, stuff can go wrong.

You know, there isn't a bad guy out there shooting down your packets, but in reality, packets do get lost.

They can get put into buffers, and imagine someone sending stuff very fast.

It can buffer up while you're waiting to read it, and what happens then is people just throw them away in order to keep pace and not end up denial of servicing a machine.

They can get reordered, too. All sorts of weird stuff can happen at a lower layer than the HTTP request, and the thing to bear in mind here is that the client is none the wiser.

All it knows is it didn't get the response, and that's not a very good position to be in.

So, how can we mitigate problems like that?

Can you imagine a client like a web browser? What they tend to do is open up multiple connections.

If you imagine each of these paddles and balls represents a TCP connection.

You effectively halve your chances of getting disruption by opening up another one, and, in effect, this is what we see with web browsers.

They open up multiple TCP connections for HTTP 1.1. That's the nature of it, but, in reality, we aren't just sending single packets.

What we're doing is sending effectively a stream of things.

If you imagine a request for an image, there's going to be a fair few bytes in there, and they don't fit in a single packet.

In actuality, you might have a small request, and the response of the server gives you a packet.

It's pretty big, and in HTTP 1.1, it's seen as a stream of bytes sent on a TCP stream.

And so, if you take that idea, actually, we might be able to do something better.

We might be able to separate out that stream into something a bit more well-formed, into atomic units, and then a frame of bytes, and in HTTP 1.1, you have something called transport encoding, or transfer encoding, and this kind of represents that view.

Rather than pongball, we've got a snake, and in HTTP 2, there is a thing called a frame.

You split the HTTP metadata and data into two different things.

So, we take a request, for example, here.

A little bit bigger for you. Just a simple get request or index .html.

In HTTP 2, what you would do is encode that into something called a headers frame.

Split them into these fields, and then the server would receive that request, and it would process it and generate a response for you.

That response would come back in the form of a headers frame again, the metadata, and then the payload of the image, in this case, is a data frame, and there's a bunch of other frames here.

The important thing to know is that these frames are delivered on logical streams.

They're just like a numerical identifier, and they have some properties, so what these things provide, a reliable delivery.

So, if you request something, the way that HTTP is designed is that it expects objects to be delivered reliably.

There may be clever things you can do to future imagine a world where partial reliability could help you do clever things, but in this case, it is reliable, and that's needed for the basic way that HTTP is designed.

And expect things to be delivered in order so that you can read the start of a response and just read it through to the end and not need to do any clever additional reordering or data gathering or anything like that.

So, yeah, we have streams. What that lets you do, if you can imagine, you can send different headers, different data, and split messages up into effectively units that can be multiplexed.

HP 1.1, you can't do that.

You just have this kind of amorphous stream of things. If you mixed up different requests and responses, it was really easy to get confused because of the expectations of HTTP.

That can lead to some kind of attacks, what they're known as, but HP 2 is really strict.

It takes what is a pretty old kind of loose textual style of things and converts them into a stricter binary frame format, which has well -defined sizes and all those things.

So, when the server is generating responses, it can decide green and yellow and intermix those two things together or decide to schedule those things however it likes.

What this allows you to do, like we mentioned before, of browsers would tend to open up multiple connections.

By multiplexing requests and responses, we're able to reduce that down.

So, the benefit for the client is it can simplify a lot of the logic that they needed to manage a connection pool.

They still do need it for different kinds of first party and third party responses, resources, sorry.

But for a server perspective, this is a big benefit because what you find in practice is that a web browser could open up a single TCP connection.

I have to go through that three-way has a latency to the connection just to retrieve a single resource, and then it will close that connection.

When we're thinking about web performance, we're thinking there's a lot of aspects here, right?

But from the things that I can focus on and help improve, the transport layer and the interaction, clients and servers, once you've got those things, you've got to do the whole processing of objects and there's loads of great web performance tools in all of those areas.

But for me, it's focusing on this HTTP syntax layer down into TCP and QUIC.

Oh, TCP, what is it? It's reliable delivery and it's in order.

But it provides ordering across a whole connection. You can have different independent HTTP two screens, but they all get funneled into the same TCP connections.

No matter what order you put things in, they'll come out in the same order.

And that's a nice guarantee. It lets you assumptions about how to process things.

But I have a downside. So, if you go back to our example, and we have this bad guy causing us to lose a packet, even though we're multiplexing different requests and responses, and we lose just one of those packets, because of the guarantees of in order delivery, HTTP, effectively it kind of blocks the rest of those frames, the data for different requests gets stuck.

They don't get stuck in the Internet, but basically what will happen is on the client side of things, it will sit there and hold later data.

It won't present it up to the browser or the web client in order to process it.

And this sucks because from the browser's perspective, it just took a long time to retrieve that request.

In actuality, it might receive 90% of the response for a picture or something, but it can't present that out of latency and annoying kind of jitter or jank to your web browsing experience.

We've got this thing called HP3, which I already mentioned. It's shiny.

It's happy. It's great. But, yeah, what is it? So, think of something like Jenga, how things stack up.

We have HP2 that's built on TCP. So, we've got TLS on the top of that.

It has transport layer security, or you might be familiar with it being called SSL from the older days of time.

This is what you do when you type in HPS and get a page.

It provides security by giving you authenticity, confidentiality, and integrity for your communications.

It's a big, important thing.

It lets us build a lot of additional value on the top of the web platform and experience.

We'll see now that powerful APIs are restricted. Over the last few years, this has gone up.

So, what we have on top of that is HP2.

And within HP2 itself, streams. Talked about before.

So, we've got QUIC and HP3. We have UDP at the bottom with the same layer as TCP.

UDP is an unreliable protocol. So, what we need to do, given the needs of HP3, add something called QUIC on the top, which provides reliability, authenticity, and integrity.

It borrows features from TLS to be able to do that. So, it uses Handshake, very similar to TLS 1.3 these days.

But what it does is steal the value, valuable stream from the application layer that used to be on top of HP2 and incorporates them into the transport layer.

And on the top of that, we've got HP3.

So, HP3 can explain how to use the transport layer streams, but it doesn't have streams of its own.

So, what this lets us do in the standardization work that we've been doing is pull down things like reorder and reassembly, retransmission for when streams go a bit wrong, down into a layer that can be reused.

So, you can have a, I don't want to call it simpler, but you can have a less complex transport layer that is able to offer all these benefits to other protocols that could benefit from streams and security, which typically UDP is unable to provide.

There are things like DTLS and other amazing protocols. I mean, you can do whatever you want on top of it.

But the real value from this is providing a protocol that's highly interoperable and as reused.

And that's the kind of thing that excites me. So, we talked about HP2 frames.

We have all of these things I mentioned already, headers and data and a lot of other things here that all kind of fall into three buckets.

We've got stuff that's about request and response, like we already talked about.

We've got server push, which is complicated.

We might talk about that another time. And things that are more about managing the connection.

Exchanging settings to tell your peer, your client or server peer, what kind of facets of the connection you're willing to operate that one connection.

Things around management of connection lifetime, telling it to fail or to request a ping to keep the connection.

So, these things are there. If you're a web developer, you might not ever see those things.

It's typically something that's managed by a library or by the browser on your behalf.

So, when we came to do HP3, some of those things got pulled down into the quick transport.

It kind of went on a bit of a rampage.

You're familiar with that old game where you have these hour blocks and you start to kind of destroy bits and pieces of them and knock them down.

So, in HP3, in a lot of ways, it's a simpler protocol than HP2 when you look at these things in isolation.

When you combine it with quick, it's as complicated or more so in some cases.

But if you're just thinking about HP3 itself, we have frames, headers and data.

We've got more frames to handle server push, which is why it's a good candidate to explain in a dedicated session itself.

And we've got some settings.

We won't talk about those too much in this talk. And then we've got quick.

And this is what I was trying to explain with the combining and the complexity.

There's a lot here. We don't have the time to talk about things today and we don't really need to worry about them so much.

But what we have is things around connection management, crypto frames, handshaking, stuff to help set up and tear down a connection.

We've got some stuff around streams themselves, managing the maximum number of concurrent streams active in any one session.

Because every stream requires some state.

You might want to control that and manage that.

And we've got things like flow control and acknowledgements and boss recovery.

All really cool and interesting stuff. For me. Maybe some others. But let me know if you've got questions or stuff you'd like to hear about in later talks, then please engage.

And so taking all of those frames and stuff, if we look at how a quick handshake connection works, instead of TCP and TLS messages being sent back and forth, we just have quick.

And once the quick handshake is complete, we can start to do HP requests and responses.

You might notice here that the handshake is a bit shorter.

Rather than a three way handshake, the protocol is designed in such a way to eliminate one of the round trips in effect.

Call it kind of like a half round trip or one and a half round trips.

But what this visualization is trying to show is by cutting out one of the bounces compared to our old TCP example, we're able to mitigate that guy.

This isn't what happens in real life, but quicker.

And by things taking less time and less interactions, you can mitigate some of the risks that happen in Internet.

Ultimately, what quick can do is speed up resource loading.

And that can provide a faster web experience. So quick streams, as mentioned, are a first class transport primitive.

Independent delivery between streams is the important thing for this talk.

So although there is guaranteed delivery within a stream, across streams is independent.

And this is where quick gets really clever. In the sense that when you create what's known as a quick packet, something that is encrypted, it's within a UDP datagram.

It contains some payload data that could be stream frames for a single stream or for multiple streams.

And if that packet gets lost, the sender is able to detect that by communications with the client.

And at that point, it doesn't just retransmit the quick packet.

It's able to retransmit what it decides. So it can rebundle that data.

There's a lot of different strategies here. But the benefit is that the received buffers, the different streams can progress independently.

If you go back to this previous example where the client is sending two different requests and responses, even if we lose a packet for one of those, in this case, yellow, the green stream is able to progress.

As I said earlier, all of that data is still collected into buffers.

But the way that this data can, when you're implementing a library, how it can effectively get stored and presented up to the application reduces that kind of dead air time that we might have where nothing's happening seemingly.

So that's great. But it's not a magic bullet.

There's some complication that happens here, which I might come on to later in the talk.

But yeah, we've got streams. And in HB2, there was two kinds of streams.

There was a server-initiated and client-initiated. In QUIC, there's four.

Streams have an ID, just a numerical number. But what we do is take the first three bits of that stream, use that to encode stream type.

So you have these things called client-initiated bidirectional and client-initiated unidirectional.

And they have meanings. And so what I'm trying to do in this talk is explain that there's streams.

And later on, we'll show you, like, if you're trying to develop a HB3 application, what you need to do is send data on streams.

And they mean different things.

So if you're opening up Wireshark, for instance, being able to see the stream ID there, you're familiar with H2, it means something slightly different that won't meet your expectations.

Maybe by the end of this talk, you'll be able to do a Wireshark trace of your own and start to probe what the message data being sent back and forth means.

We've got unidirectional streams. This is a new concept.

This is, you know, either a client or server can open a stream by sending a frame with a stream ID of a certain special bytes and do something with it.

And in HB3, sorry, and Quick provides this capability but doesn't tell you what to do with streams.

It was up to something like HB3 on application mapping to tell you.

So HB3 says the first byte of every stream contains a subtype. There's no need to fix stream IDs for special purposes.

So anyone familiar with HB2 would think, oh, stream 0, that's special.

That's the one I'd send a connection control streams on.

But in HB3, that's not the case. You open up stream, the control stream.

That could be any ID you like. You probably want it to be quite a low-numbered one that happened early in the connection, but it provides some flexibility.

That's an interesting thing. It's a powerful thing, but it comes with some responsibility and it's fun.

So when we come to set the connection, there's a few things to do here.

Both sides need to create special streams. I mentioned the control stream.

There's also something called QPAC. It's an encoder and a decoder stream.

Both client and server need to create the three streams. So this is effectively six streams just to get a HB3 connection and shake and I'm done, ready to make some requests and responses.

But given the way that they're bundled into packets, things can happen slightly more quickly.

Do these a bit but you can also be a client and you can be a bit cheeky and send a request before opening up these streams, which is legal.

It's all possible with QUIC. What it means in implementation terms is that effectively there's some default setting, how big things can be or how many concurrent requests there might be able to be.

So the defaults are kind of quite low and sensible to start with.

By sending some messages on the control stream, the server can open up the window there to tell the client actually it's able to do.

And this QPAC thing, that doesn't make any sense. I kind of glossed over it a lot in this thing.

It sounds like something in H2 called HPAC. So this is effectively HB3's version of header compression.

I'm not going to go into much more detail than that, but it's pretty complicated compared to HPAC even.

But effectively what it's trying to do is deduplicate data, metadata, that is common across requests and responses.

So if you think within a session a web browser is sending a long user agent over and over and over again by effectively compressing that thing and save some bytes, saving bytes helps lots of things.

But the important thing here is HP3 bi-directional streams.

So these types, this is where it comes similar to H2.

All client initiated bid bi-directional streams are HTTP requests.

There's no way to do anything else on that stream type. If you do, it'll be detected as a protocol error, bad stuff.

So when we're doing things like interoperability testing, there's really the kind of basic things that we're focusing on immediately.

Can you do this thing? If I send you a bogus frame that type, you detect that as an error and close the connection.

As we continue to develop these standards and providing implementations and deployments on the web, we build up more complex test cases, which is a cool thing.

But the HP3 spec doesn't do anything for server initiated bi-directional.

That's an extension point, maybe in the future, if someone can think of something.

But given the nature of HTTP being client server request response model, it fits.

Yeah. But anyway, back to our frames model, you know, streams, frames, gone streams, had a rampage.

We had this priority frame. It's gone. What does that mean? Again, this is a big topic.

It's to do with the lack of ordering between stream frames. And so with priorities in HP2, they use like a directed dependency tree.

That requires receiving things in order.

If you say something like A depends on B, and you don't receive A, you don't know what A is, you receive A, saying it's a dependent of this thing, then it's very confusing.

So, we as a group of people in the IETF and the quick working group tried to puzzle through all the different edges that occur here.

It took a lot of airtime on, you know, standards meeting and all discussions trying to resolve this thing.

And ultimately, it was too hard, and there were too many ways it could go strange, mysterious places.

So, if anyone's familiar with some of the kind of work that Cloudflare has done the last couple of years about improving the way that HP2 prioritization works, it might be strange to see that actually in HP3 there isn't any.

But this blog post kind of explores how the working group and community is coming together to provide something else.

And that could be a topic for another talk in the future.

But enough of the boring stuff, right?

I've spent 30 minutes waffling about things. So, how can you try this today?

What's out there? We talk about browsers and whatever in the abstract sense.

This is Quiche. A picture of a Quiche. What is it? It's an implementation of HP3 and Quick that was developed by Cloudflare.

It's written in Rust, and it's got a CAPI.

What this is, effectively, is a library that powers Cloudflare's offering of HTTP3.

So, if you've signed up and you've gone into the dashboard and enabled HTTP3, Quiche is running integrated with NGINX that's doing a lot of the HTTP semantics, handling of requests and response.

But when it comes down to reading and writing streams and frames, Quiche is the thing that powers that.

It's an open source thing. And I do have a T-shirt. Yes. Yes.

So, as I kind of joked about at the start, I've been involved with Quick for a few years before joining Cloudflare.

And then, you know, one day I got in, and the mastermind behind Quiche, Alessandro, did some work on this.

And I was able to actually take a lot of the theoretical sides of stuff and start to contribute and implement these things.

It's been great to actually get hands on learning and deploying code that works.

But, you know, just doing the server side of stuff isn't great.

What else can you do? You need to be able to interoperate. And the best way to interoperate is just to load a web page.

Like I said, I think, yeah, I would recommend that if anyone's really curious, just give it a go and doesn't want to have to go and say build their own web server or try to build code from scratch, which you can do.

It's fairly straightforward. But for a lot of us, we just want to maybe install a piece of software and give it a go.

The next couple of slides will show you how to do that.

But you can take your own web page, say, run it through Cloudflare.

And these blog posts provide some guidance on that. What does it look like?

You know, enabling it, what does that do? So, say you were to put your web page, basically, you want to advertise the availability of HTTP.

This is a different transport protocol.

There's a few complications here. And so, clever people a few years ago designed something called HTTP alternative services.

It's an RFC number of an 838.

And what that effectively does, I don't know if this looks complicated or not.

I've seen so many of these things, it's hard to tell anymore.

But in the simplest case, you hit a website and request something like index .html.

And the response to that request will be something like this. An advertisement for an application layer protocol like HTTP3.

And in this case, we have a draft version.

So, as we are developing the specifications, we find issues, we address the issues, we try to improve interoperability.

And yeah, so far, 28 is good.

We're fixing lots of things. But anyway, the left hand side of the equal sign here is basically the description of the application protocol.

You go and look in your dev tools and look for this header.

You might see different strings, which is to be expected.

But in this case, say for a server that wants to talk the most recent version of the HTTP3 protocol, it would put that string.

On the other side, 28 years, inverted commas, a description of the host and the protocol number.

In this case, there is no host. And what that means, come back to connect to this port number.

But because the left hand side said HTTP3, that's a UDP port number.

And it's not going to connect directly with UDP, but it is going to use QUIC.

For all of that, that advertisement is fresh for... Freshness and management of old services is a bit of a funny one.

It's always with caching.

People will do what they like to do. But effectively, yeah, what will happen when a browser sees that is it will try to opportunistically, say, upgrade the connection from whatever version of HTTP it was talking at the time to HTTP3.

And some networks, that might not work.

There might be problems with key packets coming back and forth.

But the idea is that this is a graceful upgrade. So if something were to go wrong with HTTP3, you can just fall back or never switch over in the first place.

And the web browsing and web page loading experience should just work. So how do you do this?

Say that you do hit a web server, just with curl and you can see that that header is coming back.

How do you connect? You can use something like Firefox nightly.

Go into about config and HTTP3. And there's a good chance it will work.

Sometimes it doesn't. And that's unfortunate.

You can enable logging. Neko is kind of the library that powers Firefox's QUIC and HTTP3 implementation.

So if you were to provide... I run this from a command line and provide this environment variable.

Yeah. You might get some extra debug.

You can see the different frames being sent on the screen. And all kinds of fun stuff.

And what this would look like, say, if you were to load blog.Cloudflare.com, which has HTTP3 enabled, and went into your DevTools before you loaded the page, and you enabled the protocol column in your DevTools, which you may or may not have.

But if you did have that, you would see the first request would use HTTP2.

And then the following request would use HTTP3.

Because it learned about the availability of HTTP3 on that first response.

Now, the discovery would be cached.

And so, if you were to do a subsequent page load in the future for a non-cached item, there's a good chance it would hit HTTP3 straight away in that case.

But it might not. And so, one area that people have found a bit confusing, I know it's caught me out once or twice, is that you visited a page and it uses a protocol for that resource, and it was cached, got put into the cache.

And the next time you load that page, it's going to tell you the protocol version for that type.

Even if some of the resources have transferred, like the non-cached ones, it would be pretty confusing that you think that QUIC is failing for some reason, and it's not.

So, probably if you're just purely testing, because you're interested, try and disable the cache and put it to the cache.

Chrome, another browser.

You can use Chrome Canary. Again, the support for protocol is experimental.

We're still trying to work through the standards. We're doing interoperability, and it's good.

But things change, and things can go a bit wrong.

So, these aren't instable versions of the browser's software quite yet.

But again, if you run Chrome with these flags, you can look in the protocol tab, and you'd have tools, and H2 or H3 was used.

And it suffers from the same kind of issue that it can be a bit hard to do these things with page loads, and if something goes wrong, sometimes it doesn't quite work.

So, ultimately, a lot of the time, what you need to do is dig into logs.

So, Chrome has a feature called NetLog, which you can effectively enable.

You need to explicitly enable NetLogging for a particular session.

If you were to do that and open up a page and try to load anything, it's going to capture that session.

In this case, difficult to see because of the imagery, sorry.

But yeah, what this is doing is creating a quick stream.

It's sending and shaking information, slow control, stuff like that.

And this is really good, say, if you were to discover some kind of bug. A strange behavior that you believe is a bug.

You may want to report this. This is the kind of thing that anyone who is a developer of these things, they're really useful.

So, that's a Chrome specific issue. We've got other tools, too, like curl.

It's a command line tool. It actually has two different libraries that help power its HTTP3 and QUIC capabilities.

The nice document here that Daniel Stenberg has written up, which I refer to quite often as I go and do a monthly build of curl to try it against some stuff and see what's happening.

So, it's got two different modes of operating.

A kind of forced HTTP3 mode where you can, say, use HTTP3 and try it against this URL and that may or may not work because the server you're pointing it to doesn't talk HTTP.

Or what you can do is ask it to use the old service to look for an advertisement.

Try and connect to it like we talked about before.

Of course, we've got our old friend, Wireshark. So, yeah. Ultimately, one of the best things you can do with this case, in the case of these things, is to run Wireshark and to see what's happening on the wire.

So many times the applications can, you know, this is new code, new code paths and they might not necessarily log everything that you need to see and give you a false sense of what's happening.

There's always good, at least in my experience, is to run kind of an independent verification that looking what's happening on the wire.

And there's, Wireshark's been having great support for so many years and fortunate to work with one of the core developers.

I constantly go to his desk and nag him about stuff I'd like to see in Wireshark when it comes to QUIC.

So, you know, a lot of tools offer an ability to, sorry, but QUIC being an encrypted transport means it doesn't necessarily have, sorry, it's not so easy to dissect what's happening.

You need the decryption keys.

If anyone's ever tried to dissect a session by Wireshark, you can see what's happening at the TCP layer, but then above that, it can be a bit tricky.

And what you can do is run tools with SSL key log file that will dump the session key and let you decode stuff.

But for QUIC, this is an absolute necessity because the QUIC packet is effectively encrypted.

There's some plain text header that's, yeah, you're not going to be able to see what's happening for extreme frames themselves or anything, occasion layer of that.

So, as an example, we have the Quiche projects example client and server, which we use for some of our interoperability testing.

And you could, you know, very quickly, if you wanted to see an example of a Wireshark dump, you could run these commands, build a client, build a server, run them against each other, file, extract HTML from local host.

And we also do a library.

You could do this yourself. We've got some documentation.

By the way, if you were to run that command, get the dumped PCAP file and analyze it and dissect it, you get something a bit like this.

And this is changing as we develop the standards and release new drafts.

And the Wireshark team is pretty agile with keeping up to date, but things change.

So, some of these images might not reflect exactly how it would look today, and you might need to go and download a nightly release of Wireshark, etc.

But what we see here in this particular capture that I have is stream 2, 6, 10, and 14 all being opened by the client.

And marked here in red, you can see that's the client control stream, client QPAC streams, and something called the grease stream is a thing.

It's used to test out extensibility points of the protocol to help protect us from people not implementing the extension points and therefore never being able to use them.

And that's a story for another day.

And then below that in the bottom half of the diagram, we can see that the server is doing similar here.

But ultimately on stream, in this case 0, after doing all of the control and QPAC streams, the client sends a request.

The server responds. And this is what it would look like if you were to look in the detail kind of breakdown view button.

You can dissect the quick packet. This is ITF quick, not Google quick.

But we can see the stream ID, the FIN flag. So this is just short stream frame in a quick packet that contains some data.

And because I can read hex, or because I know what the client sent in this case, it had a headers frame in there.

It had some grease frame, not stream. Again, testing out the extensibility points of the protocol and the context of that frame was grease is the word.

And then some data. In this case, the file wasn't found because there wasn't an index.html system.

And some other tools for debugging HP3, something called QLock and QViz.

So this is something I'll probably talk about in another session, but it's a great tool for visualizing what's happening at the quick and HP3 layers.

And it was developed in kind of acknowledging that in the future of encrypted transports that it's difficult to see what's happening between client and server when you're just a person in the middle trying to stir a packet.

Especially some of this stuff is stateful. It can be difficult unless you're there at the very start of the session to figure out what's happening.

There's also implementation choices that things might be taking in terms of like to send a packet that they would never expose on the wire.

You see the output of the decision, but not the reason necessarily why it was taken.

So this was an effort to do endpoint login.

Robin Marks is one of the main people behind this. I first heard about his idea a couple of years ago, and it seemed in theory it could be really useful.

And actually in practice, it's turned out to be very useful, at least for the team at Cloudflare and many other people in the community to be able to explicitly log in a reference format what's happening at endpoints.

Then Robin has a load of visualization tools that are online that let us visualize what's happening.

So for example, with Qlog, you can pass another environment variable called QlogDir and it'll dump out logs from the client or the server.

And then we can feed that into a tool that can visualize flow control and gesture windows.

And you can see that this is all stuff that can these days affect the performance things.

It's running in a user space, application space protocol implementation for a lot of the cases.

So there's a lot of experimentation and agility that we can do here.

But without a good way to visualize that, it's very easy to do stuff wrong.

So these are very powerful tools to help understand exactly, say, if you're loading a bunch of 10 parallel streams, how might you decide to pack the quick packets and send stream frames and do things?

You might not know how to do that best from day one, but by being able to measure and see what you do now, then have a target, try to do this again in the future, see what the differences are, compare yourself to other implementations.

And there's a reason about the choices that you're making, which I think if you look back at HTTP 2 and some of its early days and where we're at now, people made decisions at the time and didn't have an effective way to assess if necessarily they were good or bad.

So I talked about Quiche, I talked about Chrome and Firefox.

There's a whole load of other implementations on the Quick Working Group's wiki pages.

There's some we track the results of interoperability testing.

There's things at our library and implementations, they're written in languages and stuff like that.

Typically it's C, but you know, Quiche is written in Rust.

We've got some Go implementations, a whole bunch of stuff.

So, you know, this is a very active area of development and it's great that we've got so many different interest groups doing this stuff.

We've got some closed source ones too. It's not just open source, you know, the value of standards is that you can implement them how you choose.

But in our case, the value of open source is being able to show people how this works and get contributions in.

The community is very good in that respect, and we get input both from the library and also application developers like Kill asking us to fix an API that's a bit weird, and that kind of benefits everyone.

The whole point of this talk was to talk about performance. Getting low on time, so I'm cognizant of not overrunning.

But, you know, the question after all of that, after all the theory, the basic experimentation is how will PHP 3 improve web?

And we focused a bit on these contrived examples of losing packets, but really in effect the protocol design means that PHP 3 probably can improve web performance, especially in those adversarial networks where latency is high and packet loss is maybe non -negative.

Packet loss can happen, but like in mobile networks.

And, you know, Google Quick, part of the reason it got accepted into the ITF was demonstrable data and statistical significance of, like, this is a real way to improve certain aspects and fix certain problems that TCP encounters in a way that's added security.

But this is all new. Implementations are in development.

There's rough edges. Lots of improvement opportunities.

So if we were to take a web page like blog.cloudplay right now and try to load it in an experimental from canary build, we might find, you know, it doesn't load everything exactly the same.

There can be some outright failures. There can be different kinds of flow control or different effects at play.

So part of the reason of shipping this stuff is to get experience in deployment.

And so, you know, measuring this stuff is really tricky.

It's a lot of work. There's synthetic testing.

There's real world testing. More testing the better. And in this case, yeah, tools are your friend.

Selecting the lower layers is tricky. Especially when you're focusing between layers.

The opportunities here to improve the performance of HB3, things like how to prioritize HB requests, what to put in your quick packets, where to send them, when to transmit them.

There's a lot of stuff here.

An example that we did was to change reno congestion control. Great blog post by my colleague Juno.

So I want to give that a read. That's what it's about. And we can look at this at a later time.

Don't forget stuff like connection migrating where we can send packets.

But yeah, for now, that's me. I'm going to say thanks.

Go away and close the connection. And if you've got any comments, please let me know.

Thanks for you. Optimizely is the world's leading experimentation platform.

Our customers come to Optimizely, quite frankly, to grow their business.

They are able to test all of their assumptions and make more decisions based on insights and data.

We serve some of the largest enterprises in the world.

And those enterprises have quite high standards for the scalability and performance of the products that Optimizely is bringing to the market.

And we're really excited to be a part of that. We have a JavaScript snippet that goes on customers' websites that executes all the experiments that they have configured, all the changes that they have configured for any of the experiments.

That JavaScript takes time to download, to parse, and also to execute.

And so customers have become increasingly performance conscious. The reason we partnered with Cloudflare is to improve the performance aspects of some of our core experimentation products.

We needed a way to push this type of decision making and computation out to the edge.

And workers ultimately surfaced as the no -brainer tool of choice there.

Once we started using workers, it was really fast to get up to speed.

It was like, oh, I can just go into this playground and write JavaScript, which I totally know how to do.

And then it just works. So that was pretty cool.

Our customers will be able to run 10x, 100x the number of experiments. And from our perspective, that ultimately means they'll get more value out of it.

And the business impact for our bottom line and our top line will also start to mirror that as well.

Workers has allowed us to accelerate our product velocity around performance innovation, which I'm very excited about.

But that's just the beginning.

There's a lot that Cloudflare is doing from a technology perspective that we're really excited to partner on so that we can bring our innovation to market faster.

Thumbnail image for video "Leveling up Web Performance with HTTP/3"

Leveling up Web Performance with HTTP/3
Join Lucas Pardue and friends for in-depth explorations on using the latest web technologies to enhance performance and security!
Watch more episodes