Cloudflare TV

Leveling up Web Performance with HTTP/3

Presented by Lucas Pardue, Barry Pollard, Dmitri Tikhonov
Originally aired on 

A HTTP header compression special

HTTP/2 uses HPACK as a method to save bytes when multiple requests in a connection send similar headers. However, the QUIC transport presented some unique challenges for HPACK, so a new method called QPACK has been designed especially for HTTP/3.

Find out more about HPACK works under the hood, and what changes in QPACK.

Featuring Guests: Barry Pollard and Dmitri Tikhonov

English

Transcript (Beta)

The web, a digital frontier. I tried to picture bundles of HTTP requests as they flow through the Internet.

What do they look like? Impression tables? Weird Huffman thingy-ma-bobs?

I kept dreaming of visualizations I thought I might never see.

Then one day I read a book that made things more clear and then I saw some presentations from somebody who tried to change all of that.

Welcome to another episode of Leveling up Web Performance with HTTP3, back to the regular programming slots.

This week I'm joined by guest Barry Pollard who is the author of HTTP2 in Action and does a lot of stuff around web performance, not just about HTTP2 or 3 but a lot of different fields.

And I'm also joined by Dmitri Tikhonov, probably pronounced that wrong, I do apologize, who is the HTTP3 tech lead for Lightspeed Technologies and spent a lot of effort, say, as part of a design team within the working group to design a successor to HPAC, which is HTTP2's header compression.

And so this episode is a special about this feature just because I very much glossed over all of that in preceding weeks.

You know, we've meandered through the topics like transport protocol, free encryption, the different aspects of HTTP3 like prioritization and push, but this is the one area where I've yet personally to implement this in our stack to the full extent of the capabilities of what these things can do.

So I thought it'd be great to pull in some people who have both got experience in presenting this in an amenable manner to people and implementing this in a highly optimized way.

So that's enough for me. Yeah, where do we start today?

I think, given Barry wrote the book on the topic, that must mean he's the expert.

So I'm going to throw Barry in the deep end and say, what is HPAC?

Why do we need it? Why does it help? And anything else you want to talk about?

Okay, well, I don't know if I'm an expert. I did write the book, but I am not involved in coming up with a spec like yourself.

So I'm sitting around from the outside and looking in there.

So everything I say is unofficial, non-ITF authorized and the real truth.

But yeah, HPAC's an interesting part of HTTP2 that I think is quite often overlooked.

To me, the aims of HTTP2 was to try and remove some of the inefficiencies of HTTP1.

So there's quite a number of ways it does that.

And one of the ways is in header compression. So for bodies, we've been compressing bodies from HTTP requests for quite some time now.

So nothing is really delivered, at least in an optimal website, uncompressed.

So that might be images which are compressed in a special format, JPEG, WebP, PNG, whatever.

They're not sent as the raw bits and bytes.

They're sent in a much more optimal fashion. And even text, so HTML, JavaScript, CSS, that's usually gzipped or using some of the more newer compressions like broccoli and stuff like that.

So very little goes down the wire, actually, as raw and compressed ASCII text.

But for a long time, HTTP headers did.

So just a reminder, I'm sure most of the people listening to this know, but an HTTP request says, give me the web page, and the server will return both the web page, but also a load of metadata about it.

And even in the request, there'll be various metadata saying things like, I'm this browser, I support this type of format, I support this type of compression, I support X, Y, and Z.

And then the server can use that information to then deliver the optimal content to the browser.

So that metadata was traditionally under the HTTP one world, given as a load of header field values that were literally just raw ASCII text.

And so you would sit there and say, get me this page, I'm user agent Chrome 56.

But the thing is, they were very verbose. So you don't say you're Chrome, sorry, not 56, I think 86, we're up to there, going back a few years.

So you don't say you're Chrome 86, for various reasons.

And WebAM has an interesting blog post on this about the history of the user agent.

But everyone says, I'm Chrome 86.

And by the way, I'm the same as Firefox, whatever, and Gecko this, and all through the history.

So it's actually quite a long user string. And that's coming along with every single request.

So you ask for the homepage in HTML, and it says, Hi, I'm Chrome 86.

It's the same as Gecko 49, which is the same as Safari, blah, blah, blah, blah.

And you get the HTML back, and you're, oh, I need some CSS.

Hi, I'm Chrome 86. It's the same as Gecko blah, blah, blah. And there's an awful lot of repetition there.

So that's the user agent, one common thing. But we sit there and say, what formats do you support?

What kind of compression do you support?

And every single request is pretty much sending the same things. I mean, they vary slightly depending on what you're actually asking for.

But there's a huge amount of repetition there.

And I think that's where one of the things that HTTP 2 tried to address.

So they came up with this special format that's actually not part of the HTTP 2 spec.

There's a separate RFC for that one. RFC 7541. Sorry, I should have had that to hand.

Should have memorized that one, which is a special format for how you compress these things.

And there's a little bit of back story and history there.

So speedy that HTTP 2 is based upon initially just used gzip compression effectively and compressed the headers.

That gives good compression, but there's very security risks about that where it's possible to work out sensitive data.

And you've got some of the sensitive metadata being sent, in particular, as cookies.

So if you can fish out what the cookie is by trying various combinations and quite quickly guess what that is, then that's not going to work very well.

So after these things, do we go with no compression or do we go with insecure compression, which is probably okay in a large portion of the time.

But that small portion is quite important, particularly as e-commerce takes off and online banking, that sort of thing.

So those were just hypothetical attacks, right?

Those were like demonstrated actual things that people could do without like humongous amounts of resource.

It's very easy.

I demonstrated in that book what you're talking about, hopefully in a very easy way to understand how it's quite quick to get a cookie there.

It's not some, you know, you need a PhD or access to thousands of servers or anything like that.

You need the ability to inject something into the page, but with user -generated content, that's fairly simple.

So yeah, it is a real attack there. I'd say the vast majority of headers don't carry sensitive information.

Who cares what browser you're on?

There is a fingerprint inside of it in general, and they don't.

But small bits of information do, and therefore you've got to protect that.

So they came up with this other spec, which takes a while to get your head around, to be quite honest.

Like the RFC is quite detailed and goes through quite a lot there and has some good examples to kind of go there.

But at a high level, basically what it does is it creates a table, two tables effectively, but a table of previous values.

So your very first request, not quite true, but in general is pretty much similar as to HTTP 1, but your second request benefits enormously because it goes, hi, user agent, use the same value as before.

What formats I like, use the same value as before.

Now there's a couple of intricacies about that because it uses Huffman coding and various other ways, and there's already a static table of predefined 63, I think, 63 predefined values, common ones like get me the index .html, get me that status codes and all those.

Those are predefined. So it isn't quite true.

The first request does benefit a little, but definitely the subsequent requests go back and say, yeah, see previous value, see previous value, see previous value.

And that enormously reduces down the amount of data being sent back and forth.

Because this is both on the request coming in and also the response going back.

Because the response going back will have obviously the status code 200, 404, whatever, but also has a lot of other information and increasingly a large amount of information.

If you look at some of the new security headers, content security policy, feature policy, those sorts of things, they're quite large headers.

And there's more information there expected. And there's recently been structured headers, RFC has been signed off there, which sets the scene for more verbose and complicated headers coming in there.

And I think that's where it will really benefit.

And it sounds reasonably small. And yes, the headers in proportion to bodies coming back, large images, large videos, they are small.

I think on the request side, the headers are a good chunk of that.

And Cloudflare, I think a few years ago, had a good post on this called HPACK, the silent killer feature of HTTP2.

And they did some stats there where they measured the stuff coming in.

I've lost actual stats now.

99% compression is jumping out at me, but I don't think that's quite true.

On average, we're seeing 76% compression for incoming headers.

Now, it's a lot smaller and outgoing because you've got larger body sizes, the larger images and media and that sort of thing, but particularly inside.

And as I say, the whole point, back to the beginning point, is that HTTP2 is supposed to make these requests cheaper and easier.

And we are living in a world where you no longer just ask for the page and a couple of images in there.

There's huge amounts of communication going back and forth between a web page and the server at all times and multiple servers.

So anything we can do that. I think one of the best things I like about HPACK is unless you're a protocol geek and want to spend time writing about it, you don't need to care.

It just kind of works. If you're a browser maker or if you're a web server maker, like Mitri is, then yeah, you need to understand all the nitty gritty details and be able to implement it and go through that.

But for the hood, it's the same as turning GZIP on the server.

It's a bit of pain setting up in the beginning, but once you're there and you've got it set to the right file type, it just happens.

It's seamless in the background.

I've never heard any problems with HPACK, and I'm sure there are and people can get in touch if there are, but keep an eye on the HTTP2 tag and stack overflow and stuff like that.

And you get a few questions about people trying to understand how it works.

But in general, compared to some of the other things that we've done in HTTP2 world or HTTP3 or other protocols, it doesn't tend to go wrong.

It either works or it doesn't. And when it does work, it's just free saving, free money and effectively free bandwidth that you're saving there.

Yeah, for sure. To give people an idea, in the ideal scenario, like you're saying, once you've inserted a value into a table, whatever the length of that is, they're controlled by the HTTP2 settings that are negotiated by each side.

So you can constrain the amount of memory you need to commit to this thing, but in the ideal scenario, you can populate that table with a big header and then look it up with just one byte on the wire to transfer.

And so the compression ratio becomes as big as the header is, the one that can be humongous or it could be small.

And then, like you say, in proportion to the body, we talk a lot about web performance here, but HTTP is used as a protocol for lots of other things like APIs or whatever.

But actually, in relation to those things, you can cut down on the chattiness of the protocol with respect to headers.

And there's no other way to safely do that. It's kind of the way I look at it.

Yeah, because there's a lot to talk about HTTP2. Is it a replacement for some of the real -time APIs?

And it's not really because HTTP is quite chatty in nature and has a lot of verboseness around the thing, but this certainly helps.

Now, you do get, on the other side, people complaining that, well, HTTP1 is simple.

You can literally telnet a 488, send, get, space, slash, return, and see stuff coming back.

But the reality is, for the vast majority of traffic, HTTP hasn't been ASCII for a long time.

It's been encrypted, whether it's with HTTPS or, as I mentioned earlier, images or gzip text or anything like that.

You need the tools to actually read this anyway.

You can't eavesdrop on a connection anymore, and that's a good thing.

And people have got over that, but then they see the format changing.

Don't get me wrong, it's quite complicated to understand that.

If you do want to get down to that nitty-gritty detail, or if you're using Wireshark or any network sniffers, you can't see it, whereas you can spot bits of HTTP1 messages going across.

If they are unencrypted, you go, oh, yeah, I understand what that is.

But the truth is, you need the tools for the vast majority of traffic here.

I mean, HTTPS is by far the norm now, certainly in volume of traffic and even on volume of sites.

And you can't eavesdrop on those things because it's all an encrypted binary format that you can't see.

So you need a tool to actually listen in and decrypt it for you to present it in the best ways possible.

And the tools are there that do it. Wireshark will decode the headers for you and actually show you what they are.

And say if you're a web developer, your browser will take care of all this under the hood.

And it's very, very rare that you need to get to the bits and bytes of the raw thing.

Certainly for most web developers, network engineers are probably a bit different, and they do need that sort of thing.

But in general, again, you'll be using a tool to actually see that.

So it's been a long time since HTTP has been a plain text format anyway. And I think that the headers were one that was kind of living in the past there by staying as a plain text format way beyond whatever everyone else had moved on.

Yeah. And we've talked on previous weeks about using tools like Wireshark.

Like we said, it's because of the stateful nature now of HB2 connections.

You both need the session key for the encryption and the context about the dynamic table, how it's being populated.

And then Wireshark does this great job of even if you provide, you turn it on at the right point, that it's able to capture all that information.

It can give you the different views into these bytes sent on the wire.

These are then dissected back into HPAC instructions, which are the things that either populate the table or reading from it.

And then here's the list of human readable strings that is probably what you really wanted to see.

Yeah. I mean, lower level devices, there was a lot of worry that the complications of HB2 and I guess HPARC, part of that, you know, if you've got a low end device, will it be able to manage that?

But again, the reality of parsing, sticking just with the header, parsing an HTTP header where you read along to the new line and then you try and find the colon.

And God forbid if someone sent two colons or two new lines in the middle of a header, then the whole thing falls apart.

There's so many edge cases and broken things.

And I don't think HPARC's totally got rid of it because at some point you do convert it back to the header we all know and love.

And quite often applications, be they JavaScript front end applications or server based applications, do try to read the values and parse them in some way and format often anyway.

But still the protocol level, it shouldn't have to work that hard.

So, I mean, a lot of people say HPARC is tiny, doesn't make that much difference.

I quite like it. I think it's quite nice.

And again, it's mainly because it just works. And let's say, certainly I've not seen many problems.

Dimitri may come on here and tell me that it's hellish to program and he's run into a million problems and could never go live with a server for ages.

But in general, these things are fairly easy tested and they're fairly easy to go.

And it is quite complicated because there's that table of values and they all move slowly down the table and get evicted as they're older.

There's this whole Huffman coding, which takes a while to get your head around if you've got a maths PhD or a mega.

But again, those things, even though they're quite complicated for us to mine, they're very simple to program.

And they're quite procedural to do it and do that.

Like, OK, it takes a little bit of effort to actually program it, but there's a clear set of steps to how to encode these, how to unencode these, how to manage these things.

And again, I think the RFC is quite good in explaining that and I've not seen too many edge cases in that.

But that's one of the interesting things actually just springs to mind there is, I think there's a point in the RFC where it deliberately says that it is being kept simple.

The HPAC format is intentionally simple and inflexible. Both characteristics reduce the risk of interoperability or security issues due to implementation error.

No extensibility mechanisms are defined. Changes to the format are only possible by defining replacement.

And that I find is quite unusual because like HTTP2 spec was all built with extension in mind and new formats, new frames can be added and, you know, new protocols can be done.

HPAC is very, very strict.

It's got that static table that's defined in the spec and it says you're not allowed to define this as header.

That was done based on the scan of HTTP usage at the time.

And it said, oh, get is quite a common one and 200 is quite a common one and 400 is quite a common status.

And they came up with these 63 most common either request headers or response headers and put those in.

And they're fixed, they're locked, they can never change unless you move to an HPAC2 or a QPAC as we'll get onto in a minute.

It's very, very rigid, which is quite unusual, but also, as it says, makes it more secure and less likely to error.

Whereas, you know, I'm sure you've talked previously in the past about how some of the protocols can get stuck and how we have to grease them and make sure that we don't solidify them so they can't change around the gap.

But HPAC is a bit unusual and it's the one that we actually don't want to change and we want to keep stuck in where it is.

Yeah, I've never really thought of it that way. And that's why I like having people on my show to help me look at things in new light.

Yeah, thanks for that.

As you were talking, I was reminded of some of the war stories that I heard about rolling out H2 in that the HPAC kind of works fine, right?

Like it did go through some iterations.

I wasn't involved at that stage in the lifetime of HTTP, but I remember some stories.

But then when it got to, it was done and people were trying to deploy it.

The main issues that they got, and like we talked about this last week on the show with trying to deploy new web protocols, when you've got an existing stack of other stuff behind them is the case insensitivity.

So if you read the HTTP specs, they say that header names are case insensitive, but then they're defined in, you know, with capital letters in them.

And a lot of people kind of got this wrong over the years.

I can't remember if it was like PHP, PAX.

And so HPAC was so strict that if it got a, or some of the implementations of them, that if it saw these field names with headers in them, then things would explode or just break outright.

So there was a lot of kind of people saying, what the hell?

What is going on here? Oh yeah, well, we disagree on what you're supposed to do.

The new thing's obviously wrong. We've been doing this for so long.

You need to fix that. But it was like, no, no, we need to go and upgrade to a new version of PHP or something that is going to fix the issue in the application servers so that we can get this end to end stuff working.

So I find that quite amusing.

That was quite, actually quite a bit more annoying than anything else.

I don't know if we needed to be as strict there as we did. You're right.

It was never sensitive, case sensitive, and it's explicitly called out, but people didn't honour that, let's be honest.

And then for each two, we insisted that we honour it.

I don't know if that's specifically something to do with the HPAC.

It's certainly for the compression ratio so that you can have, like the Huffman table can focus on efficient compression of lowercase characters.

And when you're inserting values into the table, you don't have, although you can have duplicates in the table, that's absolutely fine.

If you're doing lookups for things, you can avoid some of the ambiguity about case insensitivity.

Do you think that's why it was done?

I mean, I was doing some measurements previously for when I was doing QPAC, and the sizes of those header names are dwarfed by the sizes of the values.

So there's not much win if you optimise just for the header names.

This is the smallest part of the things to optimise for. And surely it comes in, like, if you've got an application that sends uppercase headers, you'd think it was going to do that for every single request after that.

So yeah, there's a stupid hit at the beginning because you can't use the predefined static ones, which are lowercase.

But after that, and yeah, it may be a bit of waste because you've only got certain space in the table.

But after that, I did see a lot of complaints about that, and I'm not sure if that was enforced.

Certainly, I don't think HPAC insists on it, and maybe you're right, maybe it was done there because people wanted a greater efficiency.

But I think that was a bit of a miss in hindsight because you've got, you know, users suddenly just getting weird errors or, you know, protocol errors showing up in Chrome, and you're like, what the heck?

I don't know what this means. And by users, I mean end users as well as actually web users because, again, there's different browsers out there.

Safari wasn't as strict about it, but Chrome was.

So it worked in Safari, but it doesn't work in Chrome.

And then that just leads to people complaining things. And of course, yes, we should all test in all common browsers and things like that, but the reality is, you know, things will get missed.

And again, I don't, I agree with Dimitri, I don't see the necessity there to do that.

I mean, you could find out. It would be an interesting dig because it has happened so recently.

All of these probably decisions are available, and the rationale behind those decisions are available on GitHub and mailing list.

You could actually go back and look. I'd be curious.

I'd be curious to learn why it was done this way. Yeah, go away and come back next week.

That's your homework, say. No, and it is like this kind of, I call it standards archaeology of going back and trying to understand because quite often with the specifications, they don't tell you why.

They just tell you what, and that's good in the sense that you can come along as somebody who wants to implement something or try and understand it and not get lost in the noise and the pros of things.

But on the other hand, it can be really difficult to just say, well, I don't know why it's like that, and okay, the information is somewhere, but I don't understand the standards and how they work and where they worked at that time and the people involved have left and they don't remember or they don't have the interest in it anymore.

And I think sometimes the hindsight is that, oh, when these people were doing things, they had an infinite amount of time and they made all of these good design decisions versus just like getting on with stuff and deciding, well, we can't agree on the perfect solution.

Let's just pick something and go with that.

Yeah, but I do think there is more like, it's great that things are on GitHub and open mailing lists and people can go in and dig into there and find that, but it's also an awful lot of effort.

And I agree the specs should be light and not do that, but sometimes a bit more rationale and flavor is useful to give people context there and digging through 6,000 GitHub issues just to find that one golden nugget of what you were looking for isn't the best way, I don't know.

And maybe that's where other people like authors like myself can step in and fill that gap, or maybe there should be a secondary explanation document that kind of gives some of the background between some of the images, some of the decisions.

Sorry, there, I don't know.

In that case, HP3 would be a goldmine for you, Barry. I'm not doing it.

My wife says I'm not doing it, so no. Yeah, and it's just one of these things.

I think the community do a great job of coming in to fill the gaps and stuff, right?

We see this with MDN, Mozilla Developer Networks that have kind of human-friendly ways of explaining basic, say, HTTP concepts.

If you Google a header and you land on one of those pages, chances are you'll be able to understand quite quickly, precisely what that thing means without getting bogged down in, oh, what's the ABNF of this thing and going through three different RFCs to understand all of the different definitions of token or whatever it might be.

So, yeah, there's definitely a space for people to add color to these things.

The specs can be dry, and the challenge for editors is always to make things clear and possible and implementable.

Sometimes the human -friendly nature gets dropped. Sometimes specifying too much is a detriment to the specification because you get too prescriptive of how things could be done based on no implementation experience versus, like, for instance, something I'm working on is just trying to spell out to people what the problems might be.

So, things like security consideration and the ability for a client to, as I mentioned, like, commit the server to too much state are big concerns there, and those are the things that might not be obvious when you're just looking at an algorithm that you can deploy or implement.

So, I don't think there's a perfect answer here, and what you said is perfectly valid.

Sometimes we do these things as, like, that's common practice, but maybe going back and that kind of input of why did we make the decision we did at that time would be great input to the what decision should we make next time part of it, which I don't know conveniently leads me on to HP3 and QPAC.

Yes, well, QPAC is the result of work of a few competing proposals a couple of years ago.

As a matter of fact, one of all the HP3 related drafts, QPAC is one of the most more stable ones because we've come to a basic understanding of what is going to be back in 2018 in January, so it's almost two years ago, three years ago, and here I prepared some slides.

So, some of the rationale behind some of the design decisions behind QPAC.

Next slide, please. So, HPAC is a very good mechanism for compressing headers for HTTP2, and one of the things that you get by default with HTTP2 is TCP, and therefore you get head of line blocking, meaning that everything is delivered in a single stream, and a single packet loss will prevent all of the following packets from being processed.

Now, here's a little picture on this slide, and each stripe here represents a stream in HTTP2, or in this case represents a stream in GQIC, which is where HTTP3 originated, because GQIC also uses HPAC.

So, first stripe is a header stream, and this is where the headers are delivered, and message bodies are delivered on their own streams.

So, for example, message body A and message B body.

So, you can see that in TCP, they'll be delivered on a single stream, but because QUIC is UDP, you know, for example, if header block A is delivered in packet one, and header block B is delivered in packet two, and packet one gets lost, you can no longer process header block B, and that means you can no longer process message B body, even though it may have arrived.

So, this situation persisted for a little while, when HTTP3 was not even called HTTP3, and then ITF, you know, a design team work commenced on a new compression protocol, and came up with QPAC.

Next slide, please. So, QPAC's main goal was to solve this header of line blocking problem I just described, and it does it to an extent.

So, here we can see that, again, there are three streams.

One is now instead of called header stream, it is called the encoder stream of QPAC, and encoder stream is mainly for delivering updates to the dynamic table, and this is the only stream on which dynamic table updates are delivered.

So, here we can have updates associated with the stream A, and then header block A is delivered on the sound stream, followed by message A body, and header block A may be referencing newly inserted entries into the dynamic table, and if header block B does not reference newly inserted entries into the dynamic table, then it doesn't matter what happens to the updates, it doesn't matter what happens to header block A, message A body, all those packets can get lost, and message B can still be processed and delivered to the application.

So, this is already better than HPAC, but let's go to the next slide.

It does not solve this header line blocking for a class of scenarios, for example, this one.

So, let's say you send two messages, and both header block A and header block B use a newly inserted entry into the dynamic table.

In QPAC, unlike HPAC, QPAC allows you to have so-called risk streams, or potentially blocking headers, meaning that you can insert an entry to the dynamic table and then have to reference it, and when you receive a header block, you can see, okay, well, do I have the necessary state to process this header?

And if the necessary state has not arrived in the form of updates to the dynamic table, you have to wait.

So, for example, if updates associated with A get lost, they need to be retransmitted, and the receiver will wait for header block A to process it, right?

But if, now you can see this encoder stream contains updates both for A and B, right?

So, even if messages A and B are not related, but they both use the same stream to deliver updates in order, by definition, they're in order, and you lose the first packet, that means that header B and message B now depends on message A.

And we don't know how many, what is the percentage of applications out there on the web actually fit into this scenario.

I mean, QPAC's certainly better than HPAC for its use case for QUIC, but we don't know how much better.

And if the question is asked, well, is HTTP3 going to be performing better in regards to compression than HTTP2, we'll say we don't know.

Certainly, the jury is still out, not the jury, but there's still much research to be done for QPAC, because only now have we had major browsers and major servers rolled out, and I'm sure people will do some measurements, and maybe we'll come up with some strategies to improve the performance of QPAC.

But surely the performance is the same, it's just whether by performance you don't mean the compression levels, you just mean whether it gets blocked by dropped packages as much, is that what you mean?

Yes, that's what I mean.

Okay. The compression performance is similar to HPAC. Of course, QPAC contains many more parameters that one can play with.

HPAC, you have to decide what to insert into the dynamic table.

QPAC, you have to decide what to insert.

You have to decide whether or not you can risk a blocked header, right, thereby improving compression performance, or that, for example, if this message is, you can't risk it to be retransmitted, you want it to be, you want to sacrifice compression performance for the knowledge that it's, this message is not going to be blocked on the receiver.

And you also have to decide when to evict dynamic table entries from the table.

So there are many more components. So if HPAC is difficult, QPAC is doubly so.

And as far as, yeah, go ahead.

It goes along with, if HTTP2 is difficult, then, well, HTTP3 isn't that quick.

It's, I'd say, more than double so, 10 times so. Well, HTTP3 itself is probably not as, yeah, quick, quick transport itself is pretty complicated.

HTTP3 is not as, I don't think it's quite such a jump from, in complexity from HTTP2.

Let us talk about these blocked headers. Let's go to the next slide.

All right.

So this graph is compression performance. This is not compression ratio.

This is compression savings. And it comes from the design teamwork when there were several competing proposals.

On the bottom there, you will see Qmin, Qcram, and QPAC, which are the three proposals which eventually became QPAC.

The table, the y-axis goes from zero to 80.

There's a little number 80 in the top left, so it is not actually 100.

And Qmin was different from competing QPAC and Qcram is that Qmin never allowed blocked or risked any headers, meaning that it never used dynamic table entries that were not acknowledged by the receiver.

And in this case, Qmin is a good proxy for illustrating the performance, the compression performance, how it's affected by using or not using blocked headers.

Yeah, so Qmin.

Right here, if people can't see because the text is a bit small, the Qmin is the yellow line.

So this area we're facing, the middle. Yes, Qmin is the yellow line.

And the purple line that is flat is no using of dynamic table. So Huffman encoding by itself saves you 32 maybe percent of size.

And you can see that the Qmin and like Qcram and QPAC are almost identical and Qmin lags in the beginning.

And then eventually at the top right, it actually catches up. So after 90 to 80 requests, it's caught up to QPAC and Qcram.

So if you want to be safe and sacrifice some compression performance, it actually may be, you can look at this graph and say, okay, well, maybe I'll sacrifice a couple of kilobytes.

Well, depending how large your headers are and not risk a retransmission and head of line blocking.

So this is a trade-off that each server is going to have to make.

And one of the interesting things is that you don't know what is going to win.

Like for example, if you're a server and a client connects, you don't know what kind of connection you have between client and the server.

Is it better to optimize for compression or is it better to optimize for sort of, like if you have some lossy path, you probably don't want to risk anything because you don't want any packet from the header stream to affect delivery of independent messages.

So because we did not know in the design team, actually in ITF now, because of those decisions, we did not know which decision would be correct.

This is now a configuration parameter for QPAC.

And it is a setting. And you can say to the peer, you know, you can use block headers.

You can tell the peer not to use block headers.

And as the encoder on your side, you can choose to use block headers or not use block headers.

Perhaps one strategy would be to use, to risk some headers at the beginning of the connection to maximize compression performance.

And then after a few dozen requests to stop risking, because you can see that at the end, it actually catches up in performance.

Is it not a concern with the more power you give someone?

And I'm thinking back to prioritization, where we give them infinite power and nobody used it because it was too much, not too much effort.

It's probably harsh, but there was no right way of doing it. So everyone did it.

No way, pretty much. And kind of ignored that feature because it's just too complicated to get right.

Yes. Priority, everybody's, I think, did their own thing. But in this case, I think one should be able to measure the impact of packet loss on QPAC.

It's just, I'm sure people will come up. There'll be people who are going to research this and publish articles about this in papers, and we'll know better.

And perhaps a year or two from now, we'll know just how to set those knobs in reconfiguration.

I agree with Barry. There's a lot of similarities here. But I think it's easier to measure objective outcome of compression.

With a chart like this, with prioritization, it was also co-mingled with how much multiplexing that was going on at the same time, which is still the case with QPAC.

But then you have to get into all of the metrics of core web vitals or things like that to understand how loading affected page loads.

Whereas this, I mean, I was aware of a lot of the work that Alan Frindell did.

He's the editor of QPAC. And even before we started on H3 stuff, he was running a lot of simulation tests just on HPAC and then trying to tune Facebook's implementation of it.

And so I think some of the stuff here is more similar to how people tune congestion control, say, in that it's very sender-driven.

And it doesn't rely so much on the signals coming from a client-like prioritization.

If the client tells you that it can risk block streams and a certain number, then the power is still in the encoder to do what it wants.

And that doesn't require any user intervention there. It's more like who's going to make that decision is like a browser could say, I know I'm on a really poor connection.

I'm getting loads of drop packages. So I'll, say, prioritize for less compression and less retransmission.

That sounds like a fairly sensible and easy-to-do use cases.

But will browsers do that? There's a lot of priorities, pardon the pun, and a lot of work to be done.

And we know lots of ones don't have perfect HP2 implementations as it is nevermind HP3.

So this thing is interesting.

The more choice you give someone, it sounds like a good thing. And it is for maybe Facebooks who are looking at this and highly optimizing it.

But for the vast majority of people, certainly the average server operator or website developer is not going to be involved in this.

So it has to be a bit lower level there.

Is it the browsers that decide this? Is it the patches, the NGX, the light speeds that decide this?

And will they use this power that we've given them? Or is it, yeah, that would be great to look at.

And we'll have fascinating PhD research studies that you can read and all that.

But again, how many things will actually lead to NGX?

I don't know. Options are always good. But sometimes too many options just mean that a choice isn't made or the default is just stuck with.

Well, the defaults are not bad.

You know, certainly when we worked in the design team, we took a few HTTP sessions and worked with them.

So the defaults are... Well, that's not true.

I apologize. The defaults now are zeros. They used to... See, the dynamic table didn't used to be zero by default.

Now it's zero by default. We use the...

This is with a four kilobyte dynamic table with some Facebook... This is a Facebook HTTP session.

But four kilobytes, in my experience, looking at different websites, four kilobytes is plenty, plenty.

It may be different for... I know some web...

I think Chrome uses 16 kilobytes now for HTTP3. Or maybe it was Firefox.

I was looking at 16. You know, that's a lot. That's because, you know, clients, browsers can afford it.

Maybe service will be able to size the tables dynamically, depending on the load.

There's certainly several ways to do it. And if it's not the web application, you know, people who use HTTP3 as a substrate for their protocol will tune their own.

Yeah, that's... I mean, that's right. We shouldn't just think I'm always coming back to the web, but there is other use cases for it.

And telco networks, whatever. And there are big plans for quick and things.

So, yeah, potentially having more options might help with other types of limitations.

That's true. Yeah, I mean, some of the chat was, you know, is the static table with the 63 entries that we had for H2, that was taken in, I don't know, whatever, say 2015.

Was that still fit for what the web looked like now? Like new things come, we get new kinds of headers, all headers get favorable.

And so, I raise an issue on this, just to put the idea out there.

Was the static table still a good fit for the web of today?

Back three years ago or whatever. And so...

Well, on that, like HTTP2 was published in 2015. So imagine that table was decided in 2013 or something.

That's like, it's a good lot while ago and things do change.

And we posted the question to HTTP Archive, because I was like, it does something, you know, it does crawl the web.

Is there a way to catch some data that was a bit more open than the original genesis of the HAC static table?

And I think it was Rick stepped in and did some weary food to kind of distill the data that was there into something that was a bit more possible.

And then Mike Bishop, who was the editor of the HB3 specification, went through that and some of the people put in to say, well, yeah, these things are popular, but it's a proprietary header or it's a custom header that's used on this platform.

And that's why it dominates.

Again, it ranks highly in this table, because a lot of things talk to Google Analytics, as an example.

So let's take that one out. And then... Do you think that was the right choice?

Because I don't think it should be generalized.

But then again, that's a lot of traffic that goes to Google Analytics. So there's a lot of potential gains there now.

Google Analytics might fall out of favor, and you can get into the privacy issues and implications of one browser and analytics tracking.

But still, regardless, it is a lot of web traffic. So was it right to take that?

I'm using that as an example. I might well be wrong here. But I don't want to go on too many people.

But the idea is that, you know, what is the best benefit of the static table?

Where can it be best used within a lifetime of a HB3 connection?

And taking that as a way to not just look at headers coming in and out of a user agent, but to think a little bit more broadly to other applications.

And to say, well, what we want to do is optimize for that first flight, where there is no opportunity to use a dynamic table, because we've got an initial bunch of requests that we're sending.

And therefore, we want to optimize for something like the get method, I'm asking for index.html and a few other headers that it might be likely to send.

And that's kind of part of the reason we've ended up with the ordering that we've got.

And because of the changes in some of the encoding, we've got a larger static table.

So some of the things in the past that didn't make the cut in the 63 headers, we're able to include now, just because of some of the mechanical changes that happened.

Yeah, there are a couple of proposals regarding the static table.

I remember somebody proposed like a 400 entry, because it's virtually free.

So you can just make it as large as you want. And I think you Lucas had an idea to have a switchable static tables, or was it somebody?

Yeah, so it's another one of these loads of options sound great in theory.

In practice, it's probably, you know, at the time, when I suggested that I probably wasn't an implementer.

But I had a use case where if you think of taking HP and using it for something very specific, then you probably have a very good understanding of precisely what headers you're going to send.

And that you could basically get rid of any of the dynamic table usage by having effectively like a customized a mutable table is kind of the term I think I was using.

So either you'd be able to communicate that on the wire, somehow up front in the connection, like encode it in the whole table as an object in some nice way, or just, you know, predefine this and put it in the ALPN as a different flavor of H3 or something like that.

There's a lot of options, you could probably still do that. But, you know, it's more complicated than just doing it.

For a non-web, you can certainly not use cases, just a non-web thing, right?

Just if it's just application use H3 as a substrate has maybe has its own set of headers that it uses.

You may well want to use a different static table.

Yeah, so I think if people wanted to do that, there's nothing like in the specs that would stop them.

They just need to kind of figure it out and do it.

And the question becomes, how interoperable do you want that to be? If you're in control of a client and server in your own deployment, like you could go and do that today.

And also comes down to how much are we optimizing?

Like, let's say you've got your own application, doesn't use any of the web verbs and has a quack instead of get and all these things.

After the first request or two, and let's not say the first request two aren't important, but after those two requests, if you've got a proper dynamic table, then you've practically got all those values already there.

And is it worth the complexity of being able to redefine that static table whenever you get it after the first request?

Or you could send a prefright request to just fill in all those values or whatever type thing.

Yeah, absolutely. You sound like somebody you should get more involved with the standards, Barry, because you can see the trade offs up front.

Well, you know, the compression performance is actually pretty hard to get correct.

You can have, you know, you can implement your, you know, your compressor H back or Q back to specifications to have a poor compression performance.

The biggest question is, what do you put into the dynamic table?

So, you know, we talked about the user agent, right? It is humongous string.

It's always the same. So you could just say, well, if it's user agent, I'm just going to put it in the dynamic table.

But if your client is your own custom client for your custom application, that's not web, then you decide for whatever reason that user agent is going to change every time.

It is not going to get good compression performance because the compressor that's going to be using it is going to make this assumption.

Um, so how do you ensure the performance and give you more area for experimentation, but also more ways to get it wrong.

Then it's also how much of that is exposed that if you're using a standard library, curl lib or whatever, actually I don't do it too.

Anyway, one of the libraries, Keesh or something like that, that gives you quick and HB3 for free is all that exposed up there.

And can you tune that? Or are you better understanding how it works and saying, you know what, user agent is a bad choice for this constantly changing value.

Let's have a new header. Headers are cheap. Create a new one and do that.

But yeah, without realizing that or putting a bit of thought or realizing how it works behind the scenes, you might go ahead and change user agent every time and not realize the performance you're having.

Yeah, that's a good question.

I don't know whether libraries, the HTTP3 libraries would expose you to QPAC.

I would presume that they would probably do it internally and not let you do anything with the QPAC.

Yeah, I think compared to H2, the thing that I've seen is that like with HPAC and H2, you would take in, say, a collection of string value pairs and you'd get out the binary blob, the block fragment at the end that you transmit into your headers frame and go for it.

With H3, you now have to send messages on different streams and kind of synchronize between the two.

And trying to put too much of that burden onto a client to send the things at the right time and interacting with some quick library is quite complicated and easy to get wrong.

So, I would expect the H3 library to do that.

And QISH still has a little bit of work to do in terms of getting full support for dynamic table in there.

I know, Dimitri, you've got LSQPAC, which is kind of your library for QPAC stuff.

I don't know, because I haven't looked at the API too much, if you put an input in, what do you get out?

You get, like, a bunch of different blocks to write on different streams?

And how does that work?

Yeah, you put in the headers, you know, our header representation, there's a struct in the C header file.

And you get back pretty much bytes, you know, encoded header blocks and then encoded, you know, to the instructions to the encoder stream to insert entries into the dynamic table.

You know, this library, LSQPAC, grew out of QMEN, which was my proposal for, you know, this next compression mechanism.

And QMEN, because it does not use any blocking, never risks any headers, I needed to be able to, you know, to get that yellow curve up to improve performance, had to do some serious thinking.

And one of the biggest insights is that the compressor can actually figure out which entries to put in the dynamic table.

You don't have to have rules that says, okay, well, if I'm a server, I'm going to put this server header in the dynamic table.

If I'm a client, I'm going to put user agent. LSQPAC does something that I don't think any other QPAC encoder does, it keeps history of headers that it has seen.

And it figures out whether or not, you know, depending on this history, it says, well, I keep on seeing this header, let me put it in.

And if I don't see this header anymore, I don't need to retain it in the dynamic table.

And it's pretty good performance.

Sort of agnostic, completely agnostic. Yeah, I stopped sharing then just because I realized we have a gallery view.

But when we're sharing screens, we can't see all the three of us nodding or disagreeing.

So, but I did notice we have a couple of slides left.

We've got about five minutes. So, I don't know if you want to be challenged to spend like a minute per slide, and then we can go back to wrap things up.

Okay, so the reason, you can put them back up, the slides, but let's go to the next one.

So, the reason, very quickly, the reason for this still having a single stream on in, sorry.

The reason for having a single stream in QPAC versus in HPAC is that we are not having solved a header of like blocking problem completely, that it's actually very, very hard.

I don't even know if it's possible. Maybe people smarter than us can figure out someday.

But as a compromise, we said, okay, we're going to do it on a single stream, and it's going to be better than HPAC, but it's not perfect.

Originally, some of our design, some of the designs had updates to it in different streams.

And we couldn't, just couldn't make it work reliably.

And other designs, you know, so that led to these headers.

First of all, so these block streams lead to headers being not processable at some point in time.

So, they have to block these headers.

And if you, for example, leave references in the header that cannot be resolved, the set that would have to keep on reading this data from this header, which could lead to memory exhaustion.

So, the way to sort of mitigate this problem is to mandate that the header specify what table state it needs.

And once you meet that state, you can process header in a single shot. And if you can go to the next slide.

So, this header says, I need update X, and then you follow by compressed data.

So, as soon as you read that blue block, you can stop reading if you don't have that table.

So, leave all the data in the stream without having to explore the memory of the receiver.

And so, that's one attack. Another thing, there was some deadlocks in some designs that were discovered pretty late.

So, those are the reasons for a single encoder stream. Interesting. It's hard to pay attention to everything that happens in QUIC and HB3 and every one of the five or six different attacks there.

So, sometimes I wouldn't say tune out.

Sometimes I let the experts listen while I apply my concentration in other areas.

So, it's always good to kind of hear alien view. Sorry I put you on the spot to describe that so succinctly.

But I've had so much fun talking about HPAC and QPAC that I was flown by.

I could easily spend way more time on this. Which we kind of do when we're having a water cooler chat at a meeting that we haven't been able to do because of all the conditions that have happened.

But before we get cut off, I'd like to take the opportunity to, let me stop sharing, to thank you both for making the time today to come and chat to me and keep me from being lonely.

So, I hope you've enjoyed it. You have maybe a minute or a minute and a half to make any closing statements.

Barry, do you want to go? No, I thought that was interesting.

Like I say, it's one of the lesser parts of the spec. So, I did worry if we were going to be able to still be on.

But it's an interesting chat. So, thanks very much for having us on.

And yeah, I look forward to see what QPAC does in the future.

Great. And Dimitri, have you got any final words? Yes. Thank you, Lucas, for having me here.

And Barry, it was very interesting. The final word is that we want people who are watching this to know that there is still a lot of research to be done.

QPAC is very, very interesting. And if you can come up with something, please email at a quick mailing list or contact us and have great ideas.

I'll be happy to borrow them.

Yeah, and sure. And just in the 20 seconds I've got left, just to reiterate that, a lot of the QPAC work happened early on and some of the early implementers got together and created a repo called the QPACers to help them flesh out their implementations and tests that they worked.

And now people who are a bit later to the party, like me, again, benefit that.

So, check it out. But thanks.

Bye. Bye. Thank you very much.

Thumbnail image for video "Leveling up Web Performance with HTTP/3"

Leveling up Web Performance with HTTP/3
Join Lucas Pardue and friends for in-depth explorations on using the latest web technologies to enhance performance and security!
Watch more episodes