Leveling up Web Performance with HTTP/3

Name: Leveling up Web Performance with HTTP/3
Uploaded: 2020-07-21T14:00:00.000Z
Duration: 1 h
Description: Join Lucas Pardue (QUIC Working Group Co-Chair and Cloudflare engineer) and special guest Yoav Weiss (Performance Engineer and Developer Advocate at Google) for a discussion about the roles of request prioritization and server push in HTTP/2.

Presented by: Lucas Pardue, Yoav Weiss

Originally aired on July 6, 2020 @ 1:00 PM - 2:00 PM EDT

Join Lucas Pardue (QUIC Working Group Co-Chair and Cloudflare engineer) and special guest Yoav Weiss (Performance Engineer and Developer Advocate at Google) for a discussion about the roles of request prioritization and server push in HTTP/2.

English

Protocols

Performance

Transcript (Beta)

The web, a digital frontier. I tried to picture clusters of HTTP requests as they flow through the Internet. What do they look like? Tigers? Goats? I kept dreaming of analogies I thought I might never see. And then one day, I found Yoav. Hello, everybody. Welcome to another episode of Leveling up Web Performance with HTTP3. I'm Lucas Pardue. I'm an engineer at Cloudflare working on the protocols team on technologies like HB2, DLS, QUIC, HB3, those kinds of things. And today, I'm joined by a special guest, Yoav Weiss. Yoav is a developer advocate for Google. He's been working on mobile web performance for longer than he cares to admit on the server side as well as in browsers. He now works as part of Google's Chrome's developer, Google Chrome developer relations team, helping to fix web performance once and for all. He takes image bloat on the web as a personal insult, which is why he joined the responsive images community group and implemented the various responsive images features in Blink and WebKit. That was his gateway drag into the wonderful complex world of browsers and standards. And now, when he's not writing code, he's probably slapping his base, mowing the lawn in a French countryside, playing board games with his family. So, welcome to the show, Yoav. Have you got anything else you'd like to add? I think that about covers it, yeah. Would you rather talk about HB3 today or the base? Yeah, let's go with HB3. Okay. So, it's a different, sorry, if you watch any of the previous episodes leading up to this, started with maybe, you know, the first two going into a deep dive into H3 and Quick, and then we had Robin Marks and Peter Wu on to give some demonstrations about the debugability. On this show, I wanted to kind of change things up again and go more into looking at some of the features of HB3 and H2 that I've glossed over so far, and those are push and prioritization. So, we'll get into that in a bit, but these are kind of topics that sometimes people think they understand, and I know I always forget. So, I was going to just bring up some slides and walk us through some of the issues, and hoping that Yoav will interrupt me when I haven't described something very well, or just have a general discussion. So, yeah, it's kind of, I know Yoav from a few of the standards meetings, like the ITF that I've been to, or related things where we're able to kind of share different views. As the intro said, you know, we represent sometimes the server side of things from past lives or the browser side of things, but the standards are kind of the importance of interoperability, and sometimes the ITF work focuses on the network end of things and the plumbing, but I know, you know, your focus has kind of been more towards the performance angle of stuff. So, I don't know if you want to, before I get into the boring slide if you want to talk a bit about maybe your background there a little bit more. Yeah, sure. So, I've been working on mobile web performance for over two decades now, which is a bit, I don't know, it makes me sad that it's not yet a solved problem, and we're still not there. A lot of the, essentially, there is a lot of things to improve at the network level, and this is where the HTTP2 work came in, and HTTP work is hoping to improve things now. There are also a lot of things to improve on the content level, and in terms of processing on the client itself. So, yeah, I've been, so, I've been working for Google for the last 18 months or so, but I've been working as part of the Chromium project, the open source project, since 2012. First, on my own, just as an evening pastime, then as a part of the responsive images community group, which was my gateway drug to this whole world of web standards, both at the W3C and the ITF, and then, from there, I continued to work at Akamai, working on both server-side optimizations and the client-side features that are required for those server-side optimizations to work. And now, working on making the platform faster as part of the Chrome team. And so, you're also doing some work on client hints as well, which is maybe not solely performance-related. I kind of, I don't know, I'm not the expert here. Like, they could maybe improve performance. I think the motivating use case was like data, was it? Or am I completely wrong? So, the motivating use case initially was around images. So, as part of the responsive images community group, we worked on source and picture for client-side selection. And then, we saw that there is a need for server-side selection as well. So, the client hints proposal came from that. And since we realized that, generally, content negotiation in its traditional form is critical for performance, it's critical for other types of content negotiation for adaptation. At the same time, it is privacy-negative by default. So, by default, the client just sends out all the information to all the servers that may or may not be interested in it, which results in some bloat in terms of network. But more importantly, it results in passive fingerprinting information, like, potentially fingerprinting information that's being sent to all servers, which makes it very hard to track which servers are actually using that, using and abusing that information, and which servers are just, you know, accepting it and not doing anything with it. So, client hints is, in my view, a critical piece of infrastructure to enable privacy -preserving content negotiation, both for performance purposes, so for images, for network adaptation, but also to replace the user-agent string, to replace other bits of current content negotiation that are sending too much information by default. Yeah, all right. I think, you know, certainly some of the work I do, it's, you know, you've got, what you want is kind of the additional, you want better security and better performance, and maybe some people think that's hard to get. But, you know, using, carefully defining these standards and, like, working across the ecosystem to understand the impacts of stuff, there's a big focus on the privacy angle of stuff, right, and actually, you know, documenting the different kinds, active or passive, and, you know, since I started in HTTP working stuff, you've got things like GDPR coming in, and all of these kind of factors that are pretty interesting. But client hints itself is pretty near the end of its standardization process, right? Yes, on the ITF side of things, there is still a bunch of work to be done on the W3C side of things, and in terms of defining and getting agreement on the specific processing model that browsers should apply to it, there's also the rather large subject of cross-browser adoption, so we had, we've been going back and forth with Mozilla folks and Apple folks about the proposal itself. I think we've reached a good ground where they are somewhere between not objecting to it and slightly happy with it, but there's still, like, I would still love to see implementation in non-Chromium browsers, and this hasn't happened yet. Cool. Sorry to put you on the spot with that. I just, I use this a lot as a way to get better insight into things, so like I say, I might normally have asked you about over a coffee, but anyway, now you can take a break while I'll bring up my slides, so hopefully this works this week. Yes, my speaker notes away. Can you see these, Jov? Yeah? Yeah. Cool, so I start always with a brief recap, just in case anyone hasn't tuned in or they forgot, but HP3 is an application mapping on top of a new transport protocol called QUIC, which is secure, reliable, and it mitigates head-of -line blocking, so this is a transport protocol based on top of UDP, and it fixes some of the problems that we found as we worked through classical HTTP onto HP2 and finding issues in TCP and basically redesigning transport from modern age and redesigning something that is going to be reusable for a lot of applications, so we're talking HP3 on this show, but there's other things that could use QUIC in the future, which I kind of don't talk about, but there's a lot of stuff in the background. My role within the ITF is to see some of this stuff, so there's a lot of extensions coming in or how to design modifications to QUIC that would work really well in a data center or maybe on a satellite link, very different network characteristics, and what's cool with this transport protocol is it seems adaptable enough to support those, even if out of the gate the main use case is for us to focus on the web, and so our favorite kind of layer cake diagram, if we're going to draw this up, the compared H2 to H3, at the bottom we've got TCP compared to UDP, we've got TLS to provide encryption. I put QUIC at the same layer, but that's a lie. These diagrams are always models and always wrong, but ultimately QUIC is an always secure protocol that wraps in the TLS handshake, effectively TLS 1.3 handshake provides a packet protection rather than TLS records, but that's a detail we don't need to worry about today. The important thing is this layering, so in H2 we have streams above the H2 layer that are part of the H2 mapping and therefore any libraries that implement H2 or things like a browser need to provide all this stream monitoring and accounting themselves, whereas with QUIC that's provided in the transport layer. HB3 still needs to do some concept of streams, but it does need to worry about the accounting of them, the flow control, stuff like that, so in HB2 we have this idea of all these different frames that could be sent on different streams, stuff related to request response, like headers, frames, and data, or priorities. So the push, which is what we're going to look into today and the next slides, and then all this stuff that I mentioned on the right hand side about connection control, so things around exchanging settings with your peer to understand what you can do in that specific connection, telling your peer that you're going to go away. That was a hard thing in earlier HTTP versions, you might just need to kill the entire connection and that doesn't lead to a graceful state, so you end up in this kind of weird limbo a lot of the time, which sucks. Related to that reset stream, that's a way to, if you decide you don't want something, you can reset it without having to tear down an entire connection or eat the cost of having to receive the whole object. So HB3 is intended to be effectively the same set of features as HB2, albeit on top of QUIC, so these slides are just copied from before, so this probably doesn't make much sense. Part of the process has been on a bit of a rampage to take what the H2 frames were and see if they were a good fit. Some of this has involved just redesigning the frames to all different numbers and different types, but also actually at the end of the whole process what we've ended up with is only a subset of the frames in H3 are required. We've got here priority has been removed, we'll talk about that towards the second half of the day, but we've got this continuation of frame, which is a way to send really big headers. We don't need that in HB3 because the headers frame itself can just be big. That changes some stuff, it simplifies things. We've got two more frames related to push promise, which I'll explain shortly, and then this other stuff around ping and reset stream, but those features still exist, but they're at the QUIC layer, and so that means in some sense that implementations are simplified. What's nice really is that things like flow control aren't duplicated, there's like a single view of it, and so an H3 implementation can be nominal. So a push, what is it? You can say maybe high level it's a way to optimistically send resources to a client before the client asks for them. You might say that sounds great, what could go wrong? This is one of the interesting additional features of HB2. As you know, one of the prime goals of HB2 was to maintain compatibility with earlier versions of HTTP and not introduce any new features, and yet we have this new feature or push. So what is it? It's effectively a way for the client to make a request and the server to with multiple responses. In HB2, it's enabled by default. A client has to explicitly disable this by sending setting at the start of a connection. They can also constrain how many pushes or pushed responses might come back to them by setting this value of max concurrent streams. So by default, the client that doesn't send any value for these things will have push and a few hundred streams, I think, would be enabled. Default, which means that a server that wants to push could push a whole load of stuff at you as soon as you make a single request for a web page, for instance. So there's an ability for the client to respond to that and cancel pushes that are coming its way by resetting streams, which has to have a state for them. The spec describes all the kind of conditions around bearing down things. It's an important detail for the course of this discussion. But in this case, in HB2, you have stream IDs, which I'll show you in a minute, but effectively all odd-numbered streams are used for bind -initiated streams, and those can only carry requests in HB2. And all server -initiated streams are even, and those can only carry pushes. There might be ways to extend the protocol to do different things, and their extensions are not really aware of any extensions that have made it through to be widely deployed that do anything with those things. So it's reasonable to assume if you're looking in a tool like Wireshop or something at the stream IDs, this is how they're being used. So to give a visualization of this, we've got a client that's making a get request for just slash. So it's going to send a request on stream ID 1 to a server on the right-hand side that isn't clearly labeled, but the server's going to come back with this push promise frame. So it's going to send that frame on stream 1 and promise a new stream on ID 2. And what it's going to do is provide information in this push promise that looks like a request. So it tells a client, I'm going to effectively pretend that you made this get request for pushed thing. At that point, the client might be able to reset the stream and say, I don't want it, or other options. But assuming it doesn't do anything, like in this example, the server's then going to immediately follow up with the actual response headers for pushed thing. In this case, it's going to send a status and then a length and then a data frame of that length. And it'll proceed on to actually serving the initial thing that the client requested. So just slash in this case. That's one way to do it. There's a lot more decisions that a server actually needs to take. It might want to promise a few things and then send some data. There's a lot of text that explains, well, if you don't promise at the start, and you start to send back some of the response that maybe includes like HTML with a link to pushed thing, then the client's going to read that and probably request it. So if you promise, and then you promise too late, you end up in a kind of racy condition. But there's some guidance that it's better to do this than not to do that. And that kind of text and spec is sometimes a bit hard to enforce or even test. And I think that's one of my main comments with push, which will come on to some of the real world practicalities of it. But the tooling that we have available, because this is quite a low level feature, doesn't necessarily help us see what's happening in the wild and to reason about why a page is acting as it is. Sometimes it's there, but when you compare this to some of the other dev tools that's able to look at how much time is spent composing a page and stuff, it's just never quite there in my opinion. You're just looking to see if it's going to correct me, I think. Yeah. So on that front, I think that the way to go here and the way to ensure we avoid raciness is by making sure that the push promise is being sent before any link headers or other references to the resources are hitting the client, which is anyway what you actually want to do. Because if I may, and I didn't prepare slides, but I have this diagram in the blog post I published back in 2016, the main benefits of push is by being able to send... There are two benefits. One is being able to start sending things from the server and start to fill the bandwidth before the server side processing that's required for HTML generation is done. And that processing can be very short on very efficient server. It can be extremely slow in less efficient ones. And being able to start filling up the bandwidth with useful content is a huge advantage of push in those cases. And then secondly, it enables you to kick off the slow start process earlier and ramp up the initial congestion window with, again, useful content. So if you are sending those push... You want to send those push promises way before you're starting to send your HTML to the client. That's the ideal use case. Yeah. Because you've got... Especially for things like a dynamically generated website, it might require some database lookups or whatever, and you've got some dead air time when that's happening. And you could fill it with maybe some statically cached asset. But I mean, from having joined Cloudflare and seen some of the challenges with that, you need some awareness of what that page is and what those resources might be. And you could take... There's various ways you could do that. Some vendors offer smarter ways than others, but it's complicated and there's always a slight risk you're going to get it wrong. So there's always these things... I'd say with performance, that it's a trade-off between those two things. And running and measuring those things can be really, really difficult. I know in my own experience, it's really easy to demonstrate how it worked really well for this one kind of thing. But then as a holistic view over the Internet, getting those wide-scale measurements can be quite tricky. Yeah. I don't disagree about the complexity. I worked on that product and with the team that built this product as part of my work at Akamai. So it's certainly achievable. At the same time, it is certainly a complex solution that requires a feedback loop of your CDN or your server studying the resources that this page have used in the past, figure out the critical resources among them, and then attempt to push them. And beyond just... So there was a point in... I don't remember which ITF, but where both the Chrome team and Akamai presented data about what they see in the world when it comes to H2 push. So Akamai presented data that was specific to optimized websites, first load only in order to eliminate some problems that we'll talk about soon, and showed that it consistently managed to use H2 push in order to get benefits in the wild. At the same time, Chrome was measuring data about push in general and like how is push being used in the wild beyond those sites that are being smartly optimized. And it turned out that more often than not, it's being abused in the wild and the overall data was slightly negative. Yeah. And I remember watching those two different sets of presentations and it was really interesting to see how defining your population can infer the outcomes of discussion. And it's great to have different people with different perspectives to say, yeah, of course it's not going to work in general because actually there wasn't... The implementations of H2 are fairly new. To implement the protocol is fairly straightforward. Like we saw, just send this frame at this time, but then to understand how to surface that as an API to developers, I think is generally the tricky part. And there were some things that different server implementations had. I think Apache had a push manifest. So you could say, oh, when you request this path, then you could push these files. And I think it even ran a runtime push dictionary, was it called? Where it would, within a single connection, only push those items once. It would keep a record of what it was doing. But there was some smart stuff. So let me go back to the slides. With me. Here we go. So this TV segment is about HB3. So H2's old hat, man. Let's go back to the new stuff. So in H3, push is disabled by default. It's just not there. It's kind of migrated away from this setting into managing a flow control window of push IDs. So that'll make more sense on the next slide. But effectively, the client still manages both concurrency and enablement via this max push ID frame. So it starts off at zero. The server can't do anything. And it would only be after the handshake completes and this frame comes in, does the server say, okay, I'm now able to push this many things until I get another one of them. And the client can use an explicit cancel push, or it could reset the streams. The benefit of cancel push is that you can effectively cancel the push, obviously, but before the server allocates any resources to it. So you're not creating a stream or any state for that stream. It's kind of this notional logical idea that the server would like to push you a thing with an ID. And you can say even before it's ready to, that's going to go. And then the other good thing is that the server can also cancel. So it could say, actually, I've changed my mind. Whether any server does that, I don't know, but it's all there possible. And unlike in H2, where all streams server push, all server initiated streams, in H3, we have this kind of idea of subtypes. So the server would create a new stream and send one byte on that stream that's going to tell you the type. And that will be the type of push stream, followed by basically the contents of the delivered response. So it looks almost identical. So yeah, see. But again, look at that. Okay, I can't hear you, Lucas. Hello, can you hear me? Yes. Okay, thank you for holding the phone. Yeah, I wasn't sure what to do there. That was exactly on the half hour. I forgot to tell you ahead of this meeting that I actually had an appointment and that was the whole reason I was, no, no, I'm joking. So I don't want to spend forever just talking about push so let me let me race through my slides. Yeah, it's basically looks almost the same as the h2 example. And I think, like, it's, it's important to note that these things should work the same. And the question for us is like developers and infrastructure providers is like, should you push. Is this the right thing to do is there any lessons we've learned, and those lessons that we've learned replicable to both versions. So, like, we can decide to implement the feature in both if we want to, but how do we apply that and, you know, some of the context around what we learned from h2 is kind of Jake's seminal piece of h2 pushes tougher than I thought, which is a great deep dive into the real world practicalities of server push when looked at through like the lens of caching. So, the push spec has a lot around, you're going to push stuff into the browser's cache and the items need to be cacheable and stuff that's always the way I look at it I know others don't. And so there's some weirdness that can happen where say for example, rather than an item being pushed straight into the browser's cache it sits in this concept of a push cache that doesn't really isn't formally defined anywhere but kind of different browsers or user agents have and then it might get promoted into the browser cache if the web page actually request that resource and it can be tied to the connection so maybe like if if you got pushed some data and then you lost the connection like I just did, and that resource is gone and the bandwidth to send it to me was wasted. I, you know, to be honest, I don't know if three years on, like if all of those points are still true, but I remember at the time, it was quite an eye opener for a few people. Yeah, so for me the main conclusions from that article. Is that there are a lot of client inconsistencies. With regard to the handling as well as just outright bugs with handling H2 pushes in some browsers. And for me, the conclusion was that H2 push wasn't really well defined when it comes to the browsing like the browsers processing model. So it was defined at the protocol level. But not defined as to how those pushes should be treated. On the client side, what is their level of interaction with the with the browsers cache. So you're right that all browsers implemented some sort of an H2 push cache. But they all like they had subtle differences because this is all just a result of the easiest way for them to implement it, rather than the way that it was specified because it wasn't specified and that resulted in a bunch of issues like you mentioned, which If we were to discuss these issues in a broader forum, maybe we would have concluded that resources should be pushed directly into the cache or Some other mechanism like we wouldn't necessarily have necessarily reached the same one as all the implementations ended up or Ended up implementing so yeah Lack of specification was the main like the original scent here as far as I'm concerned. Yeah, and it's like I think Jake says in that post. It's not a Pointing the finger at anyone. It's just, that's, that's how it is. And you you adapt the technologies available to the model that you have for your implementation and pick something that may be a bit safe for a brand new feature like you don't know how this is going to work in practice and Yeah, it's, it's been an interesting thing. I think for me, the main annoyance is the fact that they were pushed wasn't exposed to the web platform. So, you know, the, the concept of being able to generate events. You know, like we can with the fetch API of an event fire when when the actual request you made comes back. It wasn't the similar for for like they were push And so much cool stuff could be done with that we have all these other technologies for kind of a hybrid or bi directional mode of things like Like nothing sorry relationships between client and server. We've got web sockets or web RTC data channel. I've got this new thing called web transport that's being specked up in the ITF and W3C but for me, none of those provide actual HTTP semantics. I wanted to be able to have something pushed to me with metadata that is standardized that describes the content type and that I can actually reason about because somebody else has done all the hard thinking And in a previous life. I spent a lot of time thinking about this and wrote a moaning white paper to talk about a cool use case we were using server push to deliver video And so if anyone's interested, you can read that. It's nothing to do with web performance per se, but sometimes like If you're able to give people tools that they can use them or like or develop good experiences on them. And I think maybe some of the dislike of web server push of like, oh, it doesn't do like everything it could in terms of performance would have been slightly offset if it was like, well, don't use it for performance here. You can build this kind of application and it's specifically for the web platform because if you if you have some like H2 library, they typically have just a callback function. You can register with a few lines of code that would allow you to handle those things and people building cool demos outside the browsers, but that's, that's the world we live in. And so we kind of already covered this point, you know, doing the simple thing and here to us how we are you command a server or CDN edge to push for you. Is something that also wasn't really kind of standardized anywhere. There was this notion that we can reuse a rel preload link parameter or attribute. I always forget the term. So, so when you're responding to that initial request the client made my edge server can see this parameter and then push that file. That doesn't give you the filling of the dead air time that we just talked about before, because I still have to wait for the server to generate the header, which often means they want to generate a full response and then provide the headers and data out in one go. And so if you do things too simply. Yes, it will work, but you end up kind of pushing content, the client might already have and wasting that bandwidth and Cross other stuff that it might need. So some clever people came up with a way to mitigate the case where a dumb server would push stuff that the client already has or cash digest. I don't know if you remember that one. Yeah, I remember that. Yeah. So essentially, like you pointed out, each to push has a number of potential problems. One of them is over pushing so pushing resources that the browser already has the client already has. And this is the problem that cash digest. Solves. So essentially the browser sends a list at the beginning of the connection browser sense condensed list of all the resources that it has in the cash for that particular origin and then the server knows what not to push as a result of receiving that condensed list of resources. But That is not the only potential problem with h2 push. So there's the problem over pushing. There's the problems of h2 priorities. Where we have definitely seen cases where Pushing critical like when the server receives a request for the HTML, it starts to push Resources that are critical, but at the same time less critical than the HTML itself. Then when the HTML arrives from the origin server. The fact that buffering happens both in the server itself and in lower layers mean that push is delaying those cases and delaying the arrival of the HTML, which is the most important resource. So there was a problem with h2 priorities and like you pointed out, there's also the problem of pushing the wrong thing so If the developer added the link rel preload for resource that wasn't really critical that push was potentially delaying other more critical resources. And at the same time, it was also always pushing it pushing those resources too late because it was missing the point in time where push is most effective, which is before the HTML is even generated. Yeah. So for all those reasons, and because we didn't really know what the division is because Chrome gather data about pushing the wild and saw that it's not very helpful, but no one actually put in the time and work to drill down into that. Data and figure out which cases fall into which bucket which cases are over pushing which cases are pushing the wrong thing or at the wrong time. And in which cases, it's just h2 priorities done wrong. Yeah, and it's it's tricky to kind of divide those pages up and find those experiments. But I think also you end up with like competing options can come in. Right. So, well, you know, cash digest was one way it was a draft in the ITF unfortunately. No one decided to implement it. So it's there's a matter of record and it's something that I believe like you did get experimental data that it would help For the cases you just decided, but unfortunately people want to go in a different direction and maybe just leave that there. So like we also had this idea of RFC 820 8297 which was any hints. So this is a A informative status code 103 which is an ability to kind of generate some headers related to the resource before the actual status code. And, you know, in this example taken from that draft. There's an ability to tell the browser to preload objects before So I guess in this case, those objects are related to the HTML was being requested and then the browser can decide if it wants to preload or if it already has that cached or whatnot. So did mean in your opinion did preload steal a bit of pushes kind of opportunity or the different things. So, in my view, they are typically used for To use for different things preload mainly shines with in cases where You have discover ability. Delays. So the content doesn't necessarily lend itself to be easily. Processed by the browser. So you have resources in JavaScript resources and CSS that are potentially critical and you want to make sure that the browser discovers them earlier were push At least Mostly shines in the case of pushing critical resources that are typically easily discoverable by the browser, but by pushing them earlier, you're using that that time you're you like you're kickstarting the ingestion window earlier. At the same time, People are now talking about early hints as H2 push replacement or something that can solve half of that use case so Because early hints At the cost of an RTT it can still enables us to reuse that that server time and perform all that computation. While the browser kicks off some requests and manages to fill in the bandwidth with them. So there is an extra RTT involved. So it's not as efficient as H2 push. At the same time, it sidesteps the over pushing problem because if a resources in the browsers cache. It won't get fetched again. But Use like it can still be abused. People can still use it for the wrong things and result in slower pages, similar to preload in a way. Yeah, I think my, my memory is, it might be wrong here, but I seem to recall in the process of this standardization that, you know, the, the idea of a non final response code. And I've tripped up some implementations out there. It's, it's not a silver bullet like it can be tricky, but you know, maybe it was Python or something. But yeah, they Things are designed around the kind of path of most commonly deployed. And so you see something that is completely valid but unexpected and implementations blow up. I think they can be fixed. Yeah, it's, it's just something to be mindful of. And again, it's if you're thinking just a simple origin server and a client like sometimes these things are very easily deployable when you know considering running through A cloud edge somewhere like sometimes our systems aren't quite geared up for these things. So yeah, the technologies exist actually getting widespread Internet deployment can be can take a bit of time to catch up and it needs the data to prove that it's kind of worthwhile to go to the effort of doing that. Um, yeah. So first of all, classification is definitely an issue and definitely something I'm currently struggling with a completely different front, but it turned out that Using structured headers and structured fields and requests is something that the Internet wasn't ready for. But let's not go on that tangent. Like on this front of early hints Chrome is so for longest while it was considered extremely complex to implement early hints implement preload for early hints in the browser because these requests would need to be browser generated, but then matched into like with the renderer. Um, So different processes inside the browsers inside chromium architecture, but recent changes made it so that it's Essentially easier not easy but easier than it used to be. So the Chrome loading team is now interested in trying out early hints and essentially Try out with like Not support the feature just yet because that will require a lot of work, but just measure the potential benefits so Try out with early hints enabled servers, get them to send out send down the early hints responses and see A. What is the difference between the 103 response and the 200 response because this is the time that we can potentially save here. And be whether this whole like whether sending one or three is something that is web compatible and Internet compatible or will intermediaries or whatnot blow up when they see two responses. So this is an experiment that we're currently interested in running. So if anyone listening is interested in participating in that experiment, feel free to ping me and I'll make the right people I learned something today. Well, not just one thing. Lots of stuff. That's, that's cool. Yeah, thanks. Unexpected highlight Cool. Oh, we're actually like getting a bit short of time. So we mentioned prioritization. I've got like way too many slides on that. I was just going to like hammer through quickly, just, just to give some people context. We like I've talked about this before, but I think That kind of the things that we're talking about. We can simplify it down a lot. And just to say, like, if you have a page in your browser that's got 10 things. The load. So here we've got five green boxes and five yellow boxes and what that would do is generate and requests. You know, headers frames like we already showed, you know, ultimately that the browser. Wants to like maybe have different ways of getting those responses back. Maybe you want someone by one. Maybe it wants All of them bit by bit in one go. Maybe you want to pick a mixed approach where it wants to send the green ones first in order and then the other ones could all arrive. Bit by bit and they could be like incrementally or progressively used. And so this is what prioritization allows us to do, which is applying along with those headers. Pieces of information there. Carry some information, say, look, I want to make request. Well, in this case, it's reverse, but I want to make the later request depend on The early request depend on the later one. So in this case, it would it would load back to front, which is probably not what you want, but h2 contains extra information. Allows a client to at least express the order or the pattern that it wants things to happen and in h3 Because we don't have this ordering guaranteed ordering between streams is a feature because it mitigates head of line blocking you end up in this case where well The server receives request to and it says it depends on request one and that it should serve All of request one before two, but the server doesn't even know what request one is and there was all these kind of edge cases around trying to make it clear in In the protocol, like we mentioned earlier implementations deciding to do it one way or the other leads to kind of weird inconsistencies on the Internet. So you want to describe and cover those edge cases. And I wrote a blog about how we are coming the whole discussion involving many people from the community talking different forums like the HP workshop about just over and over a year ago. And how we're in this kind of situation where we wanted to maintain h2 priorities, maybe, but not lose them at all. Like, how can we can we do this stuff. And so we're continuing that process that blog post was written earlier in the year and and as a community. We're discussing the different Options. I think we're kind of very happy that the model. We've got the scheme of urgency and incremental Makes sense. And it's probably simple and just enough, but without being too constrictive but recently a question of reprioritization has come up And I mean, this is a pretty bad diagram, but it's to say, well, look, if the browser window is smaller and it couldn't show all of those items. You would make a request for like six things initially and it would say I want them in return to me in some way. And then while the servers doing all of that hard work. Scrolls the page down and suddenly a new set of requests come in and and the servers actually going to keep trying to save the early green requests at a high priority, even though the client doesn't want them. Strictly maybe as a use case, it wouldn't want to cancel them because it's already got some some of them in flight. But what you would like to do is be able to Have the other ones. And this was like a use case that we had where you could insert a request afterwards, say at a higher priority than earlier and in h2 that wasn't well implemented by some servers and You know, Pat Meaden and Andy Davis created that use case and documented it and help get some fix, but not all but but reprioritization in this case would be in my mind more like Trying to say like I want to change my mind after I've requested all of these things. And so In order to the reason why this is important to the standard stuff is because it's slightly harder to implement I've changed my mind kind of signal, then this is what I want from the start. And so I make some progress in on the specification I asked the community for some input. And you have kindly designed an experiment to gather some data on this on whether reprioritization is actually useful between you and Pat who are running the experiment, I believe. There's some some early data. So talked about, you know, whether this stuff is working or not. Here Pat says Most metrics are neutral, but the largest contentful paint degrades by roughly 6.8 on average. And 12% of the 95th percentile and the speed index degrades. So this was some early data. I don't know if if Between you gathered anymore, but this kind of thing is very informative to help us reason about things. It's very easy to sit there with two different parties saying I need it and somebody else saying I don't need it. But yeah, I don't know if you could share any more about this experiment. Yeah, sure. So essentially browsers are already using reprioritization heavily when it comes to images. So browsers typically start requesting images before they know whether they are in the viewport or not. So images are typically at least in chromium requested with low priority and then the ones that are in the viewport get upgraded into medium priority. So, or that that's the browsers representation of priorities, but on the wire, they get a higher weight and Because it's already implemented, it was relatively easy to say, okay, let's kill that feature and see what happens. So I implemented a chromium patch. Basically, there's a current there's currently a chromium flag that enables you to disable h2 prioritization h2 reprioritization. And then pat mean and Of web page test theme. Ran a long list of well like servers that we know are well behaved in terms of their priorities implementation because there's a big problem with Anything related to h2 priorities is that like if you're measuring it and you're just including entire population. You're most likely to get more noise than not. But if you pick and choose well behaving servers and then disable the feature. This is what we got on initial data. I believe that the experiment is still running because there are a lot of servers and because it's running on basically an off time of web page test. Is running. It takes a while to run. So it's still an ongoing experiment, but the initial data is super encouraging and it Essentially shows what I would have thought it would show is very refreshing when it comes to data that typically surprises you. In this case, it really just proved that or initial data indicated that Prioritization is beneficial. So yeah, in my view, we should definitely keep it in the protocol. Yeah, and as the editor of the spec. I'm pretty neutral on it. I could, I could argue, either way. You know, like I said before, sometimes it's about providing the capabilities for people to implement stuff. There might be some use cases. It's not useful, but if there are, then you know it's a trade off between providing it and Not making life too hard for people who don't want to use it. And these kinds of things all come into play. But seeing this data is very interesting because, you know, anything even single digit percentages. So sometimes like an obvious like net gain. So to see And a 6% 12% was maybe a bit more than I was anticipating, but it's going to be super cool to see the remaining data set. Yeah, exciting. Before I get well before we get cut off. I'd like to thank you for your time this week. You have in the past few weeks, I've missed the opportunity and it's been sad, but no, thanks so much for coming on and talking about some of these things. Hopefully, yeah, my slides didn't put you off too much because you're talking to somebody else's slides, but no, it's been interesting. Was there before. Well, before we close out. Was there anything else you wanted to add. What do you have a minute or two. No, just thanks for having me. It was fun. Right. Oh, can I can I get you back in the future. Is there anything. Do you think there's anything HTTP three performance that we haven't touched on yet. Um, so one thing I'm working on is web bundles and delivery of one button those in a way that can be subsided so that can be seen future topic. Okay. Yeah. In a way, it's cash digest but reverse Turn it over its head and instead of sending what's in the cash. We're just sending a list of everything that is actually needed so Interesting future conversation. Is the first week I finished on time. That's, that's great. Might just sit here until they Stream.

Leveling up Web Performance with HTTP/3

Join Lucas Pardue and friends for in-depth explorations on using the latest web technologies to enhance performance and security!

Watch more episodes