Cloudflare TV

Leveling up Web Performance with HTTP/3

Presented by Lucas Pardue, Junho Choi
Originally aired on 

Join Lucas Pardue (QUIC Working Group Co-Chair and Cloudflare engineer) for a session on how HTTP/3 is supercharging web performance.

Episode 12

English

Transcript (Beta)

The web, a digital frontier. I'd like to picture bundles of HTTP requests as they flow through the Internet.

What do they look like? Bar graphs, lots and lots of data plotted as bar graphs.

I kept dreaming of visualizations I thought I might never see.

And then one day my colleague Junho started to picture stuff and compare them and come up with some cool blog posts.

And here we are. Welcome everybody to another episode of Leveling up Web Performance with HTTP3.

I should really remember the name of this show.

This is our 12th episode. So thanks for being part of that journey or welcome everybody who is a newcomer.

Today's show, as mentioned, I've got my colleague Junho on as special guest.

Junho didn't send me any intro to him so I'm gonna make it up on the spot.

But Junho is a systems architect at Cloudflare working at the San Jose office in the protocols team.

So the both of us together are responsible for the components that terminate protocols like QUIC and HB2 and HB3 and so on and so forth, which is kind of the primary topic of this show.

So I've been bugging Junho for the last few weeks to come on and talk about an area of QUIC that we haven't really mentioned much and put any real focus on because I'm not the expert at all in that area.

And so I don't know if it's like the theater or something I did where we looked at the QUIC transport protocol as a kind of federation of different aspects that you have an application mapping, which is what HB3 is.

But the transport protocol itself has things like loss detection, security in terms of a secure handshake and confidentiality and those aspects.

But then also this congestion control thing, which I've just hand waved and said, it's cool.

So Junho is gonna come on the show and talk a bit about some of the work he's been doing in that space for the last while.

And that's all I've got to say.

I'll hand it over to you, Junho, to take it away. Hey, yeah, thank you for having me today.

My name is Junho Choi and yeah, basically I'm in the protocol team in Cloudflare working with Lucas and other colleagues in the team.

And the one of my, the task here is the working on QUIC is the Cloudflare HTTP3 and QUIC library.

And one of the work I did was adding the pluggable congestion control layer and implement additional congestion control algorithm on QUIC.

And one of the first target was a Qubic and the Qubic is very popular in other OSs like Linux is still use Qubic as a default, the congestion control algorithm over 10 years, I believe.

And last few years, Google come up with some other cool stuff like BBR and BBRv2.

But I think Qubic is still running the Internet. I mean, the major retail congestion control for every TCP connections to use Qubic.

So I thought that it's a good target to implement the Qubic on QUIC first.

And then based on the experience, we can add more because implementing congestion control sound easy, but if you really start working on that, then you need to do a lot of things to think about.

So the few months back, I wrote the blog post on Cloudflare blog about my experience of implementing Qubic on QUIC.

So basically I'll give you some of the walkthrough of the post and yeah.

Yeah, that's good. I mean, just for anyone who's maybe a bit new to transport protocols or this relates to the both, sorry, the idea of congestion control relates to TCP and QUIC, it's like a common thing, but can you like describe in a sentence or two, just like a basic level, what it is and why people should care about it for performance reasons?

Yeah, so the TCP is doing a lot of work and among them, I believe congestion control is very critical for its performance.

And what TCP defines in terms of congestion control, it's like, I think more than 20 or 30 years ago when there was a TCP tailhole and the Reno, New Reno and things basically like, based on the concept, like there is a congestion window and slow starts ratio and basically two modes, meaning the slow start and congestion avoidance.

And the QUIC is basically using the same concept.

And I believe that's because basically to play with other congestion control already existing in the Internet nicely.

And the second reason was when you define some new congestion control, basically, I think in terms of RFC, it's recommended to start with the Reno first and then add some other things later.

And I think the Reno and the QUIC is too, I think that there's only two RFC out there about the congestion control on the Internet.

So yeah, and the BBR or there are tons of other congestion control algorithm out there from the academic people or industry, but I think only two of them is a kind of official thing we can use.

So like, I guess part of this is you use congestion control to avoid like flooding the network with loads of traffic.

And basically, between you and me and other people's computers is a lot of network hardware, right?

And it's got physical constraints.

And so if you just send as fast as you can locally, somewhere along the network, things might get a bit messed up.

And the congestion control is about mitigating some of that and being sensible.

But what sensible means is dependent on an algorithm.

So there's lots of ways you could do this. And like you said, you mentioned some already like different approaches to this problem.

And with QUIC especially, it's a user space protocol.

Now people are free to experiment with kind of whatever they want.

But for the people who are not congestion control experts, there's a risk that the, what seems like a simple way of implementing things could really hurt the environment around them or end up in unintended consequences.

So the recommendation for New Reno was like, let's pick the simplest, one of the simplest we can do that we know is safe and has an RFC and say, you do nothing else, do this one.

But if you want good performance, you need to do something else.

And that's kind of like where you came in to say, yeah, let's try modularize QUIC.

And we've done New Reno, pick that box.

Let's do better now. Yeah, yeah. So I think making a base line is very important, meaning I think the QUIC recovery draft explains well about how to, I mean, what type of the Reno, the congestion control to use.

I mean, it's pretty simple.

And also there are lots of pseudo code in the end. So you can just use them for your implementation.

And yeah, and that's a basically starting point.

And after you wrote the congestion control algorithm, then you need to wait to test it.

And you also need to make sure that the algorithm is implemented correctly.

And basically, when you hit the congestion or when you hit the packet loss, and you need to see how it reacts.

And just make sure that it works as the algorithm described. Yeah.

And I see, since the QUIC is based on UDP, and the people tend to think that the UDP doesn't have any type of control.

And the one QUIC added to the UDP was basically all the thing TCP is doing, like flow control, congestion control.

There is a window management and the recovery and the concept of the stream, things like that.

And I have seen some people doing some transfer on top of UDP and did some mistake and actually sending too much traffic.

I heard that there are some online game based on UDP and they tend to send too much packet to the client or allow the client to send too much packet to their server without any congestion control.

And that basically makes the congestion collapse, means that basically every queue in the network path is full and no one can do anything because every client is already sending the packet at it's a max rate, so no one can do anything.

And that's basically the congestion collapse. And congestion control is pretty important to prevent those things from happening.

Funny, in the UK, I mean, I love using analogies that are just bad.

That's kind of one of the highlights of this show.

So in the UK, we have what are called smart motorways where we have traffic congestion, where everyone is kind of trying to drive as fast as they can within the speed limit.

But at certain points of the day, when there's known to be rush hour and congestion, these boards bring to life and tell you the speed to go.

And it's completely different in the way it works, but there isn't a governor or coordinator running on the Internet to tell everyone how to behave.

Like it's effectively endpoints operating with the knowledge that they have, which is fairly limited.

The network might be able to provide some kind of signaling but there's no like single source authority that everyone has to coordinate with, which is both a pro and a con.

The pro is that people can innovate and do things. And it's the whole over the top, why ethernet is an IP is good versus other proprietary protocols.

But at the same time, that comes with the risk that you're gonna upset the rest of the Internet if you do.

Or people might be interfering with your traffic flows and you don't know.

Right. Yeah, so when I started working on this, the transport layer stuff, I realized that there are a lot of RFC out there.

There are a couple of RFCs where explaining about the guidelines when you make a new, or when you invent a new transport layer in the Internet and what is the nice way to behave.

So that is pretty helpful but I don't know how many people actually leading to RFC and trying to follow the guideline.

Yeah, quite often, even the RFCs that are clearer, where they have like normative requirements that you must do this or you must not do that.

And ultimately, sometimes it's in that implementation's interest or deployment to ignore that.

It's actually not enforceable in any way because practically, if you did that thing, like you could end up dossing yourself or there's all sorts of weirdness.

So like you say that it's better to look at these things as good guidelines and then to basically you need informed engineers who really understand this stuff to even be able to analyze and measure what's happening and what the effects are and be able to explain them.

And you're gonna mention the Qubic blog post here but in some of the others that we've done, like Lohit's blog post from around Christmas, New Year time, talking about how we built lab environment that let us do this.

We've got local host testing, we've got lab environments with not just does it work but different varying conditions, but really important for us to be able to characterize different things.

And that sometimes the answer, there's different answers depending on different conditions.

It's not one single algorithm with one parameter that makes sense for everything.

But anyway, let me stop waffling. You can proceed.

Sure, okay, so let me share my screen.

Can you see?

I can see it. Okay, cool. Yeah, so this is the Maya blog post regarding the Qubic and PyStar implementation in Quix.

And yeah, this is pre the long post, but I just skip to the point where I want to explain.

So, as I explained, the Quix is based on UDP and UDP is basically, I think UDP as a placeholder for any of a new protocol.

And because of TCP is doing basically all the things necessary to not to overrun the Internet, but if you want to implement something on top of UDP, you also need to think about how to do the flow control or congestion control or loss recovery if you want to make a reliable transfer.

And the Quix want to be a reliable, of course, there is other extension like non -reliable transfer, but the Quix stream is at least the Quix stream is basically reliable in order the delivery part.

And so that we need to have some type of congestion control and also the loss recovery at the same layer.

And yeah. And the Quix congestion control and loss recovery is based on the TCP experience.

There are lots of TCP related, the RFC out there.

And also there are still a lot of Internet draft and probably there's more, the actual implementation or the patches to the individual operating system.

So my experience with the transport layer is first I read a lot of RFC regarding TCP, the very old one, like the what is TCP and the basic congestion control.

And there are other RFC like the Neurino or the SAC and the Qubic and others.

And to understand why the Qubic congestion control is defined this way, you need to have some of the background knowledge of the related TCP RFCs.

So I think that the most of them is listed in the recovery draft.

And basically the Quix use the same type of the congestion control as TCP does, which means there is a congestion window.

It means how many bytes you can send to the network.

And there is a slow start threshold as a stress. It's basically defines where the slow start end.

But if you read the related RFC, you know, the one thing I was confused was how they define the congestion window.

Meaning, you know, I think all the RFC defines the congestion window as a unit of packet.

So if you send the one packet regardless of its size, just count it as one.

And there are other RFC regarding the byte counting the mechanism, meaning the congestion window now, the congestion window is now the unit of a byte.

So if you send a full size of packet, let's say 1500 byte, then you will increase the congestion window by 1500, not one.

And that makes a few difference when you can create the congestion window.

And the QUIC defines a congestion window as a unit of byte.

And meaning, you know, when you send the 1000 byte packet, then you will increase the congestion window by 1000 byte, not one.

So that makes a lot of confusion when you want to implement the new congestion control QUIC, or even in TCP.

And the implementation, the kernel implementation actually the deepest, meaning that if you look at the Linux, the kernel implementation of the TCP, they are basically based on the windows, the congestion windows defined as a windows.

But for example, the BSD or macOS is by, they use a byte counting.

So, and also a lot of academic paper regarding the congestion control tend to use the congestion control as a packet number.

So the same thing will happen when I am implementing the QUIC, because if you read the original QUIC paper, they define congestion window as the unit was a window.

But if you read the Qubic RFC, they define the congestion window as a unit of byte.

So there are a few difference, but you need to make sure, because sometimes you need to scale it, like, you know, between the packet and the size of the actual packet.

So, yeah. There's no fixed size of packets here, right?

And like you have path NTU, so the maximum size of what you can send on any given path can vary and it can vary in direction.

So the size that, you know, of a packet that comes into you might be different than you can send it.

So, yeah, the congestion control is sender driven, right? It's not, I think to do with perception.

That's probably something worth. Right. Right. And, yeah. And, you know, when you, you know, categorizing the congestion control, there is a few different types, but the one is, you know, the loss-based congestion control, such as the Reno and Qubic, is basically they respond to the packet loss.

And there is another one, such as the TCP Vegas and the BBR.

I think they are not inspired by the TCP Vegas work.

It's based on the delay-based, meaning that they measure, you know, they just trying to measure the queuing delay on the network path and tune the packet sending rate based on that.

So there is the two different one.

And, you know, since this delay-based one is still a minority other than BBR, so in the Internet is most likely the loss-based one.

And these two has a very different characteristic.

So it's known that they are not well working together, like meaning, you know, in the network same thing, you have a lot of Qubic flows, and if you have a lot of BBR flows or TCP Vegas flow, they are not sharing the bandwidth equally.

So I think that's one of topic the BBR version two trying to solve.

And so basically when we, you know, the try BBR version two, then we'll know how things work out.

I know you're gonna get into more detail shortly, but just to kind of loop back on one of the things I talked about last week is that when you mentioned academics and kind of the studies, like this is a really cool area for people to look into, but you could also like get caught out in not knowing some of the information.

So lots of people try to compare quick performance to TCP, for instance, and it's not that simple, like, cause these are optional or pluggable.

You can be comparing congestion control algorithms against each other, just as much as you're comparing the protocols.

So you wanna, depending exactly what you're looking at, you wanna kind of minimize the variables, pick your control point, like you wanna test the performance of BBR when implemented in a TCP stack and a quick stack, like no two implementations are quite the same.

So it really is important to hone down on those control variables.

Yeah, exactly. So, people want to try out the different thing and just compare or even with the same algorithm, they want to change the parameters and basically provide the modular API to you.

And what it does is providing a few hooks, when packets act and when we send a new packet and when the loss event happen.

So you can define how to react to those events.

And that basically consists of the new congestion control algorithms. Yeah, and, you know, so when you read the, I keep talking about TCP because a lot of the quick, the congestion control and recovery is based on the experience of TCP.

And if you read a TCP, I think in the beginning, very beginning of TCP, there is no, let's say, the strict separation between congestion control and loss recovery algorithms.

So for many, for example, the actual difference between Reno and new Reno is how they do a loss recovery, right?

Is how they calculate the congestion window is actually the same, but new Reno just trying to recover from loss earlier or not to do the multiple times of the congestion event to happen.

So, but the quick is trying to separate the loss detection and the loss recovery as well.

So that is considered as a separate component from the congestion control algorithm.

So, you know, there are a few rules defined for how to detect the loss and how to recover from loss.

So when you write the new congestion control algorithm on quick, then basically you don't need to consider how to recover or detect the loss, but, you know, the loss detection and recovery is a whole different topic, very interesting topic if you want to look into.

Hey, if you want to come back next week, that's fine by me.

Yeah, yeah. So, you know, the traditional TCP basically use the packet counting to detect the packet loss, but recently there are new algorithms like RAC is the name from the recent app is implemented in the Linux kernel around a few years and I think that the previous day has one very recently.

And the RAC is, I think it's still the draft, not the RFC yet, but, you know, it is proven to be a very efficient.

So I think that's why the quick recovery draft is trying to combine the packet counting detection like a three duplicate app and also the similar algorithm to RAC is basically the time-based threshold, which means, you know, you send a packet and you expect the app is coming back and if app is not coming back in some defined time threshold, then you basically consider this packet is lost.

And, you know, the TCP has a stack, the selective app is basically to provide the receiver to, I mean, the receiver to provide the information to sender that, you know, what is the actual status in the receiver buffer.

So which one is the lost and which one is received. And, you know, the quick also have the equivalent thing named the accurate ranges, it's based on the packet number so that it is doing basically the same thing.

And that actually helps the packet recovery and loss recovery a lot because using this information, the sender knows about what is the actual status in the receiver buffer so that it use this information to decide which packet need to be retransmitted.

Otherwise, the sender need to guess what is lost and what is retransmitted.

So basically, I think one of topic in loss recovery is how to reduce the retransmit by error, right?

You know, the receiver received the packet, it send the app, but app is lost.

So sender doesn't know, you know, this packet is actually delivered or not.

And then just trying to send the packet again, but it's already, you know, actually received within the receiver.

So it's basically kind of a waste of the using method. So yeah, stack and accurate ranges is very helpful information not to have, you know, this previous retransmit.

This has a funny thing. So when I'm not doing coding on Quiche and stuff, I'm also doing quick working group chairing stuff, helping things progress there.

So we have a weekly editors meeting where we talk about some of the active issues and how we're gonna resolve them.

And I generally just sit there and make notes and move things around on a Kanban board.

And yeah, like hearing John and I and Gar and Ian Sweat talk about all of these edge cases that are coming up and finding windows and left edge and right edge.

And like, it's like, all of these like tricky little details.

You can read a sentence, Gar, that's fine. And then somebody says, oh, whatever, what about this?

Like, you wouldn't, you've got to like step through each possible option and then the variants of each one to do stuff.

So when I listen in on these at midnight on a weekday, and I don't follow along so well.

But the outcome is that the text gets further refined so that people don't need to, it's still needed to have an awareness, but it should be implementable.

But it's kind of just funny, amusing to me to hear you talking about the same things kind of independently of that.

Yeah, because if you look into the, how the recovery is done in TCP over QUIC, there are lots of edge cases you mentioned.

And there are lots of papers and RPC, the Internet draft regarding some of the ideas and how to improve.

So, you know, there are TCP RPC like the SAG and the DSAG, or, you know, the forward ITO and the forward ACK.

Yeah, there are tons of tricks and the hack.

I don't say it's a hack, but there are tons of improvements on how to handle all of the common cases and reduce the errors retransmit.

So I think the quick, the recovery draft is, you know, I think the editors is well known about the issue.

So they basically trying to summarize or the combined all of the last 20 years or experience of TCP into this quick recovery spec.

But, you know, maybe there are, we can find some improvement or trying to fix the corner cases, but I think it never ends.

Yeah, a thankless task, maybe. But I make sure to thank them whenever I can.

Okay, so moving to, I think the Reno is very simple.

As I explained, the Reno or new Reno, actual difference is very close to recovery.

So in terms of how to handle the congested window, it's pretty same. So there is concept of a slow start and the congestion avoidance.

And slow start means, I don't know why slow start, but when you start the connection, just trying to send, keep sending data and the congestion window basically grows two times per every RTT.

So, meaning it, you know, the congestion window grows very aggressively until you hit the loss.

Because the slow start and when the congestion window is bigger than slow start threshold, but if you look at the recovery graph, the initial slow start threshold is infinite, meaning until the network allows, the congestion window can grow infinitely.

And then you hit the loss and then you basically reduce the congestion window.

So, and basically you enter the recovery period until all the packet loss is resolved.

And you basically enter the congestion avoidance, which trying to be more, cautiously about the congestion window basically means that it grows the full packet per RTT.

So, it's very slowly until you hit another loss. So, as a result, you will see this kind of, what is it, sawtooth pattern.

It's like, this is a slow start.

So, you grow the congestion window very aggressively and hit the loss and cut the congestion window by half and enter the recovery and you basically trying to recover all the losses.

And then you enter the congestion avoidance here and slowly increasing the congestion window until you hit another loss.

And you hit another loss and you cut down the congestion window and keep doing the same thing.

Yeah. It's just like my analogy earlier. So, I'm on the motorway and I'm accelerating up to 70 miles an hour.

And then someone tells me I need to slow down. So, I slam on the brakes and slow down to 50, they tell me, but I kind of go a bit lower.

Yeah. I keep putting the throttle down and getting up to 50 and then easing off a bit because I don't have cruise control.

Right, right. Yeah, basically it's similar to a cruise control.

But the interesting property here is it's just trying to keep growing congestion window until you hit the congestion.

You just keep doing it because it just assume there is no, I mean, the sender has no knowledge about what is actual capacity of this network pass.

Right? So, it's just trying to prove, it's trying to find what is actual capacity, but the capacity is also changing over time.

So, it just doesn't assume that, when you, after five seconds of connection, oh, this network capacity looks like 100 megabit per second.

So, I keep sending at that rate.

And that doesn't work because there are other connections coming in and that the routing may changes.

So, the network capacity is also changing dynamically.

So, the congestion control is trying to keep finding what is the current network capacity.

But that's why it doesn't trying to have the fixed sending rate based on the first few event, the loss event or congestion event.

Yeah, it's like a lot of different feedback loops in nature, right? There's no fixed answer.

So, you just, you need a tight kind of feedback loop and able to get as close to the border line as possible with an ability to step up or step down, like very quickly if needed.

That's kind of how I think about these things.

Right. And the one interesting thing is, oh, this, by the way, this is a Cubase, the visualization chart.

I think you already have a session about this with Logan, but this is very helpful when you troubleshoot the congestion control issue.

And, you know, in the below, there is RTT over time.

And if you look at this RTT, it's increasing when the congestion window increase.

So, this is basically the same over here, the queuing delay in this link.

And the increase in congestion window means that you are sending more data into the network and there is more chance of, you know, the queuing delay.

So, actually RTT is also increasing if you send the, you know, if you're getting the link overloaded.

And many, the congestion control user, this RTT increase as a sign of congestion or sign of the queuing delay increase and trying to react the base on that before actual packet loss happens.

And so, that graph was, is that of the Reno algorithm?

Yeah, this is of Reno. So, yeah, we will get to the cubic later.

So, yeah, cubic is relatively new, you know, in the history of congestion control.

So, I think that there is always no paper, but now it's RFC.

And I also already told that, you know, there's a difference, you know, how to control, how to counter the congestion window by packet or byte.

And I think almost every OS, the Linux, BSD and Windows, use the cubic by default.

And we also followed the RFC, ADC 12.

And I think of five years or six years ago, the Google has found the bug in Linux kernel.

And that was, yeah. I think that Christian also included the same fix, so we don't have the same problem.

Yeah. So, what does, why did, so interestingly, you know, cubic only changes the way the, only changed the way of congestion avoidance.

It doesn't do anything about the slow start.

But in Reno, doing the congestion control, congestion avoidance, you know, it's like increasing one full packet for RTT.

So, the increase is pretty linear, like this.

But cubic, during the congestion avoidance, it's trying to increase the congestion window using the cubic function like this way.

So, when it starts the congestion avoidance here, it's just trying to remember what was the last congestion window before the loss event happened.

And trying to increase the congestion window very quickly in the beginning.

But in the later, it is trying to approach to the previous congestion window very slowly.

Meaning, assuming that, you know, the network link capacity is same.

So, I want to approach to the previous congestion window cautiously, so that, to know that this is still, to check this is still the current link capacity.

And if loss happen, then, you know, we will do the loss event again. But let's say, you know, the capacity is actually increased by some actual capacity changes or existing other connection, you know, terminated.

So, we have more room to grow.

And then you start to grow it again. So, in the beginning, it's very slowly, but after it has some confidence that, oh, you know, it looks fine.

So, I will try to increase the congestion control very fast.

So, it's trying to increase congestion window very fast.

Like a slow start until it hit another loss. So, that's something for the next program.

So, this is basically the essence of a tubing. And another big change is, when the loss event happen, the, you know, is basically reduced 50% of congestion window but tubing only reduced 30%.

And this actually makes a lot of difference in terms of performance, because that means, you know, the tubing is more quickly grows, recover from the loss event.

And, you know, also have a TCP friendly, meaning that at least, you know, the congestion window grows bigger than we know.

And another topic is high start, because as I said, the cubic only changed the way of the congestion avoidance.

And slow start, there are other algorithms named high start.

It's also from the same guy of cubic.

And what it does is, you know, as I explained at the beginning, slow start terminate when slow start is bigger than, when the congestion window is bigger than slow start threshold.

But the initial value of a slow start threshold is infinite.

So, basically, the congestion window grows very fast. And so, practically, slow start only exit on packet loss.

So, the people trying to think about what is the way to exit a slow start earlier than that.

Right? Before actual packet loss happening, if we are able to exit a slow start and moving to congestion window, then probably we are not gonna see, you know, more packet loss.

So, high start is based on the idea that, you know, how to detect the congestion before the packet loss event.

And there are two ways of doing it. One is watching the RTT increase.

As I explained, when the network is congested by, you know, the bigger congestion window, RTT also increase.

And there is another way to detect ACK strain.

It's basically measuring the ACK interval between the arriving ACKs.

And if ACK arriving is getting slower, then we use them as a sign of the congestion.

But in the real world, ACK strain is not that useful because, you know, there are lots of, you know, that the proxies and middle boxes or the link layer doing the ACK compression.

So, the RTT delay is more reliable to detect the congestion before loss.

And there are improvements named the high start. This is also the Internet draft.

So, I think, interestingly, the high start is not the Internet draft.

I don't know why. But high start is- It has to be one, but it helps. Yeah.

But high start is already the Internet draft. And I think it's actively reviewed very recently and Microsoft implemented it.

And this is, yeah. And I, you know, when I look at, you know, high start and high start plus plus, I basically like high start plus plus better because the implementation is much simpler.

And I think biggest difference is it adds a concept of the limited slow start.

It's just in between slow start and congestion avoidance. So, it's kind of, you know, the link between those.

But this shows the importance of, you know, actually trying stuff out.

And I don't know much about this, but yeah, like a design on paper that looked pretty good in high start was found.

Practically just had some of the issues you labeled here.

So, that's not the end of it. Don't throw it in the bin.

You take what you learned from that and keep on iterating. And it's done in such a way that meant it's applicable to not just the TCP use case that they had, but like other things could pick it up.

And then you were able to implement it and measure it yourself and see what the outcome was.

Right, right. And there are a lot of the parameters, the initial parameters of this high start.

So, depending on the result, the main changes.

And yeah, I'll just skip the part of the result.

So, you know, as you say, using the same, the testing framework, basically trying to control the bandwidth and RTT and ECLOS.

And we basically compared the performance of this new implementation.

So, mainly between the Quish, the Reno and Quish cubic.

And also when I add the high start on or off. So, you know, the Reno is lower than the qubit is already known.

And I expect the TCP qubit and the Quish qubit the performance similarly.

And the one curious thing I have found was when there is more active loss, like a 4%, 6%, 80%, you know, the Quish, the Quik is actually performed better than TCP qubit.

So, this is very interesting thing. So, I think this is mainly due to the loss recovery in Quik, in Quish is somewhat more efficient than what TCP does.

But yeah, this is something that I want to look into in more details later.

Yeah, and what I like about this graph, just in case people can't crack it as good as me, the lower the better, right?

Because this is the total download time for an object of some size.

I can't remember exactly, but like your left-hand most case there with zero packet losses, like you just said, they kind of all perform the same.

So, you go, well, why would I bother? I'll just go with Reno and it'll be fine.

But it's in these conditions that are more like the real world that we might encounter, as you start to see an actual benefit.

Right, right.

Yeah, so in the real world, there are some packet losses and I think the packet losses are more higher in the mobile environment, I believe.

And then every transport port has some resistance to those type of packet losses.

Okay, so yeah, I think I only have a few minutes.

So, one thing to say about the high start is, high start is not trying to improve the actual performance.

It's trying to reduce the packet loss by exiting the slow start earlier.

So, the goal of a high start is reducing the packet loss while maintaining the similar performance.

So, that is actually proven by this, like whether we turn on high start or not, the actual performance is very similar.

But if you look at the number of the lost packet measure, the high start can actually reduce the packet loss.

And this is important for unnecessarily overriding the Internet and also reducing some of the queuing delay by removing the unnecessary packet transmit.

Okay, so yeah, we already look at the visualization and I think I just want to show the last one, which is the example of the two week with the high start.

This one, so this is a typical example of a hard start at the two week and high start.

So, this congestion avoidance phase, you can see the congestion window is growing, maybe hard to see, but in the end, it's approaching very slowly.

And also, in the slow start phase, it also trying to, I mean, the slope is a little bit smaller than the usual things.

So, if you look at in more detail in the first few seconds, you can see this part of a congestion window increase a slow start.

And here, before congestion, before actual packet loss, it detects this RTT is increasing and then it enters the limited slow start.

So, it still grows faster than congestion avoidance, but slow than the slow start until it hit the lows.

And then it enters the recovery.

So, this first part is actually a recovery period and the congestion avoidance is start here when all the losses are recovered.

So, it is important to know that when you implement something, then you should be able to verify using this way.

Otherwise, you don't know it actually works as you implement it or as you expect it or not.

Yeah. That's why it's cool to have some tools to help us.

But yeah, just to reiterate to anyone who's watching, like this is the blog post that we, oh, not we, Juno put up a couple of months ago.

So, like if any of that has piqued your interest, like do take a look as we've got like a description of the test conditions and probably some way for people to reproduce those results.

And I know you wanted to go into maybe like some of the details about the key shape PI, but I took up all your time asking dumb questions.

So, maybe we could get you back at another point to say, people if they've got some ideas, like and they wanna try this stuff out, how they could be able to do it.

But yeah, before we run out of time, I'd like to thank you for coming on and just explaining stuff.

I think my takeaway from this is, like we focus quite a lot on the web aspect of stuff like page loading and things like that.

But by changing something at like a low level for people like me and just tweaking things lightly, like you can adjust aspects of transfer speed, which then have like knock on effect for performance.

And it's kind of being able to take a holistic view of all of these things is quite important.

There's no one answer, like a silver bullet of, oh, my page is slow.

If I swap out the congestion controller, it will all be fixed.

Because like you just shown, like maybe you're inducing packet loss at the start of the connection.

And then that has some weird interplay with the way that your website's constructed.

So like you really have to kind of use all the tools at your disposal and piece together the different aspects.

We might take like a webpage test trace and a Q log and try and visualize these things just to understand like what's going on.

Is kind of why I find all this work really interesting because it's continued unexplored territory.

But that's it. And while you're on the call, I also want to highlight if congestion control isn't people's vote, but you stay to the end.

Juno published a blog post today about some work he's been doing on HP2 upload performance, which is nothing to do with congestion control.

But to do with flow control.

So I don't know, maybe you want to spend the last couple of minutes talking about that before we get cut off.

But it's a really cool blog post. There's more graphs there that show you things.

Like I enjoyed it. It just, we saw something weird and decided to look into it rather than just go out.

We don't know. Yeah.

Well, yeah, I think it's interesting to look at, how the, because it's all about the, when we upload the content to the HTTP2 server.

And then what is important is, because the HTTP2 is based on TCP.

We don't know what about the congestion control or basic TCP flow control, but HTTP2 has its own flow control, right?

And then we just need to make sure that there's not a bottleneck in the server side.

And the people mainly think about how to make a download faster, but they're less sensitive of the upload speed.

So I think that, yeah, it's great to have that work and the blog post is so helpful and interesting.

Yeah, I enjoyed reading it.

And I tweeted about it earlier to get people having a look. I think to me what it demos is like, like you were saying in terms of Internet timescales, 2008 was not that long ago.

HTTP2 has only been out for five, six years now and we're already replacing it, but we're still finding works of implementations or how different parameters cause things.

We've seen this with the priorities chat that we've had before and how, in implementation, we were able to find some ways to improve that.

But unless you measure, you don't know, like was there even a problem in the first place?

And now we're trying to replace that. And a big factor for me there is to prove like with interop testing that we've done this.

So like, I'm excited.

I don't think the job's done. I think we're gonna have a fruitful few years of continued tweaking and testing and evolving.

Do you agree?

Yeah. Are you looking forward to it as much as me? I think we're at the top of the hour and we're done now, but if not, I'm gonna take next week off because it's a bank holiday in the UK.

So tune in in a couple of weeks and I'll be back.

But for now, goodbye and thanks again, Judo. Yeah, me too. Yeah, thank you for having me.

Thumbnail image for video "Leveling up Web Performance with HTTP/3"

Leveling up Web Performance with HTTP/3
Join Lucas Pardue and friends for in-depth explorations on using the latest web technologies to enhance performance and security!
Watch more episodes