Cloudflare TV

Leveling up Web Performance with HTTP/3

Presented by Lucas Pardue, Daniel Stenberg
Originally aired on 

Detailed tips and tricks for analysing and measuring the new HTTP/3 and QUIC protocols. Featuring guest Daniel Stenberg.

English

Transcript (Beta)

The web, the digital frontier. I tried to picture clusters of HTTP requests as they flow through the Internet.

What do they look like? Lego pieces? Wooden blocks? I kept dreaming of visualizations I thought I might never see.

And then one day, Daniel Stenberg wrote some stuff on HTTP2.

Hello, everybody. This is another episode of Leveling up Web Performance with HTTP3.

This week, I've got a special guest, Daniel Stenberg, who is the core maintainer of the Curl project.

He's like a big hacker who's been supporting lots of very diverse use cases with Curl for a very long time and has been recognized for those achievements in a lot of places.

Daniel's also quite prolific blogger and has written about lots of interesting cases where Curl's been used and also some very insightful posts that I still refer back to on things like HTTP2 and related technologies such as this.

Welcome to the show, Daniel.

Thank you very much for coming on board and giving us your time. Hello.

Good to be here. Fun. I've had the pleasure of getting to spend some time with Daniel across the years at different kinds of meetups or Internet -related conferences.

One of them is like the HTTP Workshop, which is a non -standards body meeting that has been held in Europe the last few years.

But unfortunately, due to the current situation, we weren't able to hold that one this year.

And another time I got to meet you has been at the CurlUp conferences, which is your very own conference, which was also similarly affected.

So this shows an opportunity for me to have our semi-annual meetup, I'd say, and just chew the code on protocol stuff away from the specific standards issues that you might want to argue about on an email thread list and kind of more holistically about stuff.

It's good to have you here.

It's good to be here. Yes. It's been a weird year, right? So a lot of those physical meetups haven't happened.

So here we are. Here's my home in Sweden, all white background.

Any of your more close followers might be familiar with your live streams where you do some kind of live coding and hacking on CurlUp.

Is this the same room?

It is the same room, maybe a different angle. But yes, I have a few different machines.

I picked this particular machine today to do the video conferencing on.

I actually have a dedicated machine for it to minimize the problems.

So yeah, for me, I've been working from home for a long time. So for me, it wasn't such a big transition, this Corona thing.

And nowadays, I work on Curl full time.

So I'm here working on Curl, just like any other days. So as always, I dive right into what I know about things.

Maybe some of our viewers aren't so familiar with Curl.

The purpose of this show is to talk specifically around H2, H3, and QUIC and stuff.

But maybe you can give a little intro into Curl and libcurl, and then how those protocol features integrate.

Yeah, so it started, everything started with Curl, the command line tool, which I released under the name Curl in 1998 in the spring.

So and it's, I designed it to do Internet transfers based on URLs, hence the name, right?

C URL, or client for URLs, or whatever. I wanted it to be short, so you could type it easily, C URL.

So and it started like that. And already, when I created Curl, the command line tool, I knew that it would be fun to make a library to offer the same powers to applications.

And after a few years, I introduced libcurl then to as a library to offer to, yeah, to whatever application who wants to do Internet transfers based on the same things.

So that's how we started.

And then it took off from there, a lot of applications and the programming languages then adopted libcurl as their primary Internet transfer engine, sort of.

And then we added everything that we wanted to do for it to do your transfer URLs, which then always has been HTTP and HTTPS focused.

And of course, when the standards change and improve, we follow that.

So we introduce new TLS versions, new HTTP versions.

And of course, HTTP 2, we were really early on, and we want to be early with HTTP 3.

So we are here as well. And so the command line tool is used then by a lot of people to do debugging, testing, whatever you want to do.

And then we want the tool to be able to do basically everything you can do with a browser.

If you just figure out what the browser does, you can do the same thing with Curl.

So that's always a mission for us to make sure that we can do all this, all the HTTP magic a browser can do, we can do with Curl as well.

And basically, you should be able to simulate, you should be able to look like your browser with Curl.

And that's what we try to do all the time. And also, when you, for example, if you have a server, implement your server, you want to troubleshoot your server, you should be able to do that by single Curl command lines instead of trying to reproduce it with your browser.

Yeah. And as somebody who's been working on kind of HTTP level, adding features or debugging applications built on top of HTTP, like I've been that person that's used Curl.

And it is a very invaluable one. As you were talking, I've just been reminded of like a time that I might actually encounter an issue like this and then think, you know, oh, my web browser failed to load a page.

So I didn't plan to do this, but to give people a feel for like how easy it is now to use Curl to emulate this kind of data case.

If we say, for example, loading a web page, I'll use the Cloudflare quick web page loading in Firefox.

For instance, the fairly simple page, everything here worked okay. On the right-hand side, we can see the response headers that came back, a bunch of, you know, request headers that sometimes Firefox hides and does stuff with.

So we could, from here, be able to right click.

This might be a bit small. I don't know if I can increase the UI element size, but what I would do is just select a resource in this network view in the dev tools.

And there's all these options, like copy has something.

So I don't know what the difference between Windows and POSIX is here.

No, I'm actually not sure either. I think it's quotes or something on the command line.

So I think if you want to paste it in a Windows command line prompt, I think it's.

Yeah, I'll just put it in here and see what it comes out with. Oh, I think a line with a valid cookie.

Yes, and that can be a really extensive command line.

Yeah. So in here, you can see it's a fairly simple thing, right? We just got kill the command and then different.

Zoom in a bit more. We can add individual headers, right?

So we've got our user agent string, which becomes part of like trying to emulate a certain type of browser, for instance.

And this makes, yes, quite easy to overwrite because you get things like servers doing user agent sniffing, right?

And they'll change their response based on who they think they're talking to.

Right. That's very common. And you can get like browser plugins and stuff that can override those things.

But sometimes if you're just trying to create like a scriptable reproduction, so say like Cloudflare customers say they might have observed an issue, they'll be able to kind of do this kind of thing and send in is what I sent at the time.

And you can rerun it in the bash script and these things.

But this is just the tool. I think Bill has an option to dump the lib curl code out of it.

Is that right? Yeah. Yeah. So basically if you run the entire this command line and you'd add dash dash lib curl and the name of your file where you want the source to be saved, it'll do the entire thing and generate a source code to do the same thing, which is a good bootstrap process for you to get a lib curl program.

I'm suspicious here because I know Windows have done stuff in the past where they pretended to be cool.

And I don't know. I don't know. Not only, yeah, they have.

And then now they have this, there's a real curl in there as well. But they also have a sort of limited features that they have in that build.

So I'm not exactly sure.

Like that. Like that. Yeah. Okay. I can take that off. Fine. This is what happens when you go off script, not necessarily have the script.

So this happens because Microsoft, when they ship curl, they decided to disable that feature.

So they, so they, yeah, you wanted to remove the dashes there too. Sorry.

I just wanted to increase the font size a bit so anyone can see. So they disabled the automatic compression handling in curl.

So you can't do that with the built-in curl.

Oh, here we go. So that's the Cloudflare web page. Yeah. Yes. So what was the option you said to dump the lib curl?

The dash dash lib curl and then the source test.c, the name of the C file.

Like that. Bam. And when it's done now, we can check the test.c file.

I could do that in Windows. I don't know if I do. Nope.

You could probably do type, right? Type notepad. Yeah, there it is. I can invoke notepad from the command line.

So like that. Like about 30 lines of code. Yeah.

It's a very good bootstrapping technique I find. Yes, it is. It's a bit, I mean, since you did that copy as curl thing, it's important to notice that it's actually sending the exact cookie that the browser sent from the beginning, for example, which usually isn't what you want.

But anyway, it's a good start. Yeah. I'm not so worried in this case because it's just a very simple web page.

It doesn't require any user authentication to access anything.

But it's important to know if you're ever, not just with curl, but if you're saving hars or like we've had in the past, Peter to talk about Wireshark or Robin to talk about Qlog.

If you're storing data that might contain sensitive details like that, it's always good to be aware and try and strip that information where you can.

I think as I mentioned, I'm reminded of some tool in Wireshark that can go through and try to strip stuff out.

So I might try and get Peter to remind me of that in the future. Right.

And of course, of course, curl also supports both Wiresharking TLS connections and Qlog, since you mentioned them.

And curl supports those two, of course. Yes. Not that version, maybe, that you're using here.

So yeah, let's go a bit into curl's support for...

Let's start with H2 because when I got into the world of HTTP, the H2 process was kind of nearing its end and just getting towards maybe probably the point we're at in QUIC right now, where there's still a few editorial things to sort out and maybe some final issues to resolve.

But more or less, there were a lot of implementations that were pretty...

Had mature support for a few different draft versions and were very well placed to kind of just spin it on as an RFC.

So how did from, say, like I presented in the past, you have HTTP semantics like request and response.

And from that, you kind of build in a piece of software that can do this and communicate using HTTP 1.1 and build a whole load of assumptions about that into the software, just because that's the only way to do things.

And then to transition to, say, like H2's binary format, how was that process for you?

Well, the transition to the binary format was pretty straightforward, I think, since we very early on decided that we wouldn't do the binary format ourselves, since there were other people doing...

Or competent people doing libraries for it, so we could use someone's library for it.

So then we decided, sure, curl knows all the HTTP stuff, but the actual wire format of H2 is done by a library, like in H2 we use ng-http2.

So that was easy. But the harder part with H2 was rather streams over this, I mean, multiple streams over the same connection and doing many transfers over fewer connections than we were doing transfers in parallel.

And that was a pretty big transition internally.

And even for the APIs and everything, I think it's added a lot of things.

And we did that transition, I think, with some agony and headaches.

But I think we've sort of gone through that pretty good. And more or less by chance, I would say, maybe we were clever, but the API and everything handled that pretty good.

So we actually didn't have any really tight connection one-one between connections and requests in the API.

So we could just say, oh, by the way, nowadays you can do two transfers and over one connection, you just happen to use fewer connections.

And we hadn't done any of such promises in the API. So that was API and ABI-wise.

So once we've done that transition into code-wise, we could handle them separately.

That, of course, is complicated and we've had oceans of bugs because of that, but I think we've succeeded pretty good at resolving most of them.

We still have them, of course, every now and then. But so I think thanks to that transition going into H3 now, we have a lot of things already laid out in the proper order, sort of.

Now we just have to have ordered, I mean, have the other things that H3 is different from H2.

Like sockets.

I mean, we have to deal with them. We can't deal with them as sockets like we do before since there's no TCP anymore.

Now they're quick transports and we do HTTP H3 over that.

So what are the interesting things that are cool compared to other implementations that you are supporting multiple libraries provide the quick functionality, right?

Yes. One of those is Quiche, which is Cloudflare's library.

If you work on it long enough, you get a T-shirt. That's a lie. And another one is ng-tcp2, which is kind of the successor to ng-http2, which is the library that you use for the HB2 binary stuff.

Yes. And I mean, the story there is, for us, it's pretty simple, really.

We have a history of doing support for many different libraries for other parts of CURL, right?

So we support a busload of different TLS libraries for TLS.

We support a few different ones for SSH. So when we started working on adding, well, it was only called Quiche back when we started supporting it.

So we said, oh, let's start fiddling with Quiche. And since I had this great track record with the Tatsuhiro and the ng-http2 library, we figured, sort of, we took a quick glance of the libraries that existed and figured, yeah, this looks like maybe we want to go this route this time again.

So we started working on implementing support for that library.

And while we were doing that, it didn't really work.

Then Alessandro showed up and, hey, by the way, here's a patch to use Quiche.

And then, wow, that was a great way to actually get us kick-started to actually support QUIC.

I think it was actually htp0.9 over QUIC, the first one he did.

Anyway, so that was sort of a, wow, we want to support that, of course, so that we could get some initial QUIC testing going.

So I just then decided, okay, so let's support both of them so we could try them out both and make sure that we get QUIC going and we get everything internally worked out before we got the ng -http2 code working.

And then when we got both of them working, the backends are there and they both work.

So right now it's, I think, looking at how we've done TLS with many backends, and that has been really popular by users, I think it could be a good idea to just keep multiple backends for QUIC and h3 as well, as long as we manage to do the work.

Yeah, it's interesting. I think part of these issues, and even going back to h2, have been the dependencies on the TLS library to implement some kind of API service to allow it to be deployed.

And that was one of the problems with getting h2 deployed back in the day was that, first of all, you needed a version of OpenSSL, say, with next protocol negotiation in it and supported, and then that got changed to ALPN, and there was this horrible kind of transition phase.

And by having some agility to have different backends for TLS, you can allow people to make the trade-offs themselves.

I think it's very easy as a project to say, this is what we want to do, and it suits your needs, but I think something that has so many deployment scenarios like curl does, even the language that something is implemented in can be really important for some people who are, say, space-constrained or RAM -constrained, those kinds of things, right?

Yeah, and we have a lot of users doing different kinds of embedded devices or whatever, and they want to build their own curl with very specific needs and desires.

So they are the ones that are usually going with those weird choices or maybe rare choices to the rest of us.

So yeah, it is a good idea for us to do this because we have users who like this kind of flexibility to go exactly where they please and where they think we should have feature-wise or license-wise or, as you said, footprint or performance or whatever.

And for us, it's not always easy to know which horse to bet on, right?

So which horse is the best one? I don't know.

If we go with all the horses at once, we at least pick the loser. Well, we pick the loser too, but...

If you can afford to lose, then you can also win, basically.

Right. So it's just a matter of if we can manage the work to handle multiple backends.

And I guess the worst thing that can happen is that we find ourselves in a position where we just realize that we can't deal with this many backends and we just have to sort of drop them or remove the support for one of them or something.

But I don't think that's... First, I don't think it's too likely to happen, but then if it happens, then we'll deal with it then.

Sure, it would be sad, but...

Even though I work on the Quiche project and I really want that to succeed or whatever, I do find...

You could call it competition, but it's not a competitive thing.

I think having diversity of implementation is important because it makes you look in on the API that Quiche provides, for instance, and to say, well, we might design the API for exactly what we want.

Our main focus is a server product because it's powering the Cloud Edge, but the library supports clients and servers, and Curl is a consumer of that client API.

So there's been cases where you've opened an issue on us to say, it would be really useful if you could do this.

And then we can say, yeah, but that's kind of hard. And then you can say, well, NGTCP2 does it.

But then we can say, okay. But also we can look at their implementation and say, okay, well, they've done it that way and they've made this kind of trade-off in order to provide that service.

We can provide something similar in a slightly different way, for instance, as an example.

Yeah. And also, that is of course also a reason why I want to be early on here to make sure that we can...

First, we can do that testing of your libraries and make sure that they actually perform the services that someone like Curl would need and that it actually works that way.

So that it ends up a good library for whoever else want to do it. And that is good.

And then also by doing that, we make sure that everything works and then we get a tool that is ready early on in time when people start trying to fiddle with their servers.

And I want to do all that debugging we mentioned earlier with Curl, but now with the H3 servers that they're deploying now or later and everything.

So it goes hand in hand. And at the same time, of course, we verify that the protocol...

Before the protocol is done, we verify that everything works and we can be part of the process to get everything verified.

As we discussed this, I remembered...

So even before I joined Cloudflare, we would interact in mailing lists or whatever.

And that you quite early on set up a wiki page to talk about the requirements that Curl would have for some theoretical library that might provide support.

And even those requirements are kind of useful, I think, for people to think.

It's very easy to build an integrated tool that does one thing, kind of where you started with, but to break that out into stuff like you don't want a library that has to take ownership of a socket, for instance, because that's something that you already do.

And this is the kind of processing model that would benefit and most likely other applications too.

Right. And handling everything non-blocking and so on like that.

Yeah. And some of that is, of course, really hard to figure out without actually getting your feet wet and actually diving into it and trying it out and being, oh, right, we need that little thing too.

I forgot about that. That little thing, for example, just, oh, how do we know when we've maxed out the number of streams on one connection?

And when you want to do another transfer, when do you actually create a new connection?

Because you're full on one connection.

Yeah. I like that problem scenario, because that's not very browser-like.

That's more Curl -specific. Yes. And that's something kind of relevant to the things I work on to do with prioritization stuff.

But someone, it might have been Kazuo, actually, I can't quite remember, but made the observation that because in H2 the streams are integrated into the H2 protocol, then it seemed fairly reasonable to assume that an application using H2 would have access to those streams and understand the limits of H2, like flow control and stuff.

I don't know if Kill cares about flow control or just leaves that up to the library, but other stuff like extreme information is available, I believe.

But with QUIC, because it's the transport layer resource to manage, you don't necessarily find that the QUIC library implementation will provide that up to any application layers on the top.

They specify, maybe, say in the case of Quiche, you initialize it with an initial number of streams that you're willing to support, say 100, that you would take up to 100.

And after that point, you've exhausted, sorry, your peers exhausted the credit and that you would send some more when you're ready.

And on Quiche's design, it will basically maintain that buffer.

Those streams are completed, whereas like Curl would say, sorry, no, it's easier to use a server.

Because the server completes the stream, sends a thin bit, that stream's done, and then it would issue a new credit.

But you don't need to let the H3 layer know that thing happened, that the window update for the credits went out.

Because what can it do with that information that makes any sense?

It does help, because then you could test if you have enough remaining credit to take an action.

But it's kind of a philosophical view that if you start to surface too much information in the protocol layer, then the application might try to get involved in things it doesn't understand, or that you leak information, and you can have to take that back.

And as you want to change the internal model, or adopt extensions, or different kinds of like flow controls, or congestion control, that suddenly you get kind of stuck on trying to maintain an API.

I think we're still trying to figure this out, and there's no one answer for implementations.

That's why it'd be quite interesting if some, because Quiche and ngTCP2 are kind of similar in a lot of things, but if like some completely different model of doing quick wanted to like write back end for kill, how that might work.

Yeah, it would be interesting.

I think right now, I also appreciate that you both have decided to go with very different approaches for API for the client anyway.

So, implementing the, driving the libraries, that's still pretty different.

So, it makes sure that we actually get things done proper in CURL, because we can't really atone to just one of your ways to do it, and sort of get glued into that universe.

But we have to make sure that we actually can handle both ways.

And so, I think by doing this, we actually make sure that our handling of Quiche 3 is decent.

And so far, I think it works pretty good.

We have the, so there's this pretty solid existing function now.

So, we can do, we do even happy eyeballs properly with Quiche, and we can do quick connections.

I don't have fallbacks properly when it's quick, the quick connections fail.

So, I have that to solve. Since we expect quick connections to fail pretty often, I need to fall back better, especially with old services.

So, when you know, when you switch over, and then it turns out it doesn't work, and have to go back to try the other old services in some order, and then possibly go back to the origin and try that.

Yeah. And that's the. So, like a couple of years ago, at a curl up in Stockholm, was it?

I presented like, yes, a 15 minute brain dump on, here's all the problems trying to implement alt service.

Like, the spec, the spec is here.

It is descriptive of how the thing works. So, this is an advert and a header that says, oh, you could also retrieve this website using a different protocol or different IP address or whatever.

Okay, that seems pretty easy, but there's a lot of challenges to actually, when implementing that at all, and then making it robust.

It was really easy for me just to go, here's a load of problems, like, throw a grenade, or to just illustrate, like, sorry, you're going to have all this trouble in the future.

In the meantime, both you and James Fuller have been working on actually, like, implementing something.

I've not, I'll be honest, I haven't kept a close track on that development, so.

Ah, it's still early days.

So, we support alt service now, so you can actually switch between the different versions and so on.

So, it works to some degree, but we don't have that fallback, so we don't have everything proper.

And I've also, of course, discussed this, and one of the interesting things with H3 is how we are going to support H3 going forward more.

How do we know when servers are on H3? Okay, we can do that with alt service, that's the way to do it according to the spec, but then we need to cache that information somewhere.

And imagine a case when you do the curl command line tool, where would it cache that information?

I don't want to have a global cache somewhere, so, but that'll be terrible, because if you're going to go alt service every time with curl, then we would have to go, you know, waste a round trip for every H3 request we do, basically, with curl.

Not ideal. So, of course, I'm looking forward to see where the HTTPS DNS record also goes, but that is also another problematic area, of course, for a tool like curl to, you know, get a new DNS record and parse that.

Yeah. So, anyone who is maybe watching and not familiar with that record, it's basically a way to put the alt service advertisement, in this case, the same role we care about is HB3.

So, you would be able to query the DNS and discover that, like, I don't know, something like Cloudflarequick.com is available on HB3 and just connect immediately to it.

You wouldn't have to do this kind of upgrade hop and eat some round trips in order to switch your protocol.

It's really intended to avoid additional round trips, which is why the caching of the alt service is so important, but then you get into the whole problem of caching is hard.

Yeah. And you can imagine, except when you have a bunch of different applications, all speaking H3, should they all have a cache?

Or should they have a common cache? Yeah. Yeah.

And, you know, a few years ago, I was working with an old colleague of mine, Sam Hurst, who we were trying to puzzle through this, because we were using alt service for a weird multicast thing that still has a specification out there, but isn't related to, like, the call quick where it was happening.

We were trying to puzzle through this, and it's like, well, every time you're on the alternative, so, say, like a HB3 connection, it can keep sending you alt services that refresh the lease effectively or the advertisement on itself, or it could redirect you, and you have all these chains of things, and Sam kind of wrote a library to kind of implement this stuff, which worked pretty well, but it seemed to me that this would be an ideal opportunity for someone to write a system daemon that would capture alt service information for any HTTP-based application.

That sounds great, but in practice, it's probably very hard to, again, define an API for that and to actually maintain and support the thing and convince others that they'll need it, and maybe back then, I was, like, my opinion was that this is, like, DNS, so you'd go to your system resolver, and it manages these things, and in the meantime, we've seen a rise in, like, DNS over HTTPS, and for some browsers, they implement their own DNS, right?

Yes. Yeah, but not even those do them for all platforms, so there's still that problem.

To request new DNS records is going to be problematic for everyone who doesn't do DNS over HTTPS, basically, and while that might be fine for a start, I still think it's a challenge how to do that in a good way.

Yeah, so I'm familiar with that technology. It's gone through a lot of different iterations and a lot of kind of design around is it just about alt service or there's other things that benefit from being in the DNS and learning about them before you proactively make an HTTP connection or any kind of connection, so I think there's multiple benefits that we could eventually get there.

Right, so I think we're going there, or rather, I hope so.

I'm an optimist, so I always think we're going everywhere, but sometimes we don't, so yeah, but still, it'll be interesting to implement support for that.

I have a bunch of contributors who are longing to support well, it's called ECH now, right?

Encrypted Client Hello for TLS, and that is related then also.

Yeah, exactly. You answered the thing I forgot about, basically, so yeah, I think without trying to get on a HTTPS everywhere soapbox or anything, but there's some initial resistance sometimes to these foundational technologies of like, well, why do we really need that thing?

It's useful, but the use is minimal, but actually what they do is incrementally you can build up towards something that is good and secure and is private, but it relies on having built all those foundations first, so it can be hard to kind of just on spec look at something and justify the work required to implement that thing, but yeah, over time.

Yeah, the fun never ends, right?

And people tend to ask me if curl is ever done, but anyone who's involved in Internet and protocols knows that it changes and moves all the time.

Yes, so the topic of this segment is about web performance, and we're going to talk a bit about the implementation aspects of protocol stuff, but I don't know if people use curl for any means of assessing either like, is my site fast, or more is it like an ability to use curl to assess the performance of a server, because these two things always go in parallel, like how many requests per second could a server handle, or looking at metrics related to more connection-oriented aspects, like could you talk to any of those points?

Well, I think some people do various kind of web load or performance testing with curl, and curl has some sort of information, you can read out some metrics, for example, how fast the connection is set up, or where different things take time in a connection handshake, or different phases in the connection setup, and so on, so that's commonly used, but curl in itself is more focused on transfers, right, so it's not really optimized for, or set up to do performance testing, it's not made for doing an intense load, just to make sure that the server is okay, it's actually made to do those lots of transfers, which might sometimes be the same thing, but it's really not, so if you want to like, if you really want to load your server a lot, you probably use a more dedicated load service instead of curl, but of course, then we have also, we have really, really large volume users of curl, who are also very keen on making sure that curl can actually scale and perform well under extreme use, so we also actually make quite a big effort to make sure that we can actually handle, you know, thousands of parallel transfers on, in a number of connections on a typical server, what our, I think our biggest volume users of curl, they're around 1 million requests per second on average, so that's, some of our users are really high volume users.

Yeah, I don't know, but then of course, I don't know how many machines they have, so maybe they're just distributed on 20 million machines too, then it's not that high volume, but I don't know, I don't have any insights in how they actually do it.

One of the cool things with curl is like, all of the configuration options that are provided for like different stuff, right, so whereas maybe if you're using a browser to test a website or for actions or something, like you're basically at the behest or so you're beholden to how the browser wanted to implement the protocol, so on top of what the standard says, you end up with like a meta requirement of the browser does it this way, like it might impose limits on the size of headers or the number of concurrent streams that it would actually use, so for instance, like a server could advertise 256 concurrent streams, but the browser's implemented some performance -oriented heuristic that it would only open at most eight streams in the connection at any one time, so like it's, for me, it's really cool that I know that I have a capability to override that fairly straightforwardly.

Yeah, and I sometimes like to be part of all these discussions because sometimes there are a lot of browser people involved who tend to look at things from a browser -centric world when they use these protocols to render web pages and stream video, which in my case, of course, you can do that too with Curl, but in many cases, my users are different, so yes, for example, like that, our users are rarely focused on optimizing for rendering web pages, some are, but most of them are not, they do other things, so then, of course, stuff like that, reaching the number of limits of transfers on the connection and number of connections, and yeah, but it's also, of course, challenging sometimes, but so is, I guess, software is always tricky.

Challenging in what ways, like, I don't know, like people want so many concurrent games.

Yeah, and I mean, just sort of the weird cases you can run into when you do it, really high volume things, and when people are really taking advantage of all that and really exercise everything to that degree, yeah, it's just hard to reproduce and figure out things and, you know, you do just thousands of streams and connections and every once in a while something happens that you didn't really understand.

It's complicated. Yeah, and yeah, getting reproductions of those things is hard if they've been testing against, like, an internal.

Yeah, like, I've been fighting with a bug just actually recently, which happens after exactly one hour with H2, and what do you mean one hour?

Yeah, and then, you know, they shut down connections if it's idle, so you need to keep it busy for an hour, and then the bug happens.

Then you have, you know, hundreds or thousands of requests, and then suddenly the bug happens after exactly one hour, and it's really, yeah, yeah, so I have to run the debugger for an hour, and then suddenly it happens, and then stuff like that.

This has reminded me of something, actually, and I'm gonna ask you, because I think I know the answer, but I'd be happy to be proven wrong, but say, like, looking at H2 and all the bugs that have ever been, like, reported on H2, have you got a sense of what is the main, like, the primary root cause of those bug reports?

Well, yeah, there's Nginx max requests config.

I don't remember the name of it, but HTTP2 max requests, I think it's a thousand by default in Nginx, and then it sends a go away, so yeah, and actually, we've had bugs on go away in curl.

I still think we have at least one bug related to go away still there, so that I can, that's been triggered with Nginx, so yes.

Yeah. Apparently, Apache has the same config, but I don't think it's a thousand by default, so Nginx is much more aggressive there.

And the interesting thing with this for me, because I've hit this myself, and it took me, even though I'm familiar with the protocol, again, this is a case where the implementation has chosen a rule, and so, if you're not that familiar with the implementations and these kind of little details that happen, then it can be easy to focus in the wrong areas, and this is where getting some PCAPs and looking at exactly what's happened at that.

If you're in a browser, what you'll generally see is, like, error, or it used to be error speedy protocol error or something, which is useless, but perhaps you're trying to figure out where it happened, but yeah, in HP2, you have stream IDs, and you have a limit on the concurrency of how many streams can be active, but the only limit on the total number of streams in the connection is the stream ID, which is a hard limit when you run out, and it's, I think it's a 31-bit number.

Yeah. That's a lot of streams you could send in a connection.

Yeah, it's rare to run out of that, actually, on a connection. Yeah, and in QUIC, I think it's, like, 62 bits, basically, and someone doing a calculation, like, you'd have to send a stream, like, so frequently, like, the universe would expire before you would use up that stream ID, but in the interest of, like, making sure that connections don't get into, like, a runaway train state that, you know, some client gets in a weird loop and just continues to hammer a server with requests, Nginx, by default, has a limit of 1,000 that it will handle, and then after that, it will just tell the client to go away, and then the client could come back, but, you know, although the protocol defines all of these things, it is quite tricky to implement properly.

Personally, I've seen issues in some browsers, too, and people just expect this to work.

Actually, I would say that there are actually more problems with that.

First, if you're from Nginx and looking at this, you should fix it and bump the default.

I've actually whined at them before, and they didn't have any good explanation for that 1,000.

They had a memory leak at some point, and some customer of theirs complained on it, and then they solved it like that, which was a bit, yes, sure, you did it, but, okay, fix it the proper way.

You could bump it a few orders of magnitudes, and everything would be better.

So, I think the problem is that the config is there, and a lot of users, they don't really, the documentation hasn't been terrible.

I don't know. I haven't checked it recently, either, but maybe it's much better now, but they didn't really explain it in the config, so users are there, and, you know, things aren't really going well with my server.

I better try to fiddle with this little thing, and one of my first bug reports, they lowered that value to lower than the maximum concurrency limit, so basically, you couldn't get up to the concurrency limit until you said go away, and then I get a bug report.

It currently acts weirdly on my server, because, and, yes, and often with the browsers, they don't get up to that limit, depending on situations, of course, but, so, yes.

I've seen it in some cases, so this is where I've hit it, actually, when doing some local testing of HP3 using new web browsers like Firefox Nightly, which introduced support, and, you know, we're all working through this together when we're hitting different bugs and whatever, so I was testing, and, you know, I do a page load, and everything works fine.

We've got a weird test case that does load a lot of stuff on a web page.

It's a bit abnormal, and that worked okay, and then I did it again, because I forgot to open the dev tools, because I wanted to look in, so I'll just reload the page, and that worked okay, so, like, wrote down some numbers, and, like, any good experiment, you want to, you know, do the same test a couple of times to see, and then after, like, a couple of page loads, everything had broken, like Firefox could no longer connect to the Internet kind of levels, and it's because, yes, there was a max request that had been configured that told it to go away, and obviously, this has tripped up something.

And, actually, I didn't really know that.

I also learned this just the other day. They also had a bug in their go away treatment, which they also fixed just recently, so it wasn't always just the client side.

It was also a server-side problem with that go away handling, so, but, okay, I shouldn't blame them.

We all have bugs. It's just tricky, because, yes, a lot of the time, you optimise for the main use case, which is the page, and I think, like, Patrick McManus, who has done a lot of HP stuff, who used to work for Firefox, had a statistic that, like, a humongous per cent of connections in HP 1.1 at the time, this was just be, like, opened and make one request, and then closed, and you'd have to.

Yeah, I think the median, exactly, the median number of requests per connection is one with HP 1.1 on Firefox, which is fun.

The great hope was that H2 would, you know, bring all of these things together and allow multiplexing, and, yeah, I think the actual concurrent streams are still pretty low, actually, because once you factor in stuff like caching and all of these things.

Yes, and pre-connects and stuff like that, so, yes, you waste a lot of things, but as I remember, at least numbers from the Patrick mentioned years ago, then the median number of requests per connection for H2 was eight compared to one, so it was at least eight times better, and I don't know, things might have changed since then, that was years ago, another HP workshop.

Yeah, this is, we usually get an annual check-in.

Exactly. Kind of update on these methods. We missed that this year, we're completely lost now.

Yeah, we'll just run on old statistics, but no, like, you mentioned the workshop, so, you know, Daniel's, we're very good at taking the notes from those workshops.

We tend to focus on the discussion more than taking meeting minutes and stuff like this, so I constantly use those as a source to refer back to things, which is very useful.

The presentations from those sessions are online, they're all on a GitHub somewhere, but it helps to get, like, the one -paragraph summary of what was presented and what did the room think, and the last time we were there, we talked about prioritisation and all the problems and how we might solve it.

We're deep in that process right now.

Yeah. So, I couldn't remember if curl, like, allows, so this would be libcurl, control over screen prioritisation.

It does. Yeah, we have a user that wanted it badly, so we did it.

I don't know how much it's used by anyone else or if at all, so we rarely get any questions, we rarely get any bug reports.

I would imagine that very few are actually using it, so it's hard to tell.

That's one of our problems, right, when we do features or whatever, we don't really know when people are using it or if it works or if it's successful or not, just, we just throw it out there and ask once a year and see if our users say they're using it, and then I have this problem every year when we ask users in our survey, are you using these features, and a lot of users say, sure, we use it, and I'm always, do they really, and especially as I ask them about features, I know that people aren't really using, and a lot of people say they are using them, so I don't know.

Yeah, I've seen some of them, so this is the Kill annual survey, right?

Yeah. If anyone's not familiar with them, the kind of, the annual report based on the survey is always quite amusing to me at least.

Yeah, and that sort of, yeah, that shows that people said they were using HTTP pipelining way more than I, they could have for a fact.

Yeah, I removed the feature, people are still saying they use it, so maybe they don't understand what they said yes to here.

I don't know. I also always get that feeling that people say yes just to make sure that I don't remove support for anything.

Do you use HTTP version 0.9?

The, right now, the quick working group do, so you mentioned that earlier, as one of the first application mappings on top of QUIC, this is, it's still a good one, because it's a very easy way to bootstrap the streaming behavior of a QUIC transport.

So, you know, you can create a stream and say to the other endpoint, the server, send me back this much data, and that's about all you need in order to test like a whole load of stuff around QUIC implementation behaviors.

You could create something else, but it's like, it's so easy, like, and this is the great allure of HTTP, it's like, it's so easy, you can just type it in, and then you end up 20 years later saying, oh no, why did they do it this way?

Yeah, and I get it why it's there, because it's also very easy to implement.

So, if you don't have an HTTP stack, you can just, well, send an HTTP 0 .9 thing, because it's just easy to do.

For me, it was sort of, was more like, yeah, so I supported that one originally for a very brief moment, but then it was no point for me, since I then went with libraries that support the full approach, so it was much sort of better to just do it proper and adjust that instead.

Yeah, and we, towards the start of the call, we mentioned QLOG and like Wireshark support, so when we build with either QUIC backend now, and curl, I don't know, yeah, you've got some nice instructions written up for how to do that, actually, just to build curl with HTTP support.

I don't know if I can find the URL before, or maybe.

Yeah, it's in the HP3, it's in the docs folder, HP3.md.

It describes how to, because the HP3 support is still experimental in curl, so we don't enable it by support, so if you want to build a curl with HP3 support, you have to actually build it yourself and enable it manually in the build.

Yeah, but yeah, you've also basically added support for environment variables that will allow dumping of a QLOG or a PCAP.

Exactly, so you can do the QLOG dumping automatically by setting the QLOG environment variable when you start curl, and then you'll save QLOGs for that, those streams that it creates in that command line invoke, and you could do the SSL QLOG file thing you can do with the Firefox and Chrome to run Wireshark on HPS, for example, or any TLS -based protocol, actually.

I can't find right now the official curl instructions, but while I was looking, I was reminded of my colleague, Juno, who created a brew, you know, we call it recipe, so if you're on Mac, you can run this thing, and it should basically build a version of curl with the Quiche backend for you, make that available, and you can see some of the options here that we've talked about with alt service, and the alt service cache to a file, version 3 there, or if you want to ignore that, just kind of force HB3.

Again, I think it's a really useful feature, because sometimes, at least with testing, you don't care about the transfer, you just want to test the connectivity, so by dumping out the PCAP at the same time, you can start to diagnose, say, issues with UDP, V4, V6, like, connectivity issues, or other stuff.

It's a very quick way of bootstrapping these things, if you're familiar with curl as well.

I think that's a benefit, like, everyone's building their bespoke tools on top of their library, but curl is really familiar for a lot of people, too, and all they need is, like, the HB3 flag.

Anyway, before I run out of time, I'd just like to thank you, Daniel, for coming on the show.

I don't know if you have anything in the one or two minutes we have.

Fuzziness, if you'd like to say. Just that if you want to try out H3 with curl, you should, of course, just dive in and build it and try it, and the file bugs, if you find any, or submit pull requests if you fix any problems.

We have a bunch of things still not done proper, so H3 is there, and it works, at least for single or serial transfers, but there are more things to do, so if you're interested in this stuff, so just get involved and get things going.

Yeah, they're a friendly bunch of developers, but also Daniel's done a lot of great books and presentations on QUIC and HB3, so if, say, this segment hasn't described it so well or in a way that helps you understand things, Daniel's presentations are a great resource to catch up or get a different kind of learning.

Thumbnail image for video "Leveling up Web Performance with HTTP/3"

Leveling up Web Performance with HTTP/3
Join Lucas Pardue and friends for in-depth explorations on using the latest web technologies to enhance performance and security!
Watch more episodes