The Long Road to HTTP/3
Presented by: David Belson, Lucas Pardue
Originally aired on July 14, 2023 @ 8:00 AM - 8:30 AM EDT
Join Cloudflare's David Belson and Lucas Pardue for a deep dive on the journey to HTTP/3, including the RFC process, Cloudflare support for HTTP/3, and usage trends covered in our recent blog post.
Read the companion blog post:
English
HTTP/3
Protocols
Transcript (Beta)
Good morning, good afternoon, and good evening, everyone. Thank you for joining us today on Cloudflare TV to talk about the long road to HTTP3.
My name is David Belson.
I am part of the Cloudflare radar team, head of data insight at Cloudflare. And I'm here with Lucas Pardue.
Lucas, can you introduce yourself? Hello, everybody. I am Lucas Pardue.
I am a senior software engineer on the protocols team based out of London in Cloudflare offices opposite the Houses of Parliament, which are sometimes under scaffolding and not.
And apart from that, I also spend some time in the IETF as the quick working group co -chair.
Excellent, thank you. So that makes you eminently qualified for today's topic.
So I know that HTTP3 was finally, after a long wait, released as an RFC last week.
So can you talk a little bit about HTTP3?
What is it? How does it differ from the previous versions of HTTP? I think the last time I looked at an HTTP spec was probably 1.0, back when I was nerding around the late 90s.
Yeah, I guess it's always good to kind of start with just a base level.
What are we talking about with HTTP? I would have hoped people who tuned into this segment maybe tuned in explicitly because they know something about the topic, but maybe not, maybe it's just come up.
So HTTP is an application level protocol that's used across the whole web.
It's like kind of the engine of the web to power your interactions when you're using something like a web browser to load up pages or any other kind of interactions that we're seeing more these days of apps or these kind of more API based things that allow you to exchange information and present it in ways.
Most of the time, you don't need to care about this protocol.
You focus on web technologies like HTML. Often they're a bit confused because they sound similar, but they're different.
HTTP is the protocol that clients, the thing on your device, speaks to servers like the Cloudflare Edge or something like that.
This has been around for decades, right? It's like you said, there's a version 1.0, there's a version two, now there's a version three.
And so everything's iterative.
We started off with a fairly simple protocol, Tim Berners-Lee in CERN, just effectively typing out commands into a terminal that would go stream directly to a server that could interpret those and come up with a response, get me this thing.
Here's your response. Brilliant. Early versions were fairly simple. Yeah, yeah, exactly.
And over time, we needed more capabilities, the richness, multimedia, these kinds of things, a semantic layer of content negotiation or compression, like those kinds of features.
But the underlying wire format was effectively textual still.
And that was okay. I mean, this is an application layer protocol.
That's one thing. In order to actually connect between a client and a server, we need a transport protocol underneath that.
So we have things like TCPIP, which are a generic way to open and establish a bidirectional communication between endpoints.
And that presents like a logical, reliable, inordered byte stream between a client and a server.
And that's excellent for exchanging a textual byte stream.
It's like a perfect match between those two things. And that mapping, the application mapping between what HTTP was and TCP was very straightforward and made a lot of sense.
Obviously, we have things like, that was just a plain text mapping. And this is like a 0.9, 1.0, 1.1 versions of HTTP.
Yeah, in the old days. The days when I was younger.
Over time, plain text communications on the Internet aren't great.
So we have this concept of TLS, transport layer security. And that would allow a means for one, a secure negotiation between those endpoints.
So it provides not just like protection and encryption, but confidentiality of your information, integrity, make sure that what you're sending arrives on the other end and hasn't been manipulated.
And authenticity, you know who you're actually speaking to.
So although like it's different, you can't just look at it and say, oh, I can see these characters going across.
What it means is the application layer above this TLS thing, still kind of the decryption or encryption happens at an abstract layer.
But it was still for HTTPS, all of its kind of ways of measuring anything, it's still textual.
And that was fine, but pages grew in complexity.
We want to load lots of different resources for a page, lots of scripts, HTML images, all at the same time, maybe from the website you're connecting to, maybe from additional websites because there's some third party content or scripts or other kinds of stuff.
So this is mainly in the browser's perspective.
They did clever tricks to improve web page performance by opening up multiple TCP connections and multiple HTTP sessions over the top of them, all in order to do this kind of multiplex of grabbing multiple things at the same time.
And that was that.
It kind of worked okay, but it has costs, like there's just a cost to maintaining TCP connections.
An actual cost on a server, which if every client can open 60 TCP connections to a single server just to load a web page, that's okay.
It costs memory. But if each of those connections only sends one thing back and forth, it's kind of a waste.
Some studies would show that it helps, but it's wasteful too.
And sometimes with these TCP, there's all this clever congestion control stuff that happens.
And they start off slow to be safe and kind of careful for the Internet.
And over time, they gain confidence and they can use more bandwidth so things can go faster.
So it'd be better off using a warm TCP connection than a cold one.
But you kind of these trade-offs between opening lots of things, latency and reusing things with bandwidth.
So that was kind of some of the motivation for trying to define a vision of HTTP that included multiplexing on a single connection, which is what HTTP2 was, which is to take this common idea of request and response semantics and kind of redefine a new way for those to be encoded that would allow each one to have an individual identifier.
And taking that identifier- Oh, on a given connection.
Because my recollection is that the earlier versions of HTTP was basically open a connection, make a request, get the response back, close the connection, repeat ad nauseum.
Sorry, sorry to speak over you.
The very early visions would just be like one connection for a request and response.
The introduction in 1.1 of Keepalive by default, I mean, it was introduced before this, but connection Keepalive.
So you could reuse a connection, but it was sequential.
And that isn't great when you want to maybe insert something in that's like more important because you just found out because you're loading your page.
And this is what HTTP2 would allow by assigning unique identifiers to each request like exchange and coming up with a different way to encode them so you didn't have to read out a strict ordered sequence of things, but they could be kind of mixed and matched because you had clear identifiers.
And these are called like HTTP2 frames that they would allow you to do this thing.
It works like it's been deployed. HTTP2 has been deployed on Cloudflare for years and years now.
I don't want to talk about dates because I will just get them wrong.
I'm not a good person. But it's been there and it solves a lot of the kind of the original goals that it set out to do.
But as always, like I said, it's an iterative process.
And so what we learned pretty soon on is that it solves one kind of problem, but it doesn't solve another one.
And that's this problem is called head of line blocking.
So TCP I mentioned is this reliable byte stream abstraction.
We haven't got the time to go into the boring details, but this byte stream is kind of broken down into packets that get sent across the Internet.
And the Internet is a best effort system.
Things can get lost and get reordered, just thrown away.
You might have like your cat walks in front of your Wi-Fi, like just things can happen to disrupt the perfect ideal scenarios.
TCP is really good at accommodating that and kind of recovering from those things.
But it takes time, introduces latency while these processes detect the loss and try and like ask the other end to send the thing again and all of that.
And this is what head of line blocking is, is that if you have like a whole stream of bytes that you would read like that and it would just work like in your golden scenario.
Great. If you had that whole stream, but you missed just a tiny bit of it, you couldn't read anything that had come after.
All of this data was just kind of stuck somewhere in your operating system.
Even if like your computer had it in the code that powers the networking side of things.
The application like your web browser couldn't access it.
I thought it was kind of the best simple description I could give. So you have to wait to get those bytes again before you can kind of move on.
Yeah. And the net effect could be that in your page, it looks like it took a long time to load something where actually it was already there.
It just couldn't be used. And that was just, there was no way to change HTTP and TCP to get away from that just because of the way TCP was designed.
And so that was really the motivating factor for going again and saying, can we learn from these two things now that we've done HTTP 1 and HTTP 2?
And the folks at Google kind of realized this many years ago, like six, seven, eight years ago.
And it started on what was Google, well, what we now call Google Quick, but is now evolved and changed and been standardized through a number of years in the IETF, the Internet Engineering Task Force.
Which includes people from all over industry and academia working and collaborating to finalize standards.
So- And that's a great segue, actually, mentioning Quick. So that was standardized, I think, what, last year was RFC 9000?
Correct. So the process that we like to do in the IETF is like running code.
So we, Cloudflare and many others have deployed some version of Quick out there before these standards have been finalized.
Like HB3 just came out the other day, woo-hoo. But we've been running that precise version over RFC 9000 since May 2021.
So last year, in case people, we can't talk relative, let's talk specifics of 2021.
So we've been running that version. So it's really cool that now we've got RFC 9114, which is what HB3 is.
But it kind of doesn't change much for us because it's just business as usual.
And Quick came out before then, but all that's been happening is it's kind of been stuck in this, not limbo, but a queue of waiting on dependent drafts and stuff like that.
There's been no like, it's been blocked on something critical because we need to change anything.
It's just been stuck in a queue of other documents that we depend on that, you know, I'll come on and explain in a minute.
But Quick itself as well, it was kind of in a similar state. It was done or we went through 34 different draft documents and several of those were implementation kind of cuts.
So changes to the protocol that might be breaking. And we went through multiple of these to deploy them on the Internet, to let people try them out, see what parts of the design might work, might not, might need a bit of improvement.
So it's this not a develop something in isolation, in a vacuum, and then say this is ready everyone and then try it out and we find out it's not.
Like there's been just so much traffic gone over Quick and HB3 over the last, the entire time I've been at Cloudflare effectively.
So back to 2018. Is that, the 34 drafts that you mentioned, is that normal?
It feels like a lot to an outside observer like me, but obviously, you know, I assume that they incorporate feedback between the folks who write clients, the folks who write servers, and, you know, finding interoperability issues and getting those addressed.
And, you know, somebody comes up with a slightly better way to compress something or save a couple of milliseconds or whatever.
I mean, it's hard to draw any patterns from anything because like sometimes working groups work differently.
Some people like to cut more drafts and iterate more quickly.
Others like to have less drafts with bigger changes.
I think, you know, ultimately, like I haven't quite got onto it, but Quick is a brand new transport protocol.
So it needs to do everything that TCP could.
So all the things we've learned from TCP and add this new capability that I mentioned earlier, like these streams that carry requests and responses, to pull that down into the transport layer.
So we have like new features and features that are no longer extensions or kind of nice to have, but a core part of the protocol.
So like I mentioned, TLS is adding security or protection on top of TCP with Quick, that is just a part of the protocol.
It's encrypted by default. So to get all of those things correct, it wasn't just one draft.
You know, we'd have a family of documents for Quick and the idea there is to make something that's going to be extensible and should take us through, you know, the next few decades as well.
So maybe some people think 34 is a lot, maybe some think it wasn't enough.
Like it's really hard to say how long is a piece of string.
So what I can say is that we track all of this work on a GitHub repo, like these standards are developed in the open and every kind of issue that was raised was marked as a design issue or an editorial issue.
The same for HTTP3, which was developed at the same time. So we have pretty good traceability across their mailing lists and minutes of meetings.
Like there's a lot of energy have gone into this and it's great to see that actually it's being deployed, right?
It's live, not just us, but other people around the network.
Were Quick and HTTP3 developed in parallel or was it sort of parallel?
Like we didn't, we were like, what was it more serial? Like, okay, Quick's done.
Now we've got it, you know, we'll work on HTTP3 over the next year or was it sort of like they were nominally in parallel?
It was in parallel. Like I mentioned earlier, what we now call as Google Quick was effectively like it's just a single unit.
It said, this is a new transport protocol and it can only carry HTTP.
And that's not a very good kind of separation of concerns. The benefits of Quick for this multi streaming that can fix head of line blocking and provide encryption is really attractive to many different use cases.
So the work that we did in the ITF was to split those things out more cleanly, but to develop them at the same time.
So as Quick changes, HTTP3 is effectively just a thin layer over the top that describes how to map logical HTTP semantics like requests and response on top of Quick.
So, okay. So we're effectively, HTTP3 is effectively HTTP over Quick because I know I've seen in various places, people talk about, you know, DNS over Quick or, you know, pick your protocol over Quick and the various benefits that doing that may have, okay.
Yeah. And I think it's a very, it's a fun question to ask and naming things is like super hard, right?
But the document was called HTTP2 over Quick, I think, or HTTP over Quick or something.
But over time we evolved to this H3 because HTTP in isolation is just logical and abstract.
It's these semantics I keep talking about.
And that's kind of that, seeing the continued iteration and genesis of protocols.
So like seeing that there's a 1 .1 and a 2 and a 3 kind of as the HTTP working group are owners of those documents, that was kind of a kickstart for them to say the definition of 1.1, if you go to that RFC at the time, I forget the actual number here, but what it did is mingle things that were common to all versions of HTTP with things that were only specific to HB 1.1.
And that makes it really hard for implementers and people kind of not so involved to understand like what we're talking about.
And that, so we've, there's folks in the HB working group, Mark Nottingham, Roy Fielding and Julian Reshko have gone and effectively refactored a whole suite of HTTP documents to make those separations way more clear.
So now in the HTTP 1.1 definition is just, I say just, but it's a, here's how you take a concept of request and turn that into these string of ASCII characters that get sent onto a wire.
And when you see this character, this is what you do.
Similarly, H2 now says, here's how you take that request and you would turn it into this binary encoding of the characters.
And HTTP is pretty similar. So that's effectively what HTTP 3 was blocked on.
This work was already in progress to rewrite those documents. So we updated all of our terminology to be consistent.
And that's why it's such a big occasion that not only has HTTP 3 come out last week or the week before, but that like all of HTTP has been updated and it's all consistent with each other.
Which to me is an amazing thing.
Right, we touched on that in the blog post. It is now the series of RFCs. Let's shift gears a little bit.
And so you touched on earlier that we've been at Cloudflare have been supporting QUIC and then HTTP 3 for quite some time.
And I know we did a blog post back in 2018 called the quickening that I think touched on what we were initially doing with QUIC and a blog post.
Probably I can't see it at the moment, but I believe you probably wrote it in 2019 titled HTTP in the past, present and future.
So can you touch on for the next couple of minutes like the challenges that we faced at Cloudflare in implementing something that was not formalized, was not finished.
I mean, I know we innovate quickly here and we get stuff out there, but how do we deploy an in-process protocol, an incomplete protocol while still maintaining a level of functionality and sort of looking for, we wanna make sure we're not breaking things for our customers.
I think that's an excellent question.
I just wanna start off with saying I can't claim credit for every blog post that ever comes out that relates to this topic.
There's a whole team of folks involved in the protocols team and the wider team that help us get all of these systems running.
So you kind of touched on it there, right? QUIC is a transport protocol, but it needs UDP underneath it to carry the packets back and forth between client and server.
And so the traditionally systems and infrastructure like the networking aspect of these things have been like the layer four that we like to call it, has been geared and scaled for operating TCP at scale.
So the first challenge is if we want to support QUIC, how do we make UDP work?
How do we make sure the packets are coming back and forth to the right places? If there's any safeguardings and mitigations in place that work for TCP, do they work for UDP similarly?
And a lot of that groundwork was done before my time or before I, having been here before I got a good appreciation of the architecture, but it's also changed too.
So if we put that as like one bracket, then you come into the, like, how do we manage trialing kind of new software at Cloudflare that speaks protocols stuff.
And like, that's just something that like we're good at, like this is embracing change.
That's what the ITF need that we have running code on the edge.
Obviously, we want to make sure we're not unintentionally breaking stuff. So quite often these features will be like an alpha or private beta and allow people to opt into them.
So it might just start with carrying some traffic for our own test websites to see how they behave.
Or, you know, even, you know, we have our own implementation of QUIC and HB3 called Quiche, which is an open source REST-based project.
So, you know, you don't even need to run this online. You can just develop these things and test them in lab environments.
And, you know, simulations and see how those things go.
So that makes sense. We're not rolling it out at millions of sites.
We're not rolling it out at massive scale initially. We're dribbling it out and, you know, doing it in gradual phases to make sure that, you know, we've got it all right and we're not going to break anything significant.
Yeah, but, you know, you can't, like we do that, but you can't simulate everything.
But sometimes when you find out by trying and like with protocols, the fun thing is it's inter-operation.
It's not just your client speaking to your server. It's different people reading the same spec and interpreting it slightly differently.
Yes. Which might lead to edge cases that come up.
And so we want to make sure that we try to do the right thing in all cases.
But errors, you know, on the Internet ago was going to happen.
So you need a correct kind of strategy and design for being able to capture errors or not even errors, but subtleties and differences that might affect performance and behavior.
And so like monitoring and analysis like this is like our bread and butter, our daily thing.
But, you know, like you mentioned this phasing.
Yeah. If you pick through the history of blog posts to do with QUIC, a lot of them will be like it's a private beta.
It's an open beta now. We're going to go watch it progress.
GA for people, but still make it as an opt-in for customers who, you know, might just be more sensitive to enabling anything.
And that's fully understandable.
You know, their traffic is keeping their business alive.
It's they would like full, well, let things soak for another five years and then and then maybe turn it on.
They're not necessarily the early adopters, the cutting edge adopters.
That's that's the correct too. Yes. Cool. Well, and on that note, in the last few minutes that we have, let's take a look at the blog post that we published last week.
It was last week, right? That provides some insights into real world.
Let's see if I can find it. There we go.
There we go. Drives insights into real world HTTP3 usage on Cloudflare. Well, yeah, the, you know, we talked about kind of phasing of rollouts and maybe people enabling things like different browsers, enabling HTTP3 by default, perhaps, you know, so we thought it'd be a great opportunity to, you know, use the publishing of these drafts to look back at the last year at Cloudflare.
And you very, very kindly did a lot of data analysis for us.
Yeah, so I mean, so we started out looking back over the last year, so May to May.
And the first thing we did was looked at request volume by HTTP version to see, you know, for HTTP3, how did it compare to H2 and HTTP 1.1?
You know, so on the graph within the blog post, we can see the HTTP3 is in blue.
You know, started out a year ago, a little bit behind HTTP 1.1, but overtook it in early July and has really grown steadily since then, which I think is very encouraging.
HTTP2 seems to be solidly in place as the leader. I think we're seeing, you know, still seeing a significant amount of traffic, but I expect that that will drop over time.
We also looked at HTTP3 traffic by browser by the leading user agents and probably not surprisingly, we saw that Chrome far and away generates more H3 requests than the other browsers.
Interestingly, Edge and Firefox generate about the same amount.
They've been neck and neck pretty much over the last year.
And Safari right now has the lowest level of HTTP3 traffic. If we take each, let's go look at this one.
So if we take, if we look at the volumes, we see that about 80% of HTTP3 traffic on Cloudflare is generated by Chrome and then about 10% Firefox and Edge and then whatever a little bit is left right now is Safari.
And then within the post, we also see that also looked at by browser, the percentage of requests by the different versions.
So right now we're seeing on Chrome about 35, 40% of request volume coming in over HTTP3, which is great.
That's almost double where it was just about nine months ago.
Firefox, we're currently seeing, I think about 30%.
Again, that grew aggressively, especially after they rolled out default support for H3 in mid-June of last year, I believe it was Firefox 88.
Edge, again, similar to Chrome, seeing about 40%, not surprising since Edge is Chromium based.
And Safari today, seeing probably about five to 10% of requests over H3.
I think last check. We also looked into what the various leading search indexing bots are using.
Given that Google, like you mentioned, had generated or started Google Quick, which eventually grew and evolved into H3.
We said, okay, what is Googlebot using?
We're finding that Googlebot is mostly using HTTP 1.1 still as they're going out and doing their indexing.
Bing is using about 80% HTTP2. And in social media platforms, for Facebook, it is mostly 1.1.
Twitter, it is mostly HTTP2.
And for LinkedIn, it is almost exclusively HTTP 1, or HTTP 1.1, excuse me.
So I apologize for rushing through that.
I know we only have a few seconds left. So I do encourage you, if you haven't already, to go out and take a look at the blog post.
It is on blog.Cloudflare.com.
In the last 15 seconds, Lucas, any closing thoughts? Yeah, sorry for eating up all the time.
No worries, it was good. And not letting you go into the data.
Yeah, take a look at the blog and let us know if you have any questions.
Thank you very much, everyone.