Leveling up Web Performance with HTTP/3

Presented by: Lucas Pardue, Peter Wu, Robin Marx

Originally aired on August 3, 2020 @ 8:00 PM - 9:00 PM EDT

Join Lucas, Peter and Robin for a friendly, head-to-head discussion on wireshark vs. qlog and explore the synergies between the two.

Episode 7

English

Protocols

Performance

Transcript (Beta)

Welcome to another episode of Leveling up Web Performance with HTTP 3. I'm joined today with returning guests Peter Wu and Robin Marx, who are going to come and talk about Wireshark and Qlogs and Qvis respectively. So if any of you tuned in a few weeks ago, I featured both Robin and Peter in different segments to talk about their tools respectively, to give us an overview of them. And ultimately what happened is I basically kept interrupting them asking questions. We kind of slightly run out of time, but also kind of fired shots at each other around the relative merit or particular strength areas of one tool over the other. And I think it was important for me to clarify in a situation that actually, as a community of protocol people, we do interact with each other and sometimes there's tools and things that are there and they just have constraints or we'd love everything to do everything. But actually, we're all kind of resource bound too. And we do get along. So it's a collaboration. So in order to give Peter and Robin a platform to finish off what they were presenting and to have an opportunity to cross pollinate ideas like we might typically be able to do in like some face to face meetings and something like an IETF meeting, which is happening across this week in terms of a hackathon or next week in terms of a proper meeting. Just a chance to do some stuff. So I hope that's good. If you're new to the show, I just wanted to remind you that this is a live stream and things can go wrong. We try to keep our language clean. So remember that, please, you're both guests. But if you wanted to get in touch to ask a question, or just to kind of cheerleaders along. We have an email address, you can email in and find some questions at us. Or you can tweet us directly on Twitter. So I know last time around, both Peter and Robin had some really interesting questions about some of the tooling or details of things. So we're happy to do that again. Or maybe if you had a question last time that we didn't address, drop us another line. As the more interactive, this is the better, the more we can respond to things. But if not, you know, we're all pretty active in various online communities as well. So like, tweet us whenever, I'm sure one of us will respond. But anyway, less from me, and onto my guests. I've got the problem of choice here of who I go with first. Because this is live, it's not scripted and unedited. But my personal preference was going to go with Peter, who is going to pick up on where we left off with Wireshark. Over to you. Thank you, Lukas. I think last time with Wireshark, we were demonstrating how to capture stuff with Wireshark, how to record the data, the HB3 data coming from Firefox. And at that time, Firefox somehow triggered a bug that crashed. At the time, you mentioned, oh, it would be interesting to dive into the issue, investigate what's going on. Unfortunately, I haven't gotten an answer to that. But certainly, it did reveal an issue and like the Firefox people were aware of it. What I'm going to show you now is, so in case an issue occurred, how can you record the information and share it easily with the others? Because that's very common to run an issue and you can rely on so much more smart minds to help you out. Let me check where the share screen button instead goes. So I guess you can see my screen now. It's a bit small and from past experiences, the bigger the better. So as big as you can comfortably work with is now my kind of rule of thumb. All right. So last time, what I was doing was I ran this command, start Firefox with slk.fileedit, which will write stuff to a file. I'm not going to do this time, but instead I'm going to show you how a previous capture, sorry, a previous trace can be combined. So here we have a capture, a problem in the implementation. So Lucas also mentioned the whole community, like everyone working on the QUIC, they're chatting with each other via slack, they're communicating with the maintenance and so on. And doing interoperability tests basically means they take their own implementation and run it against someone else's implementation. And you would expect everything to work, but in practice there are so many little details that were not either clear in the specifications or that people implemented wrongly. A nice thing about interoperability tests is that it may expose a whole range of combinations. So in this particular case, there was one implementation called QUIC-CO, which did work with itself, and usually with others, except for one very particular case, namely when it was using a special cipher, Chacha 20.35, which would be chosen on mobile platforms which do not have an AES CPU acceleration instructions. So for this problem, I downloaded the files. MasterSecret and, well, 19 something something. Sorry, Pieter, just to clarify, in this case, was this like a suspected handshake issue or was it like, how did the problem manifest? Like what were people seeing? Oh, right. So someone was investigating, was building a product, I guess, on QUIC-CO. They wanted to integrate their application with QUIC, and they used Wireshark to understand better what's going on, what kind of traffic goes through this system. What they found was that Wireshark could not decrypt the trace, even after following the instructions, which is actually quite, well, I would say, not quite common, but not rare either. Due to the complexity of the QUIC protocol, a lot of rough edges and things where Wireshark makes assumptions about it. In those cases, well, if people reach out to Wireshark developers, either via the book tracker or via Twitter or whatever channel we're active in, they can supply a packet capture and ideally the secrets as well, so we can decrypt it and check it out. So, because it can be sometimes tricky to reproduce the problems that people are seeing, just sending you the log at the PCAP file itself is sometimes enough for you to figure out what Wireshark isn't quite doing right, and you can kind of debug and fix it. And actually you can put those packet captures into like the test of Wireshark, I think, sometimes, maybe, I don't know. I'm mumbling. Yeah. Let's get back on track with your demo. Yeah. Ultimately, the goal is probably to integrate a number of QUIC packet traces in the Wireshark test suite, but until it's finalized, kind of like holding that off, instead what we have is the Wireshark book tracker. There's a page, book 13881, which has a lot of sample captures that were in use to validate the frequency and the Wireshark. Cool. And briefly, we just copied the keys. So if you If you recall, in the beginning, I showed this sl-keylog file thing. It writes stuff to a file. Not this file. Master Secret. It will look like this. So briefly, we just copied this stuff and in Wireshark, we'd go to statistic, capture file properties, and paste the keys right there. But it still meant that you need a way to extract it and enable automatic decryption. So that's why a new function was added where you can add the keys to the file. I'm going to demonstrate how to do that. So let's say In this, okay, let's say I'll just do it now. Okay, so this injects your secret to PyScript. I'll just link to it in a minute. You should provide it with the keylog file. In this case, the master secret thing. And secondly, the capture file. In this case, the what was called 19 August something something. By default, this tool will automatically write the file. Oh, no. In theory, that should work, except this trace doesn't use This is not, this implementation didn't use the standard quick version number. I showed it in a minute. So You do b.port. Oh, it should be quick. Oh, no. Okay, so great. Anyway, I'll just open up the trace. Okay. Anyway, what this script does on this script is supposed to take a file with a lot of secrets and then put it put only the secrets that that are related to the connection and the capture file. So imagine you said this SL keylog variable and you open a lot of websites, you're going to Twitter to email to your bank, whatever. Those secrets would also be added to this file. And if you want to share a capture with someone. Ideally, you want the default to contain this secret just for the For the trace you created, which is what this command is doing. That's a pretty important point, I think, because, because we, you know, we want to share stuff with people, but we want to make sure that like we're being safe as well. So Last time around, Peter showed how to kind of filter by capture. So you just capturing traffic to a Website that you're, you're focused on trying to debug right now, but it can be sometimes a bit too simple, too easy just to capture everything And then also capture all of the keys that you're using and send that off to somebody and put it onto a public kind of archive. So it's always, always good to remember Exactly, kind of what you're sharing with people in the same with cookies. But like you say, with those secrets, you're able to if you share the entire file decrypt all of the traffic in a PCAP, which is probably not what you want to do, given how noisy things can This is why using a script is going to be Simplify your life and make it less likely for these things to happen. So at moment. Is this the Python script that you get This, this, the Python script. I think I linked it from a presentation I did earlier link from my website and just decryption and workshop on my website linkstein.nl In there, I link it somewhere in the presentation, but this presentation also shows you how can you it showed a case by water was very useful. Useful and the you there was there were there was logging present in the application, except it didn't work. So that's why you use Wireshark to look at the communication with us going on. So in the Shots fired Robin logging in applications don't work. It's the main issue. You want to add something to that, Robin. He's completely right, especially if the logging implementers make errors like like some people like or sometimes to. Isn't that right, Lucas. Sir, please continue Okay, so yeah, in this presentation, I show you how to do capture and also show you how to get the secrets in a file for easier sharing And I think. So if you want to use this tool. Just check this project and link out. It's also on the Wireshark wiki wiki.wireshark .org slash DOS if that's easy to remember. Remember my nickname. And if you're interested in trying out Wireshark, there's this website for the quick working group. Called it. There's the base route repository and so wiki with the tools page and on this page Wireshark is listed with the Wireshark versions for which a draft With the supported draft versions, as I said before, quick is still an evolving draft. So if you happen to have an older Quick trace, you might need an older Wireshark implementation as well as in support for older draft and to get dropped at some point. And on this page, you can also find the link to the capture to the Wireshark issue tracker with sample captures for your convenience. As well as some potential issues. So briefly, my tool set. Oh, no connection were found and that was due to this implementation using non standard version number and to solve that you can force Wireshark to decode non standard port. Due to force that you can configure Wireshark to decode a certain port as quick even though even if the heuristics don't pick up the the trace Right now, the heuristics in Wireshark to detect quick are quite simple. It tries to to parse the version number out of it. If it looks like a valid quick draft version, it will try to interpret this as quick You can probably talk a bit more about how Wireshark can actually process processes packets, but I think I'll hand over to Robin now. Okay. Can I just ask a question, Peter. Yeah, sure. Yeah. So just in case people missed the obvious trick there. It's that, yeah, if you share a capture packet capture with somebody without the keys, even though if you've got to open in your own Wireshark, they'll just see it as encoded encrypted data, basically, and they won't They won't be able to see that kind of thing. But yeah, so you want to share things with people and having more than one file becomes a bit of a pain. Like I've used Wireshark in ways of like downloading somebody's keys to a temporary location and then changing the Wireshark config between every packet capture that I'm opening and it gets really fiddly and annoying. The upshot of doing it this way is basically you end up with a single file. So, so analysis and debugging becomes atomic, which I think for me is like a really important thing that anyone can suddenly open that file. You don't lose two pieces of absolutely essential information. Which is why it's a really cool thing. So, yeah, thanks for sharing that with us. Okay, and just to show like to add to Lucas' comment. Yeah, if you don't have the keys. So in QUIC everything is encrypted by default. So you can see anything right there, it's all gibberish. But the initial communication is actually encrypted with a fixed key. So that's why Wireshark, even when it doesn't have the decryption secret, can still recognize the initial client hello. And in case you're wondering why you see client hello. Client hello is something from TLS. QUIC uses the TLS handshake to, well, for direct encryption basically. And if you look further after the handshake things, well, they're still doing handshake. After handshake things become, cannot be decrypted. Wireshark will tell you, oh, I cannot, well, basically decrypted. The secrets are not available. So now check this back nine. If I load the capture with the secret embedded, then I can go to back nine and actually see there's a certificate message in the encrypted payload. Previously, I converted this file with the secret inside to a normal file using this command, dshark. I basically convert a pcapng file to a pcap file. Often if you capture with a TCP dump, you will end up with a pcap file and the pcap file format is not able to contain the secrets necessary for decryption. So that's why the pcapng file is used. If you run this file command, it will tell you, oh, this file is actually a pcapng file and this is a normal pcap file. And if you're really interested in, wait, how can I know whether secrets are embedded? You can go to, if you open up a file in Wireshark, you can go to a few and choose reload as file format. And in there, there's a whole list of packets right there, but also the decryption secret block. And then if you right click on the secret data, you can already see it there. You can also show the packet bytes and you'll notice these are exactly the secrets that were stored in the keynote file. Yeah. It's like a nice, a nice round trip of putting things in and getting it back out. Yeah, usually the information is there, but it's just a question of how to find it. I hope that it helps. It does for me. I'll be looking there in the future. Right, over to Robin. I must say, Peter, I've been wondering how to embed keys in the pcapng before because I knew about the command line method. And I'm a bit surprised to see that there is no UI option in Wireshark for that. You can't just go into Wireshark and then paste your SSL key look file and do that or have something else like if the pcap is called, has the same file name as the txt file or the .keys file next to it and automatically absorbs it into one. That's one of the things I've always found a bit weird about the user experience for Wireshark with regards to that. I wonder if there's a reason for that. Yes. So, let's come down to kind of how Wireshark works. So, I'll give you a bit of background of how whenever you pass a file to Wireshark, how it will basically tell you that it is quick. So, a pcap file or pcmg file is basically a sequence of individual packets, where a packet is defined as a sequence of bytes, basically. Wireshark has implemented a lot of dissectors. How that works is it goes through every single packet and calls individual protocol dissector. So, for example, at first it will call very often the event dissector, which will recognize there's a IP packet inside. So, then it will continue to call the IP dissector, the UDP dissector and then QUIC, for example. And this very first time it happens sequentially. So, for protocols where a single packet cannot be – there are some very simple protocols like DNS, where by looking at just a single message, you can immediately tell what's going on. There are also more complex protocols, such as TCP or QUIC, where based on a single packet, you might not be able to exactly tell how to interpret it. For example, because you need some decryption keys. So, when Wireshark goes through these individual packets, it will build up some state. It will recognize, oh, there's a client, hello. Let me look up the secret that corresponds to this client, hello. Then that's the very first pass where it can build a state and decrypt stuff on the fly. Then on the second pass, it will go through all packets again. And at that point, it can calculate additional statistics. For example, oh, I have this request. And the response is actually physical in the future packet. And also, in the graphical user interface, when you select random packets, it will also call this dissector again and try to make sense of it based on the state that was saved before. And the additional complexity is, well, you might think, well, why can't I just save everything at once, keep everything in memory? The reason for that is you don't have infinite memory. I've noticed that too long as well. Most of my memory right now is consumed by Firefox or my browser, which is just to browse things. Yeah. And then back to the secret. You're wondering why doesn't the Wireshark interface have a way to add a secret. Like I said, when these packets are processed sequentially, the secrets have to be available during the early handshake. If it becomes available later, for example, you have application data, at that point, it's too late for Wireshark. Wireshark hasn't been able to set up a decryption context and the decryption is also dependent on earlier seen packets. So, with Wireshark, you can configure it to read the secret from a file using the geoscedal method that allows you to do a live capture and get the secret directly from the file. Wireshark will read the file as the keys are written. But it doesn't mean that your quick implementation, your client or your server has to write these keys in time. And if you provide it through the user interface, that's typically not a user friendly approach. So that's why for now just decided to allow you can provide these keys to Wireshark via this file. Another idea was to modify Wireshark whenever these keys are provided to automatically save the keys to the capture file. That hasn't been implemented because there may also be additional price considerations. For example, if you save a packet capture file without explicitly, usually if you save a packet capture file, it contains only the data, the packet data that anyone, like your ISP, can check. But if you add a secret to it, you might accidentally reveal the data that was encrypted. So when the pcap-ng specification was written to add support for this kind of decryption secret, it specifically said, or at least one of the constraints was, well, if you implement it, make sure that it doesn't cause privacy issues. Because things like TOS and QUIC are designed to improve privacy and security and tools to understand them. Understandably, they need to somewhat lower these requirements for debugging and testing, but it shouldn't surprise the user, basically. I think that's a really important thing. You see, not just here, but in so many places, there's just a tension between usability and security. They're opposite sides in a tug-of-war. And if either one had their way, the other one wouldn't exist, fundamentally. So, yeah, it's easy for me to whinge and say, oh, I wish there was a button. But actually, now you've described it in that way, Peter. But yeah, it could be pretty bad if people were easily convinced to just open up Wireshark and hit record and stop, and suddenly they leak things without understanding. And it's a change of behavior in a tool that they might already be familiar with as well. And that's surprising or unexpected. Yeah, cool. I love that answer. It's not the answer I was expecting. But it's almost exactly the same what we have with Qlog as well. Because the implementation chooses what to log, right? And they can also choose to expose much more privacy -sensitive data than they might intend to. Like in Qlog, we have the definitions of how to log the files. And I had fields in there for most of the packets called the raw field, like how to log the raw bytes. And there were people saying, you know, we should maybe keep that, but maybe not explicitly mention it in the drafts, or mention it like at one place and not with every individual frame, to make it a bit more obscure, right? Not to encourage people to actually start logging the raw bytes. And I think that's, that's kind of the same what you're saying here, you know, you want to allow it, but you don't want to make it too easy for people to do it or to do it by accident. Which I think is a very, very important point to make around all this. Oh, yeah. And in Wireshark, it's also good to remember, Wireshark is basically emulating whoever can listen on the communication between two parties. It's an application don't have to, do not have to be modified. That also makes it very powerful. You study issues between implementations, where you do not control the source code or application from your interoperability testing, you might not be able to get the queue logs or log files or other stuff from another implementation, and you still need to understand why do two implementations not work. And that's what basically what Wireshark provides you. Wireshark queue log needs specific support from application. Applications need to be modified to support it. But that also provides advantages such as being able to log additional data, which you'd normally not get. Yeah, and what's cool is, you know, the people in the community are building tools to do interoperability, either running clients and servers from different vendors or running their own client against all the different servers that are on the Internet and capturing the logs of this and doing some tests and seeing what happens. But alongside just, you know, the functional test that they're doing, or maybe a bit of qualitative testing, they're capturing both Wireshark and queue log logs. So we had a case reported to us last week by Tatsuhiro from NGTCP2 project, I believe, which said something weird is going on, the test fails, go and look at your logs. Both of them were incredibly short. They were like about two lines each. So we're basically failing at the handshake immediately. But both, both of them gave us enough insight into the problem that by having both and what was cool is the person running the test didn't really need knowledge of anything. They just configured stuff to be running so it takes up some space, but maybe that's something Robin can talk to as well, the size of logs. Yes. Maybe might exist to improve on that. But yeah, we spent half an hour on on me and Wireshark. So I think Robin, you can take the stage now and dig up queue log. No, it's, I do like to talk about that sizing issues. Because queue log is based on JSON, which is quite verbose by itself, right? It logs the key name and value for each field and you have a lot of extra stuff that you need to, a lot of extra characters coming. It would be much more space efficient to use a binary format, right, something like protocol buffers or something like that instead. And many people have asked about that, though I think sticking to JSON, even though it's bigger, allows us to be much more flexible. JSON parsers don't need to fix schema. You can just give them any type of JSON and it will parse it. And that's actually been used a lot in practice as well. But the thing is, it's still large, but it's not as large as Wireshark's PCAPs. Shots fired. Which is something Wireshark can't do anything about. That's the annoying thing about QUIC, from that perspective, is that it's end-to-end encrypted. Right. That means at the moment of capture, you don't know which bytes of the packet exactly are going to contain the information you want to see. That's different with TCP. With TCP, things like acknowledgement numbers or sequence numbers or flow control limits, they're just plain text visible on the wire in the TCP header. So if you're capturing an HTTP2 connection and all you care about is the TCP or even the TLS layer to debug congestion control or something, you can discard all of the encrypted HTTP2 stuff. You don't even need that. You can just store, let's say, the first 100 bytes of each packet. That contains everything you need. You can only do that with QUIC, because all those informations are encrypted in the packets themselves. You need to have the full packets, because the packet content itself are used as the encryption key. What's it called? Associated data? AED, right? So you need the full packet to be able to decrypt the packet as well. That means that if you do a packet capture of a QUIC connection, you have to store the full connection, every packet you see in full in a file to later decrypt it. That means that PCAPs for QUIC are always going to be massive, massive in size. We did a comparison. If you download a 500 megabyte file, the PCAP is going to be, let's say, 560 megabytes, because there are also some added timestamps and stuff. The default QLOG file will be about 250 megabytes, which is still beefy. But QLOG, because it's text-based, it can be compressed. So if you just GZIP that or turn it into C-Board, that's like the binary equivalent of JSON, which is still flexible, but it tries to optimize it binary a little bit. You get to much more manageable sizes of, say, I don't know what it was. I think the optimal was, what, 14 megabytes or something. While the binary capture file, you would always stay with those 500 plus megabytes, because you can't compress the binary data, or at least not to any useful extent. And that's one of the things that, you know, I think that's what we argued in our last paper, is something that might make QLOG interesting to use in practice after the debugging phase. Because it's useful when debugging and running this on your computer, but maybe not if you're a Cloudflare and you're running this massively at scale at your CDN. The problem is you also won't be able to do that with PCAPs, because they might be just too large. And so QLOG might hit a sweet spot there, where you do have to make a trade-off, because it costs more processing as well to store, to create and store the QLOGs. But it will take up less memory and it might be more privacy efficient, because you don't need to store the decryption keys for everything. The QLOGs can just contain whatever you want to expose. And there's some different trade-offs here. Like you mentioned, Cloudflare, we're terminating the connections, right? We are a valid endpoint in the communication here. So it's kind of up to us what we log, albeit we need to be sure that we follow our own privacy and our own data capture and analysis guidelines. So there's a whole load of trade-offs in all of this stuff and people take it very seriously. Let alone from the side of logging everything everywhere. It's an incredible large scalability issue. You need to solve a lot of these different problems, but I believe that people are running QLOG in their production edges. Not at Cloudflare, but other people are. Facebook is running QLOG in production, which I find interesting. Facebook has said that they absolutely refuse to add the SSL QLOG file export option to their quick implementation move fast. Because they say, if we even have a code path for it and someone is able to do remote code execution or something, they will be able to exfiltrate those keys. We absolutely don't want it. So they only have QLOG available as an option. The way they do that is they stream each individual JSON event to a collecting database server. So each event is there in a relational database and then they reconstruct the QLOGs, the full QLOG files per connection later on, based on the connection ID. And they do have issues scaling that, to be honest. There's still plenty of work to go in there to make that actually plausible at a huge scale. But for now, I think they log about 10% of all their current quick connections, which is billions of events per day. And that works fine. And that's a strategy I've seen different people who do Internet stuff take with, say, things like tracing or open tracing, where they're taking spans across. Not the protocol level that we're focusing here, but the layer seven, where you want to understand the lifetime of a request through the system as it comes in on your ingress and what different rules maybe it's triggering or how long is it taking to run a PHP script. Where are your bottlenecks in the system? And so all of these tools and techniques are very complementary and there's not one that I would ever pick to figure out anything. And I'll see that with the engineers internally, like different people analyzing things in completely different ways. It's really, really interesting to me, at least. That's cool. It is indeed. Robin raised a very good point. And don't you understand why someone like Facebook would not like to add this as a keylog file option? Because once this Once these keys are basically exposed, that basically means that everything in that connection and sent in the past of the connections and in the future of that connection should be considered compromised, as in someone with these keys can read it. And even though these keys are securely stored, TLS and QUIC were designed to lose these keys after the connection is closed. So if you store the connection, you basically break that requirement in order to maintain security of a TLS or QUIC connection. Exactly, that's, that's exactly right. Even though the keys are ephemeral per connection, you do expose them. And the thing is, it doesn't just break your current connection, it also breaks if you want to do zero RTT for an X connection, the attacker could also view the first client-side zero RTT data sent because those keys are derived from the earlier data as well, right? If I understand correctly. It's even worse. It's even worse than TLS 1.2 and before, with TLS 1 .3 and QUIC. If you compromise the key, you won't be able to read the resumed session in the future. With TLS 1.2, session resumption, if the key gets compromised, all sessions resumed from that particular session are also compromised. So when I do debugging, I, well, for me, I'm very comfortable with doing a packet capture and logging secrets and so on. So whenever I have some stuff to reproduce, typically it's my go-to method to get started. If the application is locked, it will not help. It's pretty much universal. You don't need, there's like so many applications, applications that talk to each other with different, using different programming language, different libraries. So something like a packet PCAP is pretty universal, especially if you have a way to get the secrets from applications. So many people, many applications will probably use BoringSSL or OpenSSL, or if they are built with Go, they'll probably use the Go Crypto TLS library. And Go also provides an option to configure this Keylog file within an application. But if you say using Docker or Kubernetes, which are built using Go, they don't expose this option, but still you can attach a debug or something to get this data from memory because the code that exists like Robin said. Very, very interesting. Lead hacker stuff. That's true. That's like, I've always been fascinated by how the whole, things like the resumption key secret rotation works. And the idea that if you do connection resumption, the way I understand it, there is a single key to encrypt the resumption secrets. That's shared across the entire CDN or the entire backend system. And they have to rotate that like once a day at least to make sure that that don't get leaked. And they do all these kind of funky stuff to keep that only in memory and then to try to obfuscate where in memory they put it. And so that if people get into the systems, they won't even be able to extract it from memory. So that's very cool, complex stuff. Yeah, that's especially relevant for kiosk competition indeed. At Cloudflare, the keys are rotated much more frequently, like every hour or so. I think it's written in a blog post somewhere how we do a key rotation. Yeah, exactly. So, I might want to get back to, because I don't want this to seem like, you know, I think Kulog is much better than Wireshark and all those things. That's of course not the case. You know, like Peter has also said, and he highlighted in his last talk as well, there's certain trade-offs to be made between two and one will look better at some point than another. And we actually did a survey, right, on asking quick implementers which tools do they use, and most use Wireshark still as their main debugging tool, right? There are not that many that really use Arcubis stuff very in depth, or at least not the more complex tools, but they all use Wireshark. It's very useful. And then one of the things we've also done is you can upload PCAPs, so Wireshark capture files, and then the secrets file to QViz. And internally, it will automatically transform that into a Kulog file. Those Kulogs, of course, will lack some of the internal state, right? They won't contain the congestion window and all that stuff because it's not in the wire. But all the rest, all the packet contents and what they do are in there. And internally, of course, we use Wireshark for that, to be able to do that. So Wireshark allows you to export not just a PCAP file, but also a JSON file, right, of the exact same data. This JSON file is absolutely massive. It's huge. Bigger than the PCAP that you have. Oh, yeah, way bigger. I have a couple of loaded here of several hundred megabytes JSON file. Again, it's not a problem. But that's nice, the fact that we can do that. We can ingest those JSON files. And then just map their schema to do what we do in Kulog. And make it frictionless for developers that are used to using Wireshark to also be able to use our tools with that. Just to interrupt, sorry, because I've never done that myself. I'll be talking about UI versus command line a little bit like 10 minutes ago. To convert to JSON, am I doing that in the Wireshark tool itself? Or am I doing it from a command line? You can do both. So, QVis, QVis internally does it in an automated way for the command line. But you can also, I think it's file and then save as, or export as, or something like that. And then you can just use, you can choose JSON. And funny anecdote, if you do that, you do get like the decrypted data raw inside of the JSON. A couple of years ago, I frustrated a lot of my students by saying, you know, oh, just send me the decrypted PCAP file. And I can look at it. Because in my mind, it was very logical, you know, if you had the PCAP and you added the keys, you would decrypt it. And you would be able to store a decrypted PCAP file. Which makes no sense. Now that I know how it works, right? Because what would I do? It would strip all the TLS records and replace them with the actual files or something. It wouldn't make sense. But back then, it was like the most logical thing, you know, because just store it decrypted-like and just send it to me. And the nice thing about the JSON format, that's what I was getting to, is that the JSON format from Wireshark does allow you to do that. And so we can pick the things that we want from there, from Qlog, and really build extensive visualizations on that. Cool. Yeah. Wireshark does actually have a, well, the feature has a horrible name. I think it's called something like export PDUs or something like that. But basically, for simple protocols like HP over TLS, you can strip the TLS layer and still have a recognized request and response zone. Obviously, you lose all contact about, oh, what SNI, what server name did you connect to? Was it a resume session? And so on. That's, I guess, also kind of like Qlog, where the developer implementing Qlog will decide what kind of information to take as a summary. Yeah. And that's something like, as somebody who's implemented this in our library for Cloudflare Quiche that we use, both for interop testing in like simpler clients, but also the edge, where we're not running Qlog enabled, is getting that correctly. Ideally, maybe we could come up with some profiles, but it all adds complexity. And basically, it's me making a decision for everyone who might use Quiche as a library. And so far, a lot of these things, you just pick an answer. If you ask people, they don't have enough context to even understand the question that you're asking them. So pick an answer. Pick something that's straightforward to do and lose your scope to improve and iterate. But if you don't hear anything back, then it's probably good enough. And actually makes a good project, like a good first task or something for somebody who's maybe like a student and is saying, oh, well, I'm investigating this. I've tried one library and I wanted to try it here. But I noticed you don't log something. And I can say, OK, well, yeah, you would add it here. Go and give it a go. So if there's any volunteers that want to enhance QLOG in Quiche, then let me know and I can help you. I'm still waiting for a version two from Robin. Well, I think a whole suite of improvements. Next week, right? What do you say? Next week, it will be next week. Yeah, yeah, yeah. Of course. I'm a bit swamped right now. But I hope to have it soon. It should also have some options to reduce the file size and that kind of stuff. But it's a good point, Lucas. I think most of the projects implementing QLOG have open issues like that that are interesting for beginners. Like Mozilla also has tons of those open. So people can definitely check that out if they want. Hey, the QLOG library is shared across Mozilla's stack and our stack. So if you're looking for a high-impact project to contribute to, here we go, QLOG. Exactly. Yes. Yes. Well, actually, that's pretty cool. Yeah, I'll get on to my podium now. So what QLOG is, as Robin said, JSON. But that's like the format of each file. What you have is also a schema that describes the format itself. And so what I was interested in doing is kind of pushing that as far as I could. So taking the schema as it's written in a language called TypeScript and using it as an opportunity to learn a bit about Rust. So taking that schema and seeing how closely I could create an equivalent in the Rust language and use something called SERDE, which I'd read a bit about to help me bootstrap the serialization and deserialization. So to backstab a bit with some of the QLOG implementations, what you see is that as people are, say, creating a quick packet or a quick frame, and they go to send that out to the wire, they will also maybe at that point hook into printing text to a log file. And at that point, they might just say, look at the information to hand, use something like printf almost to write that out in a low overhead way. Which is definitely an option and does scale and does work well. But I was interested in kind of experimenting here. So with my Rust object model, what I'm able to do is use some annotations on something like a structure of a quick frame and describe the fields. And maybe annotate them slightly to say, oh, well, in Rust, it's spelt this way. But in Robin's QLOG model, zero is spelt with a number zero. And in Rust, you can't do that. You're not allowed. So you can kind of decorate things quite easily. Actually then have the third and the compiler generate out all of the code that's needed to generate JSON from the actual Rust object code, which is cool. But it doesn't just do JSON. It does whatever. So it kind of manages the intermediate format, whatever serialization target you want, which allowed me to do something like try out Siebel, like Robin said, or something else. Interesting thing is when I tried to turn that around and do deserialization, because I figured I'd just written this thing that does both. I'll round trip one of my generated QLOG files back into the tool and do something like count the number of packets, something really simple, just to do some analysis. And then what we have is a library that's even independent of Quiche. It could be used by anyone in the Rust community to build QLOG tooling around. It all broke down quite tragically because QLOG has a lot of optional fields. And in a nutshell, basically, the deserialization codes are taking bytes and trying to rebuild packets and frames from those bytes. It was kind of doing fuzzy matching. So it's like, oh, these fields are the same. Okay. It must be this kind of frame. And I only noticed this because I was manually inspecting the output of the input and things like that. So it kind of took me by surprise because I didn't understand how all the stuff came together. As soon as I ran the compiler to generate the code and not actually compile it, but look at the deserialization code, it's thousands upon thousands of lines. And someone's like, oh, you could just write your own deserialized function to basically manage this. And it's like, yeah, but the nesting and things, it's basically nigh on impossible for me to do that, unfortunately. I'm sure somebody else who's better at these things probably could do it. But, yeah. So I asked Robin, like, how did he even write any of the visualization tool? How did you? Like, it just worked, didn't it, or something? It's just, oh, yeah, no. First of all, sorry, Lucas. That is because I tried to be too clever in pre-optimizing QLOG, and that's why deserialization stuff is more difficult. And version 02 will address some of those things because we also want to be able to give people the option of doing QLOG in a more binary format. But then it will be more restricted, like you said, and you have to choose which of the optional fields you do or do not do. The way I did that, the nice thing, one of the reasons I chose JSON is because I am mostly a web developer. I am at home in JavaScript and that kind of stuff. And there, JSON evolved from JavaScript. It's called JavaScript Object Notation, right? And there, it's that simple to handle all of those things and to iterate on these special objects in a flexible way and to handle if you have a field or you don't have it. So you have much more tools in a flexible scripting language like JavaScript or Python than you might have in Rust. And especially for one of those automatic libraries like Serdy to manage that. So I kind of cheated. I admit it. But, you know, a lot of people have given pushback on that and I'm going to follow their advice to make it a bit more manageable. I wouldn't call it cheating. I mean, ultimately, what you've been able to do is build some nice tools that generally help people and demonstrate the value of the thing and then get more feedback. You know, this is the whole idea is to get some running code and see what pain points people have and stuff. So I think. Yeah, exactly. The amount of response from the community has been amazing. When I wrote the paper two years ago doing Qlog, I was like, OK, this sounds good. But nobody's actually going to be interested in this. Nobody's going to actually follow this. Who would be so stupid to add JSON output to their high performance C++ library, right? Well, you know, very happy to say that that was not the case. And many people are doing it. And now I think it's just fair that I, you know, take a few steps back towards them as well to help them in practical deployment issues for that so that we can all benefit in the long term. But yeah, that has been surprising to me that people were actually interested in this approach and that it found this much uptake. I've just been collecting some of the links of stuff we've talked about today. I know sometimes they flash along the screen. This is we've got a few minutes. We didn't get on to any of your visualizations. I'm sorry, Robin, this time around. I didn't want to interrupt you and Peter's flow of interesting stuff. But yeah, just to go over these links quickly. The first one was the one Peter mentioned about the quick working group repository on GitHub where we keep all the specs. We also have a wiki pages with lots of kind of additional stuff that's not like super related just to the standards themselves, but the interoperability testing that we've been talking about. That's just the tools page, but there's a Wireshark on that one. I think there's the table you showed with all the different versions of Wireshark that support different versions of QUIC and everything, which is cool. Yeah, I may add to that. There's also an implementation page which contains all kinds of test servers, test endpoints. So if you're doing any research, you can just check that page. I, for example, used that when I was trying to probe for supported QUIC versions. It was really useful to have a list of all servers, plus the link to the source code. It's an open source implementation. A lot of QUIC implementations are all in open source. The sources are available, so you can check them out if you want to learn how it works. Yeah, look at that. Lovely word wrapping. It's all broken. I apologize. That's what you get for trying to be too clever. Implementations. Implements. Nations. Oh, yeah. Great. And then we've got a link to pcop2qlog, which is basically taken from a link on the QViz thing, which I should paste in here as well. I don't have one to hand. And then there's a link to that Qlog crate and SIR that I mentioned, too. But before I miss the opportunity, I'd like to thank you very much, both of you, for coming on the show again and continuing the general kind of chat around this area and giving us some handy, like hands-on tips that actually people can use now. In the two minutes we have. I don't know if you want time to say anything specific. I'll hand it over to Peter first. And you've got 30 seconds. There's still a T missing in the link for you, David. Can't spell. Thank you. Thank you. That's the best point you've made all call. I just wanted to really thank Peter and also Alexis. Alexi. I don't know how to pronounce his name. Alexi. The two guys from Wireshark who have been very active with Qwik from the start and much more productive than me. For me, it takes months to get a new Qlog version, but within days of a new Qwik draft appearing, they change Wireshark and they make sure that everybody can keep on using their tools, which is great. Truly great. Thank you for that. Yeah, I'd say thank you, too, as well, on behalf of me and a lot of people familiar with who use the tools without even realizing who the humans are behind them. Thanks. Yeah, I think we'll wrap it up there. Thank you again for your time and I'll see everyone else maybe in a future episode of leveling up web performance with HTTP 3. Bye for now.

Leveling up Web Performance with HTTP/3

Join Lucas Pardue and friends for in-depth explorations on using the latest web technologies to enhance performance and security!

Watch more episodes