Cloudflare TV

⚡️ What Launched Today - Monday, June 19

Presented by Sam Marsh, Alex Krivit, Taylor Smith
Originally aired on 

Welcome to Cloudflare Speed Week 2023!

Speed Week 2023 is a week-long series of new product announcements and events, from June 19 to 23, that are dedicated to demonstrating the performance and speed related impact of our products and how they enhance customer experience.

Tune in all week for more news, announcements, and thought-provoking discussions!

Read the blog post:

Visit the Speed Week Hub for every announcement and CFTV episode — check back all week for more!

Speed Week

Transcript (Beta)

Hello everybody. Today is the first day of Cloudflare's Speed Week. Today we have announced six blogs talking about how we do ML inference at the edge, in microseconds, how we scale ML.

One of our products called Orpheus has saved over 132 billion requests from failing since launching two years ago.

And finally, how the Cloudflare network continues to grow, now appearing with over 12,000 Internet networks in 300 cities.

There are also posts on low latency streaming and smart hints.

And I'm pleased to say I'm joined today on the Cloudflare TV show by author of these posts to discuss them in more detail and kind of answer some questions to hopefully shed some light on what they are.

So let's start by introducing Alex. Alex, can you introduce yourself, say what your role is at Cloudflare and tell us what it is you are announcing today?

Yeah, Sam. Hi, thanks. My name is Alex Krivit. I am a product manager for cash and some things here at Cloudflare.

And my post for today was about smart hints, which was an announcement that's been actually about a year in the making right now.

So people who sort of have read and kept up with a lot of our performance announcements for the CDN and Cloudflare remember that maybe about two years ago, we started really talking about this product called early hints.

It was a feature that we had built that really helped improve performance. And so smart hints helps to build on top of that.

And I can dive into that, I think when introductions are maybe done here before we go too deep.

Yeah, definitely. So in terms of early hints and smart hints, can you give us like a really high level overview as to what the difference is there and kind of what was early hints, what is early hints and how does smart hints differ from that?

Yeah. And so early hints is a way that when a request goes into a web server, sometimes, especially if it's like a dynamic request, oftentimes like third party resources need to be gathered from many different places, both on the server side and then also on the browser side.

And so oftentimes there's this period after a request is received by a server called like the server think time.

It's not really thinking because like machines and thinking and that whole like generative AI aspect, we're not quite there yet.

But what happens oftentimes is that it's sort of pausing while it asks other places for data that's stored in other servers.

So it might need to go and talk to a database to get some information.

It might need to go and talk to another origin, another repository to get some style sheets or something else.

And so while it's doing those things, nothing is crossing the wire.

Nothing's going back to the browser.

It's just sort of there waiting for those responses so it can finish up the entire response and send that back through so the browser can unpackage it all and start loading everything for the browser.

And then oftentimes this is happening in milliseconds.

You don't even quite notice it from the perspective of the end user.

But what early hints does is it takes some of that information that the server has, the static information, this unchanging information, and it sends a hint to the browser and saying like, hey, for the final response, we think it's going to include these resources or we think that you're going to need to grab some information from this third party or something like that.

So it sends that sort of speculative hint a little earlier so that the browser can start doing things while the server is organizing the final response on the back end.

And so you have two machines working for you to try and build this faster response time for you.

And so that's early hints. And as you can imagine, a lot of that coordination has to be developed by the person who owns the website on the origin.

And so what is different about smart hints is that instead of having that coordination happen by the website owner on the origin server, it's going to happen at Cloudflare sort of in the middle.

So you won't need to set anything.

You won't need to say like, hey, this is where the origins that we're going to connect to.

These are the assets that I want to preload. Cloudflare will be able to do that for you based on some network heuristics that we have.

Nice. And who's going to be the main people who would benefit from smart hints?

It sounds like early hints has been out for a year or so now.

Who's the main audience for this? So I think that there's a few different people that I think will benefit greatly from smart hints.

I think that people who don't have access to their origin, like a SAS provider or something where they sort of do all of the magic origin configuration on the back end and you're just sort of in front of them, maybe a merchant store that's part of a larger sales platform.

If you want to load things very quickly, then you can turn this on in your Cloudflare dashboard and we'll start setting those hints and priorities on your different assets for you.

And so that will be a benefit, I think, to a lot of retail stores and other places like that, because what we've seen, especially when we launched early hints, was that for every 10% faster that a merchant site was able to load, they saw 7% more conversions in terms of revenue and sales.

So that's huge. I think another group of people that will benefit tremendously from smart hints will be people that maybe don't want to set them or don't know what to set and just want to just say, hey, I want to focus on building my app and I want you guys to take care of performance for me.

Those people, I think, will benefit greatly because they won't have to go through and do all this configuration.

And if their page changes, then go back through and set the hints again.

We'll just do it for them on the fly. And so I think that those groups of people will benefit tremendously.

And random question on top of my head, I know that there's been obviously support for this in Chrome for quite a while now.

We're starting to see more and more support for this in other browsers and other platforms.

Can you tell us a little bit about that and some of the news on that?

Yeah. So we saw that Chrome was the tip of the spear. They were our initial partner in this when we were building it out and testing it a couple of years ago.

We just saw recently during the Apple event that took place last week, a week and a half ago, something like that, that in Safari 17, they're now going to be supportive of early hints as well.

And so that means that a lot of the browsers that we see and a lot of user agents that we see across our logs will have early hint support, which will be huge, I think, in terms of reducing the wait time on the Internet tremendously.

Yeah, especially from a mobile perspective as well.

I think the predominance of Safari on mobile devices is huge, right?

So really excited to see the impact that that's going to have just across the whole Internet.

In terms of what Smart Hints provides, what it gives to people, can you explain to us at a higher level what RUM is and how Smart Hints is going to help basically people get better RUM, get better scores effectively for their websites?

Yeah, and so this sort of gets into how the sausage is made a little bit, like how we're going to be setting these early hints and these priorities on behalf of users here.

And so RUM is Real User Metrics, is what it stands for, it's an acronym, and it's a data pipeline that we are building here at Cloudflare.

We have a lot of pieces set up already and are using it in certain aspects of our sites and our features today.

What happens generally is that when your response goes through Cloudflare, a small little piece of JavaScript gets set there, and when it gets loaded in the browser, we get some information back about how well it was loaded, what sorts of things were fast, what sorts of things were slow.

And using that information, we can start building sort of the best way to prioritize and to hint on various information based on that data that we have.

And so over time, we'll be able to do things like, hey, these types, these trends of data of these websites have these sort of load data, and we can start building sort of these rules based on pure performance.

So we can really set the best performance for how Cloudflare and your browser negotiate for loading your website for your visitors, which will be sort of a great way for people to experience the true performance benefits of using Cloudflare.

Us being able to rewrite these websites on the fly so that they are fast and optimized for the end user is going to be a huge, huge win for users of Cloudflare and the Internet.

Will Barron Yeah, definitely, definitely.

Switching gears now completely, we've spoken just a huge amount about early hints.

Can you talk to us what role fetch priorities is going to play in this?

Talk about what that is and kind of how that fits into Smart Hints?

Yes. Our sort of entry point into sort of this world of doing this on the fly was early hints, and I think we sort of described that.

The other piece of this, which is sort of interesting, is that browsers oftentimes, when they get a full page back and they're reading through the HTML and loading it all, they do it sort of from top to bottom.

And whenever they have to stop and they see some sort of like render blocking script that they, again, we were talking about going and getting something from a third party server somewhere, scripts sometimes direct that, hey, if I open and execute the script on my server, that means I discovered all these new assets and all these new things, and I have to go and get them from somewhere else.

And so sometimes those scripts block the entire render of a page.

So you might want to prioritize them earlier so that the browser can do all that work ahead of time.

And that will help improve things like the perceived functionality and the way that the user experiences your website.

And so priorities are ways that you can say, hey, even though you're reading this in the browser world from like top to bottom, you should actually sort of look at this asset before you look at this asset in that default priority, because this one's going to be better for end user experience.

This one's going to have the primary content that's on the page.

So maybe an image that's above the fold, you should load first, because that's the first thing that the user is going to see, and that's going to be a better experience than trying to load an off-screen image for the end user.

Yeah, yeah, that makes sense.

And again, like you say, that all this optimization should and will result in a better user experience ultimately.

Yeah. Can you kind of talk us through, just coming to the end of your slot, can you talk us through how this is going to be priced?

Are we going to charge for this? Who's going to get access to this?

What's your thoughts there? Yeah, so it's a good question. When we announced Early Hints, it was free.

And the reason that it was free is because you have to do a lot of work on your side.

You have to organize all of the assets at your origin in your HTML and make sure that it's working through Cloudflare.

And that's still going to be an option.

You can still set Early Hints on your origin for free.

But if you want Cloudflare to do it for you, then we're thinking that it's probably going to be included in a biz or enterprise plan.

And so if that's something that you're interested in, and if that's something that you'd like for Cloudflare to do for you, then definitely look into the business or enterprise plan, because I think that you'll see a lot of value there, both in terms of performance and other features that you have available at those higher plan levels.

Perfect. And finally, how can people get their hands on this?

What's the timeline looking like for getting this into the dashboard?

That's a great question. So we're announcing sort of a closed beta cohort today.

And so you can sign up in the dashboard. And as we're sort of putting the finishing touches on the product over the next quarter or so, we're going to be opening that beta cohort to people that we think will give the best feedback initially, so that when we GA it, when we open it up to everybody in the world, that we can be sure that people are going to see great results.

And so if you want to be part of that closed beta cohort and want to give us good feedback and work with us, definitely sign up, and we'll be in touch shortly.

And we'll open this up to everybody without needing to wait or be part of betas when we get there.

Perfect. Perfect.

Thanks very much for taking the time to explain everything. I learned a lot, and hopefully everybody watching has.

Switching gears completely now. Taylor, talking about low-level HLS for Stream.

Can you give us an introduction, what your role is at Cloudflare, and tell us what it is that you are announcing today?

Hi, my name is Taylor Smith.

I'm the new product manager for Stream, and today we are announcing a closed beta support for low -latency HLS.

Nice. Perfect. And can you describe what that is for people like me and kind of how that differs from what's in place today?

Sure. So when Stream Live launched a little less than a year ago, we started with HLS, which is HTTP live streaming that allows servers to take video that's being broadcast and chunk it up to be able to be delivered live to any kind of browser or device for websites and web applications to add live streaming features to what they're building.

And HLS really prioritized compatibility, so that's different browsers, different mobile operating systems, so that it could be played on any device.

Resiliency, because if there's packets lost or there's network problems, you want to make sure that that live streaming video continues to play smoothly.

And quality and features like adaptive bit rate, multiple audio channels, captions, those kinds of things.

And where HLS was when it was introduced, latency wasn't the highest priority feature.

And so the way HLS is built, you can end up with what we're calling glass -to-glass latency, which is the time between a broadcaster doing something on their end and a user seeing it on their screen can be much longer, you know, 30 to 60 seconds, depending on your provider, a lot of settings along the way.

But with protocols coming out like WebRTC, which we also have a beta support for, and other protocols, making latency a really hot topic right now.

So low level, or sorry, low latency HLS is an extension on HLS that builds upon it and makes it faster.

So it uses smaller chunks with hints that can be fetched more frequently by the end browser, which allows it to catch up and play video much closer to its real-time recording.

And it sounds like an obvious question, a simple question, but why does it, like you say, it's becoming more of a focus now on performance.

Why do you think that is? Why do you think people are looking to drive this latency down for these broadcasts?

I think both it's, you know, competing technological solutions with different protocols that would allow, you know, if you did a switch to something else that would deliver video faster.

But the end use case is really if you've got, you know, someone who wants to interact with a presenter.

So like video game streamers, when you're watching somebody, you can either chat with them or you can like send them whatever on different services.

Or if you've got like an event that's being broadcast live, but also has a live audience, you want those two groups of people to have a similar experience.

And I think this came up this morning when I was reading the how to be on Cloudflare TV prep notes.

There's a note in there that says, hey, if you're going to be interacting with your audience, you should know that there's a 20 to 30 second delay between when you do something and when the audience sees it.

Be ready for that in any interactivity you have with your audience.

That's what we're talking about. So bringing that latency down from 30 seconds to less than 10.

Huge, huge. And how does, so talking about LL, low level, low latency, HLS in general, how does this help content creators specifically?

So the ones who are doing these broadcasts. So that is, you know, one of our use cases is a lot of our customers are building platforms that allow people to stream content like video games or sports or, you know, worship or other things where there's either a live audience nearby or the person presenting is trying to get input from the people who are watching.

And so there's, if the latency is really long, that creates kind of an awkward gap where you ask a question and then you kind of wait for the chat responses to come in.

Or if you're on the audience side, if you're trying to like get a broadcaster's attention, like you're watching them, you know, do something and you send them a chat message, you send them some kind of a reaction.

They might not get it for a long time.

And so it creates this weird, you know, lots of sort of awkward lags. Getting those things to happen a lot faster allows the event to feel more live and, you know, brings audiences and creators closer together.

Nice, nice. And how is this going to be priced?

So this is going to be included in stream plans going forward.

So it's currently in a closed beta, but when it goes GA, it will continue to, it costs $5 per thousand minutes of video stored per month and $1 per thousand minutes of video viewed per month.

And that is stream all in. So this new LLHLS support will be included in that rate.

And, you know, the great part about the way stream is priced is it gets us out of any kind of surprise fees.

There are no separate charges for ingress, which is broadcasting to us.

Compute, which would be encoding and adaptive bitrate variant generation, or egress, which is the data expended getting video out to your users, which is pretty enormous with live video streaming.

Yeah, perfect. And how can people get their hands in this? I know you it's part of stream or it's going to be part of stream if they wanted to kind of try it today, or if they want to try it, when can they get hold of it?

So we're doing closed beta today.

So there's a blog post that goes over this and also some of the requirements for the things that were eligible for the test at this time.

And then a link to sign up and it won't show up in the dashboard.

It's currently, you know, you got to go through that registration flow.

And then the does not change the experience in the dashboard, but there might be a minor code edit that you would need to make on the player side to make sure that the player uses the new playlist that supports LLHLS.

Perfect, perfect. Thank you. Thank you very much. Looking like that's the end of our questions.

Thank you very much for walking us through that, Alex and Taylor.

That's really intuitive. And I think a lot of people will learn a lot from that, plus the accompanying blogs, which we're announcing today.

For everyone watching, join us again tomorrow. We have two sessions, one kind of like this, where we're walking through the day's announcement and one where we'll be doing a deep dive on a to be announced, very interesting product set.

So make sure you tune in, make sure you read the blogs and make sure you keep on reading and keep on watching.

Thank you very much and see you tomorrow.

Thank you.

Thumbnail image for video "Speed Week"

Speed Week
Relive Cloudflare's Speed Week with episodes showcasing how we keep everything fast, from lightning quick configuration updates and code deploys, to logs you don’t have to wait for, to ludicrously fast cache purges and real time analytics.
Watch more episodes