Emerging standards in WebRTC live streaming
Presented by: Brendan Irvine-Broque, Renan Dincer, Dan Jenkins, Jonas Birme
Originally aired on January 19 @ 9:30 AM - 10:00 AM EST
Tune in to learn how new standards are making it easier to integrate real-time video into apps, traditional broadcasts and beyond.
English
Transcript (Beta)
All right. Hi, everybody. My name is Brendan Irvine-Broque, product manager for Cloudflare Stream.
And we're here today to talk a little bit about WebRTC and some exciting new standards for live broadcasting.
We have a couple of guests here today that I'll introduce in a second.
But for those watching, you might be familiar with WebRTC for things like video chat.
WebRTC is like the technology that powers services like Google Meet and others.
But another use case for broadcasting live video at ultra low latency.
And that's what that's kind of the space that everybody here today, our guests, has been working to push forward.
So to kick things off, maybe go around and introduce ourselves, starting with Dan.
Hi, I'm Dan Jenkins.
I run a consultancy, a real time consultancy in the UK called NimbleLape Limited.
And I also run a service called Broadcaster PC that brings in remote talent in ultra low latency into professional AV workflows.
That's me. Jonas, go for it.
All right. Yeah. I'm Jonas Birme. I'm working at Ivy Technology.
We are independent specialists in video streaming. We are helping customers with tech strategy and software development.
And I've been involved in R&D on the WebRTC distribution side on the broadcast and doing some proof of concepts and helping out with this conversation there.
My name is Renan. I work with Brandon on Complex Stream as well.
Super exciting to have this conversation. Great. So maybe Jonas, to kick us off, could you start out by explaining a little bit more about these new signaling protocols in WebRTC called WIP and WEP and, you know, why these can make broadcasting over WebRTC a little bit easier for people?
Yeah, that's a very good question.
Why do we need new standards and new protocols? I mean, WebRTC is a set of standards and protocols.
So it's not like you actually need any more.
But the thing is here, I mean, when you build, when you typically build a WebRTC based application for video conferencing, you have the same app for both sending and receiving, right?
Because we are all sitting in the same room in that sense.
So you can, you don't need to, you have basically the full control of setting up the room, discovering the participants and all those things.
When you take this into a more of a broadcast scenario, when you have like one-to-many distribution, then you have a set of producers or broadcasters where you, I mean, it could be at one arena or another producer, another arena.
And then in the middle, you have the actual distribution part.
And then on the viewer side, you have different services consuming and should be able to consume this and the platforms and the devices and all those things.
So what has been very successful sort of in the broadcast industry is that we've divided this into three parts.
So you have one part, which is on the sort of left-hand side, where you have the producer production side.
In the middle, you have the distribution side and on to the right, you have the consumption side.
And by having standardized interfaces between these parts, everyone can focus on their own part.
So the production side can still continue focusing on their side.
And they know if they build a product with a standard interface into a WebRTC based distribution platform, they know it will work with all distribution platforms.
The same goes on the playback side. If you build a player that works with a typical WebRTC based distribution with this standardized protocol, they know that it could work with ours as well.
And the same goes for when, if you would build a distribution platform, if you have a standardized in and standardized out, you know it will work with any type of production part or also on the consumption part.
And what is WIP and what is WEP? They are very similar in that sense.
It's an HTTP based protocol. It's basically a standardization using HTTP, HTTP methods, very restful in many ways.
And it uses this just to transport the HTTP packet or the handshaking documents that is needed for two peers to connect.
So on the WIP side, which is the English side, you have a single direction stream where you have from the production to the distribution, which is the sort of receiver.
And then on the playback side, we have WEP, which is similar in that sense, but it's a receive only from the distribution and network to the player in that sense.
So it's nothing more revolutionary than that, I would say. But the thing is that this is something that is standardized.
So if I would build a sender software or hardware, then if I would implement WIP as a protocol, I can handshake with all the other WebRTC based distribution platforms out there.
Yeah. That consistency is super important for kind of growing adoption.
Dan, I'm curious, you gave a whole talk actually this year at DMUX, the big video engineering conference, about some of the pain points of working with WebRTC.
And so I'm curious for your perspective on the value of standards of clients being able to kind of talk to each other and speak a common language.
So standards are hugely important.
And anyone that tells you different doesn't really know the real world.
I mean, back when WebRTC kind of got formed, the big pro was, oh, there's no signaling standards.
We aren't all forced to do things with SIP or we aren't all forced to do things with this or that or XMPP or whatever was the crux of your niche.
And it was fantastic that we didn't have anything that held us back. But as time's gone on, many, many, many people, including the broadcast industry and many others, have all said, well, we're not all going to implement 10 different ways of talking to provider X and provider Y.
So standards are huge. Yeah, the talk I did at DMUX was a little bit, I don't know, what's the word, in your face around W3C, ITF standards and kind of how much of a pain developing WebRTC applications can be.
And unfortunately, we're still in this position where if you actually want to go build an app using a browser, I mean, you probably build it in, well, you build the base of it in the grand scheme of things in hours, right?
But then you spend weeks and months actually kind of building all of the stuff around it that protects it from, oh, Safari does this weird thing, Firefox does this weird thing.
Oh, Safari doesn't even do this other thing.
So we can't do that in that browser. So we have to work around all of that.
So yeah, we're now what, 10 years into the WebRTC journey, basically.
And we're still kind of dealing with these oddities on a daily basis.
And I know that web developers have this with other APIs from the W3C. But because WebRTC is such a huge set of different things, there are so many different parts of it.
I feel as though WebRTC kind of gets a lot of the, we'll just implement this little bit of this API and we'll implement this other little bit of this API.
And then, oh, there's a bug, but 99% of people aren't going to really find that bug.
So we're not going to bother fixing that bug. And I mean, it's not just the W3C, there's lots and lots of other places as well.
Can I ask you a question, Dan?
Yeah, absolutely. Also, I just noticed Christoph from the DMUX Slack room was chatting out to your proof behind you that you gave a talk, actually.
But my question is, why, if there's all these problems with WebRTC, even if it existed for 10 years, why aren't we creating just another standard that's going to replace WebRTC for this use case?
Why are we using kind of an existing standard and building on top of it?
I mean, so I think in the semi to near future, we might have other things like, so WebTransport and WebCodec and Meteor over QUIC, or just RTP over QUIC, or whatever you want to define this new thing that's coming down soon, air quotes soon.
Then that's really interesting, right? But the reason why we all put so much effort into this today is there are billions of devices that know how to do WebRTC straight away.
So you don't need to install an app. You don't need to do anything.
You can open Safari on any iPhone, pretty much, unless you haven't updated it in three, four years, or whatever it is now, or open up the default browser on an Android phone.
They all support WebRTC. So this is why we put so much effort into WebRTC still.
And ultimately, it's still an excellent tool. That's not to say another good tool is not coming soon, potentially, because it is, potentially.
Well, no, it is. In what form, that's still kind of up in the air. But this is why we put so much effort into it, because it's available.
And whether or not we complain about it an awful lot, we complain about, I don't know.
But it was still the thing that was available to X number of people. So we supported it, because that's important.
And also, I want to add, if you look at the, let's say, on the contribution side, those protocols that we have, RTMP, SRT, and those, I mean, none of them are actually sort of web-based.
I think WebRTC is a web protocol.
It's a W3C protocol. So it has the, I would say, the fundamental part that makes web so successful, or where we are today, is that there is a web standard.
So it is, as you say, it's available in basically all browsers out there, whatever type of device.
I mean, it's extremely hard, I would say, or I would say impossible to implement SRT in a web browser.
I don't think you can do it.
And even though both UDP-based, there's a lot of similarities, but you can't build a sender application just from your browser.
So you would always have to install an app that you would need to implement the SRT protocol and then start sending.
And the same goes for RTMP and those others. So that is one thing as well, I would say, why it's still worth investing the time in this, because maybe there will be something better around the corner, but I think we are a bit far from that corner anyway.
For sure. Yeah, it's a huge barrier for people, even just as end users trying to figure out how they're going to go live.
It requires, you know, them to learn OBS or other pieces of software, which are just more advanced tools than what I think we're all used to expecting, which is you go to a website, you have a video chat, it works.
That's kind of, that's what we expect to work in 2022.
I think we all take for granted that we can just go on an app store as well and just download an app, whereas data is more expensive in other parts of the world.
Bandwidth is more constrained in other parts of the world.
Google went on a big push around Chrome and PWAs and TWAs and everything like that around this very reason.
So it's really, really easy for us in the US or Europe or whatever to say, oh yeah, just go download an app.
And it takes, you know, 30 seconds to a minute to install an app.
But in other parts of the world, it really isn't that simple. And they do just have a browser and they just want to be able to connect to something using that browser.
Yeah, just kind of the adoption. Yeah, go for it. Yeah. And it's not also just publishing.
I mean, publishing an app on an app store, it's not something you do just for, yeah, I just want to try this whole idea and share with my friends.
It's a bit of a hassle. So, I mean, many web pages or services on the Internet, they have started from, yeah, I have an idea I want to try out and you could just try that out.
You don't have to fill in some forms and sort of how much revenue do you think you will make from this, blah, blah, blah, you know?
So those are also things, I mean, and since this is open standards, I mean, yeah, you could say there is a wall of gardens because you have these dominant browsers and vendors, but everyone should actually build their own browser implementing the protocol.
So you're not forced into that, even though it will be unlikely, but yeah. Totally, totally.
So turning things over a little bit, I'm curious, Renan, what's Cloudflare's role in all of this?
I think as Dan might've alluded to earlier, a lot of people think of WebRTC as this kind of peer-to-peer only thing.
If you read a bunch of the APIs on MDN, you're like, peer connection, peer connection, peer connection.
What are we doing in all this? Well, standards are worthless if nobody implements them, right?
And you have to have, what makes a standard useful is that the industry adopts them and then maybe as they're in the draft stage and then shows, okay, this can work and there's a business model under it so that more people can support it.
And I think what we're trying to do is promote, as John said, we don't want another proprietary protocol or we want standards and we want everything to be intercompatible because of all the reasons we discussed.
And I think what Cloudflare is trying to do is kind of be an implementer in the networking space, server, how can we make a implementation that replaces the distribution side of what's on the web, right?
Jonas was saying, you have players, you have contribution side software in the production, we need distribution and we're trying to use these protocols that are coming up to make that happen.
And I think it's just important to kind of, especially for people that are new to WebRTC, as Bram was saying, all the docs talk about peers, but it's really, really important to remember that a peer can be a server.
And I don't know what the figure is, but it's probably in the high, high percentages of all WebRTC sessions go via a server, go via an MCU or an SFU or something.
A very low percentage in the grand scheme of things probably are what we call peer-to-peer, which is one device behind a NAT and another device behind a NAT.
And WebRTC enables you to kind of connect those two together, even though they're kind of in their own private networks, but probably in the very, very high percentages, we are talking about a device talking to a server that has a public IP, right?
99%, a lot of the time.
Yeah, I think, you know, when we were talking right on, when we were first launching our WebRTC kind of solution, you know, we knew that WIP and WEP were early along, you know, they're early in their journey, but we wanted to make that like big bet to push those protocols forward and to really like, you know, make sure that people understood the value of these.
That's kind of why we wanted to get even this group together is like, get a wider set of people thinking about why this can be exciting.
I'm curious, like Jonas and Dan, you've both been spending a ton of time recently, it seems with people in the broadcast world who are coming from like, you know, TV production backgrounds and more traditional setups.
I'm curious, like, what kind of challenges do they face right now trying to bring WebRTC content into their broadcasts?
Jonas, would you like to go first? Yeah, I mean, I think it's still a bit of a learning curve.
When you bring up WebRTC on the table, there is still some skepticism.
You're thinking, well, peer to peer, no, no, no, I want to be more control of the distribution here, which stands as a quite big misconception, because that's not how it normally works.
So I think that's one of the challenges is sort of understanding.
And that is what WIP and WEP have actually helped a lot in sort of drawing out this, okay, you have your sender production contribution, familiar, feeling familiar, then you have the distribution, which could be whatever, and then you take WIP there, and then here is some media service, SFU, MCUs, whatever, mainly SFUs, I think.
And then you have the website, which is sort of like HTTP, HLS, MPEG-DASH, they know that.
And then you have sort of the player, which is, yeah, you can play it in everything.
The key part is, I think that the driver here is, of course, it's always about the latency and bringing down the latency down to real time, it opens up some new possibilities.
I would say, so there is some interest.
And I mean, a couple of years ago, no one would even listen if I would say something about WebRTC.
Today, there are some more listening, but I think we might come to that.
There are still some challenges still needs to be solved to actually work.
But there is still some sort of, there is still some why question mark on all this.
Yeah. And Dan, I know you've been focused on this problem a ton with Broadcaster VC.
Yeah, so I guess I'm more focused on ingest than say egress, which what Ernest was just kind of talking about mostly.
The pandemic kind of changed everything, right?
And we say that an awful lot now. But it did in this space, especially because before the pandemic, if you wanted to get someone on TV, you'd just, you know, send out a TV crew or whatever.
And then you might occasionally do something via, you know, Skype maybe or Zoom.
But pandemic happened and everyone wanted greater control and everything that comes with being able to kind of do all of the knobs and levers and everything.
So, and they're less worried about, say, DRM, et cetera, on ingest because they don't care.
Like they're just getting this media in and they want control of that and it's coming through to them.
And so there's less of a fight around that. There's still fights around like quality and protected bandwidth and everything else.
But ultimately, the broadcast industry in general does see the pros of what WebRTC can offer.
Yeah. One of the interesting use cases I heard about just the other day was a company who has now gone to everybody working remotely doing TV productions and they needed feeds of multiple cameras so that the people working remotely could, you know, make production decisions on the fly.
And they were kind of experimenting with using WIP for ingest and then that playback happening over WIP so they could get that ultra low latency while everybody's working in different locations around the world doing a live broadcast, which is just like, you know, it sounds normal to us post COVID, but like, that's a crazy idea to think about doing real time TV production from your, everybody's living room.
Oh yeah. And then add in like out band audio so that they can all talk to one another and it's a logistical nightmare, but it's also incredibly fun.
These are the things that like we were trying to kind of do before the pandemic and, but no, there were proper ways of doing that.
There are proper ways of doing that.
So, but suddenly it's cheaper to do it this way. And yes, you might lose a little bit of quality because you're not in complete control of the network end to end maybe, but the cost to benefit to everything ratio is huge.
Totally.
So I'm curious kind of in this space, there's two angles here. Like, you know, what, what problems do you feel like are unsolved?
I know Jonas, you've been doing so much prototyping of kind of the translation layer going between different protocols, trying to figure out what the interoperability story is.
Curious if you could speak more to that.
Yeah. I mean, I mean, there's a lot on the egress side I would say we've been looking at because it's sort of, I'm trying to do web RTC based distribution and getting the latency down that way.
So obviously there are things that always come up and Dan mentioned one thing is the DRM, which is you think it wouldn't be that hard.
I mean, you have EME in the browsers, they all have a CDM.
They can all handle DRM. They can protect the screen recording, whatever, whatever.
So you have the technology in there, but it's fundamentally almost impossible to get that from an RTP stream into that.
So that's, that's one part, which I mean, there are workarounds here.
I would say there are ways to solve this or recon this, but the best way would be to actually get that in the browser.
I mean, get that connection between RTP and EME somehow.
I'm not sure exactly what will be the best way how to do it, but I would say that one, one is one part.
And then of course, I mean, codecs that that's also a lot of talk about, I mean, it's the supporting HTC, et cetera, et cetera, premium codecs and all those things.
And all of these things of course comes up as a little bit blockers when you're talking to broadcasters, broadcasting premium content.
But when you, when you, when you sort of lay that on the side saying, yeah, this, this needs to be sold.
I think it will be sold. If, if many, if, if many people wants to solve it, it will be sold.
And but until then, there are a lot of things you can explore anyway.
I mean, in that way, there are so many use cases that is interesting also on the sort of distribution side for it.
But yeah, there are challenges that, that needs to be overcome of course, before we are done.
Yeah.
I'm curious, like what kind of changes would we all love to see happen in the WebRTC community that would, you know, increase that adoption of WIP and WEP, whether that's like plugins for different pieces of open-source software support and different streaming clients.
Like, I'm curious what, what, what your thoughts are.
Maybe even, maybe even, you know, support on smart TVs and, you know, you know, Apple TV or whatever, you know, there is.
So that's, and, and more standard, you know, implementations like that interop with each other that people test with, because, you know, people, developers coming to the space, they look and they're like, oh, okay.
There seems to be, you know, if they see, I don't know, like 200, you know, packages that like all implement the same thing and they all work, they're going to come and be like, okay, there's something for me.
Right. There's something that's going to work with my, I don't know, embedded device that does, you know, video streaming on my fridge.
Right. You know, or buy-in from like big players, like AV player, XO player, things like that.
That's like the huge vote of confidence hear a lot from people building mobile apps where this is all possible.
I mean, like, you know, Dan, you've built tons of mobile apps using WebRTC before, but how do you make it so that somebody who doesn't know the video space can just jump in and be like, okay, I get it.
I mean, WIP and WEP by themselves will kind of allow that to happen, like just naturally.
Right. Like the amount of buzz that I've seen in the past, probably six months around WIP and WEP.
It's interesting. WIP's been around for a little while now. WEP has suddenly kind of caused this new kind of spur of activity around WebRTC.
It's really quite interesting.
And that's just because we're now seeing like GStreamer now have a WIP and a WEP plugin written in Rust.
OBS are talking about supporting it. There's just a ton.
And then there's been a number, Jonas was one of them, releasing a WIP to SRT kind of thing that translates WIP to SRT.
And just this renewed kind of interest then renews like documentation and like stuff that we've all just taken for granted over the past, you know, five years minimum.
Suddenly there are more and more people kind of getting interested and going, oh, how do I get started with this?
And Sean from the Pine Project a couple, two years ago, I think made WebRTC for the curious.
You can go read it on WebRTCforthecurious.com because he kind of joined this ecosystem and went, huh, it's not easy.
So he kind of spearheaded that and then like WIP and WEP seem to have kind of caught enough attention in people's imaginations to kind of reinforce that and keep the journey going.
So I'm really, really excited about what's coming next.
Yeah, I'm excited about the kind of hardware side of things too.
We've even been talking to a couple of hardware companies who are spending some time, you know, figuring out how they would support WIP in their devices.
And, you know, when hardware companies make a bet on something, that's a big bet because, you know, those are physical boxes somewhere that are harder to update.
So that's a huge vote of confidence moving forward. Yeah, totally.
Like Sergio, one of the guys that wrote the WIP draft said to me one day a while back now, oh, so-and-so we're going to do it in hardware.
And I was like, that's amazing.
Like it was like the first big, like people are really taking this seriously because it's going to live around forever, right?
It's not software.
You can't just, it doesn't just get it. It does live around forever. I mean, a real confirmation is to say, Brandon, if let's say you would get something so easy in Navy Foundation or any player that you can play WIP, then it's a real standard, I would say.
So that's probably what's needed. And I think also what's also allowing you with WIP and WEP is that you, I mean, you don't have to do everything end to end if you're in this space.
You can focus on your part. It might be a simple translation library from one protocol to another.
It could be a bin in GStreamer, for example, or it could be, I mean, the small things.
And I think that adding all of these small tools and in many open source, not everyone, not everything has to be that, but I mean, a lot of open source tools is being made available and which for each small thing that's been added, it makes the adoption bigger.
It makes it easier to jump on board and come into it. And when you have all these tools that all this sort of pieces of the puzzles or Lego pieces, then things can be really interesting when the imagination starts to really pop in and you build some cool applications on top of all this.
Yeah, totally. Totally. There's a huge opportunity for anybody listening to contribute to so many open source projects.
And kind of on that note, feel free to reach out to any of us on the panel.
We're probably about at time today. So I want to just say thanks everybody for watching.
Thank you to our guests for spending time and joining from different time zones around the world.
I know it's dark. I see in both your backgrounds. Let's see.
So you can find folks on Twitter. Dan, what's your Twitter again? I'm dan underscore Jenkins on Twitter.
And Jonas? It's Jonas Birme. It's like a name without the accent.
Great. And Renan? R-R -N-N is my... No vowels, no vowels. I'm Brendan.
You can find me at Irvine Broke. There's no way you're going to spell my name.
But thanks everyone for joining. Hope you have a good rest of your day. Thank you for having us.