Impl. live streaming system: Intro
Introduction
I choose live streaming as the first project for posting. because it can be expanded widely in these days and future.
This article is a decision to select protocols for both streamer and viewer.
- Define live streaming as Real-Time(low-latency) Video/Audio Delivery service.
- live streaming + low latency bi-directional communication => game streaming service
In this series, I want to cover the whole flow are explained below from scratch as possible as possible.
flowchart LR
p(Streamer)
subgraph Service Provider
i(Ingest)
t(Transcoder)
e(Edge)
end
s(Viewer)
p --> i
i --> t
t --> e
e --> s
Select Protocols
TL;DR
I will use
RTMP
andHLS
withCMAF
At first, think as a Stakeholder
(streamer, viewer, and service provider).
The important point is easy to use, well documented to using by the streamer. (for well user acquisition)
Let’s think about the end-users first
Service Provider => Viewer
The most important points to consider are compatibility, low-latency, and operation cost.
These days, live video streaming services use HLS
, DASH
, CMAF
.
Some services use WebRTC
to provide low-latency. but, it cost more than common CDN-based delivery. (WebRTC based CDN existx. but I don’t know production-ready global service)
And it needs to implement too much than others. (These series implement from scratch, with less dependency)
Most of the legacy systems in production are based on HLS
. due to iOS issues. (HLS
is the only format native supported by Safari browser)
CMAF
is made by Apple and Microsoft. so CMAF
is native support in iOS.
The primary difference between them is the media container.
- | HLS | CMAF HLS | DASH | CMAF |
---|---|---|---|---|
Container | MPEG-2 TS |
fMP4 | ISO_BMFF (MPEG-4) |
CMAF(fMP4 compatible) |
My thought about most valuable point of CMAF is requiring less storage cost and computing power on the service provider.
- Support both of
MPEG-2 TS
andfMP4
is required 2 times of original media. - Then, What about only
MPEG-2 TS
?MPEG-2 TS
packet size is fixed at about 200 bytes. streams will be fragmented with overhead. (it means require more storage)ISO_BMFF
is variable, header has a length value, so NALU is not fragmented likeMPEG-2 TS
. (I don’t readISO_BMFF
’s specification, so I’m not sure it always not fragmented)- Related slide from BITMOVIN
So, CMAF is best for VOD. due to requiring less storage, computing power, and bandwidth. but, We need to check out the streamer side. because this series is about live streaming.
Streamer => Service Provider
TL;DR
Use RTMP
There exists so many real-time video/audio transport protocols. due to different requirements.
Widely mentioned in the live-streaming industry is RTMP
, WebRTC
, SRT
. So, check them now.
- | RTMP | WebRTC | SRT |
---|---|---|---|
Protocol | RTMP | SRTP | SRT |
L2 | TCP | UDP | UDP |
Latency(approx.) | 5 | <1 | 1 |
Media container | FLV | no use | MPEG-2 TS |
At first, I will drop SRT
from the list.
- According to
SRT
Specification’, there is no way to determine a unique id. - as I understand(read specification partially and searching supported services user guide)
- At handshake, sharing unique SRT Socket ID that determines stream at SRT header parsing.
- But, the problem is how to determine the streamer?
- RTMP: use the stream key to determine.
- WebRTC: developer can handle
- IP-based? (Commonly used in services)
- How support the mobile environment?
- Wi-Fi change meaning IP change. if the connection lost by accident then need to update the IP.
- How support the mobile environment?
Next, I will drop WebRTC
.
WebRTC
is good stack to provide low-latency service.- But, it is not implemented in most of the streamer’s environment like XSplit, OBS Studio, or OBS’s folk projects.
- It is implemented on the web browser. not compatible with broadcasting systems.
the other options?
- Youtube supports
HLS
uplink stream.- Based on HTTPS POST/PUT
- Supported by OBS
- Higher latency than
RTMP
So, I think RTMP
is the best choice.
Let’s check again Service Provider => Viewer
side.
RTMP
uses FLV
as a media container. (the specification is the same structure)
If protocol uses MPEG-2 TS
like SRT
, then the same container using at HLS
is a good choice. due to code reusability.
Just streaming FLV
to the viewer is lower latency. (less overhead)
But, it is not compatible. for example how to play on iOS Safari?
And, FLV
only supports legacy codecs.
Results
- Streamers are using
RTMP
to publish stream. - Viewers are receiving
HLS
withCMAF
segments
So, need to implement RTMP
, FLV
, ISO_BMFF
, CMAF
, HLS
.
In the next post, I will minimal rough system design.
Discussion
- At WOWZA’s article, I have some questions
- WebRTC is faster than RTSP/RTP.
- As I know, WebRTC uses SRTP that a secured version of RTP as a video/audio track transport protocol.
- RTP has less overhead due to SRTP having encryption processing.
- I think, maybe different segment length?
- UDP-based protocols have low latency.
- Then, how about RTMFP(UDP-based RTMP with Secure)? Why isn’t using in production on these days?
- How aboue WebTransport?
- WebRTC is separated as WebCodecs and WebTransport(At first, it can be explained as UDP-based WebSocket. but P2P-WebTransport draft is a work in progress)
- So, using WebTransport or P2P-WebTransport as transport layer, it can be faster like WebRTC?
- WebRTC is faster than RTSP/RTP.