I am attempting to stream h.264 video via RTMP to a rebroadcast service (specifically, twitch.tv) using C, libx264 and librtmp.
Part of my confusion comes from the fact that I am streaming to a rebroadcast service, which presumably will never miss packets and also has definitely seen the start-of-stream info, and may ALSO be repackaging my input stream for new clients coming and going. So I'm not sure what I need to do to make sure I'm doing it "right" as a source, vs. what the broadcast service is handling for me.
Anyway. Based on the Adobe FLV spec, there are three types of AVCPacketType used for an h.264 stream:
- type 0: "AVC sequence header", containing AVCDecoderConfigurationRecord
- type 1: "AVC NALU", containing one or more NALUs
- type 2: "AVC end of sequence", empty body
Packet type 2 is obvious (send at end of stream). But I am having trouble with the mapping of h.264's PPS and SPS NALU into the RTMP container, and the spec is not complete about this. Here are my questions:
- Do I send type 0 packet only once, at start of stream? Or do I send it multiple times - maybe before every keyframe? (does type 0 packet have "seekable" flag set?) And if it is only sent once, how does a client deal with joining mid-stream?
- Do I include SPS and PPS types within type 1 packet, along with IDR etc? Or is it only included in type 0? Or both? (Some guides recommend setting x264's 'repeat_headers' param, while others do not.) Again, how does a client deal with joining mid-stream?
- More of an h.264 question than RTMP, but, can SPS or PPS change mid-stream? How do I send a new version - a new type 0 tag, include the changes in type 1, or close + reopen the stream?
My understanding based on the video systems I've written (I'm not an expert though so take it for what it's worth)
Packet type 0 is only sent once and at the beginning of the stream. If you are sending the RTMP to a playback client then you need to send packet type 0 to the client when they connect with a playback request, otherwise they will not be able to decode the video frames.
SPS and PPS do not get sent with a type of 1, they are only contained within type 0 requests (e.g. the AVCDecoderConfigurationRecord).
My understanding is that SPS does not change mid-stream and most clients will ignore later SPS records that come in. I think that PPS values can't change but you can send multiple sets of pps, and each frame can identify which PPS they are using (though I'm not 100% sure of it). Research I've done seems to claim that most video streams use a single SPS and single PPS (though I've found some exceptions, such as WebRTC sending new SPS and PPS with each IDR).