Flush & Latency Issue with Fragmented MP4 Creation in FFMPEG

7k views Asked by At

I'm creating a fragmented mp4 for html5 streaming, using the following command:

-i rtsp://172.20.28.52:554/h264 -vcodec copy -an -f mp4 -reset_timestamps 1 -movflags empty_moov+default_base_moof+frag_keyframe -loglevel quiet -
  1. "-i rtsp://172.20.28.52:554/h264" because the source is h264 in rtp packets stream from an ip camera. For the sake of testing, the camera is set with GOP of 1 (i.e. all frames are key frames)
  2. "-vcodec copy" because I don't need transcoding, only remuxing to mp4.
  3. "-movflags empty_moov+default_base_moof+frag_keyframe" to create a fragmented mp4 according to the media source extensions spec.
  4. "-" at the end in order to output the mp4 to stdout. I'm grabbing the ouput and sending it to the webclient through web sockets.

Everything is working well, expect for a latency issue which I'm trying to solve. If I'm logging every time a data is coming in from stdout, with the timestamp of arrival, I get this output:

16/06/2015 15:40:45.239 got data size = 24

16/06/2015 15:40:45.240 got data size = 7197

16/06/2015 15:40:45.241 got data size = 32768

16/06/2015 15:40:45.241 got data size = 4941

16/06/2015 15:40:45.241 got data size = 12606

16/06/2015 15:40:45.241 got data size = 6345

16/06/2015 15:40:45.241 got data size = 6339

16/06/2015 15:40:45.242 got data size = 6336

16/06/2015 15:40:45.242 got data size = 6361

16/06/2015 15:40:45.242 got data size = 6337

16/06/2015 15:40:45.242 got data size = 6331

16/06/2015 15:40:45.242 got data size = 6359

16/06/2015 15:40:45.243 got data size = 6346

16/06/2015 15:40:45.243 got data size = 6336

16/06/2015 15:40:45.243 got data size = 6338

16/06/2015 15:40:45.243 got data size = 6357

16/06/2015 15:40:45.243 got data size = 6357

16/06/2015 15:40:45.243 got data size = 6322

16/06/2015 15:40:45.243 got data size = 6359

16/06/2015 15:40:45.244 got data size = 6349

16/06/2015 15:40:45.244 got data size = 6353

16/06/2015 15:40:45.244 got data size = 6382

16/06/2015 15:40:45.244 got data size = 6403

16/06/2015 15:40:45.304 got data size = 6393

16/06/2015 15:40:45.371 got data size = 6372

16/06/2015 15:40:45.437 got data size = 6345

16/06/2015 15:40:45.504 got data size = 6352

16/06/2015 15:40:45.571 got data size = 6340

16/06/2015 15:40:45.637 got data size = 6331

16/06/2015 15:40:45.704 got data size = 6326

16/06/2015 15:40:45.771 got data size = 6360

16/06/2015 15:40:45.838 got data size = 6294

16/06/2015 15:40:45.904 got data size = 6328

16/06/2015 15:40:45.971 got data size = 6326

16/06/2015 15:40:46.038 got data size = 6326

16/06/2015 15:40:46.105 got data size = 6340

16/06/2015 15:40:46.171 got data size = 6341

16/06/2015 15:40:46.238 got data size = 6332

As you can see, the first 23 lines (which contain data of about 1.5 secs of video) are arriving almost instantly, and then the delay between each 2 consecutive lines is ~70ms which makes sense because the video is 15 frames per sec. This behavior introduces a latency of about 1.5 sec.

It looks like a flushing issue because I don't see any reason why would ffmpeg need to hold the first 23 frames in memory, especially since each frame is a fragment of it's own inside the mp4. I couldn't however, find any method that would cause ffmpeg to flush this data faster.

Has anyone got a suggestion?

I'd like to note that this is a follow up question to this one: Live streaming dash content using mp4box

4

There are 4 answers

1
galbarm On BEST ANSWER

The key to removing the delay is to use the -probesize argument:

probesize integer (input)

Set probing size in bytes, i.e. the size of the data to analyze to get stream information. A higher value will enable detecting more information in case it is dispersed into the stream, but will increase latency. Must be an integer not lesser than 32. It is 5000000 by default.

By default the value is 5,000,000 bytes which was equivalent to ~1.5 sec of video. I was able to almost completely eliminate the delay by reducing the value to 200,000.

0
curtiss On

Usually the buffering for stdout is disabled in case of console output. If you run ffmpeg from code, the buffering is enabled, so you will get your data only when the buffer is full or the command ends.

You have to eliminate the stdout buffering of your os. On windows its impossible imo, but on ubuntu for ex. There is http://manpages.ubuntu.com/manpages/maverick/man1/stdbuf.1.html

0
Mikael Finstad On

As some already pointed out, one way is to transcode the video using ffmpeg and choose a small GOP size, e.g. -g 1. This works for me, and gives a latency of a few hundred ms delay from the IP camera to the html5 <video> element when using MediaSource. My guess is that when you set your GOP to 1 on the camera, it doesn't actually give you every frame as a keyframe, but rather 1 keyframe per second.

However for my use case, transcoding is not an option, so what was the key to reducing latency to a few hundred ms was to add the ffmpeg option -frag_duration 100. This makes ffmpeg create very small MP4f fragments and give out a quick and steady stream of packets to stdout instead of batching them to 1-2 seconds.

1
Brian Adams On

I solved the latency issue by using the -g option to set the number of frames in the group. In my case I used -g 2. I suspect that if you don't make it explicit, the fragment either waits for the source to provide the keyframe or uses a really large default value to generate the keyframe before closing off the fragment and dumping it to stdout.