I am writing an Android App which encompasses sending and receiving a video stream from a desktop PC. For the app to work properly we need as little latency as possible, sacrificing video quality if necessary. We are using gstreamer 1.45
on both ends, but with the current pipeline we have at least 0.5s delay on a Galaxy Note S2, and thats if both devices are on the same network (later this should work via VPN).
The sender pipeline
appsrc name=vs_src format=time do-timestamp=true
caps="video/x-raw, format=(string)RGB, width=(int)640, height=(int)480, framerate=(fraction)15/1000"
! videoconvert
! x264enc speed-preset=ultrafast tune=zerolatency byte-stream=true threads=1 key-int-max=15 intra-refresh=true ! h264parse ! rtph264pay pt=96
! queue ! udpsink name=vs_sink host=%s port=%d async=false
The receiver pipeline
udpsrc name=vr_src
caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, payload=(int)96, encoding-name=(string)H264"
! rtpjitterbuffer
! rtph264depay ! h264parse ! avdec_h264
! videorate ! videoconvert
! glimagesink name=vr_sink async=false
Setting threads=2
or higher emits a gstreamer warning that it was compiled without multithreading support. I know that some devices offer hardware decoders, but the only way to access them reliably seems to be via encodebin
/decodebin
. I already tried using a decodebin
but for some reason it complains that it can't find the required plugins (e.g. No decoder to handle media type 'video/x-h264'
).
I am no expert by any means when it comes to streaming and video encoding/decoding, and getting the App to a working point was already a nightmare :/ If H264 should be inappropriate we can switch to any other codec supported by gstreamer. Can anybody help me?
We do live video streaming from desktop PCs to Raspberry Pis, and we spent an enormous amount of time tweaking both the encoding and decoding portions of our system. Unfortunately most libraries and tools have their out-of-the-box settings geared towards trans-coding or general video playback (not live). We ended up writing our own GStreamer element to do the encoding (using vaapi) and our own Raspberry Pi program to do the decoding (using OMX).
I can offer a few thoughts for you, but nothing specific for the Android decoding scenario, unfortunately.
If you're encoding on a powerful desktop, like i3-i7, make sure you add queues for any significant operation: colorspace conversion, scaling, encoding, etc. So in your pipeline, make sure there is a "queue" between "videoconvert" and "x264enc" so they run on seperate threads.
As Ralf mentioned, you probably want to use only P frames, not B frames, and your x264enc settings likely already do this.
We usually favor dropping frames and showing garbage over using a large jitter buffer. We also adjust the QP (quality of the encode) on the fly to be within our networks means. So I'd suggest adding sync=false to your receiving program. You want to render a frame as soon as you have it. This potentially makes your video less smooth, but if you have a large jitter buffer you're always gonna be delayed. Better to adjust the stream to the network and get rid of the buffer. x264enc has "qp-min" and "qp-max" properties you can try.
Try adjusting the "latency" and "drop-on-latency" properties of your rtpjitterbuffer, or try getting rid of it altogether.
One very nasty thing we discovered is that in the Raspberry Pi decoder it seemed to always have some sort of builtin latency, no matter how live-optimized our stream was. It turns out in the h264 stream there is something called a VUI packet that can be used to tell the decoder what type of stream to expect, and when we supplied this packet the decoder reacted very differently.
For reference: https://www.raspberrypi.org/forums/viewtopic.php?t=41053
So in the above VUI settings I tell the decoder that we'll have a max of one P frame that it needs to buffer. It's crazy how much this helped. Of course we had to also make sure our encoder did only send the one P frame. I'm not sure this is possible to do do with x264enc.
This stuff can get pretty scary. Hopefully someone else has the Android video chops to give you a simpler answer!
EDIT: Regarding queues, I don't parameterize them at all, and in a live streaming situation if your queues fill up you need to scale back (resolution, quality, whatever) anyway. In GStreamer the queue element causes GStreamer to launch a new thread to handle the following portion of the pipeline. You just want to make sure your encode/scaling/colorspace conversion elements work in isolation.
The above will give you five threads.
If you get nothing here, my recommendation is to hit up an Android video API subforum or mailing list to see if anyone else has live video going and if so what tweaks they made to their stream and decoder.
--- Addendum 1-5-18
We've also noticed that some streams can fill up the kernel socket buffer and result in packet drops--particularly on large keyframes. So if you have a larger stream I recommend checking the kernel buffer size using
sysctl
:Append
net.core.rmem_max = whatever
in /etc/sysctl.conf on your receiving device, and onudpsrc
setbuffer-size
to this new max value. You can tell if you're still seeing drops by running something like this:...or something like this on your receiving pipeline:
--- Addendum 1-11-19
If you have the means to navigate the patent situation, the openh264 library from Cisco works very nicely. It's very much tuned for live streaming.
https://github.com/cisco/openh264
https://www.openh264.org/BINARY_LICENSE.txt
There is a GStreamer plugin for it under
gst-plugins-bad
.