pyav / ffmpeg / libav access side data without decoding the video

1.2k views Asked by At

Right now I am accessing the motion vectors in the following way:

container = av.open(
    rtsp_url, 'r',
    options={
        'rtsp_transport': 'tcp',
        'stimeout': '5000000',  
        'max_delay': '5000000', 
    }
)
stream = container.streams.video[0]
codec_context = stream.codec_context
codec_context.export_mvs = True


for packet in container.demux(video=0):
    for video_frame in packet.decode():
        motion_vectors_raw = video_frame.side_data.get('MOTION_VECTORS')

It seems to me that this does decode the video_frame. Is there a way to obtain the motion vectors without having to decode the entire frame? My goal is to reduce the CPU utilization.

1

There are 1 answers

0
e.d.n.a On BEST ANSWER

First, I only found these comments on the matter:

no solution available - Reddit

complicated, but not impossible - Stackoverflow

idea to implement this in ffmpeg back in 2016 - Mailing list

Most if not all working methods are based on "ffmpeg", which requires frame decoding before the side data becomes available.

Another approach could be to get motion vectors while encoding hw-accelerated, e.g. on a Raspberry Pi:

Motion vectors via MMAL encoder

Googling "h264 motion vectors compressed domain" gives quite a few research papers on the topic though, where motion vectors were extracted to improve performance of certain analysis goals, e.g.:

2009, Szczerba, "Fast compressed domain motion detection in H.264 video streams for video surveillance applications"

2009, Solana-Cipres, "Real-time moving object segmentation in H.264 compressed domain based on approximate reasoning"

2014, Patel, "Motion Detection and Segmentation in H.264 Compressed Domain for Video Surveillance Application"

2004, Babu, "Video Object Segmentation - A Compressed Domain Approach"

2014, Babu, "A survey on compressed domain video analysis techniques"

There are no real solutions there or hints on decoding, only concepts and evaluation.

So if you want to try to get the motion vectors by parsing a compressed h264-video yourself using Python, you could start with:

H264 - Spec

H.264 and MPEG4 Video Compression - Book

https://github.com/beardypig/pymp4

https://github.com/alastairmccormack/pymp4parse

https://github.com/halochou/py-h264-decoder

https://github.com/slhck/h26x-extractor

https://code.google.com/archive/p/py264/

I don't think there is a ready-to-use solution out there (yet) anyways!