pyav / ffmpeg / libav access side data without decoding the video

Question

pyav / ffmpeg / libav access side data without decoding the video

1.1k views Asked by user1315621 At 03 June 2021 at 20:09

Right now I am accessing the motion vectors in the following way:

container = av.open(
    rtsp_url, 'r',
    options={
        'rtsp_transport': 'tcp',
        'stimeout': '5000000',  
        'max_delay': '5000000', 
    }
)
stream = container.streams.video[0]
codec_context = stream.codec_context
codec_context.export_mvs = True


for packet in container.demux(video=0):
    for video_frame in packet.decode():
        motion_vectors_raw = video_frame.side_data.get('MOTION_VECTORS')

It seems to me that this does decode the video_frame. Is there a way to obtain the motion vectors without having to decode the entire frame? My goal is to reduce the CPU utilization.

Original Q&A

There are 1 answers

**e.d.n.a** · Accepted Answer · 2021-06-19T05:50:47+00:00

First, I only found these comments on the matter:

no solution available - Reddit

complicated, but not impossible - Stackoverflow

idea to implement this in ffmpeg back in 2016 - Mailing list

Most if not all working methods are based on "ffmpeg", which requires frame decoding before the side data becomes available.

Another approach could be to get motion vectors while encoding hw-accelerated, e.g. on a Raspberry Pi:

Motion vectors via MMAL encoder

Googling "h264 motion vectors compressed domain" gives quite a few research papers on the topic though, where motion vectors were extracted to improve performance of certain analysis goals, e.g.:

2009, Szczerba, "Fast compressed domain motion detection in H.264 video streams for video surveillance applications"

2009, Solana-Cipres, "Real-time moving object segmentation in H.264 compressed domain based on approximate reasoning"

2014, Patel, "Motion Detection and Segmentation in H.264 Compressed Domain for Video Surveillance Application"

2004, Babu, "Video Object Segmentation - A Compressed Domain Approach"

2014, Babu, "A survey on compressed domain video analysis techniques"

There are no real solutions there or hints on decoding, only concepts and evaluation.

So if you want to try to get the motion vectors by parsing a compressed h264-video yourself using Python, you could start with:

H264 - Spec

H.264 and MPEG4 Video Compression - Book

https://github.com/beardypig/pymp4

https://github.com/alastairmccormack/pymp4parse

https://github.com/halochou/py-h264-decoder