Play video frame by frame performance issues

1.9k views Asked by At

I want to play a video (mostly .mov with Motion JPEG) in frame by frame mode with changing framerate. I have a function who gives me a framenumber and then I have to jump there. It will be mostly in one direction but can skip a few frames from time to time; also the velocity is not constant. So I have a timer asking every 40ms about a new framenumber and setting the new position. My first approach now is with DirectShow.Net (Interop.QuartzTypeLib). Therefore I render and open the video and set it to pause to draw the picture in the graph

    FilgraphManagerClass media = new FilgraphManagerClass();
    media.RenderFile(FileName);
    media.pause();

Now I will just set a new position

    media.CurrentPosition = framenumber * media.AvgTimePerFrame;

Since the video is in pause mode it will then draw every requested new position (frame). Works perfectly fine but really slow... the video keeps stuttering and lagging and its not the video source; there are enough frames recorded to play a fluent video. With some performance tests I found out that the LAV-Codec is the bottleneck here. This is not included directly in my project since its a DirectShow-Player it will be cast through my codec pack I installed on my PC.

Ideas:

  • Using the LAV-Codec by myself directly in C#. I searched but everyone is using DirectShow it seems, building their own filters and not using existing ones directly in the project.
  • Instead of seeking or setting the time, can I get single frames just by the framenumber and draw them simply?
  • Is there a complete other way to archive what I want to do?

Background:

This project has to be a train simulator. We recorded real time videos of trains driving from inside the cockpit and know which frame is what position. Now my C# programm calculates the position of the train in dependence of time and acceleration, gives back the appropriate framenumber and draw this frame.


Additional Information:

There is another project (not written by me) in C/C++ who uses DirectShow and the avcodec-LAV directly with a similar way I do and it works fine! Thats because I had the idea to use a codec / filter like the avrcodec-lav by myself. But I can't find an interop or interface to work with C#.


Thanks everyone for reading this and trying to help! :)

2

There are 2 answers

4
Roman Ryltsov On

Obtaining specific frame by seeking filter graph (the entire pipeline) is pretty slow since every seek operation involves the following on its backyard: flushing everything, possibly re-creating worker threads, seeking to first key frame/splice point/clean point/I-Frame before the requested time, start of decoding starting from found position skipping frames until originally requested time is reached.

Overall, the method works well when you scrub paused video, or retrieve specific still frames. When however you try to play this as smooth video, it eventually causes significant part of the effort to be wasted and spent on seeking within video stream.

Solutions here are:

  • re-encode video to remove or reduce temporal compression (e.g. Motion JPEG AVI/MOV/MP4 files)
  • whenever possible prefer to skip frames and/or re-timestamp them according to your algorithm instead of seeking
  • have a cached of decoded video frames and pick from there, populate them as necessary in worker thread

The latter two are unfortunately hard to achieve without advanced filter development (where continuous decoding without interruption by seeking operations is the key to achieving decent performance). With basic DirectShow.Net you only have basic control over streaming and hence the first item from the list above.

1
Null511 On

Wanted to post a comment instead of an answer, but don't have the reputation. I think your heading in the wrong direction with Direct Show. I've been messing with motion-jpeg for a few years now between C# & Android, and have gotten great performance with built-in .NET code (for converting byte-array to Jpeg frame) and a bit of multi-threading. I can easily achieve over 30fps from multiple devices with each device running in it's own thread.

Below is an older version of my motion-jpeg parser from my C# app 'OmniView'. To use, just send the network stream to the constructor, and receive the OnImageReceived event. Then you can easily save the frames to the hard-drive for later use (perhaps with the filename set to the timestamp for easy lookup). For better performance though, you will want to save all of the images to one file.

using OmniView.Framework.Helpers;
using System;
using System.IO;
using System.Text;
using System.Windows.Media.Imaging;

namespace OmniView.Framework.Devices.MJpeg
{
    public class MJpegStream : IDisposable
    {
        private const int BUFFER_SIZE = 4096;
        private const string tag_length = "Content-Length:";
        private const string stamp_format = "yyyyMMddHHmmssfff";

        public delegate void ImageReceivedEvent(BitmapImage img);
        public delegate void FrameCountEvent(long frames, long failed);
        public event ImageReceivedEvent OnImageReceived;
        public event FrameCountEvent OnFrameCount;

        private bool isHead, isSetup;
        private byte[] buffer, newline, newline_src;
        private int imgBufferStart;

        private Stream data_stream;
        private MemoryStream imgStreamA, imgStreamB;
        private int headStart, headStop;
        private long imgSize, imgSizeTgt;
        private bool useStreamB;

        public volatile bool EnableRecording, EnableSnapshot;
        public string RecordPath, SnapshotFilename;

        private string boundary_tag;
        private bool tagReadStarted;
        private bool enableBoundary;

        public volatile bool OututFrameCount;
        private long FrameCount, FailedCount;


        public MJpegStream() {
            isSetup = false;
            imgStreamA = new MemoryStream();
            imgStreamB = new MemoryStream();
            buffer = new byte[BUFFER_SIZE];
            newline_src = new byte[] {13, 10};
        }

        public void Init(Stream stream) {
            this.data_stream = stream;
            FrameCount = FailedCount = 0;
            startHeader(0);
        }

        public void Dispose() {
            if (data_stream != null) data_stream.Dispose();
            if (imgStreamA != null) imgStreamA.Dispose();
            if (imgStreamB != null) imgStreamB.Dispose();
        }

        //=============================

        public void Process() {
            if (isHead) processHeader();
            else {
                if (enableBoundary) processImageBoundary();
                else processImage();
            }
        }

        public void Snapshot(string filename) {
            SnapshotFilename = filename;
            EnableSnapshot = true;
        }

        //-----------------------------
        // Header

        private void startHeader(int remaining_bytes) {
            isHead = true;
            headStart = 0;
            headStop = remaining_bytes;
            imgSizeTgt = 0;
            tagReadStarted = false;
        }

        private void processHeader() {
            int t = BUFFER_SIZE - headStop;
            headStop += data_stream.Read(buffer, headStop, t);
            int nl;
            //
            if (!isSetup) {
                byte[] new_newline;
                if ((nl = findNewline(headStart, headStop, out new_newline)) >= 0) {
                    string tag = Encoding.UTF8.GetString(buffer, headStart, nl - headStart);
                    if (tag.StartsWith("--")) boundary_tag = tag;
                    headStart = nl+new_newline.Length;
                    newline = new_newline;
                    isSetup = true;
                    return;
                }
            } else {
                while ((nl = findData(newline, headStart, headStop)) >= 0) {
                    string tag = Encoding.UTF8.GetString(buffer, headStart, nl - headStart);
                    if (!tagReadStarted && tag.Length > 0) tagReadStarted = true;
                    headStart = nl+newline.Length;
                    //
                    if (!processHeaderData(tag, nl)) return;
                }
            }
            //
            if (headStop >= BUFFER_SIZE) {
                string data = Encoding.UTF8.GetString(buffer, headStart, headStop - headStart);
                throw new Exception("Invalid Header!");
            }
        }

        private bool processHeaderData(string tag, int index) {
            if (tag.StartsWith(tag_length)) {
                string val = tag.Substring(tag_length.Length);
                imgSizeTgt = long.Parse(val);
            }
            //
            if (tag.Length == 0 && tagReadStarted) {
                if (imgSizeTgt > 0) {
                    finishHeader(false);
                    return false;
                }
                if (boundary_tag != null) {
                    finishHeader(true);
                    return false;
                }
            }
            //
            return true;
        }

        private void finishHeader(bool enable_boundary) {
            int s = shiftBytes(headStart, headStop);
            enableBoundary = enable_boundary;
            startImage(s);
        }

        //-----------------------------
        // Image

        private void startImage(int remaining_bytes) {
            isHead = false;
            imgBufferStart = remaining_bytes;
            Stream imgStream = getStream();
            imgStream.Seek(0, SeekOrigin.Begin);
            imgStream.SetLength(imgSizeTgt);
            imgSize = 0;
        }

        private void processImage() {
            long img_r = (imgSizeTgt - imgSize - imgBufferStart);
            int bfr_r = Math.Max(BUFFER_SIZE - imgBufferStart, 0);
            int t = (int)Math.Min(img_r, bfr_r);
            int s = data_stream.Read(buffer, imgBufferStart, t);
            int x = imgBufferStart + s;
            appendImageData(0, x);
            imgBufferStart = 0;
            //
            if (imgSize >= imgSizeTgt) processImageData(0);
        }

        private void processImageBoundary() {
            int t = Math.Max(BUFFER_SIZE - imgBufferStart, 0);
            int s = data_stream.Read(buffer, imgBufferStart, t);
            //
            int nl, start = 0;
            int end = imgBufferStart + s;
            while ((nl = findData(newline, start, end)) >= 0) {
                int tag_length = boundary_tag.Length;
                if (nl+newline.Length+tag_length > BUFFER_SIZE) {
                    appendImageData(start, nl+newline.Length - start);
                    start = nl+newline.Length;
                    continue;
                }
                //
                string v = Encoding.UTF8.GetString(buffer, nl+newline.Length, tag_length);
                if (v == boundary_tag) {
                    appendImageData(start, nl - start);
                    int xstart = nl+newline.Length + tag_length;
                    int xsize = shiftBytes(xstart, end);
                    processImageData(xsize);
                    return;
                } else {
                    appendImageData(start, nl+newline.Length - start);
                }
                start = nl+newline.Length;
            }
            //
            if (start < end) {
                int end_x = end - newline.Length;
                if (start < end_x) {
                    appendImageData(start, end_x - start);
                }
                //
                shiftBytes(end - newline.Length, end);
                imgBufferStart = newline.Length;
            }
        }

        private void processImageData(int remaining_bytes) {
            if (EnableSnapshot) {
                EnableSnapshot = false;
                saveSnapshot();
            }
            //
            try {
                BitmapImage img = createImage();
                if (EnableRecording) recordFrame();
                if (OnImageReceived != null) OnImageReceived.Invoke(img);
                FrameCount++;
            }
            catch (Exception) {
                // output frame error ?!
                FailedCount++;
            }
            //
            if (OututFrameCount && OnFrameCount != null) OnFrameCount.Invoke(FrameCount, FailedCount);
            //
            useStreamB = !useStreamB;
            startHeader(remaining_bytes);
        }

        private void appendImageData(int index, int length) {
            Stream imgStream = getStream();
            imgStream.Write(buffer, index, length);
            imgSize += (length - index);
        }

        //-----------------------------

        private void recordFrame() {
            string stamp = DateTime.Now.ToString(stamp_format);
            string filename = RecordPath+"\\"+stamp+".jpg";
            //
            ImageHelper.Save(getStream(), filename);
        }

        private void saveSnapshot() {
            Stream imgStream = getStream();
            //
            imgStream.Position = 0;
            Stream file = File.Open(SnapshotFilename, FileMode.Create, FileAccess.Write);
            try {imgStream.CopyTo(file);}
            finally {file.Close();}
        }

        private BitmapImage createImage() {
            Stream imgStream = getStream();
            imgStream.Position = 0;
            return ImageHelper.LoadStream(imgStream);
        }

        //-----------------------------

        private Stream getStream() {return useStreamB ? imgStreamB : imgStreamA;}

        private int findNewline(int start, int stop, out byte[] data) {
            for (int i = start; i < stop; i++) {
                if (i < stop-1 && buffer[i] == newline_src[0] && buffer[i+1] == newline_src[1]) {
                    data = newline_src;
                    return i;
                } else if (buffer[i] == newline_src[1]) {
                    data = new byte[] {newline_src[1]};
                    return i;
                }
            }
            data = null;
            return -1;
        }

        private int findData(byte[] data, int start, int stop) {
            int data_size = data.Length;
            for (int i = start; i < stop-data_size; i++) {
                if (findInnerData(data, i)) return i;
            }
            return -1;
        }

        private bool findInnerData(byte[] data, int buffer_index) {
            int count = data.Length;
            for (int i = 0; i < count; i++) {
                if (data[i] != buffer[buffer_index+i]) return false;
            }
            return true;
        }

        private int shiftBytes(int start, int end) {
            int c = end - start;
            for (int i = 0; i < c; i++) {
                buffer[i] = buffer[end-c+i];
            }
            return c;
        }
    }
}