Android Media Codec: How long does it take to decode and display one video frame

1.3k views Asked by At

I couldn't find any information on this topic. Maybe someone of you may help. I'm using the Android MediaCodec to decode H264-frames. The MediaCodec is used in synchronous mode. I want to measure the time from queueing one single frame to the decoder to the point where it is actually viewable on the screen.

So at some point in my code I call

codec.getInputBuffer(inIndex);

And afterwards:

int outIndex = codec.dequeueOutputBuffer(bufferInfo, BUFFER_TIMEOUT);

if(outIndex >= 0)
    codec.releaseOutputBuffer(outIndex, true);
    if(PMVR.calculateLatency && validIteration) {
        PMVR.calculateLatency = false;
        PMVR.pingEnded = System.nanoTime();
}

So question one: Can I assume that the frame that was previously queued to the input buffers is the one that i get decoded when doing the call to dequeuOutputBuffers() (Note: synchronous mode)? I couldn't find an option to actually set a picture ID...

And question two: I call releaseOutputBuffer() with render=true. How long does it actually take to display the decoded frame?

Thanks for your help,

Christoph

1

There are 1 answers

0
mstorsjo On

First and foremost, keep in mind that the decoder works in an asynchronous fashion. Whether you use the MediaCodec API in synchronous or asynchronous mode doesn't change that; that only changes whether you need to poll the decoder for inputs/outputs, or if you get them via a callback.

In general, you won't get near the decoder's full performance if you just pass in one packet for decoding and await the decoded frame before you proceed to the next one. In many cases, you won't even be able to decode video in realtime if you do things this way. Some decoders won't even return a single frame output until you've input a few packets (or signalled end of stream).

You can't reliably assume that the frame you get out is the one corresponding to the packet you passed in for decoding. If the stream uses frame reordering (like in H264 with B-frames), the output frames won't be in the same order as they were input. If the packets were corrupted, the decoder might skip returning some frames.

To identify the individual frames through the decoder, you can use the presentationTimeUs parameter to queueInputBuffer, which is passed through into the same field in MediaCodec.BufferInfo. As long as this value is unique in all input packets, you should be able to use it to track input packets into output frames.

When you call releaseOutputBuffer with render=true, the frame will be shown as soon as possible; I don't think it is defined anywhere exactly how soon this is, but within one or two screen refreshes probably is what you can assume. Since API level 21, there's a parameter long renderTimestampNs you can pass as well, that allows you to specify more exactly when it should be shown.