HTML5 Video: How to get the index of current frame

347 views Asked by At

I have a web application that processes mp4 video frame-by-frame using this WebCodecs library and stores the presentation timestamp and duration of every VideoFrame.

Then i want to play the video and match the currently playing frame with the processed frames. For this i use the requestVideoFrameCallback. Since the video can have variable framerate, I cannot just use currentTime / FPS or even VideoFrameCallbackMetadata.mediaTime / FPS. Instead I try to find the VideoFrame that has Timestamp <= VideoFrameCallbackMetadata.mediaTime && Timestamp + Duration >= VideoFrameCallbackMetadata.mediaTime. But even this is not consistent because on some videos the first frame has timestamp > 0 but the html5 video displays this frame on the start of the video when currentTime = 0 and even mediaTime = 0.

Is there a way to match the VideoFrames to the frame that is displayed in the html video element? I thought the mediaTime should be consistent with the VideoFrame timestamp but it is not.

EDIT: I noticed that the first processed frame sometimes has timestamp > 0 but running FFProbe shows that the first frame should have timestamp==0. Also The number of frames that are processed is sometimes lower that the info.VideoTracks.nb_samples. So I think this is probably error in the library.

EDIT2: The bug in the library that caused not all frames to be processed is now fixed. However, The timestamp of the first frame still differs from the timestamp extracted using FFProbe ffprobe file.mp4 -select_streams v -show_entries frame=coded_picture_number,pkt_pts_time -of csv=p=0:nk=1 -v 0. I'm going to try to compare FFProbe output to MP4Box command line to see where does the difference in timestamps occur. Another culrpit could be the video element itself rendering the frames with different timestamps.

EDIT3: Even when using the frame timestamps from FFProbe, the frame times are still not in sync with the video element.

2

There are 2 answers

2
pedrobroese On

From my experience, requestVideoFrame is not precise to the frame level. The reason for that, I believe, is that you don't have control over the way the video element manipulates the underlying codecs. What I did to work precisely to the frame level was to manipulate the codecs myself, however, my goal was different (I built a video Editor). In your case, before going to the codecs themselfs, you can try to use the MediaStreamTrackProcessor API https://developer.mozilla.org/en-US/docs/Web/API/MediaStreamTrackProcessor. In my case it was more accurate then requestVideoFrame, but still not accurate enough. In your case It might do the trick:

const videoTracks = document.querySelector("video").videoTracks[0]; //originl video
const trackProcessor = new MediaStreamTrackProcessor({ track: videoTrack }); //input videoTrack
const trackGenerator = new MediaStreamTrackGenerator({ kind: "video" }); //output videoTrack

const reader = trackProcessor.reader //get stream from input videoTrack
const transformer = new TransformStream({
  async transform(videoFrame, controller) {
    const newFrame = myFunction(videoFrame, processedFrame) //here you'll generate a new videoFrame with the old processed Frame overlaid or side-by-side with the original frame.
    videoFrame.close();
    controller.enqueue(newFrame);
  },
});

trackProcessor.readable
  .pipeThrough(transformer)
  .pipeTo(trackGenerator.writable);

After that, you'll have to attach the output track to a differente video element, which will play the new frames.

0
KarelPrdel On

In the end I solved it by shifting all of the frame timestamps so that the first frame starts at 0:00 (mp4box sometimes outputs timestamp > 0 for the first frame, ffmpeg does not). This was close, but some frames were still rendering 1 frame behind. So when a timestamp is very close to next frame (about 5 microseconds) I render the next frame instead.

Although cumbersome, this seems to work pretty well.