I'm currently working on a project where I need to retrieve frames from a capture device, process them, and display them with minimal latency and compression. Initially, my goal is to maintain the video stream as close to the source signal as possible, ensuring no noticeable compression or latency. However, as the project progresses, I also want to adjust framerate and apply image compression.
I have experimented using FFmpeg, since that was the first thing that came to my mind when thinking about capturing video(frames) and processing them.
However I am not satisfied yet, since I am experiencing delay in the stream. (No huge delay but definately noticable) The command that worked best so far for me:
ffmpeg -rtbufsize 512M -f dshow -i video="Blackmagic WDM Capture (4)" -vf format=yuv420p -c:v libx264 -preset ultrafast -qp 0 -an -tune zerolatency -f h264 - | ffplay -fflags nobuffer -flags low_delay -probesize 32 -sync ext -
I also used OBS to capture the video stream from the capture device and when looking into the preview there was no noticable delay. I then tried to simulate the exact same settings using ffmpeg:
ffmpeg -rtbufsize 512M -f dshow -i video="Blackmagic WDM Capture (4)" -vf format=yuv420p -r 60 -c:v libx264 -preset veryfast -b:v 2500K -an -tune zerolatency -f h264 - | ffplay -fflags nobuffer -flags low_delay -probesize 32 -sync ext -
But the delay was kind of similar to the one of the command above. I know that OBS probably has a lot complexer stuff going on (Hardware optimization etc.) but atleast I know this way that it´s somehow possible to display the stream from the capture device without any noticable latency (On my setup).
The approach that so far worked best for me (In terms of delay) was to use Python and OpenCV to read frames of the capture device and display them. I also implemented my own framerate (Not perfect I know) but when it comes to compression I am rather limited compared to FFmpeg and the frame processing is also too slow when reaching framerates about ~20 fps and more.
import cv2
import time
# Set desired parameters
FRAME_RATE = 15 # Framerate in frames per second
COMPRESSION_QUALITY = 25 # Compression quality for JPEG format (0-100)
COMPRESSION_FLAG = True # Enable / Disable compression
# Set capture device index (replace 0 with the index of your capture card)
cap = cv2.VideoCapture(4, cv2.CAP_DSHOW)
# Check if the capture device is opened successfully
if not cap.isOpened():
print("Error: Could not open capture device")
exit()
# Create an OpenCV window
# TODO: The window is scaled to fullscreen here (The source video is 1920x1080, the display is 1920x1200)
# I don´t know the scaling algorithm behind this, but it seems to be a simple stretch / nearest neighbor
cv2.namedWindow('Frame', cv2.WINDOW_NORMAL)
cv2.setWindowProperty('Frame', cv2.WND_PROP_FULLSCREEN, cv2.WINDOW_FULLSCREEN)
# Loop to capture and display frames
while True:
# Start timer for each frame processing cycle
start_time = time.time()
# Capture frame-by-frame
ret, frame = cap.read()
# If frame is read correctly, proceed
if ret:
if COMPRESSION_FLAG:
# Perform compression
_, compressed_frame = cv2.imencode('.jpg', frame, [int(cv2.IMWRITE_JPEG_QUALITY), COMPRESSION_QUALITY])
# Decode the compressed frame
frame = cv2.imdecode(compressed_frame, cv2.IMREAD_COLOR)
# Display the frame
cv2.imshow('Frame', frame)
# Calculate elapsed time since the start of this frame processing cycle
elapsed_time = time.time() - start_time
# Calculate available time for next frame
available_time = 1.0 / FRAME_RATE
# Check if processing time exceeds available time
if elapsed_time > available_time:
print("Warning: Frame processing time exceeds available time.")
# Calculate time to sleep to achieve desired frame rate -> maintain a consistent frame rate
sleep_time = 1.0 / FRAME_RATE - elapsed_time
# If sleep time is positive, sleep to control frame rate
if sleep_time > 0:
time.sleep(sleep_time)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release the capture object and close the display window
cap.release()
cv2.destroyAllWindows()
I also thought about getting the SDK of the capture device in order to upgrade the my performance. But Since I am not used to low level programming but rather to scripting languages, I thought I would reach out to the StackOverflow community at first, and see if anybody has some hints to better approaches or any tips how I could increase my performance.
Any Help is appreciated!