Fully GPU accelerated (decoding,deinterlacing,scaling,encoding) HLS variable stream with ffmpeg

1.5k views Asked by At

I'm trying to create a variable HLS MBR live stream using ffmpeg, which will be fully accelerated at the GPU level. This means accelerated decoding, deinterlacing, scaling and encoding. Here is my broken example ...

ffmpeg -loglevel debug -hwaccel cuvid -c:v h264_cuvid -hwaccel_output_format cuda -vsync 0 -i "udp://@239.250.4.152:1234?fifo_size=1000000&overrun_nonfatal=1" \
-filter_complex "[0:v]yadif_cuda=0:-1:0,split=3[v1][v2][v3],[v1]copy[v1out],[v2]scale_npp=1280:720[v2out],[v3]scale_npp=720:405[v3out]" \
-map [v1out] -c:v:0 hevc_nvenc -b:v:0 4000k -g 48 \
-map [v2out] -c:v:1 hevc_nvenc -b:v:0 3000k -g 48 \
-map [v3out] -c:v:2 hevc_nvenc -b:v:0 2000k -g 48 \
-map a:0 -c:a:0 aac -b:a:0 128k -ac 2 \
-map a:0 -c:a:1 aac -b:a:1 96k -ac 2 \
-map a:0 -c:a:2 aac -b:a:2 64k -ac 2 \
-f hls \
-hls_playlist_type event \
-hls_segment_type mpegts \
-hls_time $seglen \
-hls_list_size $numsegs \
-hls_flags delete_segments+independent_segments \
-hls_segment_filename "$dst/stream_%v/$segments" \
-hls_base_url "$url" \
-master_pl_name "$dst/$index" \
-var_stream_map "v:0,a:0 v:1,a:1 v:2,a:2" \
"$dst/$index"

Note: My graphics card can handle more than 2 concurrent encodings. I'm getting a classic error "Impossible to convert between the formats supported by the filter 'Parsed_split_1' and the filter 'auto_scaler_0'".

Is my goal real? Or what is the proper way to use the GPU in this scenario as efficiently as possible? Thanks for the help.

Stream mapping:
   Stream # 0: 3 (h264_cuvid) -> yadif_cuda (graph 0)
   copy (graph 0) -> Stream # 0: 0 (h264_nvenc)
   scale_npp (graph 0) -> Stream # 0: 1 (h264_nvenc)
   scale_npp (graph 0) -> Stream # 0: 2 (h264_nvenc)
   Stream # 0: 4 -> # 0: 3 (ac3 (native) -> aac (native))
   Stream # 0: 4 -> # 0: 4 (ac3 (native) -> aac (native))
   Stream # 0: 4 -> # 0: 5 (ac3 (native) -> aac (native))
1

There are 1 answers

0
Milan Čížek On

I have a working solution. Unfortunately, I haven't been able to avoid moving between VRAM and RAM yet, due to the split filter, which duplicates the video track.

ffmpeg -vsync 0  -hwaccel cuvid -c:v h264_cuvid -hwaccel_output_format cuda -i "udp://@239.250.4.152:1234?fifo_size=1000000&overrun_nonfatal=1" \
-filter_complex "[0:v]yadif_cuda=0:-1:0,hwdownload,format=nv12,split=3[v1][v2][v3]; [v1]copy[v1out]; [v2]hwupload,scale_npp=1280:720[v2out]; [v3]hwupload,scale_npp=720:405[v3out]" \
-map [v1out] -c:v:0 h264_nvenc -b:v:0 4000k -g 48 \
-map [v2out] -c:v:1 h264_nvenc -b:v:0 3000k -g 48 \
-map [v3out] -c:v:2 h264_nvenc -b:v:0 2000k -g 48 \
-map a:0 -c:a:0 aac -b:a:0 128k -ac 2 \
-map a:0 -c:a:1 aac -b:a:1 96k -ac 2 \
-map a:0 -c:a:2 aac -b:a:2 64k -ac 2 \
-f hls \
-hls_segment_type mpegts \
-hls_playlist_type vod \
-hls_time $seglen \
-hls_list_size $numsegs \
-hls_flags delete_segments+independent_segments \
-hls_segment_filename "$dst/stream_%v/$segments" \
-master_pl_name "$index" \
-var_stream_map "v:0,a:0 v:1,a:1 v:2,a:2" \
"$dst/stream_%v/mystream.m3u8"