FFmpeg delay and mix audio streams while keeping overall volume constant

740 views Asked by At

I have about 100 audio streams, all with the same intro music/sound, and in some of them the intro is delayed by a few seconds. I want to align and mix all the audio streams such that all the intros play at the same time and the output remains pretty much the same volume throughout. I know in advance how much each stream needs to be delayed by.

Like this in Audacity. Each audio stream is aligned to the intro, and the duration before the intro is arbitrary. (This doesn't solve the volume problem though.)

What I have so far uses adelay and amix. It looks something like this but with more audio streams.

ffmpeg -i 00.oga \
       -i 01.oga \
       -i 02.oga \
       -i 03.oga -filter_complex \
"[0]adelay=delays=     123S:all=1[a0]; \
 [1]adelay=delays=    2718S:all=1[a1]; \
 [2]adelay=delays= 6283185S:all=1[a2]; \
 [3]adelay=delays=11235813S:all=1[a3]; \
 [a0][a1][a2][a3]amix=inputs=4" output.oga

In this example the first stream is delayed by 123 samples, the second by 2 718, the third by 6 283 185, and the by fourth 11 235 813.

This works, except at the beginning of the output it's very quiet. When fed n streams, amix makes each stream 1/nth its original volume, which is a good thing in principle. In this case it's not an entirely good thing, because at the beginning of the output 3 of the 4 audio streams are silent (adelay fills delayed streams with silence), meaning the only audible stream is 1/4 = 25% of its original volume. When the second stream becomes audible, the overall volume is 2/4, with three audible streams 3/4, and with all four streams audible it's 4/4 = 100%.

Instead, I want the the first stream to be at 100% volume when it's the only audible one, 50% volume each when there are two audible streams, etc.

Is there a way to make it so when there are n audio streams but m non-silent audio streams, the volume for each of the audio streams is 1/m not 1/n? amix does this when streams end; if one stream ends it changes the volume of the others from 1/n to 1/n-1 over a period of time (dropout_transition: https://ffmpeg.org/ffmpeg-filters.html#amix).

I found a similar question where someone wanted to do something like this but only with 2 audio streams. The answer was to split, trim, and change the volume manually. This would be incredibly complicated with 100 audio streams or more, like in my situation.

Is there any easy way to achieve this, even without FFmpeg?

0

There are 0 answers