I have an audio editor in the browser using ffmpeg (WebAssembly), and I want to insert new audio into the existing audio without having to re-encode everything. Re-encoding everything takes a long time, especially in the browser, so I would like to only re-encode the inserted file, match it to the original one and concatenate them using the copy command.
On ffmpeg concatenate docs it says:
All files must have the same streams (same codecs, same time base, etc.)
But it is not clear what is meant by time base. So far I have observed I need to match:
- codec
- bit rate
- sample rate
- channels (mono, stereo)
Is there anything else I need to match so that the resulting audio is not corrupt/broken when concatenating?
I have observed with mp3 for example it has VBR, CBR, and ABR. If the original audio has a bit rate of 128 kb/s, I am assuming it is a CBR, so I match it with:
ffmpeg -i original.mp3
# > Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s
ffmpeg -i input.mp3 -b:a 128k -ar 44100 -ac 2 re_encoded.mp3
# then merge
# concat_list.txt contains the original audio and the re_encoded.mp3
ffmpeg -f concat -i concat_list.txt -safe 0 -c copy merged.mp3
And that works fine for CBR such as 8, 16, 24, 32, 40, 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, or 320 (docs), as far as I have tested.
The issue is when the original.mp3 has a VBR (variable bit rate) or ABR, such as 150 kb/s.
If I try to match it like below:
ffmpeg -i input.mp3 -b:a 150k -ar 44100 -ac 2 re_encoded.mp3
ffmpeg -i re_encoded.mp3
# Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 160 kb/s
The resulting bitrate is rounded to the nearest CBR which is 160.
I can solve this with mp3 by using -abr 1:
ffmpeg -i input.mp3 -abr 1 -b:a 150k -ar 44100 -ac 2 re_encoded.mp3
ffmpeg -i re_encoded.mp3
# Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 150 kb/s
Now the bitrate matches the original audio, however I am not sure this is correct since I am modifying the new audio to an ABR and concatenating it with a VBR? I am not even sure how to check with ffmpeg if the audio is VBR, CBR or ABR, or if that even matters when concatenating.
Another issue also happens with aac files. When I try to match the original audio bitrate I can't.
ffmpeg -i input.mp3 -b:a 128k -ar 44100 -ac 2 re_encoded.aac
ffmpeg -i re_encoded.aac
# Stream #0:0: Audio: aac (LC), 44100 Hz, stereo, fltp, 135 kb/s
The resulting bitrate always seems to be variable (135 in this case), and hence I can't match it to the original one.
So my question is, what conditions need to be met when concatenating audios with different streams, and how can I achieve re-encoding only one audio to match the other one. Or if there is some package that can do this, it would be of great help.
You need to match codec, channel count, and sample rate. You do not need to match bitrate. The decoder will work with a varying bitrate as if it were any other VBR stream. Each frame can indicate its size. For CBR, all the frames just happen to be the same size.
Realistically though, you're not going to want to bother with this. You're going to want to decode everything to raw PCM and re-encode. While this does result in a generation of loss, the upsides are clear: