pydub: how to retain headroom across `export` and `from_file`?

89 views Asked by At

What are the proper ffmpeg flags to pass through pydub.AudioSegment.export and / or pydub.AudioSegment.from_file to ensure the audio retains the expected headroom?

I'm trying to write some unit tests that assert that audio normalization is taking place as expected, but I'm finding that when I re-open an exported file, the max_dBFS is different than it was just before exporting:

In [1]: from pydub import AudioSegment, effects, utils
   ...: audio = AudioSegment.from_file("test_input.mp3")
   ...: print(f"Max dB before normalizing: {audio.max_dBFS}")
   ...: audio = effects.normalize(audio, headroom=5)
   ...: print(f"Max dB after normalizing: {audio.max_dBFS}")
   ...: audio.export("test_output.mp3")
   ...: audio = AudioSegment.from_file("test_output.mp3")
   ...: print(f"Max dB of re-opened file: {audio.max_dBFS}")
Max dB before normalizing: 0.0
Max dB after normalizing: -4.99990598233068
Max dB of re-opened file: -2.778919494135808

My first thought is this has to do with exporting a fixed bitrate of "320k", but even when I control for the bitrate (which helps a lot), it's still not matching satisfactorily:

In [2]: audio = AudioSegment.from_file("test_input.mp3")
   ...: print(f"Max dB before normalizing: {audio.max_dBFS}")
   ...: audio = effects.normalize(audio, headroom=5)
   ...: print(f"Max dB after normalizing: {audio.max_dBFS}")
   ...: audio.export("test_output.mp3", bitrate=utils.mediainfo("test_input.mp3")['bit_rate'])
   ...: audio = AudioSegment.from_file("test_output.mp3")
   ...: print(f"Max dB of re-opened file: {audio.max_dBFS}")
Max dB before normalizing: 0.0
Max dB after normalizing: -4.99990598233068
Max dB of re-opened file: -4.425849083800531

Here's an example with a generator so you don't need an actual input file to repro:

In [3]: from pydub import AudioSegment, effects, generators, utils
   ...: audio = AudioSegment.silent(duration=1000).overlay(
   ...:     generators.WhiteNoise().to_audio_segment(duration=1000).apply_gain(-2)
   ...: )
   ...: print(f"Max dB before normalizing: {audio.max_dBFS}")
   ...: audio = effects.normalize(audio, headroom=5)
   ...: print(f"Max dB after normalizing: {audio.max_dBFS}")
   ...: audio.export("test_output.mp3")
   ...: audio = AudioSegment.from_file("test_output.mp3")
   ...: print(f"Max dB of re-opened file: {audio.max_dBFS}")
Max dB before normalizing: -2.0005164576636596
Max dB after normalizing: -4.99990598233068
Max dB of re-opened file: 0.0
0

There are 0 answers