I need to convert mp3 to wav files that will then be used by Python script. (The script analyzes the wav file data in the form of numpy arrays for each channel.) That all works fine. To get the wav, I first used Audacity "by hand" and opened the mp3, then used "Export as WAV". All good, but, I now need to automate this and avoid using Audacity. So I modified my Python script to first run ffmpeg to do the conversion, using:
subprocess.call(["ffmpeg -i", mp3filename, wavfilename])
Looking at the resulting wavfilename.wav data, I find it is totally different from what is in the Audacity-produced file. I verified that both approaches are using PCM 16-bit unsigned little endian, same sample rate, etc.
Now, if I take that ffmpeg-produced wav file, open it with Audacity, and then "export as WAV", it will produce a 3rd wav that works just fine, and looks identical to the one that I created with Audacity in the first place - and when I say identical, I mean that I compare the numpy arrays element by element, and they line right up, whereas the ffmpeg data does not correlate to the Audacity data at all.
I am using the very latest ffmpeg, ffmpeg version N-99557-g6bdfea8d4b with Lavf58.62.100. Audacity is 2.3.2. I have tried using pydub also but that gives the same results as ffmpeg.
Clearly, Audacity is doing something differently when it exports the wav, and even though the ffmpeg data looks very different in Python, Audacity plays it just fine - so it is also able to correctly account for whatever it is that ffmpeg did when it made the wav file. I would simply like to get ffmpeg to emulate what Audacity does. Any advice or insight is greatly appreciated.