Getting soundfile.LibsndfileError: Error opening 'speech.wav': Format not recognized when giving 2D numpy array to soundfile

Question

Getting soundfile.LibsndfileError: Error opening 'speech.wav': Format not recognized when giving 2D numpy array to soundfile

12.4k views Asked by Jacob Mukiti At 16 December 2022 at 08:30

Tried generating audio from tensors generated from NVIDIA TTS nemo model before running into the error:

Here is the code for it:

import soundfile as sf

from nemo.collections.tts.models import FastPitchModel
from nemo.collections.tts.models import HifiGanModel

spec_generator = FastPitchModel.from_pretrained("tts_en_fastpitch")
vocoder = HifiGanModel.from_pretrained(model_name="tts_hifigan")

text = "Just keep being true to yourself, if you're passionate about something go for it. Don't sacrifice anything, just have fun."
parsed = spec_generator.parse(text)
spectrogram = spec_generator.generate_spectrogram(tokens=parsed)
audio = vocoder.convert_spectrogram_to_audio(spec=spectrogram)
audio = audio.to('cpu').detach().numpy()

sf.write("speech.wav", audio, 22050)

Expected to get an audio file speech.wav

Original Q&A

There are 3 answers

wleong On 26 March 2023 at 22:08

In case you really need the audio in stereo (like I did), transpose the array. Per soundfile documentation, the expected shape is (samples x channels).

Ayodeji Babalola On 07 August 2023 at 14:40

x, _ = lib.load(path, sr=None, mono=True)
sf.write('new-file.wav', x, 4000) # for a file we want to write with 4k sample rate

check that mono == True so you load a stereo file.

The above code solves the problem. You need to check that the channels loaded correspond to the one you are trying to write.

**jlamperez** · Accepted Answer · 2023-01-06T11:54:05+00:00

Looking at your example I see that your audio shape is (1, 173056).

Based on https://github.com/bastibe/python-soundfile/issues/309 I have converted the audio to 1D array of size 173056 and worked fine.

Used code:

>>> import numpy as np
>>> sf.write("speech.wav", np.ravel(audio), sample_rate)

Regards,

TechQA.

Getting soundfile.LibsndfileError: Error opening 'speech.wav': Format not recognized when giving 2D numpy array to soundfile

There are 3 answers

Related Questions in PYTHON-3.X

Related Questions in LIBSNDFILE

Related Questions in SOUNDFILE

Popular Questions

Popular Tags

Trending Questions