use SeamlessM4Tv2Model, I want to slow down the rate of speech of audio output

20 views Asked by At
text_inputs = processor(text="I have a daughter 2 years old, I wanted her name to be Hương Ly", src_lang="eng", return_tensors="pt").to(device)
audio_array = model.generate(**text_inputs, tgt_lang=language)[0].cpu().numpy().squeeze()
file_path = 'audio_from_text.wav'
sf.write(file_path, audio_array, 16000)

doc [ex]

it has returned a 3 seconds audio

I try adding speech_temperature=0.2 or speech_do_sample=True to generate() but there is no change, it still has returned a 3 seconds audio, for example, I want to change the rate of speech so it will be 5 seconds audio any ideal ?

0

There are 0 answers