use SeamlessM4Tv2Model, I want to slow down the rate of speech of audio output

20 views Asked by lam vu Nguyen At 15 March 2024 at 08:17

text_inputs = processor(text="I have a daughter 2 years old, I wanted her name to be Hương Ly", src_lang="eng", return_tensors="pt").to(device)
audio_array = model.generate(**text_inputs, tgt_lang=language)[0].cpu().numpy().squeeze()
file_path = 'audio_from_text.wav'
sf.write(file_path, audio_array, 16000)

doc [ex]

it has returned a 3 seconds audio

I try adding speech_temperature=0.2 or speech_do_sample=True to generate() but there is no change, it still has returned a 3 seconds audio, for example, I want to change the rate of speech so it will be 5 seconds audio any ideal ?

Original Q&A

TechQA.

use SeamlessM4Tv2Model, I want to slow down the rate of speech of audio output

There are 0 answers

Related Questions in PYTHON

Related Questions in PYTORCH

Related Questions in TEXT-TO-SPEECH

Related Questions in HUGGINGFACE

Popular Questions

Trending Questions