How to tune the quality of festival TTS?

2.2k views Asked by At

I am using 2.1 release of festival. I was able to install and use 172M voice with

(voice_cmu_us_slt_arctic_clunits)

The quality has been significantly improved but far from desired. I believe generation still uses a lot of defaults. Is it possible to tune this further (e.g. close to the quality of qwiki.com engine)? I understand that I need a proper combination of

  • Synthesis method
  • Intonation/duration settings
  • Audio output parameters
  • xx ?

but it is very difficult to find all the details (the progress is quite slow).

Any tips, links to tutorials/docs (old version but provides some theory overview) or scheme snippets are appreciated.

PS

Please note that so far I am not interested in the tuning of the algorithms themselves (e.g. training the voice model with sphinx).

To generate speech I use commands like

(SayText "This is a short introduction ...")

and

./text2wave -eval '(voice_cmu_us_slt_arctic_clunits)' TEXT > output.wav
0

There are 0 answers