IBM Cloud - How to adjust speaking rate in Watson TTS using curl POST?

Question

IBM Cloud - How to adjust speaking rate in Watson TTS using curl POST?

508 views Asked by Bloggy At 03 October 2020 at 15:38

I'm having issues trying to adjust the prosody speaking rate in IBM Watson's TTS Service using curl. Here is the code I've tried, it does synthesize audio but just completely ignores the --header "prosody rate: +50%" ^ line I inserted which was to be expected as I'm unsure how to make that happen and just improvised that. Does anyone know how I could get it to work as intended? I want to speed it up by 50%, but I can't find anything in the docs to help me when it comes to this request format.

Thanks!

curl -X POST -u "apikey:apikey" ^
--header "Content-Type: application/json" ^
--header "Accept: audio/wav" ^
--header "prosody rate: +50%" ^
--data "{\"text\":\"Adult capybaras are one meter long.\"}" ^
--output hello_world.wav ^
"URL/v1/synthesize?voice=en-US_HenryV3Voice"

Original Q&A

There are 2 answers

**chughts** · Answer 1 · 2020-10-05T10:10:00+00:00

chughts On 05 October 2020 at 10:10

prosody is an SSML option, so I would expect it to be used as tags around the text that you are synthesising.

--data "{\"text\":\"<prosody rate = \"fast\">Adult capybaras are one meter long.</prosody>\"}"

**Vidyasagar Machupalli** · Answer 2 · 2020-10-05T10:19:09+00:00

Here's a working example with the POST call,

curl -X POST -u "apikey:{API_KEY}" \
--header "Accept: audio/wav" \
--header "Content-Type: application/json" \
--data '{"text": "<p><s><prosody rate=\"+50%\">This is the first sentence of the paragraph.</prosody></s><s>Here is another sentence.</s><s>Finally, this is the last sentence.</s></p>"}' \
--output result.wav \
"{URL}/v1/synthesize" -v

on a Windows command prompt(cmd),

Create a JSON file input.json with the below command

echo {"text": "<p><s><prosody rate='+50%'>This is the first sentence of the paragraph.</prosody></s><s>Here is another sentence.</s><s>Finally, this is the last sentence.</s></p>"} > input.json

and then cURL to see result.wav file

curl -X POST -u "apikey:{API_KEY}" ^
--header "Accept: audio/wav" ^
--header "Content-Type: application/json" ^
--data @input.json ^
--output result.wav ^
"{URL}/v1/synthesize" -v

For the sentence in your question, replace the JSON above with yours

{"text":"<prosody rate='fast'>Adult capybaras are one meter long.</prosody>"}

Here's some useful links I followed to create this code sample that will help you in understanding the SSML attributes. Also, check the limitations of <prosody> in the links below

TechQA.

IBM Cloud - How to adjust speaking rate in Watson TTS using curl POST?

There are 2 answers

Related Questions in CURL

Related Questions in IBM-CLOUD

Related Questions in TEXT-TO-SPEECH

Related Questions in IBM-WATSON

Related Questions in SSML

Popular Questions

Popular Tags

Trending Questions