I would like to know how a VoiceXML document is rendered by a text to speech engine of a speech server. The VXML document would be containing the text which is supposed to be converted into an audio file. If the TTS server understands MRCP, to what is the VXML doc converted into, so that the speech server can understand it and how..?
What is the work flow between VoiceXML and Speech Synthesis?
433 views Asked by Abhishek At
1
The VoiceXML document as a whole is not parsed by the TTS engine. Instead, the VoiceXML browser is responsible for extracting the prompt, including any Speech Synthesis Markup Language (SSML) markup included in the VoiceXML document, and passing just that text to the TTS engine via MRCP.
You can find more info on SSML from the W3C specification: SSML 1.0 Specification