I am trying to call the relevant API of Azure Speech, trying to synthesize speech from simple to complex SSML. When I send a simple request without phoneme pitch adjustment, there will be no problem, but when I start to add style, I start to report errors. The SSML I sent was the same as the official website, but the error message was 400. I really don't know where my format is wrong.
This is my request body :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" lang="zh-CN" version="1.0">
<voice name="zh-CN-YunxiNeural">
<mstts:express-as style="sad">快走吧, 路上一定要注意安全,早去早回。</mstts:express-as>
</voice>
</speak>
This is Error response :
Response{protocol=http/1.1, code=400, message=Synthesis failed. StatusCode: FailedPrecondition, Details: SSML parsing error: 0x80045003 - The caller has specified an unsupported format.., url=https://eastus.tts.speech.microsoft.com/cognitiveservices/v1}
This is working. I changed the speak element (part lang to xml:lang and this did the job) This is my working XML