SpeechSynthesis API on iOS 17 failed with certain text

139 views Asked by At

let voice = null;
function updateVoice() {
  voice = speechSynthesis.getVoices().find(voice => voice.lang == 'zh-CN')
  document.getElementById('voice_name').textContent = voice?.name ?? '(No Voice)';
}
speechSynthesis.addEventListener('voiceschanged', updateVoice);
updateVoice()
const speak = document.getElementById('speak');
const text = document.getElementById('text');
speak.addEventListener('click', () => {
  const ssu = new SpeechSynthesisUtterance(text.value);
  ssu.voice = voice;
  ssu.lang = 'zh-CN';
  ssu.addEventListener('start', e => console.log('start', e));
  ssu.addEventListener('end', e => console.log('end', e));
  ssu.addEventListener('error', e => console.log('error', e));
  speechSynthesis.speak(ssu);
});
<textarea id="text">“你好”</textarea>
<div id="voice_name"></div>
<button type="button" id="speak">
Speak
</button>

My PWA using Speech Synthesis API simpliy works on Safari iOS 16.5 but it just failed after I upgraded my device to iOS 17. Codes are shown above. When running on my iPhone SE2 with iOS 17, trying to speak “你好” (你好 with CJK quotes “”) results nothing happened. However, if you remove the quotes and leave only 你好 in the textarea, and click the Speak button again, it would work as expected. I'm not sure what I can do to get ride of this problem. So could anyone help me find a work around here?

1

There are 1 answers

0
tsh On

As the text to speak is given by user which is out of my control. I cannot avoid the quote in the text. After some try and error. The best workaround I got so far is:

  • Append “”。 on each text for ssu

It could be tricky but however works.

const extraSuffix = '“”。';

let voice = null;
function updateVoice() {
  voice = speechSynthesis.getVoices().find(voice => voice.lang == 'zh-CN')
  document.getElementById('voice_name').textContent = voice?.name ?? '(No Voice)';
}
speechSynthesis.addEventListener('voiceschanged', updateVoice);
updateVoice()
const speak = document.getElementById('speak');
const text = document.getElementById('text');
speak.addEventListener('click', () => {
  const ssu = new SpeechSynthesisUtterance(text.value + extraSuffix);
  ssu.voice = voice;
  ssu.lang = 'zh-CN';
  ssu.addEventListener('start', e => console.log('start', e));
  ssu.addEventListener('end', e => console.log('end', e));
  ssu.addEventListener('error', e => console.log('error', e));
  speechSynthesis.speak(ssu);
});
<textarea id="text">“你好”</textarea>
<div id="voice_name"></div>
<button type="button" id="speak">
Speak
</button>

I find out that if I append a at the end of text, the speech synthesis works again. However, 怎么? and 怎么?。 would have different tune, which is not what I want. But 怎么?“”。 works fine. That's why it use such strange suffix.