I’m trying to use pyttsx3 in a game where speech is said in response to events within the game, however I can’t figure out how to execute speech commands without pausing the game loop until the audio finishes. Using runAndWait() is no good for this reason, and I can’t get the example in the docs using iterate() to work as I need it to either. See the code below:
import pyttsx3
import pygame as pg
def onStart(name):
print('start')
def onWord(name, location, length):
print('word', name, location, length)
def onEnd(name, completed):
print('end')
def speak(*args, **kwargs):
engine.connect('started-utterance', onStart)
engine.connect('started-word', onWord)
engine.connect('finished-utterance', onEnd)
engine.say(*args, **kwargs)
def main():
engine.startLoop(False)
n = 0
while True:
n += 1
print(n)
if n == 3:
speak('The first utterance.')
elif n == 6:
speak('Another utterance.')
engine.iterate()
clock.tick(1)
if __name__ == '__main__':
pg.init()
engine = pyttsx3.init()
clock = pg.time.Clock()
main()
pg.quit()
sys.exit()
In this example, the first statement is triggered at the right time, but it seems like pyttsx3 stops processing at that point - no further say commands produce any sound, and only the first started-utterance event fires - the started-word and finished-utterance commands never fire. I’ve tried this every way I can think of and still can’t get it to work. Any ideas?
I'm not familiar with pyttx3, but I've created an example that incorporates a pygame event loop. The colour changes with every mouse click event and the name of the colour is spoken.
If you run this you'll see a repetition of the problem you described, the game loop pausing, indicated by the frame count in the title bar. I used a longer sentence rather than just the colour name to exacerbate the issue. Perhaps someone with a better understanding of pyttx3 will understand where we've both gone wrong. In any case we can get around this by running the Text to Speech engine in a different thread.
In this case, we're using a queue to send the text to the thread which will perform the text-to-speech, so the frame counter does not pause. Unfortunately we can't interrupt the speech, but that might not be a problem for your application.
Let me know if you need any further explanation of what's going on, I've tried to keep it simple.
EDIT: Looking at this issue recorded on github it seems that speech interruption on Windows is not working, the work-around in that discussion is to run the text-to-speech engine in a separate process which is terminated. I think that might incur an unacceptable start-up cost for process and engine re-initialisation every time an interruption is desired. It may work for your usage.