Is it possible to infer how long a phrase will take to speak in pyttsx?

402 views Asked by At

I have a list of phrases, and I want to know how long each phrase will take at a given rate, dynamically, so I can spread them out evenly over some time period.

I am currently leveraging started-utterance and finished-utterance to time it, speaking each phrase and recording how long it takes. The downside to this of course is that I have to speak every phrase on application load, or do this before load and saving the results to disk, which would be invalidated if the list of phrases changed.

import pyttsx
from datetime import datetime
import time

phrases = ['long sentence', 'medium sentence', 'short sentence', 'word']
phrase_lengths = {phrase: 0 for phrase in phrases}

start_time = None
end_time = None

def onStart(name):
    global start_time
    start_time = datetime.now()

def onEnd(name, completed):
    global start_time
    global end_time
    end_time = datetime.now()
    print name, "took", end_time - start_time
    phrase_lengths[name] = end_time - start_time

engine = pyttsx.init()
engine.connect('started-utterance', onStart)
engine.connect('finished-utterance', onEnd) 

def speak(engine, rate, phrase):
    engine.setProperty('rate', rate)
    engine.say(p, p)
    engine.runAndWait()

for phrase in phrases:
    speak(engine, 120, phrase)
    time.sleep(3.0)
1

There are 1 answers

1
SashaZd On

There's nothing specific in the API documentation to answer you directly. However, here's a test that should work.

Pyttsx lets you set the speech rate of the engine in words per minute. Additionally, you can easily write a short Python script to see how many syllables a word/phrase contains.

I suggest you set the speech rate, and then see whether a constant number of syllables takes the same amount of time irrespective of what phrase is given to the engine to speak. Once you do that, you'll be able to "estimate" how long a particular phrase may take to say if you know how many syllables it has.