How can I generate transcript from my audio with timestamps in pocketsphinx using jupyter?

32 views Asked by At
#!/usr/bin/env python
import pocketsphinx as ps

DATADIR = 'deps/pocketsphinx/test/data'

# Create a decoder with certain model
config = ps.Decoder.default_config()
decoder = ps.Decoder(config)

# Decode streaming data.
decoder.start_utt()
stream = open(os.path.join(DATADIR, 'hello_world.wav'), 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
print('Best hypothesis segments:', [(seg.word, seg.start_frame, seg.end_frame) for seg in decoder.seg()])

Is there a way to get transcript like this by using pocketsphinx

[{"start_time":"0.5","end_time":"2.5", "confidence":"1.0","text":"Hello"}, {"start_time":"3.0","end_time":"4.5", "confidence":"1.0","text":"world"}]

0

There are 0 answers