AWS SDK transcribe streaming not working for OPUS

17 views Asked by At

I have a telephonic system using 3cx and broadworks. When a call is initiated an Invite packet is sent to a port. That invite packet contains the port number for capturing RTP data. My Python code will listen to that port for capturing RTP data. From that python code, I used the transcribe streaming function. It works fine in microphone audio but doesn't works for audio captured like this.

SIP packet details:

m=audio 21706 RTP/AVP 0 101 8 18 9 127 107
a=rtpmap:8 PCMA/8000
a=rtpmap:18 G729/8000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:127 opus/48000/2
a=fmtp:127 useinbandfec=1
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=rtpmap:107 telephone-event/48000
a=fmtp:107 0-15
a=ptime:20
a=maxptime:20
a=sendonly
a=label:2

this is the sip packet details.

The implentation of AWS transcribe streaming in my Python code will look like this:

class MyEventHandler(TranscriptResultStreamHandler):
    def init(self, transcript_result_stream: TranscriptResultStream, log):
        super().init(transcript_result_stream)self.log = log
    
    async def handle_transcript_event(self, transcript_event: TranscriptEvent):     
        results = transcript_event.transcript.results     
        self.log.info(results)     
            for result in results:         
                for alt in result.alternatives:             
                    self.log.info(alt.transcript)

class RTPCapturing:
    def __init__(self):
        #some declaration
    
    def write_audio_to_raw_file(self):
        while True:
            try:
                if (self.isClosing == False):
                    self.sock.settimeout(10)
                    data, addr = self.sock.recvfrom(65565)  
                    # data = self.sock.recv(1024)
                    await self.stream.input_stream.send_audio_event(audio_chunk=data)
                else:
                    return
            except 
                Exception as ex:
                    self.log.error(base.exception())
                    print('RTP data not found')

This is the snippets of my code. This code works fine when i pass audio through microphone and the results are shown accurately.

Note: AWS supports signed 16bit linear PCM 16000hz little endian format. ogg-pcm then flac formats

I have tried changing the SIP attributes to

a=rtpmap:8 PCMA/8000 a=rtpmap:0 PCMU/8000

AWS does not support both of the audio signals. Another attribute I tried is:

a=rtpmap:127 opus/48000/2

but I'm getting data corrupted issue. is there any conversion or decoding i must do?

data, addr = self.sock.recvfrom(65565) 
data = self.sock.recv(1024)
opus_decoder = opuslib.Decoder(16000, 2)
opus_data  = opus_decoder.decode(data, 320)
await self.stream.input_stream.send_audio_event(audio_chunk=opus_data)

i also tried to decode like this but didn't work. Kindly anyone help.

0

There are 0 answers