I have a telephonic system using 3cx and broadworks. When a call is initiated an Invite packet is sent to a port. That invite packet contains the port number for capturing RTP data. My Python code will listen to that port for capturing RTP data. From that python code, I used the transcribe streaming function. It works fine in microphone audio but doesn't works for audio captured like this.
SIP packet details:
m=audio 21706 RTP/AVP 0 101 8 18 9 127 107
a=rtpmap:8 PCMA/8000
a=rtpmap:18 G729/8000
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:127 opus/48000/2
a=fmtp:127 useinbandfec=1
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=rtpmap:107 telephone-event/48000
a=fmtp:107 0-15
a=ptime:20
a=maxptime:20
a=sendonly
a=label:2
this is the sip packet details.
The implentation of AWS transcribe streaming in my Python code will look like this:
class MyEventHandler(TranscriptResultStreamHandler):
def init(self, transcript_result_stream: TranscriptResultStream, log):
super().init(transcript_result_stream)self.log = log
async def handle_transcript_event(self, transcript_event: TranscriptEvent):
results = transcript_event.transcript.results
self.log.info(results)
for result in results:
for alt in result.alternatives:
self.log.info(alt.transcript)
class RTPCapturing:
def __init__(self):
#some declaration
def write_audio_to_raw_file(self):
while True:
try:
if (self.isClosing == False):
self.sock.settimeout(10)
data, addr = self.sock.recvfrom(65565)
# data = self.sock.recv(1024)
await self.stream.input_stream.send_audio_event(audio_chunk=data)
else:
return
except
Exception as ex:
self.log.error(base.exception())
print('RTP data not found')
This is the snippets of my code. This code works fine when i pass audio through microphone and the results are shown accurately.
Note: AWS supports signed 16bit linear PCM 16000hz little endian format. ogg-pcm then flac formats
I have tried changing the SIP attributes to
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
AWS does not support both of the audio signals. Another attribute I tried is:
a=rtpmap:127 opus/48000/2
but I'm getting data corrupted issue. is there any conversion or decoding i must do?
data, addr = self.sock.recvfrom(65565)
data = self.sock.recv(1024)
opus_decoder = opuslib.Decoder(16000, 2)
opus_data = opus_decoder.decode(data, 320)
await self.stream.input_stream.send_audio_event(audio_chunk=opus_data)
i also tried to decode like this but didn't work. Kindly anyone help.