How to convert byte array to audio file?

Question

How to convert byte array to audio file?

2.3k views Asked by Adrian Costin At 21 August 2019 at 17:27

I have written a program that gets SIP packets in real time from the network and I want to use the SDP information embedded in the packets to capture the audio conversation from two VOIP soft phones.

Once I retrieve the binary data from the RTP protocol how should I go about converting it into a sound file?

c++ preferred.

Original Q&A

There are 2 answers

mail2subhajit On 27 August 2019 at 10:24

if your requirement is only from the audio recording point of view

( .wav file - audio codec used in the call is a-law /u-law)

This approach you can take without coding .

Use Wireshark to capture the network packets ( in pcap file)

Wireshark-> Telephony -> Stream Analysis

In Stream Analysis windows -> Save ( drop down menu - select Forward/reverse stream Audio)

Save it in .raw file format.

Open the .raw file format in Audacity and convert it to .wav file.

I hope it helps you.

**tomrtc** · Accepted Answer · 2019-08-23T07:38:41+00:00

Hi Adrian and welcome,

You are right, we cannot directly put the RTP payloads in a file concatenated one after another and then reading this file as an audio file, let's say a ".wav".

The missing part that you are looking for is a piece of code that re-assemble, decode and play-out the rtp flow of packets into voice samples; for the sake of simplicity, consider the wellknown G.711 or PCM codec because all SIP phone support this codec. You need to implement a Playout buffer (logically an infinite buffer but a ring buffer with wrap around is ok).

The packet itself contains audio data in small payload of 20ms duration. Each chunks of audio data is preceded with a RTP header, which indicates the type of encoding (This is related to the SDP information and you have a good understanding of that part).

For each packet:

Decode the 8-bits values into 16 bits samples at the right rate usually 8,000 times per second for G.711;
Compute from the RTP header the play-out point, it is the index in the play-out buffer array. Take into account jitter and re-ordering based on RTP timestamp
Write the samples into a .wav or play it to an audio device.

From a pragmatical point of view, you may do that in several ways:

You collect all the UDP/RTP packets in a capture file and use wireshark to do the hard work;
Use an existing tool, like playSIP A command-line SIP session recorder;
Grab a library or write existing code for that purpose but that is not an easy task. You can think about handling packet loss for instance.

TechQA.

How to convert byte array to audio file?

There are 2 answers

Related Questions in AUDIO

Related Questions in VOIP

Related Questions in RTP

Related Questions in SDP

Related Questions in JAIN-SIP

Popular Questions

Trending Questions