Here are the repeatable steps to produce my issue:

In Twilio Video Rooms, I start a call between two participants (a host and a guest). I mute the guest's audio using track.disable(). I start recording the call using Twilio's recording API. After 40 seconds I unmute the guest's audio. After a further 20 seconds I stop recording.

Twilio generates two .mka audio files - one for each side of the call. Ideally I would like these to be the same length, with the guest recording starting with 40 seconds of silence. But the recordings are of different lengths - the host recording is about 40 seconds longer than the guest recording, which only seems to start when I enabled the guest's audio.

How do I find out the exact timestamp difference between the times that the two recordings start? Alternatively, can I get Twilio to start the audio recordings for the two participants simultaneously, even though one of them has its audio track disabled?

(Context: I want to do this because I want to use AWS Transcribe to generate transcripts for the two recordings, and then combine the two transcripts into a unified transcript. For the entries in the combined transcript to be in the correct order, I need to know the difference between the start times of the two recordings.)

According to https://www.twilio.com/docs/video/api/recordings-resource, a recording's offset property is: "The time in milliseconds elapsed between an arbitrary point in time, common to all group rooms, and the moment when the source room of this track started. This information provides a synchronization mechanism for recordings belonging to the same room." However, the offset property is the same for my two recordings - is this a Twilio bug?

There's also the date_created property of Twilio recordings, but it's given to the nearest second rather than the nearest millisecond, and I don't know how precisely it reflects the actual start time of the recording.

0

There are 0 answers