Edit: Updated code based on suggestions, fixing the ASBD and making another attempt at getting PTS right. It still doesn't play any audio, but there are no errors anymore at least.
I'm working on an iOS project where I'm receiving packets of Opus audio data and attempting to play them using AVSampleBufferAudioRenderer. Right now I'm using Opus's own decoder, so ultimately I just need to get the decoded PCM packets to play. The whole process from top to bottom isn't suuuper well documented, but I think I'm getting close. Here's the code I'm working with so far (edited down, and with some hardcoded values for simplicity).
static AVSampleBufferAudioRenderer* audioRenderer;
static AVSampleBufferRenderSynchronizer* renderSynchronizer;
int samplesPerFrame = 240;
int channelCount = 2;
int sampleRate = 48000;
int streams = 1;
int coupledStreams = 1;
char mapping[8] = ['\0','\x01','\0','\0','\0','\0','\0','\0'];
CMTime startPTS;
// called when the stream is about to start
void AudioInit()
{
renderSynchronizer = [[AVSampleBufferRenderSynchronizer alloc] init];
audioRenderer = [[AVSampleBufferAudioRenderer alloc] init];
[renderSynchronizer addRenderer:audioRenderer];
int decodedPacketSize = samplesPerFrame * sizeof(short) * channelCount; // 240 samples per frame * 2 channels
decodedPacketBuffer = SDL_malloc(decodedPacketSize);
int err;
opusDecoder = opus_multistream_decoder_create(sampleRate, // 48000
channelCount, // 2
streams, // 1
coupledStreams, // 1
mapping,
&err);
[renderSynchronizer setRate:1.0 time:kCMTimeZero atHostTime:CMClockGetTime(CMClockGetHostTimeClock())];
startPTS = CMClockGetTime(CMClockGetHostTimeClock());
}
// called every X milliseconds with a new packet of audio data to play, IF there's audio. (while testing, X = 5)
void AudioDecodeAndPlaySample(char* sampleData, int sampleLength)
{
// decode the packet from Opus to (I think??) Linear PCM
int numSamples;
numSamples = opus_multistream_decode(opusDecoder,
(unsigned char *)sampleData,
sampleLength,
(short*)decodedPacketBuffer,
samplesPerFrame, // 240
0);
int bufferSize = sizeof(short) * numSamples * channelCount; // 240 samples * 2 channels
CMTime currentPTS = CMTimeSubtract(CMClockGetTime(CMClockGetHostTimeClock()), startPTS);
// LPCM stream description
AudioStreamBasicDescription asbd = {
.mFormatID = kAudioFormatLinearPCM,
.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger,
.mBytesPerPacket = sizeof(short) * channelCount,
.mFramesPerPacket = 1,
.mBytesPerFrame = sizeof(short) * channelCount,
.mChannelsPerFrame = channelCount, // 2
.mBitsPerChannel = 16,
.mSampleRate = sampleRate // 48000,
.mReserved = 0
};
// audio format description wrapper around asbd
CMAudioFormatDescriptionRef audioFormatDesc;
OSStatus status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault,
&asbd,
0,
NULL,
0,
NULL,
NULL,
&audioFormatDesc);
// data block to store decoded packet into
CMBlockBufferRef blockBuffer;
status = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault,
decodedPacketBuffer,
bufferSize,
kCFAllocatorNull,
NULL,
0,
bufferSize,
0,
&blockBuffer);
// data block converted into a sample buffer
CMSampleBufferRef sampleBuffer;
status = CMAudioSampleBufferCreateReadyWithPacketDescriptions(kCFAllocatorDefault,
blockBuffer,
audioFormatDesc,
numSamples,
currentPTS,
NULL,
&sampleBuffer);
// queueing sample buffer onto audio renderer
[audioRenderer enqueueSampleBuffer:sampleBuffer];
}
The AudioDecodeAndPlaySample function comes from the library I'm working with, and as the comment says, is called with a packet of about 5 ms worth of samples at a time (and, important to note, does not get called if there's silence).
There are plenty of places here I could be wrong - I think I'm correct that the opus decoder (docs here) decodes into Linear PCM (interleaved), and I hope I'm building the AudioStreamBasicDescription correctly. I'm definitely not sure what to do with the PTS (presentation timestamp) in CMAudioSampleBufferCreateReadyWithPacketDescriptions - I'm trying to come up with a time based on current host time - init host time, but I have no idea if that works or not.
Most code examples I've seen of enqueueSampleBuffer have it wrapped in requestMediaDataWhenReady with a dispatch queue, which I have also tried to no avail. (I suspect it's more good practice than essential to functioning, so I'm just trying to get the simplest case working first; but if it is essential I can drop it back in.)
Feel free to respond using Swift if you're more comfortable with it, I can work with either. (I'm stuck with Objective-C here, like it or not. )
It seems like you're on the right track with your iOS audio project. Your approach to decoding Opus audio data and attempting to play it using AVSampleBufferAudioRenderer is fundamentally sound, but there are a few potential issues and improvements to consider in your code.