I'm trying to implement Oboe library into my application, so I can perform low latency audio playing. I could perform panning, playback manipulation, sound scaling, etc. I've been asking few questions about this topic because I'm completely new to audio worlds.
Now I can perform basic things which internal Android audio class such as SoundPool
provides. I can play multiple sounds simultaneously without noticeable delays.
But now I encountered another problem. So I made very simple application for example; There is button in screen and if user taps this screen, it plays simple piano sound. However fast user taps this button, it must be able to mix those same piano sounds just like what SoundPool
does.
My codes can do this very well, until I taps button too much times, so there are many audio queues to be mixed.
class OggPlayer;
class PlayerQueue {
private:
OggPlayer* player;
void renderStereo(float* audioData, int32_t numFrames);
void renderMono(float* audioData, int32_t numFrames);
public:
int offset = 0;
float pan;
float pitch;
int playScale;
bool queueEnded = false;
PlayerQueue(float pan, float pitch, int playScale, OggPlayer* player) {
this->pan = pan;
this->playScale = playScale;
this->player = player;
this->pitch = pitch;
if(this->pan < -1.0)
this->pan = -1.0;
else if(this->pan > 1.0)
this->pan = 1.0;
}
void renderAudio(float* audioData, int32_t numFrames, bool isStreamStereo);
};
class OggPlayer {
private:
std::vector<PlayerQueue> queues = std::vector<PlayerQueue>();
public:
int offset = 0;
bool isStereo;
float defaultPitch = 1.0;
OggPlayer(std::vector<float> data, bool isStereo, int fileSampleRate, int deviceSampleRate) {
this->data = data;
this->isStereo = isStereo;
defaultPitch = (float) (fileSampleRate) / (float) (deviceSampleRate);
}
void renderAudio(float* audioData, int32_t numFrames, bool reset, bool isStreamStereo);
static void smoothAudio(float* audioData, int32_t numFrames, bool isStreamStereo);
void addQueue(float pan, float pitch, int playerScale) {
queues.push_back(PlayerQueue(pan, defaultPitch * pitch, playerScale, this));
};
static void resetAudioData(float* audioData, int32_t numFrames, bool isStreamStereo);
std::vector<float> data;
};
OggPlayer
holds decoded PCM data with defulatPitch
value to sync speaker's sample rate and audio file's sample rate. Each OggPlayer
holds its own PCM data (meaning each audio file's data), and it holds its own vector of PlayerQueue
. PlayerQueue
is the class which actually renders audio data. OggPlayer
is PCM data provider for PlayerQueue
classes. PlayerQueue
has its own custom pitch, pan, and audio scale value. Since AudioStream
can provide limited size of array in callback methods, I added offset
, so PlayerQueue
can continue rendering audio in next session without losing its status.
void OggPlayer::renderAudio(float *audioData, int32_t numFrames, bool reset, bool isStreamStereo) {
if(reset) {
resetAudioData(audioData, numFrames, isStreamStereo);
}
for(auto & queue : queues) {
if(!queue.queueEnded) {
queue.renderAudio(audioData, numFrames, isStreamStereo);
}
}
smoothAudio(audioData, numFrames, isStreamStereo);
queues.erase(std::remove_if(queues.begin(), queues.end(),
[](const PlayerQueue& p) {return p.queueEnded;}), queues.end());
}
This is how I render audio data currently, I seek through each OggPlayer
's PlayerQueue
vector, and make them render audio data by passing pointer of array if they didn't reach end of PCM data array yet. I smooth audio data after finishing audio data, to prevent clipping or other things. Then finally remove queues from vector if they finished rendering audio (completely).
void PlayerQueue::renderAudio(float * audioData, int32_t numFrames, bool isStreamStereo) {
if(isStreamStereo) {
renderStereo(audioData, numFrames);
} else {
renderMono(audioData, numFrames);
}
}
void PlayerQueue::renderStereo(float *audioData, int32_t numFrames) {
for(int i = 0; i < numFrames; i++) {
if(player->isStereo) {
if((int) ((float) (offset + i) * pitch) * 2 + 1 < player->data.size()) {
float left = player->data.at((int)((float) (offset + i) * pitch) * 2);
float right = player->data.at((int)((float) (offset + i) * pitch) * 2 + 1);
if(pan < 0) {
audioData[i * 2] += (left + right * (float) sin(abs(pan) * M_PI / 2.0)) * (float) playScale;
audioData[i * 2 + 1] += right * (float) cos(abs(pan) * M_PI / 2.0) * (float) playScale;
} else {
audioData[i * 2] += left * (float) cos(pan * M_PI / 2.0) * (float) playScale;
audioData[i * 2 + 1] += (right + left * (float) sin(pan * M_PI / 2.0)) * (float) playScale;
}
} else {
break;
}
} else {
if((int) ((float) (offset + i) * pitch) < player->data.size()) {
float sample = player->data.at((int) ((float) (offset + i) * pitch));
if(pan < 0) {
audioData[i * 2] += sample * (1 + (float) sin(abs(pan) * M_PI / 2.0)) * (float) playScale;
audioData[i * 2 + 1] += sample * (float) cos(abs(pan) * M_PI / 2.0) * (float) playScale;
} else {
audioData[i * 2] += sample * (float) cos(pan * M_PI / 2.0) * (float) playScale;
audioData[i * 2 + 1] += sample * (1 + (float) sin(pan * M_PI / 2.0)) * (float) playScale;
}
} else {
break;
}
}
}
offset += numFrames;
if((float) offset * pitch >= player->data.size()) {
offset = 0;
queueEnded = true;
}
}
void PlayerQueue::renderMono(float *audioData, int32_t numFrames) {
for(int i = 0; i < numFrames; i++) {
if(player->isStereo) {
if((int) ((float) (offset + i) * pitch) * 2 + 1 < player->data.size()) {
audioData[i] += (player->data.at((int) ((float) (offset + i) * pitch) * 2) + player->data.at((int) ((float) (offset + i) * pitch) * 2 + 1)) / 2 * (float) playScale;
} else {
break;
}
} else {
if((int) ((float) (offset + i) * pitch) < player->data.size()) {
audioData[i] += player->data.at((int) ((float) (offset + i) * pitch)) * (float) playScale;
} else {
break;
}
}
if(audioData[i] > 1.0)
audioData[i] = 1.0;
else if(audioData[i] < -1.0)
audioData[i] = -1.0;
}
offset += numFrames;
if((float) offset * pitch >= player->data.size()) {
queueEnded = true;
offset = 0;
}
}
I render everything (panning, playback, scaling) queue has in one session, considering both speaker and audio file's status (mono or stereo)
using namespace oboe;
class OggPianoEngine : public AudioStreamCallback {
public:
void initialize();
void start(bool isStereo);
void closeStream();
void reopenStream();
void release();
bool isStreamOpened = false;
bool isStreamStereo;
int deviceSampleRate = 0;
DataCallbackResult
onAudioReady(AudioStream *audioStream, void *audioData, int32_t numFrames) override;
void onErrorAfterClose(AudioStream *audioStream, Result result) override ;
AudioStream* stream;
std::vector<OggPlayer>* players;
int addPlayer(std::vector<float> data, bool isStereo, int sampleRate) const;
void addQueue(int id, float pan, float pitch, int playerScale) const;
};
and then finally in OggPianoEngine
, I put vector of OggPlayer
, so my app can hold multiple sounds in memory, making users able to add sounds, and also able to play them in anywhere, anytime.
DataCallbackResult
OggPianoEngine::onAudioReady(AudioStream *audioStream, void *audioData, int32_t numFrames) {
for(int i = 0; i < players->size(); i++) {
players->at(i).renderAudio(static_cast<float*>(audioData), numFrames, i == 0, audioStream->getChannelCount() != 1);
}
return DataCallbackResult::Continue;
}
Rendering audio in engine is quite simple, as you may expect, I just seek through vector of OggPlayer
, and call renderAudio
method. Below code is how I initialize AudioStream
.
void OggPianoEngine::start(bool isStereo) {
AudioStreamBuilder builder;
builder.setFormat(AudioFormat::Float);
builder.setDirection(Direction::Output);
builder.setChannelCount(isStereo ? ChannelCount::Stereo : ChannelCount::Mono);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
builder.setCallback(this);
builder.openStream(&stream);
stream->setBufferSizeInFrames(stream->getFramesPerBurst() * 2);
stream->requestStart();
deviceSampleRate = stream->getSampleRate();
isStreamOpened = true;
isStreamStereo = isStereo;
}
Since I watched basic video guide of Oboe like this or this, so I tried to configure basic settings for LowLatency mode (for example, setting setting buffer size to burst size multiplying 2). But audio starts to stop rendering when there are too many queues. At first, sound starts to stutter. It feels like it skipping some of rendering session, and then it completely stops rendering if I tap playing button more after this. It starts to render again after I wait for a while (5~10 seconds, enough time to wait for queues to be emptied). So I have several questions
- Does Oboe stop rendering audio if it takes too much time to render audio like situation above?
- Did I reach limit of rendering audio, meaning that only limiting number of queues is the solution? or are there any ways to reach better performance?
These codes are in my flutter plugin, so you can get full codes from this github link
Yes. If you block
onAudioReady
for longer than the time represented bynumFrames
you will get an audio glitch. I bet if you ransystrace.py --time=5 -o trace.html -a your.app.packagename audio sched freq
you'd see that you're spending too much time inside that method.Looks like it. The problem is you're trying to do too much work inside the audio callback. Things I'd try immediately:
-O2
,-O3
and-Ofast
sin
andcos
. There may be faster versions of these functions.I spoke about some of these debugging/optimisation techniques in this talk
One other quick tip. Try to avoid raw pointers unless you really have no choice. For example
AudioStream* stream;
would be better asstd::shared_ptr<AudioStream>
andstd::vector<OggPlayer>* players;
can be refactored tostd::vector<OggPlayer> players;