Splitting an AAC stream, priming / padding samples problems (gapless playback)

1.6k views Asked by At

I am encoding raw audio to AAC with the MediaCodec API of Android. The problem: I need to send to a server the AAC stream in chunks of one second. So I need to split the stream. Right now, since an AAC frame is 1024 samples, I take round(SAMPLE_RATE/1024) AAC frames for each chunk. However, because of "priming samples" this simple cutting of the AAC stream does not work. More details follow. After sending a chunk to the server, a client receives it in the web browser Chrome and using Web Audio API plays all received chunks. The playback is done in such a way to be gapless: a large audiobuffer is initially allocated, the received chunks are decoded and copied in the audiobuffer, the audiobuffer is played. Now, this does not work with AAC (it works with Ogg/Vorbis though). With AAC I have artifacts in the generated sound. At end of each second the start of the next second is zero, then, gradually, the waveform grows until it has a normal size. This lasts for 10, 20 milliseconds. I believe the problem is caused by missing "priming samples". Maybe the Web Audio API is expecting "priming samples" at the start of each AAC chunk, it does not find them and thus modifies the actual audio.

The question is: how can I split the original AAC stream and send "good" AAC chunks of one second? From what I have understood, I should include at the start of each chunk the previous two frames (last two frames of the previous chunk). However, this number should vary and there is not much documentation. Some expert advice is appreciated.

1

There are 1 answers

2
Gibezynu Nu On

I am using the following method. I am not an expert of AAC so I may be missing something, but experimentally it is working. Assuming that the Chrome decoder is expecting priming samples at the start of each chunk I do the following: before sending a chunk to the server, I add at its beginning the last 4 AAC frames of the previous chunk (if it is the first chunk I do not do this). Client-side, I retrieve a chunk, I decode it and the remove the first 4*1024 samples (1024 = samples in one AAC frame). This is working.