Audio File Conversion by Encoding and Decoding using MediaCodec in Android

65 views Asked by At

I am trying to convert any audio file into AAC, 128kbps bit-rate, 32000Hz sample-rate.

I was expecting a converted audio file with mentioned specifications.

I got an output file, but it was garbage audio with lots of Audio Sink Error error in ExoPlayer and lots of query -- param skipped warning in Codec2Client.

This is my code:

info - outputFile & sourceUri is of type AtomicReference<File> & AtomicReference<Uri> respectively

info - I am assuming that sourceUri is an uri of a single audio file

I tried only encoding first:

startEncoding.setOnClickListener(v ->
        {
            try {
                outputFile.set(File.createTempFile("temp", null));

                MediaExtractor extractor = new MediaExtractor();
                extractor.setDataSource(this, sourceUri.get(), null);
                extractor.selectTrack(0);

                MediaFormat outputFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, 32000, 2);
                outputFormat.setInteger(MediaFormat.KEY_BIT_RATE, 128000);


                MediaMuxer muxer = new MediaMuxer(outputFile.get().getPath(), MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4);
                int trackIndex = muxer.addTrack(outputFormat);

                MediaCodec.Callback encoderCallbacks = new MediaCodec.Callback() {
                    int sampleSize;
                    long sampleTime;
                    boolean EOSFlag = false;
                    boolean muxerStarted = false;
                    @Override
                    public void onInputBufferAvailable(@NonNull MediaCodec encoder, int index) {
                        Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: encoding...");
                        if(EOSFlag){
                            encoder.queueInputBuffer(index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                            return;
                        }
                        ByteBuffer inputBuffer = encoder.getInputBuffer(index);
                        sampleSize = extractor.readSampleData(inputBuffer, 0);
                        if(sampleSize >= 0){
                            sampleTime = extractor.getSampleTime();
                            encoder.queueInputBuffer(index, 0, sampleSize, sampleTime, 0);
                            Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: decoding...");
                        } else {
                            EOSFlag = true;
                            encoder.queueInputBuffer(index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                            extractor.release();
                            Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: decoding completed by extractor.getSampleSize() < 0");
                            return;
                        }
                        if(!extractor.advance()){
                            EOSFlag = true;
                            Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: decoding completed by extractor.advance() == false");
                            extractor.release();
                        }
                    }

                    @Override
                    public void onOutputBufferAvailable(@NonNull MediaCodec encoder, int index, @NonNull MediaCodec.BufferInfo info) {
                        if(!muxerStarted){
                            muxer.start();
                            muxerStarted = true;
                        }
                        if (info.flags != MediaCodec.BUFFER_FLAG_END_OF_STREAM) {
                            muxer.writeSampleData(trackIndex, encoder.getOutputBuffer(index), info);
                            Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: writing to file...");
                            encoder.releaseOutputBuffer(index, false);
                        } else {
                            Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: Writing completed.");
                            encoder.stop();
                            encoder.release();
                            muxer.stop();
                            muxer.release();
                            muxerStarted = false;
                            Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: Resources released");
                            Log.d(Tag.DEBUG.toString(),
                                    "Source file size: " + new File(sourceUri.get().getPath()).length() +
                                            "\nTarget file size: " + outputFile.get().length());
                        }
                    }

                    @Override
                    public void onError(@NonNull MediaCodec encoder, @NonNull MediaCodec.CodecException e) {
                        encoder.stop();
                        encoder.release();
                        muxer.stop();
                        muxer.release();
                        muxerStarted = false;
                        extractor.release();
                        Log.e(Tag.DEBUG.toString(), "onError: ", e);
                    }

                    @Override
                    public void onOutputFormatChanged(@NonNull MediaCodec encoder, @NonNull MediaFormat format) {

                    }
                };
                MediaCodec encoder = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC);
                encoder.configure(outputFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
                encoder.setCallback(encoderCallbacks);
                encoder.start();
            } catch (Exception e) {
                Log.e(Tag.DEBUG.toString(), "onCreate: ", e);
            }
        });

Then I tried decoding before encoding:

(I am aware that I won't be able to encode big files with this code otherwise byteBuffersWithInfo will overflow.

So I am using small audio files to for testing this code)

startDecodingAndEncoding.setOnClickListener(
                v -> {
                    try {
                        outputFile.set(File.createTempFile("temp", null));
                        LinkedList<HashMap<String, Object>> byteBuffersWithInfo = new LinkedList<>();

                        MediaExtractor extractor = new MediaExtractor();
                        extractor.setDataSource(this, sourceUri.get(), null);
                        extractor.selectTrack(0);

                        MediaFormat outputFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, 32000, 2);
                        outputFormat.setInteger(MediaFormat.KEY_BIT_RATE, 128000);

                        MediaMuxer muxer = new MediaMuxer(outputFile.get().getPath(), MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4);
                        int trackIndex = muxer.addTrack(outputFormat);

                        MediaCodec.Callback encoderCallbacks = new MediaCodec.Callback() {
                            boolean muxerStarted = false;
                            @Override
                            public void onInputBufferAvailable(@NonNull MediaCodec encoder, int index) {
                                Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: encoding...");
                                HashMap<String, Object> byteBufferWithInfo = byteBuffersWithInfo.poll();
                                if(byteBufferWithInfo != null) {
                                    ByteBuffer buffer = (ByteBuffer) byteBufferWithInfo.get("buffer");
                                    encoder.getInputBuffer(index).put(buffer);
                                    MediaCodec.BufferInfo info = (MediaCodec.BufferInfo) byteBufferWithInfo.get("info");
                                    encoder.queueInputBuffer(index, info.offset, info.size, info.presentationTimeUs, info.flags);
                                } else {
                                    Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: encoded.");
                                }
                            }

                            @Override
                            public void onOutputBufferAvailable(@NonNull MediaCodec encoder, int index, @NonNull MediaCodec.BufferInfo info) {
                                if(!muxerStarted){
                                    muxer.start();
                                    muxerStarted = true;
                                }
                                if (info.flags != MediaCodec.BUFFER_FLAG_END_OF_STREAM) {
                                    muxer.writeSampleData(trackIndex, encoder.getOutputBuffer(index), info);
                                    Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: writing to file...");
                                    encoder.releaseOutputBuffer(index, false);
                                } else {
                                    Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: Writing completed.");
                                    encoder.stop();
                                    encoder.release();
                                    muxer.stop();
                                    muxer.release();
                                    muxerStarted = false;
                                    Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: Resources released");
                                    Log.d(Tag.DEBUG.toString(),
                                            "Source file size: " + new File(sourceUri.get().getPath()).length() +
                                            "\nTarget file size: " + outputFile.get().length());
                                }
                            }

                            @Override
                            public void onError(@NonNull MediaCodec encoder, @NonNull MediaCodec.CodecException e) {
                                encoder.stop();
                                encoder.release();
                                muxer.stop();
                                muxer.release();
                                muxerStarted = false;
                                Log.e(Tag.DEBUG.toString(), "onError: ", e);
                            }

                            @Override
                            public void onOutputFormatChanged(@NonNull MediaCodec encoder, @NonNull MediaFormat format) {

                            }
                        };
                        MediaCodec encoder = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC);
                        encoder.configure(outputFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
                        encoder.setCallback(encoderCallbacks);

                        MediaCodec.Callback decoderCallbacks = new MediaCodec.Callback() {
                            int sampleSize;
                            long sampleTime;
                            boolean EOSFlag = false;
                            boolean encoderStarted = false;
                            @Override
                            public void onInputBufferAvailable(@NonNull MediaCodec decoder, int index) {
                                if(EOSFlag){
                                    decoder.queueInputBuffer(index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                                    return;
                                }
                                ByteBuffer inputBuffer = decoder.getInputBuffer(index);
                                sampleSize = extractor.readSampleData(inputBuffer, 0);
                                if(sampleSize >= 0){
                                    sampleTime = extractor.getSampleTime();
                                    decoder.queueInputBuffer(index, 0, sampleSize, sampleTime, 0);
                                    Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: decoding...");
                                } else {
                                    EOSFlag = true;
                                    decoder.queueInputBuffer(index, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                                    extractor.release();
                                    Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: decoding completed by extractor.getSampleSize() < 0");
                                    return;
                                }
                                if(!extractor.advance()){
                                    EOSFlag = true;
                                    Log.d(Tag.DEBUG.toString(), "onInputBufferAvailable: decoding completed by extractor.advance() == false");
                                    extractor.release();
                                }
                            }

                            @Override
                            public void onOutputBufferAvailable(@NonNull MediaCodec decoder, int index, @NonNull MediaCodec.BufferInfo info) {
                                Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: filling middle buffer...");
                                HashMap<String, Object> byteBufferWithInfo = new HashMap<>();
                                byteBufferWithInfo.put("buffer", decoder.getOutputBuffer(index));
                                byteBufferWithInfo.put("info", info);
                                byteBuffersWithInfo.offer(byteBufferWithInfo);
                                decoder.releaseOutputBuffer(index, false);
                                if(info.flags == MediaCodec.BUFFER_FLAG_END_OF_STREAM){
                                    Log.d(Tag.DEBUG.toString(), "onOutputBufferAvailable: middle buffer filled.");
                                    if(!encoderStarted){
                                        encoder.start();
                                        encoderStarted = true;
                                        decoder.stop();
                                    }
                                }
                            }

                            @Override
                            public void onError(@NonNull MediaCodec decoder, @NonNull MediaCodec.CodecException e) {
                                decoder.stop();
                                decoder.release();
                                extractor.release();
                                encoderStarted = false;
                                Log.e(Tag.DEBUG.toString(), "onError: ", e);
                            }

                            @Override
                            public void onOutputFormatChanged(@NonNull MediaCodec decoder, @NonNull MediaFormat format) {

                            }
                        };

                        MediaCodec decoder = MediaCodec.createDecoderByType(extractor.getTrackFormat(0).getString(MediaFormat.KEY_MIME));
                        decoder.configure(extractor.getTrackFormat(0), null, null, 0);
                        decoder.setCallback(decoderCallbacks);
                        decoder.start();
                    } catch (Exception e) {
                        Log.e(Tag.DEBUG.toString(), "onCreate: ", e);
                    }
                }
        );

But while I try to play any of both with Media3 ExoPlayer, I get:

  1. Garbage audio

  2. This error in ExoPlayer

    Audio sink error
                                                                                                          androidx.media3.exoplayer.audio.AudioSink$UnexpectedDiscontinuityException: Unexpected audio track timestamp discontinuity: expected 1000000480000, got 1000000689343
                                                                                                              at androidx.media3.exoplayer.audio.DefaultAudioSink.handleBuffer(DefaultAudioSink.java:956)
                                                                                                              at androidx.media3.exoplayer.audio.MediaCodecAudioRenderer.processOutputBuffer(MediaCodecAudioRenderer.java:739)
                                                                                                              at androidx.media3.exoplayer.mediacodec.MediaCodecRenderer.drainOutputBuffer(MediaCodecRenderer.java:1998)
                                                                                                              at androidx.media3.exoplayer.mediacodec.MediaCodecRenderer.render(MediaCodecRenderer.java:827)
                                                                                                              at androidx.media3.exoplayer.ExoPlayerImplInternal.doSomeWork(ExoPlayerImplInternal.java:1079)
                                                                                                              at androidx.media3.exoplayer.ExoPlayerImplInternal.handleMessage(ExoPlayerImplInternal.java:529)
                                                                                                              at android.os.Handler.dispatchMessage(Handler.java:102)
                                                                                                              at android.os.Looper.loopOnce(Looper.java:230)
                                                                                                              at android.os.Looper.loop(Looper.java:319)
                                                                                                              at android.os.HandlerThread.run(HandlerThread.java:67)
    
  3. Lots of this warning in logs:

    Codec2Client: query -- param skipped: index = 1342179345.
    

Please tell me what am I missing.

This is my first post to the community, so please pardon me for any mis-conduct.

1

There are 1 answers

0
dev.bmax On

First of all, yes, you must decode the file before feeding it into the encoder.

Second, the MediaCodec API doesn't handle sample-rate conversion (e.g. from 48,000 Hz to 32,000 Hz). You need to use a third-party library for this or write your own code.

Regarding your code. When you get an output buffer from the decoder, you save it in a linked list. And in the next line you release it. This is a bug.

The decoder manages a queue of reusable buffers. Once an output buffer is released to the codec, it MUST NOT be used. By the time you take the buffer from the list it is probably overridden with other data. One way to solve this issue is to make a copy of the buffer.