FFMPEG - AVFrame to per channel array conversion

1.2k views Asked by At

I am looking to copy an AVFrame into an array where pixels are stored one channel at a time in a row-major order.

Details:

I am using FFMPEG's api to read frames from a video. I have used avcodec_decode_video2 to fetch each frame as an AVFrame as follows:

AVFormatContext* fmt_ctx = NULL;
avformat_open_input(&fmt_ctx, filepath, NULL, NULL);
...
int video_stream_idx;  // stores the stream index for the video
...
AVFrame* vid_frame = NULL;
vid_frame = av_frame_alloc();
AVPacket vid_pckt;
int frame_finish;
...
while (av_read_frame(fmt_ctx, &vid_pckt) >= 0) {
    if (b_vid_pckt.stream_index == video_stream_idx) {
        avcodec_decode_video2(cdc_ctx, vid_frame, &frame_finish, &vid_pckt);
        if (frame_finish) {
            /* perform conversion */
        }
    }
}

The destination array looks like this:

unsigned char* frame_arr = new unsigned char [cdc_ctx->width * cdc_ctx->height * 3];

I need to copy all of vid_frame into frame_arr, where the range of pixel values should be [0, 255]. The problem is that the array needs to store the frame in row major order, one channel at a time, i.e. R11, R12, ... R21, R22, ... G11, G12, ... G21, G22, ... B11, B12, ... B21, B22, ... (I have used the notation [color channel][row index][column index], i.e. G21 is the green channel value of pixel at row 2, column 1). I have had a look at sws_scale, but I don't understand it enough to figure out whether that function is capable of doing such a conversion. Can somebody help!! :)

1

There are 1 answers

1
halfelf On BEST ANSWER

The format you called "one channel at a time" has a term named planar. (btw, the opposite format is named packed) And almost every pixel format is of row order.

The problem here is the input format may vary and all of them should be converted to one format. That's what sws_scale() does.

However, there is no such planar RGB format in ffmpeg libs yet. You have to write your own pixel format description into ffmpeg source code libavutil/pixdesc.c and re-build the libs.

Or you can just convert the frame into AV_PIX_FMT_GBRP format, which is the most similar one to what you want. AV_PIX_FMT_GBRP is a planar format, while the green channel is at first and red at last (blue middle). And rearrange these channels then.

// Create a SwsContext first:
SwsContext* sws_ctx = sws_getContext(cdc_ctx->width, cdc_ctx->height, cdc_ctx->pix_fmt, cdc_ctx->width, cdc_ctx->height, AV_PIX_FMT_GBRP, 0, 0, 0, 0);
// alloc some new space for storing converted frame
AVFrame* gbr_frame = av_frame_alloc();
picture->format = AV_PIX_FMT_GBRP;
picture->width  = cdc_ctx->width;
picture->height = cdc_ctx->height;
av_frame_get_buffer(picture, 32);
....

while (av_read_frame(fmt_ctx, &vid_pckt) >=0) {
    ret = avcodec_send_packet(cdc_ctx, &vid_pckt);
    // In particular, we don't expect AVERROR(EAGAIN), because we read all
    // decoded frames with avcodec_receive_frame() until done.
    if (ret < 0)
        break;

    ret = avcodec_receive_frame(cdc_ctx, vid_frame);
    if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF)
        break;
    if (ret >= 0) {
        // convert image from native format to planar GBR
        sws_scale(sws_ctx, vid_frame->data, 
                  vid_frame->linesize, 0, vid_frame->height, 
                  gbr_frame->data, gbr_frame->linesize);

        // rearrange gbr channels in gbr_frame as you like
        // g channel is gbr_frame->data[0]
        // b channel is gbr_frame->data[1]
        // r channel is gbr_frame->data[2]
        // ......
    }
}

av_frame_free(gbr_frame);
av_frame_free(vid_frame);
sws_freeContext(sws_ctx);
avformat_free_context(fmt_ctx)