corruption of unknown origin in audio (libopus) and video (nvenc HEVC) bitstreams sent via webtransport API, and decoded with webcodecs

169 views Asked by At

I'm using windows 11, and chrome for the web client. I have a golang program that runs two c++ programs as subprocess. the first uses the nvidia video codec SDK to set up an hevc encoder:

NV_ENC_INITIALIZE_PARAMS IP = {};
    NV_ENC_CONFIG C = {};
    IP.encodeConfig = &C;
    IP.encodeConfig->version = NV_ENC_CONFIG_VER;
    IP.version = NV_ENC_INITIALIZE_PARAMS_VER;

    IP.encodeGUID = NV_ENC_CODEC_HEVC_GUID;
    IP.presetGUID = NV_ENC_PRESET_P7_GUID;
    IP.tuningInfo = NV_ENC_TUNING_INFO_LOW_LATENCY;
    IP.encodeWidth = 1920;
    IP.encodeHeight = 1080;
    IP.frameRateNum = 60;
    IP.frameRateDen = 1;
    IP.enablePTD = 1;
    IP.enableEncodeAsync = false;

    NV_ENC_PRESET_CONFIG presetConfig = { NV_ENC_PRESET_CONFIG_VER, { NV_ENC_CONFIG_VER } };
    NvEncFunctions.nvEncGetEncodePresetConfigEx(pEncoder, NV_ENC_CODEC_HEVC_GUID, NV_ENC_PRESET_P7_GUID, NV_ENC_TUNING_INFO_LOW_LATENCY, &presetConfig);
    memcpy(IP.encodeConfig, &presetConfig.presetCfg, sizeof(NV_ENC_CONFIG));

    IP.encodeConfig->frameIntervalP = 1;
    IP.encodeConfig->gopLength = NVENC_INFINITE_GOPLENGTH;
    IP.encodeConfig->rcParams.rateControlMode = NV_ENC_PARAMS_RC_CBR;
    IP.encodeConfig->rcParams.averageBitRate = 800000;
    IP.encodeConfig->rcParams.enableAQ = 1;
    IP.encodeConfig->rcParams.zeroReorderDelay = 1;

    res = NvEncFunctions.nvEncInitializeEncoder(pEncoder, &IP);
    if (res != NV_ENC_SUCCESS) {
        cerr << "nvEncInitializeEncoder " << res;
        return 1;
    }

the second process uses opus.lib to set up an audio encoder:

// Define the Opus encoder parameters
    OpusEncoder* pEncoder = opus_encoder_create(48000, 2, OPUS_APPLICATION_AUDIO, (int*)&res);
    if (res != OPUS_OK) {
        cerr << "opus_encoder_create " << res;
        return 1;
    }

    // Define the Opus encoder bitrate
    opus_encoder_ctl(pEncoder, OPUS_SET_BITRATE(40000));
    opus_encoder_ctl(pEncoder, OPUS_SET_COMPLEXITY(10));
    opus_encoder_ctl(pEncoder, OPUS_SET_VBR_CONSTRAINT(0));
    opus_encoder_ctl(pEncoder, OPUS_SET_SIGNAL(OPUS_SIGNAL_MUSIC));
    opus_encoder_ctl(pEncoder, OPUS_SET_APPLICATION(OPUS_APPLICATION_AUDIO));
    opus_encoder_ctl(pEncoder, OPUS_SET_BANDWIDTH(OPUS_BANDWIDTH_FULLBAND));

    //opus_encoder_ctl(pEncoder, OPUS_SET_INBAND_FEC(1));
    //opus_encoder_ctl(pEncoder, OPUS_SET_PACKET_LOSS_PERC(100));

The bitstreams outputted by these encoders are sent via udp to the loopback interface (127.0.0.1) and are received by the golang host process, which promptly forwards them to a remote web client via webtransport (webtransport-go pkg) (audio example only below):

var audioStream webtransport.SendStream
    //audioStreamOnline := false
    go func() {
        audioPipeAddr, err := net.ResolveUDPAddr("udp", "127.0.0.1:10050")
        if err != nil {
            panic(err)
        }
        audioPipe, err := net.ListenUDP("udp", audioPipeAddr)
        if err != nil {
            panic(err)
        }
        err = audioPipe.SetReadBuffer(100)
        if err != nil {
            panic(err)
        }

        for {
            buffer := make([]byte, 100)
            len, _, err := audioPipe.ReadFromUDP(buffer)
            if err != nil {
                panic(err)
            }

            audioStream.Write(buffer[0:len])

            //fmt.Printf("Received %d bytes from audio pipe: %s\n", len, string(buffer[:len]))
        }
    }()

at the client side, the received bitstream is directly fed into webcodecs opus and hevc decoders:

let ts = 0
                    
                    while (true) {
                        const {done, value} = await reader.read()
                        if (done) return

                        lastVideoPacket = performance.now()

                        videoDecoder.decode(new EncodedVideoChunk({
                            type: "delta",
                            data: value.slice(5),
                            timestamp: ts,
                            duration: 16000
                        }))

                        ts += 16000
                    }
let ts = 0

                    while (true) {
                        const {done, value} = await reader.read()
                        if (done) return

                        lastAudioPacket = performance.now()

                        audioDecoder.decode(new EncodedAudioChunk({
                            type: "key",
                            data: value,
                            timestamp: ts,
                            duration: 10000
                        }))

                        ts += 10000
                    }

The decoders are configured as follows:

videoDecoder.configure({
        codec: "hev1.2.4.L120.B0",
        codedWidth: 1920,
        codedHeight: 1080,
        hardwareAcceleration: "prefer-hardware",
        optimizeForLatency: true
    })
audioDecoder.configure({
        codec: "opus",
        sampleRate: 48000,
        numberOfChannels: 2
    })

however, the decoded audio and video show clear corruption as depicted in the video: https://youtu.be/wAY5w4zlku4 it may seem that the audio corruption is due to me moving the windows, but I can confirm that it happens all the time. the video is actually on the less glitchy side of what I have experienced, and if I left it, the decoders would eventually suffer a fatal error and close.

This was one of the first problems I experienced when I started this project and I have tried so hard to fix it. the bitstreams were initially posted and read from stdout instead of sent through the loopback interface, and switching made no difference. now I am out of ideas, and I wish for those experienced with encoded av bitstreams to have a look at the code above to see if something's wrong, or if the corruption in the video looks familiar.

Thanks in advance!

Edit: I had the subprocesses write the bitstream to files once, and the hevc file played absolutely perfectly using ffplay with no issues, haven't tested playing the audio, however I'm sure there is nothing wrong with the subprocesses themselves.

1

There are 1 answers

2
Tiger Yang On

SOLVED! turns out there's some weird bug with the webtransport-go pkg which is corrupting the data in some way when I send it as a stream. (I'm pretty sure it's something more subtle than the system swapping line endings as that makes the bitstream completely unplayable in my experience). I sent the bitstreams using datagrams instead using SendMessage(msg []byte) and implemented my own packet fragmentation system for the larger video packets and it works perfectly!