Reading an LZ4 compressed text file (mozlz4) in WebExtensions (JavaScript, Firefox)

Question

Reading an LZ4 compressed text file (mozlz4) in WebExtensions (JavaScript, Firefox)

4.6k views Asked by CanisLupus At 09 September 2017 at 11:09

I'm porting a Firefox Add-on SDK extension to WebExtensions. Previously I could access the browser's search engines, but now I can't, so a helpful user suggested I try reading the search.json.mozlz4 file, which has every installed engine. However, this file is json with LZ4 compression, and it's in Mozilla's own LZ4 format, with a custom magic number, 'mozLz40\0'.

Before, one could use this to read a text file that uses LZ4 compression, including a mozlz4 file:

let bytes = OS.File.read(path, { compression: "lz4" });
let content = new TextDecoder().decode(bytes);

(although I couldn't find documentation about the "compression" field, it works)

Now, using WebExtensions, the best I could come up with to read a file is

var reader = new FileReader();
reader.readAsText(file);
reader.onload = function(ev) {
    let content = ev.target.result;
};

This does not handle compression in any way. This library handles LZ4~~, but it is for node.js so I can't use that.~~ [edit: it works standalone too]. However, even if I remove the custom magic number processing I can't get it to decompress the file, while this Python code, in comparison, works as expected:

import lz4
file_obj = open("search.json.mozlz4", "rb")
if file_obj.read(8) != b"mozLz40\0":
    raise InvalidHeader("Invalid magic number")
print(lz4.block.decompress(file_obj.read()))

How can I do this in JS?

Original Q&A

There are 1 answers

**CanisLupus** · Accepted Answer · 2017-09-23T16:58:14+00:00

After much trial and error, I was finally able to read and decode the search.json.mozlz4 file in a WebExtension. You can use the node-lz4 library, though you'll only need one function - uncompress (aliased as decodeBlock for external access) - so I renamed it to decodeLz4Block and included it here with slight changes:

// This method's code was taken from node-lz4 by Pierre Curto. MIT license.
// CHANGES: Added ; to all lines. Reformated one-liners. Removed n = eIdx. Fixed eIdx skipping end bytes if sIdx != 0.
function decodeLz4Block(input, output, sIdx, eIdx)
{
    sIdx = sIdx || 0;
    eIdx = eIdx || input.length;

    // Process each sequence in the incoming data
    for (var i = sIdx, j = 0; i < eIdx;)
    {
        var token = input[i++];

        // Literals
        var literals_length = (token >> 4);
        if (literals_length > 0) {
            // length of literals
            var l = literals_length + 240;
            while (l === 255) {
                l = input[i++];
                literals_length += l;
            }

            // Copy the literals
            var end = i + literals_length;
            while (i < end) {
                output[j++] = input[i++];
            }

            // End of buffer?
            if (i === eIdx) {
                return j;
            }
        }

        // Match copy
        // 2 bytes offset (little endian)
        var offset = input[i++] | (input[i++] << 8);

        // 0 is an invalid offset value
        if (offset === 0 || offset > j) {
            return -(i-2);
        }

        // length of match copy
        var match_length = (token & 0xf);
        var l = match_length + 240;
        while (l === 255) {
            l = input[i++];
            match_length += l;
        }

        // Copy the match
        var pos = j - offset; // position of the match copy in the current output
        var end = j + match_length + 4; // minmatch = 4
        while (j < end) {
            output[j++] = output[pos++];
        }
    }

    return j;
}

Then declare this function that receives a File object (not a path) and callbacks for success/error:

function readMozlz4File(file, onRead, onError)
{
    let reader = new FileReader();

    reader.onload = function() {
        let input = new Uint8Array(reader.result);
        let output;
        let uncompressedSize = input.length*3;  // size estimate for uncompressed data!

        // Decode whole file.
        do {
            output = new Uint8Array(uncompressedSize);
            uncompressedSize = decodeLz4Block(input, output, 8+4);  // skip 8 byte magic number + 4 byte data size field
            // if there's more data than our output estimate, create a bigger output array and retry (at most one retry)
        } while (uncompressedSize > output.length);

        output = output.slice(0, uncompressedSize); // remove excess bytes

        let decodedText = new TextDecoder().decode(output);
        onRead(decodedText);
    };

    if (onError) {
        reader.onerror = onError;
    }

    reader.readAsArrayBuffer(file); // read as bytes
};

Then you can add an HTML button to your add-on settings page that lets the user search and select search.json.mozlz4 (in WebExtensions you can't simply open any file in the filesystem without user intervention):

<input name="selectMozlz4FileButton" type="file" accept=".json.mozlz4">

To respond to the user selecting the file, use something like this, which calls the method we previously declared (here I don't use the error callback, but you can):

let button = document.getElementsByName("selectMozlz4FileButton")[0];
button.onchange = function onButtonPress(ev) {
    let file = ev.target.files[0];
    readMozlz4File(file, function(text){
        console.log(text);
    });
};

I hope this helps someone. I sure spent a lot of time working this simple thing out. :)

TechQA.

Reading an LZ4 compressed text file (mozlz4) in WebExtensions (JavaScript, Firefox)

There are 1 answers

Related Questions in JAVASCRIPT

Related Questions in FILE

Related Questions in FIREFOX-ADDON-WEBEXTENSIONS

Related Questions in LZ4

Popular Questions

Popular Tags

Trending Questions