Using zstandard to compress a file in Python

2.5k views Asked by At

So I'm using the zstandard python library, and I've written a helper class and function to use contexts to decompress files.

class ZstdReader:
    def __init__(self, filename):
        self.filename = filename

    def __enter__(self):
        self.f = open(self.filename, 'rb')
        dctx = zstd.ZstdDecompressor()
        reader = dctx.stream_reader(self.f)
        return io.TextIOWrapper(reader, encoding='utf-8')

    def __exit__(self, *a):
        self.f.close()
        return False

def openZstd(filename, mode='rb'):
    if 'w' in mode:
        return ZstdWriter(filename)
    return ZstdReader(filename)

This works really well and allows me to just use with openZstd('filename.zst', 'rb') as f: before using the file f for json dumping and loading. I'm however having issues generalizing this to writing, I've tried following the documentation in the same way I did so far but something is not working. Here's what I've tried:

class ZstdWriter:
    def __init__(self, filename):
        self.filename = filename

    def __enter__(self):
        self.f = open(self.filename, 'wb')
        ctx = zstd.ZstdCompressor()
        writer = ctx.stream_writer(self.f)
        return io.TextIOWrapper(writer, encoding='utf-8')

    def __exit__(self, *a):
        self.f.close()
        return False

When I open a file using this class, and do a json.dump([], f), the file ends up being empty for some reason. I guess one of the steps is swallowing my input, but have no idea what it could possibly be.

1

There are 1 answers

0
John Do On BEST ANSWER

As suggested by jasonharper in the comments, you have to flush both the io wrapper and the writer itself, as follows:

s = json.dumps({})
iw = io.TextIOWrapper(writer, encoding="utf-8")
iw.write(s)

iw.flush()
writer.flush(zstd.FLUSH_FRAME)
f.close()

This results on the data being in the file, and the file being complete.