Zstandart - compress with function reuse context

402 views Asked by At

I'm trying to figure out when to use the Zstandart function, which, as written, reuses the context.

Please explain what you mean in this case:

`ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx(void);

ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* cctx,void* dst, size_t dstCapacity,const void* src, size_t srcSize, int compressionLevel);`

Compression context When compressing many times, it is recommended to allocate a context just once, and re-use it for each successive compression operation. This will make workload friendlier for system's memory. Note : re-using context is just a speed / resource optimization. It doesn't change the compression ratio, which remains identical. Note 2 : In multi-threaded environments, use one different context per thread for parallel execution.

What means - "When compressing many times" ??

What exactly is compressed many times? The same string with data ? Or something different ?

1

There are 1 answers

0
Redu On

ZSTD is a sophisticated compression algorithm. It analyses the input and creates a context for compression. You can even provide a dictionary which represents the data to be compressed to train the algoritm (create the context) before starting to compress any actual data.

When you use the simple API the context is generated from the input data everytime you attempt compressing. However if you use the more sophisticated ZSTD_compressCCtx() then the obtained context is carried over to the next compressions attempts. This relieves the compressor algorithm to not try to regenrate the compression context but just improve the provided one for further uses.

This is of course most beneficial when you will be compressing similar data multiple times such as chunks of a collection of documents with the same or similar structure. In fact the ZSTD_compressStream() and ZSTD_compressStream2() functions also take the context as the first parameter for this very purpose. They are very handy tools to create ZSTD_TransformStreams.