LZF may compress with different algorithms

1k views Asked by At

I am using libLZF for compression in my application. In the documentation, there is a comment that concerns me:

lzf_compress might use different algorithms on different systems and
even different runs, thus might result in different compressed strings
depending on the phase of the moon or similar factors.

I plan to compare compressed data to know if the input was identical. Obviously if different algorithms were used then the compressed data would be different. Is there a solution to this problem? Possibly a way to force a certain algorithm each time? Or is this comment not ever true in practice? After all, phase of the moon, or similar factors is a little strange.

2

There are 2 answers

2
atzz On BEST ANSWER

The reason for the "moon phase dependency" is that they omit initialization of some data structures to squeeze out a little bit of performance (only where it does not affect decompression correctness, of course). Not an uncommon trick, as compression libraries go. So if you put your compression code in a separate, one-shot process, and your OS zeroes memory before handing it over to a process (all "big" OSes do but some smaller may not), then you'll always get the same compression result.

Also, take note of the following, from lzfP.h:

/*
 * You may choose to pre-set the hash table (might be faster on some
 * modern cpus and large (>>64k) blocks, and also makes compression
 * deterministic/repeatable when the configuration otherwise is the same).
 */
#ifndef INIT_HTAB
# define INIT_HTAB 0
#endif

So I think you only need to #define INIT_HTAB 1 when compiling libLZF to make it deterministic, though wouldn't bet on it too much without further analysis.

2
NPE On

Decompress on the fly, then compare.

libLZF's web site states that "decompression [...] is basically at (unoptimized) memcpy-speed".