I'm trying to set up a code to pack a few big files (from tens to hundreds of gigabytes) into one archive. The compression methods that supported in tarfile module are a bit slow for such a big amount of data, so I would like to use some external compress module like lz4 to achive better speed of compression. Unfortunately I can't find a way how to create tar file and compress it with lz4 on the fly to avoid creating temporary tar file. The documentation of tarfile module says that there's a way to open an uncompressed stream for writing using 'w|' mode. Is it the way to stream tar file directly to lz4 module? If so, what's the proper way to use it? Thank you very much.
Python: how to create tar file and compress it on the fly with external module, using different compression methods not available in tarfile module?
2.3k views Asked by Trevor_Numbers At
2
There are 2 answers
0
On
You can pipe the result of the tar
command directly to the lz4
utility. This will avoid usage of any intermediate file. Here is an example (assuming you have both tar
and lz4
installed on your system) :
tar cvf - * | lz4 > mypack.tar.lz4
The -
here tells to output the result from tar
to stdout
. Of course, you can change the *
with whichever target you want to tar.
The reverse operation is also possible :
lz4 -d mypack.tar.lz4 | tar xv
Per our conversation above.
From there you can do the usual
tar.addfile
. FYI: as I stated in the conversation. GNU tar can auto detect gz and bz2 but not lz4. Just a note. So you have to dolz4 -c -d stdin.lz4 | tar xf -
to extract files. If you simply didtar xf
it would fail.