Find invalid bz2 file preferable using C/C++

234 views Asked by At

I have around 200 thousand bz2 files in which only one 1 valid. The size of each bz2 file is less than 200 bytes. I need to find the valid one. The command line bz2 utility is taking too much time.

Is there minimal check using file bytes by which I can find invalid bz2 and ignore further processing. I want to do in C/C++ as it would be way faster than shell scripts.

1

There are 1 answers

0
Shashwat Kumar On BEST ANSWER

Got the solution. As per bz2 format, first 3 characters should be 'BZh'. This filtered out all but 19 files.