File Organization in a tar.gz archive

225 views Asked by At

While I observed that usually the files inside a folder are listed sequentially in a tar.gz archive in one exceptional case I found that it is listed in a random manner. E.g., let's say there are three folders a, b, and c and each contains 1,2,3 file. In the usual case, the archive entries would be listed in a/1, a/2, a/3, b/1, b/2, b/3, c/1, c/2, c/3 but in this case it is something like b/2, a/1, b/4, ... Why this could happen? I'm using the first organization assumption to read a .tar.gz archive file and do some processing on the data inside at a folder level. Without traversing the whole archive each time and generating parent/child formation any idea if I could get the folder listings sorted inline for such cases. Sample code below:

   TarArchiveInputStream tis = new TarArchiveInputStream("a.tar");
   while(tis.getNextTarEntry()!=null)
    System.out.println(tis.getCurrentEntry().getName() );

I could not find any API which would give me such a sorted list inline. It would be very helpful if somebody helps me here. I'm stuck with this case.

0

There are 0 answers