Zip file encryption readable by some Zip clients, not others

956 views Asked by At

I'm investigating a bug report on my open source UnzipKit project. Basically, when writing files encrypted with a password to a Zip file, the resulting archives are readable by some Zip clients, and not others.

UnzipKit writes the password as a UTF-8 string, using the MiniZip wrapper around zlib, which only supports "Traditional PKWare Encryption", not AES. It's using the zipOpenNewFileInZip3() MiniZip function to open the file for writing.

It's readable by BetterZip and UnzipKit on the Mac, as well as 7zip on Windows. However, WinZip (Mac and Windows) and the Mac's unzip command-line app complain about an incorrect password.

For testing purposes, I'm encrypting the files using 111111 as the password, as indicated in the bug report. I tried changing the text encoding to ASCII, and Latin 1 (CP-1252), but that didn't seem to make a difference.

I'm working on getting familiar with the way Zip files work, but this still seems mysterious to me. What could I be doing wrong to cause it to work in some clients and not in others? I would expect it to work or be broken across the board.

Here is a zip file's hex dump, that fails to unzip:

50 4B 03 04 14 00 01 00 08 00 B7 54 D1 46 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 37 52 6F F1 31 B6 6E 3D 76 CD 3A 67 0E FF 08 42 C9 4D 61 74 C1 27 DF CB BE 24 41 46 56 60 89 C2 07 97 56 C9 2A 50 80 86 15 E2 62 66 90 77 20 50 4B 01 02 00 00 14 00 01 00 08 00 B7 54 D1 46 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 50 4B 05 06 00 00 00 00 01 00 01 00 3D 00 00 00 5C 00 00 00 00 00

This is what I get on the command line, with a return code of 82:

$ unzip -P 111111 PasswordProtected.zip
Archive:  PasswordProtected.zip
   skipping: Test File A.txt         incorrect password

Update

I created an archive of the same file with WinZip for Mac, with the same password on the file. This is its hex dump:

50 4B 03 04 14 00 03 00 08 00 27 BA 76 44 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 1C 68 5F 1E FF CA 3A 6C D5 B6 01 28 0F 72 83 D9 01 9B BA 87 51 50 1F 66 61 83 43 E8 64 58 B6 ED A6 F0 9B 3B 87 89 70 F2 4F D9 AB 21 6A 6A 06 50 4B 01 02 14 03 14 00 03 00 08 00 27 BA 76 44 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 00 00 00 00 01 00 00 00 80 81 00 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 50 4B 05 06 00 00 00 00 01 00 01 00 3D 00 00 00 5C 00 00 00 00 00

The biggest difference is that the file data is completely different, meaning it was encrypted with a different key. Also, the general purpose bit flag indicates it used Maximum, rather than Normal, compression. Just in case the rest offers any clues, this is a summary of the differences, annotated with the field names provided by the spec.

  • Local file header

    Field Name                UnzipKit Bytes  WinZip Bytes
    general purpose bit flag  01 00           03 00
    last mod file time        B7 54           27 BA
    last mod file date        D1 46           76 44
    
  • File data

    UnzipKit
    37 52 6F F1 31 B6 6E 3D 76 CD 3A 67 0E FF 08 42 C9 4D 61 74 C1 27 DF CB BE 24 41 46 56 60 89 C2 07 97 56 C9 2A 50 80 86 15 E2 62 66 90 77 20
    
    WinZip
    1C 68 5F 1E FF CA 3A 6C D5 B6 01 28 0F 72 83 D9 01 9B BA 87 51 50 1F 66 61 83 43 E8 64 58 B6 ED A6 F0 9B 3B 87 89 70 F2 4F D9 AB 21 6A 6A 06
    
  • Central directory structure

    Field Name                UnzipKit Bytes  WinZip Bytes
    version made by           00 00           14 03
    general purpose bit flag  01 00           03 00
    last mod file time        B7 54           27 BA
    last mod file date        D1 46           76 44
    external file attributes  00 00 00 00     00 00 80 81
    

All of the following fields match 100%:

  • Local file header
    • version to extract
    • compression method
    • crc-32
    • compressed size
    • uncompressed size
    • uncompressed size
    • extra field length
    • file name
  • Central directory structure
    • version needed to extract
    • compression method
    • crc-32
    • compressed size
    • uncompressed size
    • file name length
    • extra field length
    • file comment length
    • disk number start
    • internal file attributes
    • relative offset of local header
    • file name
  • The entire "End of central directory record"
1

There are 1 answers

4
Mark Adler On BEST ANSWER

I'm guessing that the last parameter in your zipOpenNewFileInZip3() call is zero. It is supposed to be the CRC of the file.

When a compressed entry is encrypted it is preceded by a 12-byte encryption header. That header is composed of 10 or 11 random bytes followed by 2 or 1 bytes of the high end of that file's CRC. The header is then encrypted using the password, and encryption continues from there on the compressed data. (1 vs. 2 bytes is determined by the version of the zip format.) This allows the unzipper to check the password by comparing the end of the decrypted encryption header with the CRC stored in the local header that precedes it.

What is happening with BetterZip and 7Zip is that they are simply not checking the end of the encryption header. They then do not notice that the encryption header does not comply with the specification, and continue to correctly decrypt the compressed data. UnZip and WinZip on the other hand detect the bug.