I'm investigating a bug report on my open source UnzipKit project. Basically, when writing files encrypted with a password to a Zip file, the resulting archives are readable by some Zip clients, and not others.
UnzipKit writes the password as a UTF-8 string, using the MiniZip wrapper around zlib, which only supports "Traditional PKWare Encryption", not AES.
It's using the zipOpenNewFileInZip3()
MiniZip function to open the file for writing.
It's readable by BetterZip and UnzipKit on the Mac, as well as 7zip on Windows. However, WinZip (Mac and Windows) and the Mac's unzip
command-line app complain about an incorrect password.
For testing purposes, I'm encrypting the files using 111111
as the password, as indicated in the bug report. I tried changing the text encoding to ASCII, and Latin 1 (CP-1252), but that didn't seem to make a difference.
I'm working on getting familiar with the way Zip files work, but this still seems mysterious to me. What could I be doing wrong to cause it to work in some clients and not in others? I would expect it to work or be broken across the board.
Here is a zip file's hex dump, that fails to unzip:
50 4B 03 04 14 00 01 00 08 00 B7 54 D1 46 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 37 52 6F F1 31 B6 6E 3D 76 CD 3A 67 0E FF 08 42 C9 4D 61 74 C1 27 DF CB BE 24 41 46 56 60 89 C2 07 97 56 C9 2A 50 80 86 15 E2 62 66 90 77 20 50 4B 01 02 00 00 14 00 01 00 08 00 B7 54 D1 46 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 50 4B 05 06 00 00 00 00 01 00 01 00 3D 00 00 00 5C 00 00 00 00 00
This is what I get on the command line, with a return code of 82:
$ unzip -P 111111 PasswordProtected.zip
Archive: PasswordProtected.zip
skipping: Test File A.txt incorrect password
Update
I created an archive of the same file with WinZip for Mac, with the same password on the file. This is its hex dump:
50 4B 03 04 14 00 03 00 08 00 27 BA 76 44 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 1C 68 5F 1E FF CA 3A 6C D5 B6 01 28 0F 72 83 D9 01 9B BA 87 51 50 1F 66 61 83 43 E8 64 58 B6 ED A6 F0 9B 3B 87 89 70 F2 4F D9 AB 21 6A 6A 06 50 4B 01 02 14 03 14 00 03 00 08 00 27 BA 76 44 1B B6 2D 32 2F 00 00 00 21 00 00 00 0F 00 00 00 00 00 00 00 01 00 00 00 80 81 00 00 00 00 54 65 73 74 20 46 69 6C 65 20 41 2E 74 78 74 50 4B 05 06 00 00 00 00 01 00 01 00 3D 00 00 00 5C 00 00 00 00 00
The biggest difference is that the file data is completely different, meaning it was encrypted with a different key. Also, the general purpose bit flag indicates it used Maximum, rather than Normal, compression. Just in case the rest offers any clues, this is a summary of the differences, annotated with the field names provided by the spec.
Local file header
Field Name UnzipKit Bytes WinZip Bytes general purpose bit flag 01 00 03 00 last mod file time B7 54 27 BA last mod file date D1 46 76 44
File data
UnzipKit 37 52 6F F1 31 B6 6E 3D 76 CD 3A 67 0E FF 08 42 C9 4D 61 74 C1 27 DF CB BE 24 41 46 56 60 89 C2 07 97 56 C9 2A 50 80 86 15 E2 62 66 90 77 20 WinZip 1C 68 5F 1E FF CA 3A 6C D5 B6 01 28 0F 72 83 D9 01 9B BA 87 51 50 1F 66 61 83 43 E8 64 58 B6 ED A6 F0 9B 3B 87 89 70 F2 4F D9 AB 21 6A 6A 06
Central directory structure
Field Name UnzipKit Bytes WinZip Bytes version made by 00 00 14 03 general purpose bit flag 01 00 03 00 last mod file time B7 54 27 BA last mod file date D1 46 76 44 external file attributes 00 00 00 00 00 00 80 81
All of the following fields match 100%:
- Local file header
version to extract
compression method
crc-32
compressed size
uncompressed size
uncompressed size
extra field length
file name
- Central directory structure
version needed to extract
compression method
crc-32
compressed size
uncompressed size
file name length
extra field length
file comment length
disk number start
internal file attributes
relative offset of local header
file name
- The entire "End of central directory record"
I'm guessing that the last parameter in your
zipOpenNewFileInZip3()
call is zero. It is supposed to be the CRC of the file.When a compressed entry is encrypted it is preceded by a 12-byte encryption header. That header is composed of 10 or 11 random bytes followed by 2 or 1 bytes of the high end of that file's CRC. The header is then encrypted using the password, and encryption continues from there on the compressed data. (1 vs. 2 bytes is determined by the version of the zip format.) This allows the unzipper to check the password by comparing the end of the decrypted encryption header with the CRC stored in the local header that precedes it.
What is happening with BetterZip and 7Zip is that they are simply not checking the end of the encryption header. They then do not notice that the encryption header does not comply with the specification, and continue to correctly decrypt the compressed data. UnZip and WinZip on the other hand detect the bug.