[File] [PATCH] of Magdir/zip for Zip archive data; version + method
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Tue Mar 3 00:27:27 UTC 2020
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
some days ago i handled many ZIP archives. When running file command
i get sometimes strange results. For comparison reasons i download
well known ZIP archives from infozip ftp server. When running file
command version 5.38 with -k option on such ZIP archives i get an
output like:
acorn\zcr22xr.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v?[0xd16],
extract using at least v1.0,
last modified Sun Oct 30 19:09:42 1988,
method=store
amiga\zcr21xa.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0x115],
extract using at least v2.0,
last modified Thu May 14 22:42:53 1987,
uncompressed size 12337,
method=deflate
BeOS-FormatFloppy.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v?[0x1017],
extract using at least v1.0,
last modified Thu Aug 29 09:18:49 1996,
method=store
C64CPMSRC.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0],
extract using at least v2.0,
last modified Mon Oct 08 22:42:55 2007,
method=store
cbasic2.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v1.0,
extract using at least v1.0,
last modified Thu Jan 01 00:00:00 1970,
uncompressed size 20992,
method=[0x6]
freehep-vectorgraphics-svg-2.1.1b-src-diff.zip: Zip archive data,
at least v?[0x314] to extract
Zip archive data, made by v?[0x33f],
extract using at least v?[0x314],
last modified Mon Jul 25 07:46:41 2005,
method=store
lh2_222.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0x613],
extract using at least v2.0,
last modified Wed Jan 11 19:31:18 1984,
uncompressed size 56743,
method=deflate
MacZip10_JLEE.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v?[0x70a],
extract using at least v1.0,
last modified Thu Sep 07 01:06:42 1989,
uncompressed size 750,
method=[0x1]
mvs\unz532xm-docs.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0xb16],
extract using at least v2.0,
last modified Sun May 28 09:20:05 1989,
uncompressed size 4513,
method=deflate
tandem\zip23xk.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0x1117],
extract using at least v2.0,
last modified Sun Jan 06 00:41:37 1991,
uncompressed size 608256,
method=deflate
vmcms\unz532cms-doc.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0x416],
extract using at least v2.0,
last modified Sun May 28 09:20:05 1989,
uncompressed size 4513,
method=deflate
vms\touchx.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v?[0x213],
extract using at least v2.0,
last modified Wed Feb 16 04:44:21 1983,
uncompressed size 17920,
method=deflate
For most of these inspected samples the file command was not able to
recognise the version stored inside ZIP archive. Instead it displays
the version value as hexadecimal value starting with phrase "v?[".
A good starting point of information is the Zip file format on
Wikipedia. More Information can be seen in APPNOTE from PKWARE.
So i add to Magdir/zip according comment lines like
# URL: https://en.wikipedia.org/wiki/Zip_(file_format)
# reference: pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.6.TXT
So for example MacZip10_JLEE.zip 0x70a is shown instead correct 1.0
version. According to APPNOTE the version information is stored as
2 byte little endian integer, where the lower byte indicates the
ZIP version of this document. So one correction way is to change
inside sub function zipversion line from
>0 leshort 0x0a v1.0
into line like
>0 leshort&0x00FF 0x0a v1.0
This procedure must be also done for other versions beside 1.0.
Luckily there exist a more efficient way to display version
information. According to documentation this byte value dived by 10
indicates the major version number, and the value mod 10 is the
minor version number. So show now version information by replacing
old 17 lines by just 2 lines like:
>0 ubyte/10 x v%u
>0 ubyte%10 x \b.%u
The upper byte indicates the compatibility of the file attribute
information. If the external file attributes are compatible with
MS-DOS and can be read by PKZIP for DOS version 2.04g then this value
will be zero. For many ZIP archives that is true. So there the error
in zipversion function was not a problem.
If these attributes are not compatible, then this value will identify
the host system on which the attributes are compatible. Software can
use this information to determine the line record format for text
files for example. So show this information by additional function
ziphost starting with lines
0 name ziphost
#>1 ubyte 0 DOS
>1 ubyte 1 Amiga
>1 ubyte 2 OpenVMS
Unfortunately i only found examples for 11 host cases. The 9
remaining cases with non found examples start with lines
>1 ubyte 5 Atari ST
So show now after the made by zip version text the host name by
additional calling ziphost inside Zip central directory record
function. This looks now like
0 name zipcd
>0 string PK\001\002 Zip archive data
>>4 leshort x \b, made by
>>4 use zipversion
>>4 use ziphost
A similar problem appears for the compression method. For many ZIP
archives that is reported in text form like in example lh2_222.zip.
There the used compression method is shown by phrase
"method=deflate", but for some cases the compression method is only
shown as hexadecimal value starting with phrase "method=[0x"
The current magic lines look like
>>10 leshort x \b, method=
>>10 use zipcompression
#
0 name zipcompression
>0 leshort 0 \bstore
...
>0 leshort 99 \bAES Encrypted
>0 default x
>>0 leshort x \b[%#x]
According to APPNOTE add 7 additional methods inside zipcompression
with lines like
>0 leshort 1 \bShrinking
>0 leshort 6 \bImploding
Unfortunately i myself found only examples for method Shrinking and
Imploding.
After applying the above mentioned modifications by patch
file-5.38-zip-version.diff then i get output with correct zip
version and a more human readable compression method text like:
acorn\zcr22xr.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v2.2 Acorn Risc,
extract using at least v1.0,
last modified Sun Oct 30 19:09:42 1988,
method=store
amiga\zcr21xa.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v2.1 Amiga,
extract using at least v2.0,
last modified Thu May 14 22:42:53 1987,
uncompressed size 12337,
method=deflate
BeOS-FormatFloppy.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v2.3 BeOS,
extract using at least v1.0,
last modified Thu Aug 29 09:18:49 1996,
method=store
C64CPMSRC.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v0.0,
extract using at least v2.0, l
ast modified Mon Oct 08 22:42:55 2007,
method=store
cbasic2.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v1.0,
extract using at least v1.0,
last modified Thu Jan 01 00:00:00 1970,
uncompressed size 20992,
method=Imploding
freehep-vectorgraphics-svg-2.1.1b-src-diff.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v6.3 UNIX,
extract using at least v2.0,
last modified Mon Jul 25 07:46:41 2005,
method=store
lh2_222.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v1.9 OS/2,
extract using at least v2.0,
last modified Wed Jan 11 19:31:18 1984,
uncompressed size 56743,
method=deflate
MacZip10_JLEE.zip: Zip archive data,
at least v1.0 to extract
Zip archive data, made by v1.0 Macintosh,
extract using at least v1.0,
last modified Thu Sep 07 01:06:42 1989,
uncompressed size 750,
method=Shrinking
mvs\unz532xm-docs.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v2.2 MVS,
extract using at least v2.0,
last modified Sun May 28 09:20:05 1989,
uncompressed size 4513,
method=deflate
tandem\zip23xk.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v2.3 Tandem,
extract using at least v2.0,
last modified Sun Jan 06 00:41:37 1991,
uncompressed size 608256,
method=deflate
vmcms\unz532cms-doc.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v2.2 VM/CMS,
extract using at least v2.0,
last modified Sun May 28 09:20:05 1989,
ucompressed size 4513,
method=deflate
vms\touchx.zip: Zip archive data,
at least v2.0 to extract
Zip archive data, made by v1.9 OpenVMS,
extract using at least v2.0,
last modified Wed Feb 16 04:44:21 1983,
uncompressed size 17920,
method=deflate
I hope my diff file can be applied in future version of
file utility.
With best wishes
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCXl2kSgAKCRCv8rHJQhrU
1p/NAJ9Hbghro9tkSDV7tO/f2g6+ikY17wCgqr9LpAjKJGyKXApjhJil8KM0mDw=
=TG6c
-----END PGP SIGNATURE-----
-------------- next part --------------
--- file-5.38/magic/Magdir/zip.old 2019-07-16 12:04:50 +0000
+++ file-5.38/magic/Magdir/zip 2020-03-02 21:50:16 +0000
@@ -4,11 +4,12 @@
# Note the version of magic in archive is currently stronger, this is
# just an example until negative offsets are supported better
-# Zip Central Cirectory record
+# Zip Central Directory record
0 name zipcd
>0 string PK\001\002 Zip archive data
>>4 leshort x \b, made by
>>4 use zipversion
+>>4 use ziphost
>>6 leshort x \b, extract using at least
>>6 use zipversion
>>12 ledate x \b, last modified %s
@@ -16,13 +17,27 @@
>>10 leshort x \b, method=
>>10 use zipcompression
+# URL: https://en.wikipedia.org/wiki/Zip_(file_format)
+# reference: https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.6.TXT
# Zip known compressions
0 name zipcompression
>0 leshort 0 \bstore
+>0 leshort 1 \bShrinking
+>0 leshort 6 \bImploding
+>0 leshort 7 \bTokenizing
>0 leshort 8 \bdeflate
>0 leshort 9 \bdeflate64
+>0 leshort 10 \bLibrary imploding
+#>0 leshort 11 \bReserved by PKWARE
>0 leshort 12 \bbzip2
+#>0 leshort 13 \bReserved by PKWARE
>0 leshort 14 \blzma
+#>0 leshort 15 \bReserved by PKWARE
+>0 leshort 16 \bCMPSC Compression
+#>0 leshort 17 \bReserved by PKWARE
+>0 leshort 18 \bIBM TERSE
+>0 leshort 19 \bIBM LZ77
+# https://support.winzip.com/hc/en-us/articles/115012122828-Compression-method-used-for-this-file-is-94
>0 leshort 94 \bMP3
>0 leshort 95 \bxz
>0 leshort 96 \bJpeg
@@ -34,23 +49,55 @@
# Zip known versions
0 name zipversion
->0 leshort 0x09 v0.9
->0 leshort 0x0a v1.0
->0 leshort 0x0b v1.1
->0 leshort 0x14 v2.0
->0 leshort 0x15 v2.1
->0 leshort 0x19 v2.5
->0 leshort 0x1b v2.7
->0 leshort 0x2d v4.5
->0 leshort 0x2e v4.6
->0 leshort 0x32 v5.0
->0 leshort 0x33 v5.1
->0 leshort 0x34 v5.2
->0 leshort 0x3d v6.1
->0 leshort 0x3e v6.2
->0 leshort 0x3f v6.3
->0 default x
->>0 leshort x v?[%#x]
+# The lower byte indicates the ZIP version of this file. The value/10 indicates
+# the major version number, and the value mod 10 is the minor version number.
+>0 ubyte/10 x v%u
+>0 ubyte%10 x \b.%u
+# >0 leshort 0x09 v0.9
+# >0 leshort 0x0a v1.0
+# >0 leshort 0x0b v1.1
+# >0 leshort 0x14 v2.0
+# >0 leshort 0x15 v2.1
+# >0 leshort 0x19 v2.5
+# >0 leshort 0x1b v2.7
+# >0 leshort 0x2d v4.5
+# >0 leshort 0x2e v4.6
+# >0 leshort 0x32 v5.0
+# >0 leshort 0x33 v5.1
+# >0 leshort 0x34 v5.2
+# >0 leshort 0x3d v6.1
+# >0 leshort 0x3e v6.2
+# >0 leshort 0x3f v6.3
+# >0 default x
+# >>0 leshort x v?[%#x]
+
+# display compatible host system name of ZIP archive
+0 name ziphost
+# The upper byte indicates the compatibility of the file attribute information.
+# If the file is compatible with MS-DOS (v 2.04g) then this value will be zero.
+#>1 ubyte 0 DOS
+>1 ubyte 1 Amiga
+>1 ubyte 2 OpenVMS
+>1 ubyte 3 UNIX
+>1 ubyte 4 VM/CMS
+>1 ubyte 6 OS/2
+>1 ubyte 7 Macintosh
+>1 ubyte 11 MVS
+>1 ubyte 13 Acorn Risc
+>1 ubyte 16 BeOS
+>1 ubyte 17 Tandem
+# 9 untested
+>1 ubyte 5 Atari ST
+>1 ubyte 8 Z-System
+>1 ubyte 9 CP/M
+>1 ubyte 10 Windows NTFS
+>1 ubyte 12 VSE
+>1 ubyte 14 VFAT
+>1 ubyte 15 alternate MVS
+>1 ubyte 18 OS/400
+>1 ubyte 19 OS X
+# unused
+#>1 ubyte >19 unused 0x%x
# Zip End Of Central Directory record
-22 string PK\005\006
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.38-zip-version.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20200303/8f0dc5e7/attachment.obj>
More information about the File
mailing list