[File] [PATCH] of Magdir/zip for Zip archive data; version + method

Jörg Jenderek joerg.jen.der.ek at gmx.net
Tue Mar 3 00:27:27 UTC 2020


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some days ago i handled many ZIP archives. When running file command
i get sometimes strange results. For comparison reasons i download
well known ZIP archives from infozip ftp server. When running file
command version 5.38 with -k option on such ZIP archives i get an
output like:

acorn\zcr22xr.zip:                              Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v?[0xd16],
	extract using at least v1.0,
	last modified Sun Oct 30 19:09:42 1988,
	method=store
amiga\zcr21xa.zip:                              Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0x115],
	extract using at least v2.0,
	last modified Thu May 14 22:42:53 1987,
	uncompressed size 12337,
	method=deflate
BeOS-FormatFloppy.zip:                          Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v?[0x1017],
	extract using at least v1.0,
	last modified Thu Aug 29 09:18:49 1996,
	method=store
C64CPMSRC.zip:                                  Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0],
	extract using at least v2.0,
	last modified Mon Oct 08 22:42:55 2007,
	method=store
cbasic2.zip:                                    Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v1.0,
	extract using at least v1.0,
	last modified Thu Jan 01 00:00:00 1970,
	uncompressed size 20992,
	method=[0x6]
freehep-vectorgraphics-svg-2.1.1b-src-diff.zip: Zip archive data,
	at least v?[0x314] to extract
	Zip archive data, made by v?[0x33f],
	extract using at least v?[0x314],
	last modified Mon Jul 25 07:46:41 2005,
	method=store
lh2_222.zip:                                    Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0x613],
	extract using at least v2.0,
	last modified Wed Jan 11 19:31:18 1984,
	uncompressed size 56743,
	method=deflate
MacZip10_JLEE.zip:                              Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v?[0x70a],
	extract using at least v1.0,
	last modified Thu Sep 07 01:06:42 1989,
	uncompressed size 750,
	method=[0x1]
mvs\unz532xm-docs.zip:                          Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0xb16],
	extract using at least v2.0,
	last modified Sun May 28 09:20:05 1989,
	uncompressed size 4513,
	method=deflate
tandem\zip23xk.zip:                             Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0x1117],
	extract using at least v2.0,
	last modified Sun Jan 06 00:41:37 1991,
	uncompressed size 608256,
	method=deflate
vmcms\unz532cms-doc.zip:                        Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0x416],
	extract using at least v2.0,
	last modified Sun May 28 09:20:05 1989,
	uncompressed size 4513,
	method=deflate
vms\touchx.zip:                                 Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v?[0x213],
	extract using at least v2.0,
	last modified Wed Feb 16 04:44:21 1983,
	uncompressed size 17920,
	method=deflate

For most of these inspected samples the file command was not able to
recognise the version stored inside ZIP archive. Instead it displays
the version value as hexadecimal value starting with phrase "v?[".

A good starting point of information is the Zip file format on
Wikipedia. More Information can be seen in APPNOTE from PKWARE.
So i add to Magdir/zip according comment lines like
 # URL:       https://en.wikipedia.org/wiki/Zip_(file_format)
 # reference: pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.6.TXT

So for example MacZip10_JLEE.zip 0x70a is shown instead correct 1.0
version. According to APPNOTE the version information is stored as
2 byte little endian integer, where the lower byte indicates the
ZIP version of this document. So one correction way is to change
inside sub function zipversion line from
 >0	leshort			0x0a		v1.0
into line like
 >0	leshort&0x00FF		0x0a		v1.0
This procedure must be also done for other versions beside 1.0.
Luckily there exist a more efficient way to display version
information. According to documentation this byte value dived by 10
indicates the major version number, and the value mod 10 is the
minor version number. So show now version information by replacing
old 17 lines by just 2 lines like:
 >0	ubyte/10	x		v%u
 >0	ubyte%10	x		\b.%u

The upper byte indicates the compatibility of the file attribute
information. If the external file attributes are compatible with
MS-DOS and can be read by PKZIP for DOS version 2.04g then this value
will be zero. For many ZIP archives that is true. So there the error
in zipversion function was not a problem.
If these attributes are not compatible, then this value will identify
the host system on which the attributes are compatible. Software can
use this information to determine the line record format for text
files for example. So show this information by additional function
ziphost starting with lines
 0	name		ziphost
 #>1	ubyte		0		DOS
 >1	ubyte		1 		Amiga
 >1	ubyte		2		OpenVMS
Unfortunately i only found examples for 11 host cases. The 9
remaining cases with non found examples start with lines
 >1	ubyte		5		Atari ST

So show now after the made by zip version text the host name by
additional calling ziphost inside Zip central directory record
function. This looks now like
 0	name		zipcd
 >0	string		PK\001\002	Zip archive data
 >>4	leshort		x		\b, made by
 >>4	use		zipversion
 >>4	use		ziphost

A similar problem appears for the compression method. For many ZIP
archives that is reported in text form like in example lh2_222.zip.
There the used compression method is shown by phrase
"method=deflate", but for some cases the compression method is only
shown as hexadecimal value starting with phrase "method=[0x"

The current magic lines look like
 >>10	leshort		x		\b, method=
 >>10	use		zipcompression
 #
 0	name		zipcompression
 >0	leshort		0		\bstore
 ...
 >0	leshort		99		\bAES Encrypted
 >0	default		x
 >>0	leshort		x		\b[%#x]

According to APPNOTE add 7 additional methods inside zipcompression
with lines like
 >0	leshort		1		\bShrinking
 >0	leshort		6		\bImploding
Unfortunately i myself found only examples for method Shrinking and
Imploding.

After applying the above mentioned modifications by patch
file-5.38-zip-version.diff then i get output with correct zip
version and a more human readable compression method text like:
acorn\zcr22xr.zip:                              Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v2.2 Acorn Risc,
	extract using at least v1.0,
	last modified Sun Oct 30 19:09:42 1988,
	method=store
amiga\zcr21xa.zip:                              Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v2.1 Amiga,
	extract using at least v2.0,
	last modified Thu May 14 22:42:53 1987,
	uncompressed size 12337,
	method=deflate
BeOS-FormatFloppy.zip:                          Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v2.3 BeOS,
	extract using at least v1.0,
	last modified Thu Aug 29 09:18:49 1996,
	method=store
C64CPMSRC.zip:                                  Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v0.0,
	extract using at least v2.0, l
	ast modified Mon Oct 08 22:42:55 2007,
	method=store
cbasic2.zip:                                    Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v1.0,
	extract using at least v1.0,
	last modified Thu Jan 01 00:00:00 1970,
	uncompressed size 20992,
	method=Imploding
freehep-vectorgraphics-svg-2.1.1b-src-diff.zip: Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v6.3 UNIX,
	extract using at least v2.0,
	last modified Mon Jul 25 07:46:41 2005,
	method=store
lh2_222.zip:                                    Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v1.9 OS/2,
	extract using at least v2.0,
	last modified Wed Jan 11 19:31:18 1984,
	uncompressed size 56743,
	method=deflate
MacZip10_JLEE.zip:                              Zip archive data,
	at least v1.0 to extract
	Zip archive data, made by v1.0 Macintosh,
	extract using at least v1.0,
	last modified Thu Sep 07 01:06:42 1989,
	uncompressed size 750,
	method=Shrinking
mvs\unz532xm-docs.zip:                          Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v2.2 MVS,
	extract using at least v2.0,
	last modified Sun May 28 09:20:05 1989,
	uncompressed size 4513,
	method=deflate
tandem\zip23xk.zip:                             Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v2.3 Tandem,
	extract using at least v2.0,
	last modified Sun Jan 06 00:41:37 1991,
	uncompressed size 608256,
	method=deflate
vmcms\unz532cms-doc.zip:                        Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v2.2 VM/CMS,
	extract using at least v2.0,
	last modified Sun May 28 09:20:05 1989,
	ucompressed size 4513,
	method=deflate
vms\touchx.zip:                                 Zip archive data,
	at least v2.0 to extract
	Zip archive data, made by v1.9 OpenVMS,
	extract using at least v2.0,
	last modified Wed Feb 16 04:44:21 1983,
	uncompressed size 17920,
	method=deflate

I hope my diff file can be applied in future version of
file utility.

With best wishes
Jörg Jenderek
- --
Jörg Jenderek




-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCXl2kSgAKCRCv8rHJQhrU
1p/NAJ9Hbghro9tkSDV7tO/f2g6+ikY17wCgqr9LpAjKJGyKXApjhJil8KM0mDw=
=TG6c
-----END PGP SIGNATURE-----
-------------- next part --------------
--- file-5.38/magic/Magdir/zip.old	2019-07-16 12:04:50 +0000
+++ file-5.38/magic/Magdir/zip	2020-03-02 21:50:16 +0000
@@ -4,11 +4,12 @@
 # Note the version of magic in archive is currently stronger, this is
 # just an example until negative offsets are supported better
 
-# Zip Central Cirectory record
+# Zip Central Directory record
 0	name		zipcd
 >0	string		PK\001\002	Zip archive data
 >>4	leshort		x		\b, made by
 >>4	use		zipversion
+>>4	use		ziphost
 >>6	leshort		x		\b, extract using at least
 >>6	use		zipversion
 >>12	ledate		x		\b, last modified %s
@@ -16,13 +17,27 @@
 >>10	leshort		x		\b, method=
 >>10	use		zipcompression
 
+# URL:		https://en.wikipedia.org/wiki/Zip_(file_format)
+# reference:	https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.6.TXT
 # Zip known compressions
 0	name		zipcompression
 >0	leshort		0		\bstore
+>0	leshort		1		\bShrinking
+>0	leshort		6		\bImploding
+>0	leshort		7		\bTokenizing
 >0	leshort		8		\bdeflate
 >0	leshort		9		\bdeflate64
+>0	leshort		10		\bLibrary imploding
+#>0	leshort		11 		\bReserved by PKWARE
 >0	leshort		12		\bbzip2
+#>0	leshort		13 		\bReserved by PKWARE
 >0	leshort		14		\blzma
+#>0	leshort		15 		\bReserved by PKWARE
+>0	leshort		16		\bCMPSC Compression
+#>0	leshort		17 		\bReserved by PKWARE
+>0	leshort		18		\bIBM TERSE
+>0	leshort		19		\bIBM LZ77
+# https://support.winzip.com/hc/en-us/articles/115012122828-Compression-method-used-for-this-file-is-94
 >0	leshort		94		\bMP3
 >0	leshort		95		\bxz
 >0	leshort		96		\bJpeg
@@ -34,23 +49,55 @@
 
 # Zip known versions
 0	name		zipversion
->0	leshort		0x09		v0.9
->0	leshort		0x0a		v1.0
->0	leshort		0x0b		v1.1
->0	leshort		0x14		v2.0
->0	leshort		0x15		v2.1
->0	leshort		0x19		v2.5
->0	leshort		0x1b		v2.7
->0	leshort		0x2d		v4.5
->0	leshort		0x2e		v4.6
->0	leshort		0x32		v5.0
->0	leshort		0x33		v5.1
->0	leshort		0x34		v5.2
->0	leshort		0x3d		v6.1
->0	leshort		0x3e		v6.2
->0	leshort		0x3f		v6.3
->0	default		x
->>0	leshort		x		v?[%#x]
+# The lower byte indicates the ZIP version of this file. The value/10 indicates
+# the major version number, and the value mod 10 is the minor version number.
+>0	ubyte/10	x		v%u
+>0	ubyte%10	x		\b.%u
+# >0	leshort		0x09		v0.9
+# >0	leshort		0x0a		v1.0
+# >0	leshort		0x0b		v1.1
+# >0	leshort		0x14		v2.0
+# >0	leshort		0x15		v2.1
+# >0	leshort		0x19		v2.5
+# >0	leshort		0x1b		v2.7
+# >0	leshort		0x2d		v4.5
+# >0	leshort		0x2e		v4.6
+# >0	leshort		0x32		v5.0
+# >0	leshort		0x33		v5.1
+# >0	leshort		0x34		v5.2
+# >0	leshort		0x3d		v6.1
+# >0	leshort		0x3e		v6.2
+# >0	leshort		0x3f		v6.3
+# >0	default		x
+# >>0	leshort		x		v?[%#x]
+
+#	display compatible host system name of ZIP archive
+0	name		ziphost
+# The upper byte indicates the compatibility of the file attribute information.
+# If the file is compatible with MS-DOS (v 2.04g) then this value will be zero.
+#>1	ubyte		0		DOS
+>1	ubyte		1 		Amiga
+>1	ubyte		2		OpenVMS
+>1	ubyte		3		UNIX
+>1	ubyte		4		VM/CMS
+>1	ubyte		6		OS/2
+>1	ubyte		7		Macintosh
+>1	ubyte		11		MVS
+>1	ubyte		13		Acorn Risc
+>1	ubyte		16		BeOS
+>1	ubyte		17		Tandem
+# 9 untested
+>1	ubyte		5		Atari ST
+>1	ubyte		8		Z-System
+>1	ubyte		9		CP/M
+>1	ubyte		10		Windows NTFS
+>1	ubyte		12		VSE
+>1	ubyte		14		VFAT
+>1	ubyte		15		alternate MVS
+>1	ubyte		18		OS/400
+>1	ubyte		19		OS X
+# unused
+#>1	ubyte		>19		unused 0x%x
 
 # Zip End Of Central Directory record
 -22	string		PK\005\006
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.38-zip-version.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20200303/8f0dc5e7/attachment.obj>


More information about the File mailing list