[File] [PATCH] of Magdir/archive for xar archive; update+extensions *.xar *.xip *.pkg

Christos Zoulas christos at zoulas.com
Sat Mar 30 02:08:22 UTC 2019


Committed, thanks!

christos

> On Mar 29, 2019, at 6:21 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
> 
> Hello,
> 
> some days ago i run file command version 5.36 on eXtensible ARchives.
> I get an output like:
> 
> dslocal-backup.xar:      xar archive version 1, SHA-1 checksum
> FullBundleUpdate.pkg:    xar archive version 1, SHA-1 checksum
> none-none-bzip2.xar:     xar archive version 1, no checksum
> sha1-sha1-bzip28.xar:    xar archive version 1, SHA-1 checksum
> sha1-sha1-gzip5.xar:     xar archive version 1, SHA-1 checksum
> sha1-sha1-none.xar:      xar archive version 1, SHA-1 checksum
> sha256-sha256-gzip7.xar: xar archive version 1,
> sha256-sha512-bzip2.xar: xar archive version 1,
> sha512-sha512-gzip.xar:  xar archive version 1,
> xar-1.1.xar:             xar archive version 1, SHA-1 checksum
> xar-1.5.2.xar:           xar archive version 1, SHA-1 checksum
> Xcode_10.2_beta_4.xip:   xar archive version 1, SHA-1 checksum
> 
> What is wrong? Apparently newer checksum algorithm are not recognized.
> Only older methods like "SHA-1" are recognized. Furthermore with
> --extension option only ??? is displayed.
> 
> So i started to change Magdir/archive. Most information is found on
> new Wikipedia page about xar archiver. So add comment line like
> # URL: https://en.wikipedia.org/wiki/Xar_(archiver)
> 
> Furthermore all fields are stored in big endian order. And many are
> unsigned. So lines about table of contents (TOC) must look like:
>> 8	ubequad	x		compressed TOC: %llu,
>> 16	ubequad	x		uncompressed TOC: %llu,
> 
> The Header size of archives is stored at position 4. In older versions
> this was always 28. So there was no need to show this standard value.
> So this was mentioned only as a comment line:
> #>4	beshort	x		header size %d
> According to Kyle J. McKay wiki in newer versions also padding bytes
> ( like in example xar-1.1.xar) or checksum algorithm name can appear
> after that position in header. This is probably interesting for users,
> because older versions of the xar library do not correctly handle a
> header size value of other than 28. So display this information now by
> line like
>> 4	ubeshort >28		\b, header size %u
> 
> Apparently for eXtensible ARchives file name extension "xar" is used.
> The xar format is used by some Mac OS X installers for packages. Then
> there the filename extension "pkg" like in example
> FullBundleUpdate.pkg is used. Apple introduced a variant with
> additional signature. Such signed archives like Xcode_10.2_beta_4.xip
> have "xip" extension. So these 3 extensions are shown by additional line:
> !:ext	xar/pkg/xip
> 
> Version of Xar format is stored at position 6. This value is 1 at the
> moment and has not changed in the last years. So do not bother users
> with uninteresting information and display now information only when
> not standard by line:
>> 6	ubeshort >1		version %u,
> 
> The variable for used checksum algorithm is stored at position 24.
> According to Wikipedia page values 3, 4 is used for newer checksum
> methods. This is now expressed by additional lines
>> 24	belong	3		SHA-256 checksum
>> 24	belong	4		SHA-512 checksum
> So examples like sha256-sha512-bzip2.xar and sha512-sha512-gzip.xar
> are described more precisely.To recognize also possible other checksum
> algorithm add also additional line:
>> 24	belong	>4		unknown 0x%x checksum
> 
> Some interesting information can be got from bytes inside heap
> section. By jumping over header and TOC (table of contents) we get at
> the heap section and can inspect data by lines by calling file again like:
>>>> &(4.S)	ubyte	x
>>>>> &(8.Q)	ubyte	x
>>>>>> &-1	indirect x	\b, contains
> When we look in the XML TOC we see at the beginning of heap the
> checksum is stored. So by jumping more bytes forward depending on
> checksum type the pointer looks at data. So for SHA-1 this are
> additional 20 bytes. So this looks for SHA-1 (18=20 - 1 (.S
> expression) -1 (.Q expression) like
>> 24	belong	1
>>> 18		ubyte	x
> Now by the indirect expression the compression method like bzip2 is
> reported, which was used when building archive by xar --compression
> option. In older version of xar utility only gzip and bzip2 are
> supported, which i have tested. According to man page xar(1) in Apple
> MAC OSX new versions like mojave also lzma can be used. According to
> open source fork 1.6.1 also xz is supported.
> 
> For pkg and xip the indirect expression pointer aims at the the
> signature. So if magic lines for something like X509Certificate exist
> it should be possible to distinguish such file name extensions from
> xar variants. This is a TODO.
> 
> After applying the above mentioned modifications by patch
> file-5.36-archive-xar.diff then all inspected examples are now
> described more precisely like:
> 
> dslocal-backup.xar:      xar archive
> 	compressed TOC: 20986, SHA-1 checksum
> 	, contains zlib compressed data
> FullBundleUpdate.pkg:    xar archive
> 	compressed TOC: 3188, SHA-1 checksum
> none-none-bzip2.xar:     xar archive
> 	compressed TOC: 385, no checksum
> 	, contains bzip2 compressed data, block size = 900k
> sha1-sha1-bzip28.xar:    xar archive
> 	compressed TOC: 498, SHA-1 checksum
> 	, contains bzip2 compressed data, block size = 800k
> sha1-sha1-gzip5.xar:     xar archive
> 	compressed TOC: 499, SHA-1 checksum
> 	, contains zlib compressed data
> sha1-sha1-none.xar:      xar archive
> 	compressed TOC: 476, SHA-1 checksum
> sha256-sha256-gzip7.xar: xar archive
> 	compressed TOC: 538, SHA-256 checksum
> 	, contains zlib compressed data
> sha256-sha512-bzip2.xar: xar archive
> 	compressed TOC: 620, SHA-256 checksum
> 	, contains bzip2 compressed data, block size = 900k
> sha512-sha512-gzip.xar:  xar archive
> 	compressed TOC: 614, SHA-512 checksum
> 	, contains zlib compressed data
> xar-1.1.xar:             xar archive, header size 32
> 	compressed TOC: 5726, SHA-1 checksum
> 	, contains zlib compressed data
> xar-1.5.2.xar:           xar archive
> 	compressed TOC: 6932, SHA-1 checksum
> 	, contains zlib compressed data
> Xcode_10.2_beta_4.xip:   xar archive
> 	compressed TOC: 2948, SHA-1 checksum
> 
> I hope my diff file can be applied in future version of file utility.
> 
> With best wishes
> Jörg Jenderek
> --
> Jörg Jenderek
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> <file-5_36-archive-xar_diff.DEFANGED-4431>-- 
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>



More information about the File mailing list