[File] [PATCH] of Magdir/archive for xar archive; update+extensions *.xar *.xip *.pkg
Christos Zoulas
christos at zoulas.com
Sat Mar 30 02:08:22 UTC 2019
Committed, thanks!
christos
> On Mar 29, 2019, at 6:21 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> Hello,
>
> some days ago i run file command version 5.36 on eXtensible ARchives.
> I get an output like:
>
> dslocal-backup.xar: xar archive version 1, SHA-1 checksum
> FullBundleUpdate.pkg: xar archive version 1, SHA-1 checksum
> none-none-bzip2.xar: xar archive version 1, no checksum
> sha1-sha1-bzip28.xar: xar archive version 1, SHA-1 checksum
> sha1-sha1-gzip5.xar: xar archive version 1, SHA-1 checksum
> sha1-sha1-none.xar: xar archive version 1, SHA-1 checksum
> sha256-sha256-gzip7.xar: xar archive version 1,
> sha256-sha512-bzip2.xar: xar archive version 1,
> sha512-sha512-gzip.xar: xar archive version 1,
> xar-1.1.xar: xar archive version 1, SHA-1 checksum
> xar-1.5.2.xar: xar archive version 1, SHA-1 checksum
> Xcode_10.2_beta_4.xip: xar archive version 1, SHA-1 checksum
>
> What is wrong? Apparently newer checksum algorithm are not recognized.
> Only older methods like "SHA-1" are recognized. Furthermore with
> --extension option only ??? is displayed.
>
> So i started to change Magdir/archive. Most information is found on
> new Wikipedia page about xar archiver. So add comment line like
> # URL: https://en.wikipedia.org/wiki/Xar_(archiver)
>
> Furthermore all fields are stored in big endian order. And many are
> unsigned. So lines about table of contents (TOC) must look like:
>> 8 ubequad x compressed TOC: %llu,
>> 16 ubequad x uncompressed TOC: %llu,
>
> The Header size of archives is stored at position 4. In older versions
> this was always 28. So there was no need to show this standard value.
> So this was mentioned only as a comment line:
> #>4 beshort x header size %d
> According to Kyle J. McKay wiki in newer versions also padding bytes
> ( like in example xar-1.1.xar) or checksum algorithm name can appear
> after that position in header. This is probably interesting for users,
> because older versions of the xar library do not correctly handle a
> header size value of other than 28. So display this information now by
> line like
>> 4 ubeshort >28 \b, header size %u
>
> Apparently for eXtensible ARchives file name extension "xar" is used.
> The xar format is used by some Mac OS X installers for packages. Then
> there the filename extension "pkg" like in example
> FullBundleUpdate.pkg is used. Apple introduced a variant with
> additional signature. Such signed archives like Xcode_10.2_beta_4.xip
> have "xip" extension. So these 3 extensions are shown by additional line:
> !:ext xar/pkg/xip
>
> Version of Xar format is stored at position 6. This value is 1 at the
> moment and has not changed in the last years. So do not bother users
> with uninteresting information and display now information only when
> not standard by line:
>> 6 ubeshort >1 version %u,
>
> The variable for used checksum algorithm is stored at position 24.
> According to Wikipedia page values 3, 4 is used for newer checksum
> methods. This is now expressed by additional lines
>> 24 belong 3 SHA-256 checksum
>> 24 belong 4 SHA-512 checksum
> So examples like sha256-sha512-bzip2.xar and sha512-sha512-gzip.xar
> are described more precisely.To recognize also possible other checksum
> algorithm add also additional line:
>> 24 belong >4 unknown 0x%x checksum
>
> Some interesting information can be got from bytes inside heap
> section. By jumping over header and TOC (table of contents) we get at
> the heap section and can inspect data by lines by calling file again like:
>>>> &(4.S) ubyte x
>>>>> &(8.Q) ubyte x
>>>>>> &-1 indirect x \b, contains
> When we look in the XML TOC we see at the beginning of heap the
> checksum is stored. So by jumping more bytes forward depending on
> checksum type the pointer looks at data. So for SHA-1 this are
> additional 20 bytes. So this looks for SHA-1 (18=20 - 1 (.S
> expression) -1 (.Q expression) like
>> 24 belong 1
>>> 18 ubyte x
> Now by the indirect expression the compression method like bzip2 is
> reported, which was used when building archive by xar --compression
> option. In older version of xar utility only gzip and bzip2 are
> supported, which i have tested. According to man page xar(1) in Apple
> MAC OSX new versions like mojave also lzma can be used. According to
> open source fork 1.6.1 also xz is supported.
>
> For pkg and xip the indirect expression pointer aims at the the
> signature. So if magic lines for something like X509Certificate exist
> it should be possible to distinguish such file name extensions from
> xar variants. This is a TODO.
>
> After applying the above mentioned modifications by patch
> file-5.36-archive-xar.diff then all inspected examples are now
> described more precisely like:
>
> dslocal-backup.xar: xar archive
> compressed TOC: 20986, SHA-1 checksum
> , contains zlib compressed data
> FullBundleUpdate.pkg: xar archive
> compressed TOC: 3188, SHA-1 checksum
> none-none-bzip2.xar: xar archive
> compressed TOC: 385, no checksum
> , contains bzip2 compressed data, block size = 900k
> sha1-sha1-bzip28.xar: xar archive
> compressed TOC: 498, SHA-1 checksum
> , contains bzip2 compressed data, block size = 800k
> sha1-sha1-gzip5.xar: xar archive
> compressed TOC: 499, SHA-1 checksum
> , contains zlib compressed data
> sha1-sha1-none.xar: xar archive
> compressed TOC: 476, SHA-1 checksum
> sha256-sha256-gzip7.xar: xar archive
> compressed TOC: 538, SHA-256 checksum
> , contains zlib compressed data
> sha256-sha512-bzip2.xar: xar archive
> compressed TOC: 620, SHA-256 checksum
> , contains bzip2 compressed data, block size = 900k
> sha512-sha512-gzip.xar: xar archive
> compressed TOC: 614, SHA-512 checksum
> , contains zlib compressed data
> xar-1.1.xar: xar archive, header size 32
> compressed TOC: 5726, SHA-1 checksum
> , contains zlib compressed data
> xar-1.5.2.xar: xar archive
> compressed TOC: 6932, SHA-1 checksum
> , contains zlib compressed data
> Xcode_10.2_beta_4.xip: xar archive
> compressed TOC: 2948, SHA-1 checksum
>
> I hope my diff file can be applied in future version of file utility.
>
> With best wishes
> Jörg Jenderek
> --
> Jörg Jenderek
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> <file-5_36-archive-xar_diff.DEFANGED-4431>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
More information about the File
mailing list