[File] [PATCH] Magdir/apple HFS/HFS+ resource fork misidentifies 1 Microsoft Event Trace Log *.ETL
Christos Zoulas
christos at zoulas.com
Sun Oct 2 12:54:39 UTC 2022
Committed, thanks!
christos
> On Sep 26, 2022, at 8:47 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> Some days ago i run the cleaning tool czkawka found on
> https://qarmin.github.io/czkawka/. One menu item concerns bad
> extensions. After running tool i looked in saved file list
> results_bad_extensions.txt for bad extension examples.
> One listed extension is ETL.
>
> These files are Microsoft Event Trace Logs. When running file
> command version 5.43 on some other ETL examples and related positive
> resource samples i get an output like:
>
> Alkaios.dfont: Mac OSX datafork font, TrueType
> CPack.OSXScriptLauncher.rsrc.in: Apple HFS/HFS+ resource fork
> Empty.rsrc.rsr: data
> HelveticaNeue.dfont: data
> Icon_.icns: Mac OS X icon, 2296 bytes, "ics#"
> Icon_.rsrc: Apple HFS/HFS+ resource fork
> NPETraceSession.etl: Apple HFS/HFS+ resource fork
> OpenSans-CondBold.dfont: data
> Panorama-12.icns: Mac OS X icon, 2296 bytes, "ICN#"
> Panorama.rsr: data
> Read me.txt.rsrc: Apple HFS/HFS+ resource fork
> TCDB 2003-10 demo.pan.icns: Mac OS X icon, 5849 bytes, "ics#"
> TCDB 2003-10 demo.pan.rsrc: Apple HFS/HFS+ resource fork
> XLISP.RSR: Apple HFS/HFS+ resource fork
> XLISPTIN.RSR: Apple HFS/HFS+ resource fork
> droplet.rsrc: Apple HFS/HFS+ resource fork
> empty.rsr: Apple HFS/HFS+ resource fork
> nastro.pi1: DEGAS Elite bitmap 320 x 200 x 16,
> color palette 0000 0000 0000
> 0000 0000 ...
>
> Furthermore for the resource samples only generic
> application/octet-stream mime type is shown with -i option. With
> option --extension 3 byte sequence ??? is shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This describes
> the one Microsoft Event Trace Log NPETraceSession.etl correctly as
> "Window tracing/diagnostic binary log" by etl.trid.xml. But it does
> not recognize the resource samples. Most DFONT examples are
> described as "Macintosh OS X Data Fork Font" by dfont.trid.xml
> whereas file command many examples do not recognize. ( See appended
> trid-v-etl_rsr.txt.gz). The ETL sample is misidentified as "Apple
> HFS/HFS+ resource fork" by file command.
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies only the ICNS images as "Apple Icon Image Format" by PUID
> x-fmt/311.
>
> Luckily i found some specification on the net. So that informations
> are now expressed by comment lines inside Magdir/apple like:
>
> # URL: http://fileformats.archiveteam.org/
> # wiki/Macintosh_resource_file
> # https://en.wikipedia.org/wiki/Resource_fork
> # Reference: https://github.com/
> # kreativekorp/ksfl/wiki/Macintosh-Resource-File-Format
> # http://developer.apple.com
> # /legacy/mac/library/documentation/mac/pdf/
> # MoreMacintoshToolbox.pdf
> # https://formats.kaitai.io/resource_fork/
>
> According to documentation the software deark can handle such
> resources. Often this true and can be verified by commands like:
> deark -m macrsrc -l Panorama.rsr
> deark -m macrsrc Icon_.rsrc
>
> For undetected Panorama.rsr we get Mac OS X icons ICNS. So we know
> this is valid resource.
>
> In current Magdir/apple the description happens by lines like:
> 0 string \000\000\001\000
>> 4 leshort 0
>>> 16 lelong 0 Apple HFS/HFS+ resource fork
>
> At first glance this looks good, because 10 bytes are used for
> recognition, but in reality only 1 byte is not nil. So apparently
> this magic lines are too weak. So ETL sample is misidentified and
> apparently recognition is not exact in all cases.
>
> So i put displaying part inside sub routine apple-rsr. Then later i
> can adjust the test lines to cancel misidentified samples. So this su
> b
> routine starts like:
> 0 name apple-rsr
>> 0 ubelong x Apple HFS/HFS+ resource fork
> !:mime application/x-apple-rsr
> !:ext rsrc/rsr/in
>> 0 ubelong !0x100 \b, data offset %#x
>> 4 ubelong x \b, map offset %#x
>> 12 ubelong x \b, map length %#x
>> 8 ubelong x \b, data length %#x
>> 16 ubelong !0 \b, at 16 %#x
>
> Instead of generic application/octet-stream i choose an user
> defined mime type. The standard file suffix are rsrc or rsr.
> According to documentation there exist also extensions. On my systems
> i also found example with suffix rsrc.in. Apparently suffix means
> here used as template for real resources. Such sample is found as
> part of compiler suites. On Windows such example is part for Visual
> Studio. And on Linux i found such sample as part of cmake tools.
>
> At the beginning the data offset is stored as 4 byte big endian
> endian. This is always 100h. This was used as first test by line like
> :
> 0 string \000\000\001\000
> So this line is always true and can be kept.
>
> Afterwards the map offset is stored as 4 byte big endian endian. For
> most samples this values are "low" (32K). Other offset values are
> 16-bit only. So it sound logically that here also 16-bit values are
> used. So maybe the original magic writer has misunderstood that fact
> and believed that map offset are also "low". Or in other words the 2
> upper bytes of that offset value are nil. This was done by second
> test line like:
>> 4 leshort 0
> Apparently this is not always true. For sample Panorama.rsr I got
> map offset 0x1325b. For misidentified NPETraceSession.etl i get here
> nil value. When the data section start at offset 0x100 than the map
> section must start later. So offset must be greater or equal. In
> worst case like mentioned empty.rsr without content the data length
> is 0. So map starting afterwards starts therefore at offset 0x100.
> So second test line becomes like:
>> 4 ubelong >0xFF
>
> Afterwards the data and map length are stored as 4 byte big endian
> endian. According to documentation the use of 16-bit signed integers
> imposes a 32K limitation on the size of the resource map. So this can
> be used as additional test. For misidentified NPETraceSession.etl i g
> et
> here also nil values. So i use this as new third test by line like:
>>> 12 ubelong <0x8001
> So few Atari DEGAS Elite bitmaps (eil2.pi1 nastro.pi1) with invalid
> "high" (0x6550766 0x7510763) map length are skipped.
>
> At offset 16 reserved 112 bytes are stored. According to
> documentation this field can generally be ignored when reading and
> writing. This field is used by the Classic Mac OS Finder as temporary
> storage space. It usually contains parts of the file metadata (name,
> type/creator code, etc.). In resource files written by Mac OS X this
> field is set to all zero bytes. This was used third test by line like
> :
>>> 16 lelong 0 Apple HFS/HFS+ resource fork
> For most examples this field is nil, but for two of my dozen
> inspected resources i get non nil values. That is Empty.rsrc.rsr
> (0x8fd20000) which is part of Panorama ProVUE Development suite
> The other is OpenSans-CondBold.dfont (0x00768c2b). So i am not sure
> if this is just an accident. So i could create a branch suited for
> most resources and another branch for "such exotic" resources. So
> this would become like:
>>>> 16 lelong =0
>>>>> 0 use apple-rsr
>>>> 16 lelong !0
>>>>> 0 use apple-rsr
>
> According to documentation the resource map contains a copy of the
> resource header. For debugging purpose show this information inside
> subroutine by lines like:
>> (4.L) ubelong x \b, DATA offset %#x
>> (4.L+4) ubelong x \b, MAP offset %#x
>> (4.L+8) ubelong x \b, DATA length %#x
>> (4.L+12) ubelong x \b, MAP length %#x
> If this is true i can use this information to test for second 0x100
> value. As my formulation suggest this is not. Oh, this take me a day
> to find out what is wrong. Later on page about Macintosh font formats
> on fontforge.org i found an explanation. There is mentioned a
> differentiation between resource fork and data fork. For resource
> fork the mention facts about copied header applies. For data fork
> these 16 bytes are zeroed. That is what i see for 2 examples
> (XLISP.RSR XLISPTIN.RSR). So using this area as test becomes more
> complicated. And things become worse.
>
> According to documentation DFONT are also resource fork. So i checked
> such samples. For few (4/24) such fonts the above part of sub routine
> is never executed. So this take another day to find out what is
> wrong. For bad examples the output contain parts like:
> Times.dfont map offset 0x1866b8, map length 0x1b1
> Courier.dfont map offset 0x191dfe, map length 0x157
> Helvetica.dfont map offset 0x27194e, map length 0x1ec
> HelveticaNeue.dfont map offset 0x6ab0f4, map length 0x1e3
> For recognised samples i got parts like:
> Geneva.dfont: map offset 0xb3f0e, map length 0x124
> Monaco.dfont: map offset 0x899ca, map length 0xed
> Bank Gothic.dfont: map offset 0x54aa4, map length 0x95
> GohuFont.dfont: map offset 0x1873b, map length 0x93
> Alkaios.dfont: map offset 0x7884a, map length 0x72
> So this was triggered by internal limits of file command. In
> src/file.h the constant FILE_BYTES_MAX is defined as 1 MiB (1024*1024
> =0x100000). When raising this limit above 0x6ab0f4 (6,66 MiB for
> HelveticaNeue.dfont) then most works as expected. One example
> OpenSans-CondBold.dfont is still not recognised. This seem to be an
> accident, because reported map offset 0x110100 is above file size
> 264372 (=0x408B4). So i use the information about start of map as new
> tests by lines like:
>>>> (4.L) ubelong 0x100
>>>>> 0 use apple-rsr
>>>> (4.L) ubelong 0
>>>>> 0 use apple-rsr
>
> But things are still complicated, because DFONT are already described
> often by Magdir/macintosh via lines like:
> 0 belong 0x100
>> (0x4.L+24) beshort x
>>> &4 belong 0x73666e74 Mac OSX datafork font, TrueType
>>> &4 belong 0x464f4e54 Mac OSX datafork font, 'FONT'
>>> &4 belong 0x4e464e54 Mac OSX datafork font, 'NFNT'
>>> &4 belong 0x504f5354 Mac OSX datafork font, PostScript
>
> The second line for resource becomes like:
>> (4.L+24) ubeshort x \b, list offset %#x
> So here the MAX bug also occur. The next lines inspect the Resource
> type list array. So here the first resource type is expressed by line
> like:
>> (4.L+30) ubelong x \b, type %#x
>> (4.L+30) string x "%-.4s"
>
> Unfortunately still few (13/26) DFONT are not recognised. This took
> another day. But luckily i find a page about RSRC Tags on
> exiftool.org web site. According to that page for my undetected
> DFONT samples i must add two more lines to Magdir/macintosh like:
>>> &4 belong 0x464f4e44 Mac OSX datafork font, 'FOND'
>>> &4 belong 0x76657273 Mac OSX datafork font, 'vers'
> Maybe that there exist more other font tags.
>
> But now DFONT samples would be described twice with -k option. So i
> moved and merged the DFONT part from Magdir/macintosh to resource
> part inside Magdir/apple. So display for DFONT is done by sub routine
> apple-dfont. This looks like:
> 0 name apple-dfont
>> (4.L+30) ubelong x Mac OSX datafork font,
> !:mime application/x-dfont
> !:ext dfont
>> (4.L+30) ubelong 0x73666e74 TrueType
>> (4.L+30) ubelong 0x464f4e54 'FONT'
>> (4.L+30) ubelong 0x4e464e54 'NFNT'
>> (4.L+30) ubelong 0x504f5354 PostScript
>> (4.L+30) ubelong 0x464f4e44 'FOND'
>> (4.L+30) ubelong 0x76657273 'vers'
>
> Now at the beginning of sub routine apple-rsr i must look for known
> RSRC Tags. If i find one call subroutine. If no found then it is a
> normal resource handled by default case. So this now start like:
> 0 name apple-rsr
>> (4.L+30) ubelong 0x73666e74
>>> 0 use apple-dfont
>> (4.L+30) ubelong 0x464f4e54
>>> 0 use apple-dfont
> ...
>> (4.L+30) default x Apple HFS/HFS+ resource fork
> !:mime application/x-apple-rsr
> !:ext rsrc/rsr
>
> After applying the above mentioned modifications by patches
> file-5.43-apple-rsr.diff, file-5.43-macintosh-dfont.diff
> AND file-5.43-file.h-big.diff then all my apple resources are
> recognised and are described with more details. Also
> misidentification vanish. This with -m Magdir/apple option now
> looks like:
>
> Alkaios.dfont: Mac OSX datafork font, TrueType,
> map offset 0x7884a, map length 0x72,
> data length 0x7874a,
> list offset 0x1c, name offset 0x6a,
> 2 types, 0x73666e74 'sfnt' * 4
> resource offset 0x12
> CPack.OSXScriptLauncher.rsrc.in: Apple HFS/HFS+ resource fork,
> map offset 0x124, map length 0x46,
> data length 0x24,
> nextResourceMap 0xf4577b00,
> fileRef 0xe801,
> list offset 0x1c, name offset 0x46,
> 2 types, 0x7363737a 'scsz' * 1
> resource offset 0x12
> Empty.rsrc.rsr: Apple HFS/HFS+ resource fork,
> map offset 0x12d, map length 0x46,
> data length 0x2d, at 16 0x8fd20000,
> nextResourceMap 0x9bb43a0,
> fileRef 0x23a0,
> list offset 0x1c, name offset 0x46,
> 2 types, 0x4b415358 'KASX' * 1
> resource offset 0x12
> HelveticaNeue.dfont: Mac OSX datafork font, 'FOND',
> map offset 0x6ab0f4, map length 0x1e3,
> data length 0x6aaff4,
> nextResourceMap 0x9000000,
> fileRef 0x1900,
> list offset 0x1c, name offset 0x13e,
> 3 types, 0x464f4e44 'FOND' * 7
> resource offset 0x1a
> Icon_.icns: data
> Icon_.rsrc: Apple HFS/HFS+ resource fork,
> map offset 0x9fc, map length 0x32,
> data length 0x8fc,
> nextResourceMap 0x43d08c,
> fileRef 0x2fc,
> list offset 0x1c, name offset 0x32,
> 1 type, 0x69636e73 'icns' * 1
> resource offset 0xa
> NPETraceSession.etl: data
> OpenSans-CondBold.dfont: data
> Panorama-12.icns: data
> Panorama.rsr: Apple HFS/HFS+ resource fork,
> map offset 0x1325b, map length 0x66d,
> data length 0x1315b,
> nextResourceMap 0x534f5254,
> fileRef 0x116, attributes 0x80,
> list offset 0x1c, name offset 0x5a6,
> 27 types, 0x414c5254 'ALRT' * 3
> resource offset 0xda
> Read me.txt.rsrc: Apple HFS/HFS+ resource fork,
> map offset 0x568, map length 0x46,
> data length 0x468,
> nextResourceMap 0xe8084,
> fileRef 0x63,
> list offset 0x1c, name offset 0x46,
> 2 types, 0x4d505352 'MPSR' * 1
> resource offset 0x12
> TCDB 2003-10 demo.pan.icns: data
> TCDB 2003-10 demo.pan.rsrc: Apple HFS/HFS+ resource fork,
> map offset 0x17dd, map length 0x32,
> data length 0x16dd,
> nextResourceMap 0x43d08c,
> fileRef 0x28e,
> list offset 0x1c, name offset 0x32,
> 1 type, 0x69636e73 'icns' * 1
> resource offset 0xa
> XLISP.RSR: Apple HFS/HFS+ resource fork,
> map offset 0xbc70, map length 0x56,
> data length 0xbb70,
> list offset 0x1c, name offset 0x56,
> 1 type, 0x434f4445 'CODE' * 4
> resource offset 0xa
> XLISPTIN.RSR: Apple HFS/HFS+ resource fork,
> map offset 0xab44, map length 0x56,
> data length 0xaa44,
> list offset 0x1c, name offset 0x56,
> 1 type, 0x434f4445 'CODE' * 4
> resource offset 0xa
> droplet.rsrc: Apple HFS/HFS+ resource fork,
> map offset 0x11e, map length 0x32,
> data length 0x1e,
> nextResourceMap 0xba5d4,
> fileRef 0x5c,
> list offset 0x1c, name offset 0x32,
> 1 type, 0x7363737a 'scsz' * 1
> resource offset 0xa
> empty.rsr: Apple HFS/HFS+ resource fork,
> map offset 0x100, map length 0x1e,
> data length 0,
> list offset 0x1c, name offset 0x1e
> nastro.pi1: data
>
>
>
> I hope my diff file can be applied in future version of file
> utility.
>
> Unfortunately a description of ETL itself is missing. I will try to
> do this in a future session.
>
> With best wishes,
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYzJIMgAKCRCv8rHJQhrU
> 1kZpAJ0UCpiifBiajYRyYgvGnhB7Gnp6LQCcCiD1v6NBVd4uaZ7LIRuAYHU5TgY=
> =dNV0
> -----END PGP SIGNATURE-----
> <trid-v-etl_rsr.txt.gz><file-5_43-macintosh-dfont_diff.DEFANGED-30023><file-5_43-macintosh-dfont_diff_sig.DEFANGED-30024><file-5_43-file_h-big_diff.DEFANGED-30025><file-5_43-file_h-big_diff_sig.DEFANGED-30026><file-5_43-apple-rsr_diff_sig.DEFANGED-30027><file-5_43-apple-rsr_diff.DEFANGED-30028>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20221002/1e53d808/attachment-0001.asc>
More information about the File
mailing list