[File] [PATCH] Magdir/apple HFS/HFS+ resource fork misidentifies 1 Microsoft Event Trace Log *.ETL

Christos Zoulas christos at zoulas.com
Sun Oct 2 12:54:39 UTC 2022


Committed, thanks!

christos

> On Sep 26, 2022, at 8:47 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello,
> 
> Some days ago i run the cleaning tool czkawka found on
> https://qarmin.github.io/czkawka/. One menu item concerns bad
> extensions. After running tool i looked in saved file list
> results_bad_extensions.txt for bad extension examples.
> One listed extension is ETL.
> 
> These files are Microsoft Event Trace Logs. When running file
> command version 5.43 on some other ETL examples and related positive
> resource samples i get an output like:
> 
> Alkaios.dfont:                   Mac OSX datafork font, TrueType
> CPack.OSXScriptLauncher.rsrc.in: Apple HFS/HFS+ resource fork
> Empty.rsrc.rsr:                  data
> HelveticaNeue.dfont:             data
> Icon_.icns:                      Mac OS X icon, 2296 bytes, "ics#"
> Icon_.rsrc:                      Apple HFS/HFS+ resource fork
> NPETraceSession.etl:             Apple HFS/HFS+ resource fork
> OpenSans-CondBold.dfont:         data
> Panorama-12.icns:                Mac OS X icon, 2296 bytes, "ICN#"
> Panorama.rsr:                    data
> Read me.txt.rsrc:                Apple HFS/HFS+ resource fork
> TCDB 2003-10 demo.pan.icns:      Mac OS X icon, 5849 bytes, "ics#"
> TCDB 2003-10 demo.pan.rsrc:      Apple HFS/HFS+ resource fork
> XLISP.RSR:                       Apple HFS/HFS+ resource fork
> XLISPTIN.RSR:                    Apple HFS/HFS+ resource fork
> droplet.rsrc:                    Apple HFS/HFS+ resource fork
> empty.rsr:                       Apple HFS/HFS+ resource fork
> nastro.pi1:                      DEGAS Elite bitmap 320 x 200 x 16,
> 				 color palette 0000 0000 0000
> 				 0000 0000 ...
> 
> Furthermore for the resource samples only generic
> application/octet-stream mime type is shown with -i option. With
> option --extension 3 byte sequence ??? is shown.
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This describes
> the one Microsoft Event Trace Log NPETraceSession.etl correctly as
> "Window tracing/diagnostic binary log" by etl.trid.xml. But it does
> not recognize the resource samples. Most DFONT examples are
> described as "Macintosh OS X Data Fork Font" by dfont.trid.xml
> whereas file command many examples do not recognize. ( See appended
> trid-v-etl_rsr.txt.gz). The ETL sample is misidentified as "Apple
> HFS/HFS+ resource fork" by file command.
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies only the ICNS images as "Apple Icon Image Format" by PUID
> x-fmt/311.
> 
> Luckily i found some specification on the net. So that informations
> are now expressed by comment lines inside Magdir/apple like:
> 
> # URL:		http://fileformats.archiveteam.org/
> #		wiki/Macintosh_resource_file
> #		https://en.wikipedia.org/wiki/Resource_fork
> # Reference:	https://github.com/
> #		kreativekorp/ksfl/wiki/Macintosh-Resource-File-Format
> #		http://developer.apple.com
> #		/legacy/mac/library/documentation/mac/pdf/
> #		MoreMacintoshToolbox.pdf
> #		https://formats.kaitai.io/resource_fork/
> 
> According to documentation the software deark can handle such
> resources. Often this true and can be verified by commands like:
> 	deark -m macrsrc -l Panorama.rsr
> 	deark -m macrsrc Icon_.rsrc
> 
> For undetected Panorama.rsr we get Mac OS X icons ICNS. So we know
> this is valid resource.
> 
> In current Magdir/apple the description happens by lines like:
> 0	string  \000\000\001\000
>> 4	leshort 0
>>> 16	lelong  0			Apple HFS/HFS+ resource fork
> 
> At first glance this looks good, because 10 bytes are used for
> recognition, but in reality only 1 byte is not nil. So apparently
> this magic lines are too weak. So ETL sample is misidentified and
> apparently recognition is not exact in all cases.
> 
> So i put displaying part inside sub routine apple-rsr. Then later i
> can adjust the test lines to cancel misidentified samples. So this su
> b
> routine starts like:
> 0	name		apple-rsr
>> 0	ubelong	x			Apple HFS/HFS+ resource fork
> !:mime	application/x-apple-rsr
> !:ext	rsrc/rsr/in
>> 0	ubelong	!0x100			\b, data offset %#x
>> 4	ubelong	x			\b, map offset %#x
>> 12	ubelong	x			\b, map length %#x
>> 8	ubelong	x			\b, data length %#x
>> 16	ubelong	!0			\b, at 16 %#x
> 
> Instead of generic application/octet-stream i choose an user
> defined mime type. The standard file suffix are rsrc or rsr.
> According to documentation there exist also extensions. On my systems
> i also found example with suffix rsrc.in. Apparently suffix means
> here used as template for real resources. Such sample is found as
> part of compiler suites. On Windows such example is part for Visual
> Studio. And on Linux i found such sample as part of cmake tools.
> 
> At the beginning the data offset is stored as 4 byte big endian
> endian. This is always 100h. This was used as first test by line like
> :
> 0	string  \000\000\001\000
> So this line is always true and can be kept.
> 
> Afterwards the map offset is stored as 4 byte big endian endian. For
> most samples this values are "low" (32K). Other offset values are
> 16-bit only. So it sound logically that here also 16-bit values are
> used. So maybe the original magic writer has misunderstood that fact
> and believed that map offset are also "low". Or in other words the 2
> upper bytes of that offset value are nil. This was done by second
> test line like:
>> 4	leshort 0
> Apparently this is not always true. For sample Panorama.rsr I got
> map offset 0x1325b. For misidentified NPETraceSession.etl i get here
> nil value. When the data section start at offset 0x100 than the map
> section must start later. So offset must be greater or equal. In
> worst case like mentioned empty.rsr without content the data length
> is 0. So map starting afterwards starts therefore at offset 0x100.
> So second test line becomes like:
>> 4	ubelong	>0xFF
> 
> Afterwards the data and map length are stored as 4 byte big endian
> endian. According to documentation the use of 16-bit signed integers
> imposes a 32K limitation on the size of the resource map. So this can
> be used as additional test. For misidentified NPETraceSession.etl i g
> et
> here also nil values. So i use this as new third test by line like:
>>> 12	ubelong	<0x8001
> So few Atari DEGAS Elite bitmaps (eil2.pi1 nastro.pi1) with invalid
> "high" (0x6550766 0x7510763) map length are skipped.
> 
> At offset 16 reserved 112 bytes are stored. According to
> documentation this field can generally be ignored when reading and
> writing. This field is used by the Classic Mac OS Finder as temporary
> storage space. It usually contains parts of the file metadata (name,
> type/creator code, etc.). In resource files written by Mac OS X this
> field is set to all zero bytes. This was used third test by line like
> :
>>> 16	lelong  0			Apple HFS/HFS+ resource fork
> For most examples this field is nil, but for two of my dozen
> inspected resources i get non nil values. That is Empty.rsrc.rsr
> (0x8fd20000) which is part of Panorama ProVUE Development suite
> The other is OpenSans-CondBold.dfont (0x00768c2b). So i am not sure
> if this is just an accident. So i could create a branch suited for
> most resources and another branch for "such exotic" resources. So
> this would become like:
>>>> 16	lelong  =0
>>>>> 0	use	apple-rsr
>>>> 16	lelong  !0
>>>>> 0	use	apple-rsr
> 
> According to documentation the resource map contains a copy of the
> resource header. For debugging purpose show this information inside
> subroutine by lines like:
>> (4.L)	ubelong	x			\b, DATA offset %#x
>> (4.L+4) ubelong x			\b, MAP offset %#x
>> (4.L+8) ubelong x			\b, DATA length %#x
>> (4.L+12) ubelong x			\b, MAP length %#x
> If this is true i can use this information to test for second 0x100
> value. As my formulation suggest this is not. Oh, this take me a day
> to find out what is wrong. Later on page about Macintosh font formats
> on fontforge.org  i found an explanation. There is mentioned a
> differentiation between resource fork and data fork. For resource
> fork the mention facts about copied header applies. For data fork
> these 16 bytes are zeroed. That is what i see for 2 examples
> (XLISP.RSR XLISPTIN.RSR). So using this area as test becomes more
> complicated. And things become worse.
> 
> According to documentation DFONT are also resource fork. So i checked
> such samples. For few (4/24) such fonts the above part of sub routine
> is never executed. So this take another day to find out what is
> wrong. For bad examples the output contain parts like:
> Times.dfont		map offset 0x1866b8, map length 0x1b1
> Courier.dfont		map offset 0x191dfe, map length 0x157
> Helvetica.dfont		map offset 0x27194e, map length 0x1ec
> HelveticaNeue.dfont	map offset 0x6ab0f4, map length 0x1e3
> 	For recognised samples i got parts like:
> Geneva.dfont:           map offset 0xb3f0e, map length 0x124
> Monaco.dfont:           map offset 0x899ca, map length 0xed
> Bank Gothic.dfont:      map offset 0x54aa4, map length 0x95
> GohuFont.dfont:		map offset 0x1873b, map length 0x93
> Alkaios.dfont:  	map offset 0x7884a, map length 0x72
> So this was triggered by internal limits of file command. In
> src/file.h the constant FILE_BYTES_MAX is defined as 1 MiB (1024*1024
> =0x100000). When raising this limit above 0x6ab0f4 (6,66 MiB for
> HelveticaNeue.dfont) then most works as expected. One example
> OpenSans-CondBold.dfont is still not recognised. This seem to be an
> accident, because reported map offset 0x110100 is above file size
> 264372 (=0x408B4). So i use the information about start of map as new
> tests by lines like:
>>>> (4.L)	ubelong	0x100
>>>>> 0	use	apple-rsr
>>>> (4.L)	ubelong	0
>>>>> 0	use	apple-rsr
> 
> But things are still complicated, because DFONT are already described
> often by Magdir/macintosh via lines like:
> 0	belong	0x100
>> (0x4.L+24)	beshort	x
>>> &4	belong	0x73666e74	Mac OSX datafork font, TrueType
>>> &4	belong	0x464f4e54	Mac OSX datafork font, 'FONT'
>>> &4	belong	0x4e464e54	Mac OSX datafork font, 'NFNT'
>>> &4	belong	0x504f5354	Mac OSX datafork font, PostScript
> 
> The second line for resource becomes like:
>> (4.L+24) ubeshort x			\b, list offset %#x
> So here the MAX bug also occur. The next lines inspect the Resource
> type list array. So here the first resource type is expressed by line
> like:
>> (4.L+30) ubelong x			\b, type %#x
>> (4.L+30) string x			"%-.4s"
> 
> Unfortunately still few (13/26) DFONT are not recognised. This took
> another day. But luckily i find a page about RSRC Tags on
> exiftool.org web site. According to that page for my undetected
> DFONT samples i must add two more lines to Magdir/macintosh like:
>>> &4	belong	0x464f4e44	Mac OSX datafork font, 'FOND'
>>> &4	belong	0x76657273	Mac OSX datafork font, 'vers'
> Maybe that there exist more other font tags.
> 
> But now DFONT samples would be described twice with -k option. So i
> moved and merged the DFONT part from Magdir/macintosh to resource
> part inside Magdir/apple. So display for DFONT is done by sub routine
> apple-dfont. This looks like:
> 0	name		apple-dfont
>> (4.L+30)	ubelong x		Mac OSX datafork font,
> !:mime	application/x-dfont
> !:ext	dfont
>> (4.L+30)	ubelong	0x73666e74	TrueType
>> (4.L+30)	ubelong	0x464f4e54	'FONT'
>> (4.L+30)	ubelong	0x4e464e54	'NFNT'
>> (4.L+30)	ubelong	0x504f5354	PostScript
>> (4.L+30)	ubelong	0x464f4e44	'FOND'
>> (4.L+30)	ubelong	0x76657273	'vers'
> 
> Now at the beginning of sub routine apple-rsr i must look for known
> RSRC Tags. If i find one call subroutine. If no found then it is a
> normal resource handled by default case. So this now start like:
> 0	name		apple-rsr
>> (4.L+30)	ubelong	0x73666e74
>>> 0	use	apple-dfont
>> (4.L+30)	ubelong	0x464f4e54
>>> 0	use	apple-dfont
> ...
>> (4.L+30)	default	x		Apple HFS/HFS+ resource fork
> !:mime	application/x-apple-rsr
> !:ext	rsrc/rsr
> 
> After applying the above mentioned modifications by patches
> file-5.43-apple-rsr.diff, file-5.43-macintosh-dfont.diff
> AND file-5.43-file.h-big.diff then all my apple resources are
> recognised and are described with more details. Also
> misidentification vanish. This with -m Magdir/apple option now
> looks like:
> 
> Alkaios.dfont:                   Mac OSX datafork font, TrueType,
> 				 map offset 0x7884a, map length 0x72,
> 				 data length 0x7874a,
> 				 list offset 0x1c, name offset 0x6a,
> 				 2 types, 0x73666e74 'sfnt' * 4
> 				 resource offset 0x12
> CPack.OSXScriptLauncher.rsrc.in: Apple HFS/HFS+ resource fork,
> 				 map offset 0x124, map length 0x46,
> 				 data length 0x24,
> 				 nextResourceMap 0xf4577b00,
> 				 fileRef 0xe801,
> 				 list offset 0x1c, name offset 0x46,
> 				 2 types, 0x7363737a 'scsz' * 1
> 				 resource offset 0x12
> Empty.rsrc.rsr:                  Apple HFS/HFS+ resource fork,
> 				 map offset 0x12d, map length 0x46,
> 				 data length 0x2d, at 16 0x8fd20000,
> 				 nextResourceMap 0x9bb43a0,
> 				 fileRef 0x23a0,
> 				 list offset 0x1c, name offset 0x46,
> 				 2 types, 0x4b415358 'KASX' * 1
> 				 resource offset 0x12
> HelveticaNeue.dfont:             Mac OSX datafork font, 'FOND',
> 				 map offset 0x6ab0f4, map length 0x1e3,
> 				 data length 0x6aaff4,
> 				 nextResourceMap 0x9000000,
> 				 fileRef 0x1900,
> 				 list offset 0x1c, name offset 0x13e,
> 				 3 types, 0x464f4e44 'FOND' * 7
> 				 resource offset 0x1a
> Icon_.icns:                      data
> Icon_.rsrc:                      Apple HFS/HFS+ resource fork,
> 				 map offset 0x9fc, map length 0x32,
> 				 data length 0x8fc,
> 				 nextResourceMap 0x43d08c,
> 				 fileRef 0x2fc,
> 				 list offset 0x1c, name offset 0x32,
> 				 1 type, 0x69636e73 'icns' * 1
> 				 resource offset 0xa
> NPETraceSession.etl:             data
> OpenSans-CondBold.dfont:         data
> Panorama-12.icns:                data
> Panorama.rsr:                    Apple HFS/HFS+ resource fork,
> 				 map offset 0x1325b, map length 0x66d,
> 				 data length 0x1315b,
> 				 nextResourceMap 0x534f5254,
> 				 fileRef 0x116, attributes 0x80,
> 				 list offset 0x1c, name offset 0x5a6,
> 				 27 types, 0x414c5254 'ALRT' * 3
> 				 resource offset 0xda
> Read me.txt.rsrc:                Apple HFS/HFS+ resource fork,
>     				 map offset 0x568, map length 0x46,
> 				 data length 0x468,
> 				 nextResourceMap 0xe8084,
> 				 fileRef 0x63,
> 				 list offset 0x1c, name offset 0x46,
> 				 2 types, 0x4d505352 'MPSR' * 1
> 				 resource offset 0x12
> TCDB 2003-10 demo.pan.icns:      data
> TCDB 2003-10 demo.pan.rsrc:      Apple HFS/HFS+ resource fork,
>     	     			 map offset 0x17dd, map length 0x32,
> 				 data length 0x16dd,
> 				 nextResourceMap 0x43d08c,
> 				 fileRef 0x28e,
> 				 list offset 0x1c, name offset 0x32,
> 				 1 type, 0x69636e73 'icns' * 1
> 				 resource offset 0xa
> XLISP.RSR:                       Apple HFS/HFS+ resource fork,
> 				 map offset 0xbc70, map length 0x56,
> 				 data length 0xbb70,
> 				 list offset 0x1c, name offset 0x56,
> 				 1 type, 0x434f4445 'CODE' * 4
> 				 resource offset 0xa
> XLISPTIN.RSR:                    Apple HFS/HFS+ resource fork,
> 				 map offset 0xab44, map length 0x56,
> 				 data length 0xaa44,
> 				 list offset 0x1c, name offset 0x56,
> 				 1 type, 0x434f4445 'CODE' * 4
> 				 resource offset 0xa
> droplet.rsrc:                    Apple HFS/HFS+ resource fork,
> 				 map offset 0x11e, map length 0x32,
> 				 data length 0x1e,
> 				 nextResourceMap 0xba5d4,
> 				 fileRef 0x5c,
> 				 list offset 0x1c, name offset 0x32,
> 				 1 type, 0x7363737a 'scsz' * 1
> 				 resource offset 0xa
> empty.rsr:                       Apple HFS/HFS+ resource fork,
> 				 map offset 0x100, map length 0x1e,
> 				 data length 0,
> 				 list offset 0x1c, name offset 0x1e
> nastro.pi1:                      data
> 
> 
> 
> I hope my diff file can be applied in future version of file
> utility.
> 
> Unfortunately a description of ETL itself is missing. I will try to
> do this in a future session.
> 
> With best wishes,
> Jörg Jenderek
> - --
> Jörg Jenderek
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYzJIMgAKCRCv8rHJQhrU
> 1kZpAJ0UCpiifBiajYRyYgvGnhB7Gnp6LQCcCiD1v6NBVd4uaZ7LIRuAYHU5TgY=
> =dNV0
> -----END PGP SIGNATURE-----
> <trid-v-etl_rsr.txt.gz><file-5_43-macintosh-dfont_diff.DEFANGED-30023><file-5_43-macintosh-dfont_diff_sig.DEFANGED-30024><file-5_43-file_h-big_diff.DEFANGED-30025><file-5_43-file_h-big_diff_sig.DEFANGED-30026><file-5_43-apple-rsr_diff_sig.DEFANGED-30027><file-5_43-apple-rsr_diff.DEFANGED-30028>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20221002/1e53d808/attachment-0001.asc>


More information about the File mailing list