[File] [PATCH] of Magdir/msdos for MZ executables; negative relocation address *.ICL

Christos Zoulas christos at zoulas.com
Fri Nov 18 16:15:45 UTC 2022

Wow, this magic is truly complicated.... Committed.


> On Nov 13, 2022, at 8:14 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
> Hash: SHA1
> Hello,
> some months ago i inspect files on my EFI partition. For files
> starting with 2 byte MZ magic i get unexpected recognitions.
> There are chain of errors. So i will split it after work of month.
> So will try to start which seems to be the beginning.
> When running file command version 5.43 on such examples and other
> related files i get an output like:
> BGISRV.DRV:    MS-DOS executable
> EXE64.exe:     MS-DOS executable PE32+ executable (GUI)
> 	       x86-64, for MS Windows
> MACCNV55.EXE:  MS-DOS executable
> PCISCAN.EXE:   MS-DOS executable, MZ for MS-DOS
> WORD60.ICL:    MS-DOS executable
> stinger64.exe: MS-DOS executable PE32+ executable (GUI)
> 	       x86-64 (stripped to external PDB), for MS Windows,
> 	       MZ for MS-DOS
> With --extension option the wrong file name extensions are displayed.
> This looks like:
> BGISRV.DRV:    exe/com/vlm
> EXE64.exe:     exe/com/vlm
> MACCNV55.EXE:  exe/com/vlm
> PCISCAN.EXE:   exe/com/vlm
> WORD60.ICL:    exe/com/vlm
> stinger64.exe: exe/com/vlm
> Furthermore with -i option for all samples only generic DOS
> executable mime type application/x-dosexec is shown.
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This list the used
> file name extension and often with -v option the related URL
> pointing to used file format information (See appended
> trid-v-e_lfarlc.txt.gz
> Furthermore i looked in the TrID database (triddefs_xml.7z) files for
> similar MZ-executables. These are expressed by XML-constructs like:
> 	<Pos>0</Pos>
> Then i look for such MZ executables and insert right lines inside
> Magdir/msdos. If i do not found such an MZ sample mentioned by TrID i
> add this as TODO comment with lines like:
> #	TODO
> # FLT:	Syntrillium CoolEdit Filter
> #	https://en.wikipedia.org/wiki/Adobe_Audition
> # FMX64:FileMaker Pro 64-bit plug-in
> #	https://en.wikipedia.org/wiki/FileMaker
> # FMX:	FileMaker Pro 32-bit plug-in
> # ...
> # ZAP:	ZoneLabs Zone Alarm data
> #	http://www.zonelabs.com
> First error is that some EFI files like ext4_x64_signed.efi and
> Shell_Full.efi are also identified as MS-DOS executable. The same
> error occur for all Windows Icons Library 16-bit like WORD60.ICL. The
> same error occur for all Microsoft compiled help format 2.0 like
> WINWORD.DEV.HXS. The same error occur for Michal Mutl EXE Explorer
> EXE64.exe.
> Inside Magdir/msdos the first test looks for e_magic at the beginning
> by line like
> 0	string/b	MZ
> Afterwards for debugging reason i insert some lines like:
> #>0x18		uleshort	x	\b, e_lfarlc=0x%x
> #>(0x3c.l)	string		x	\b, at 0x3c %.2s
> e_lfarlc is the address of relocation table. That value is later used
> to do sub classification. For some examples i get unexpected values
> here. That results are summarised inside the following table:
> # http://www.mitec.cz/Downloads/EXE.zip/EXE64.exe	0x8ead
> # some EFI apps Shell_Full.efi ext4_x64_signed.efi	0
> # Icon library WORD60.ICL				0
> # Microsoft compiled help format 2.0 WINWORD.DEV.HXS	0
> At Offset 3Ch the next exe header magic is stored. This value is
> used in later tests. I myself found samples with values like:
> 	PE NE LE LX W3 W4
> And according to documentation also following strings can occur:
> 	ZM DL MP P2 P3
> As second test look for "low" relocation table value by lines like
>> 0x18	leshort <0x40 MS-DOS executable
> !:mime	application/x-dosexec
> !:ext	exe/com
> As a comment is written:
> # All non-DOS EXE extensions have the relocation table more than
> 0x40 bytes into the file.
> This now becomes like:
> # Most non-DOS MZ-executable extensions have the relocation table
> more than 0x40 bytes into the file.
> So i see for Michal Mutl (http://www.mitec.cz/) EXE Explorer
> EXE64.exe a value of 0x8ead. By current magic line this is
> interpreted as negative value. So this exe sample is handled by above
> test branch. So this samples is considered as "MS-DOS executable"
> with wrong mime type application/x-dosexec
> But according to documentation this value is an unsigned integer.
> So i changed all test lines concerning e_lfarlc to unsigned. So
> this test line now becomes like:
>> 0x18	uleshort <0x40
> Now additional tests are needed in that branch. So i look for possibl
> e
> new header at offset 3C. If it is neither a portable executable (PE)
> nor a new executable (NE) or (LX), then it is really a DOS executable
> (like MACCNV55.EXE). That is now expressed by lines
>>> (0x3c.l)	default	x	MS-DOS executable
> !:mime	application/x-dosexec
> !:ext	exe/com
> If it is a portable executable (PE), then do nothing, because PE are
> inspected later in another branch. This is now done by line like
>>> (0x3c.l)	string	PE
> So samples ext4_x64_signed.efi, Shell_Full.efi and WINWORD.DEV.HXS
> are not misidentified any more as DOS executables.
> Some OS/S executable like PCISCAN.EXE are now handled by a branch tha
> t
> looks like:
>>> (0x3c.l)	string	LX
>>>> (0x3c.l)	use		lx-executable
> Then i also check for new executables (NE) with low e_lfarlc. In that
> branch i only find Windows Icons Library 16-bit. So these are matched
> by lines like:
>>> (0x3c.l)	string	NE	Windows Icons Library 16-bit
> !:mime	image/x-ms-icl
> !:ext	icl
> For many samples like xcopy32.exe, stinger64.exe, WimUtil.exe i get
> after identification as PE32 executable an adaptional messages text
> "MZ for MS-DOS". This message was triggered by lines like:
>>>>> &(2.s-514)	string	!LE
>>>>>> &-2	string	!BW \b, MZ for MS-DOS
> !:mime	application/x-dosexec
> Unfortunately i was not able to understand magic test lines before
> looking like spaghetti. So i skipped such Portable Executables here b
> y
> additional looking for PE magic. If i do not find this magic and LX,
> then it should be a real DOS executable. This now becomes like:
>>>>> &(2.s-514)	string	!LE
>>>>>> &-2	string	!BW
>>>>>> (0x3c.l)	string	!PE	\b, MZ for MS-DOS
>>>>>>> (0x3c.l)	string	!LX
>>>>>>>> (0x3c.l)	string	!PE	\b, MZ for MS-DOS_
> !:mime	application/x-dosexec
> Unfortunately i myself found no such DOS executable, but now
> irritating DOS message text is vanished.
> The displaying part of portable executable (PE) start with lines like
> :
>> (0x3c.l)	string		PE\0\0	PE
> !:mime	application/x-dosexec
> But according to documentation PE have an own mime type. So this
> now becomes like
>> (0x3c.l)	string		PE\0\0	PE
> !:mime	application/vnd.microsoft.portable-executable
> For debugging purpose the DLL Characteristics value and Windows
> Subsystem can be shown by lines like
> #>>(0x3c.l+22)	leshort		x	\b, CHARACTERISTICS 0x%x
> #>>(0x3c.l+92)	leshort		x	\b, SUBSYSTEM %u
> At the end of PE displaying part i show also the number of sections
> if more than one. This looks like:
>>> 0x30	string		Inno \b, InnoSetup self-extracting archive
>>> (0x3c.l+6)	leshort			>1	\b, %u sections
> Normal Windows DLL libraries have a few sections for code, data and
> resource for example. Sometimes the PE format is only used as
> container like for Microsoft compiled help format 2.0 (*.hxs) or
> Windows Icons Library (*.icl). Such PE container have less sections.
> So i can use this additional information to distinguish in more
> detail PE samples.
> After applying the above mentioned modifications by patch
> file-5.43-msdos-e_lfarlc.diff then i get a more correct output like:
> BGISRV.DRV:    MS-DOS executable, MZ for MS-DOS
> CMD8086.COM:   MS-DOS executable, MZ for MS-DOS
> EXE64.exe:     PE32+ executable (GUI)
> 	       x86-64, for MS Windows
> 	       , 10 sections
> MACCNV55.EXE:  MS-DOS executable, MZ for MS-DOS
> PCISCAN.EXE:   LX executable for OS/2 (program) (console) i80386
> WORD60.ICL:    Windows Icons Library 16-bit
> stinger64.exe: PE32+ executable (GUI)
> 	       x86-64 (stripped to external PDB), for MS Windows
> 	       , 3 sections
> Now with --extension option for inspected samples the correct file
> name extensions are shown like:
> BGISRV.DRV:    exe/com/vlm/drv
> CMD8086.COM:   exe/com/vlm/drv
> EXE64.exe:     exe/scr
> MACCNV55.EXE:  exe/com/vlm/drv
> PCISCAN.EXE:   exe
> WORD60.ICL:    icl
> stinger64.exe: exe/scr
> I hope my diff file can be applied in future version of
> file utility. There exist many more errors for MZ executables.
> I will try to handle these in a future session.
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 1o/dAKC96lCzWDYROmwNer5ByHxqnvGUfQCfcybnRvXnZBjG6UMq/HfqlUaYRik=
> =SxEi
> <trid-v-e_lfarlc.txt.gz><file-5_43-msdos-e_lfarlc_diff.DEFANGED-1783><file-5_43-msdos-e_lfarlc_diff_sig.DEFANGED-1784>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20221118/f60a9da4/attachment.asc>

More information about the File mailing list