[File] [PATCH] of Magdir/msdos for MZ executables; negative relocation address *.ICL

Jörg Jenderek joerg.jen.der.ek at gmx.net
Mon Nov 14 01:14:08 UTC 2022


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some months ago i inspect files on my EFI partition. For files
starting with 2 byte MZ magic i get unexpected recognitions.
There are chain of errors. So i will split it after work of month.
So will try to start which seems to be the beginning.

When running file command version 5.43 on such examples and other
related files i get an output like:

BGISRV.DRV:    MS-DOS executable
EXE64.exe:     MS-DOS executable PE32+ executable (GUI)
	       x86-64, for MS Windows
MACCNV55.EXE:  MS-DOS executable
PCISCAN.EXE:   MS-DOS executable, MZ for MS-DOS
WORD60.ICL:    MS-DOS executable
stinger64.exe: MS-DOS executable PE32+ executable (GUI)
	       x86-64 (stripped to external PDB), for MS Windows,
	       MZ for MS-DOS

With --extension option the wrong file name extensions are displayed.
This looks like:

BGISRV.DRV:    exe/com/vlm
EXE64.exe:     exe/com/vlm
MACCNV55.EXE:  exe/com/vlm
PCISCAN.EXE:   exe/com/vlm
WORD60.ICL:    exe/com/vlm
stinger64.exe: exe/com/vlm

Furthermore with -i option for all samples only generic DOS
executable mime type application/x-dosexec is shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This list the used
file name extension and often with -v option the related URL
pointing to used file format information (See appended
trid-v-e_lfarlc.txt.gz

Furthermore i looked in the TrID database (triddefs_xml.7z) files for
similar MZ-executables. These are expressed by XML-constructs like:

	<ASCII> M Z</ASCII>
	<Pos>0</Pos>

Then i look for such MZ executables and insert right lines inside
Magdir/msdos. If i do not found such an MZ sample mentioned by TrID i
add this as TODO comment with lines like:

#	TODO
# FLT:	Syntrillium CoolEdit Filter
#	https://en.wikipedia.org/wiki/Adobe_Audition
# FMX64:FileMaker Pro 64-bit plug-in
#	https://en.wikipedia.org/wiki/FileMaker
# FMX:	FileMaker Pro 32-bit plug-in
# ...
# ZAP:	ZoneLabs Zone Alarm data
#	http://www.zonelabs.com

First error is that some EFI files like ext4_x64_signed.efi and
Shell_Full.efi are also identified as MS-DOS executable. The same
error occur for all Windows Icons Library 16-bit like WORD60.ICL. The
same error occur for all Microsoft compiled help format 2.0 like
WINWORD.DEV.HXS. The same error occur for Michal Mutl EXE Explorer
EXE64.exe.

Inside Magdir/msdos the first test looks for e_magic at the beginning
by line like
 0	string/b	MZ

Afterwards for debugging reason i insert some lines like:
 #>0x18		uleshort	x	\b, e_lfarlc=0x%x
 #>(0x3c.l)	string		x	\b, at 0x3c %.2s


e_lfarlc is the address of relocation table. That value is later used
to do sub classification. For some examples i get unexpected values
here. That results are summarised inside the following table:
 # http://www.mitec.cz/Downloads/EXE.zip/EXE64.exe	0x8ead
 # OS/2 ECS\INSTALL\DETECTEI\PCISCAN.EXE		0x1c
 # some EFI apps Shell_Full.efi ext4_x64_signed.efi	0
 # Icon library WORD60.ICL				0
 # Microsoft compiled help format 2.0 WINWORD.DEV.HXS	0
At Offset 3Ch the next exe header magic is stored. This value is
used in later tests. I myself found samples with values like:
	PE NE LE LX W3 W4
And according to documentation also following strings can occur:
	ZM DL MP P2 P3

As second test look for "low" relocation table value by lines like
 >0x18	leshort <0x40 MS-DOS executable
 !:mime	application/x-dosexec
 !:ext	exe/com

As a comment is written:
 # All non-DOS EXE extensions have the relocation table more than
0x40 bytes into the file.

This now becomes like:
 # Most non-DOS MZ-executable extensions have the relocation table
more than 0x40 bytes into the file.

So i see for Michal Mutl (http://www.mitec.cz/) EXE Explorer
EXE64.exe a value of 0x8ead. By current magic line this is
interpreted as negative value. So this exe sample is handled by above
test branch. So this samples is considered as "MS-DOS executable"
with wrong mime type application/x-dosexec

But according to documentation this value is an unsigned integer.
So i changed all test lines concerning e_lfarlc to unsigned. So
this test line now becomes like:
 >0x18	uleshort <0x40

Now additional tests are needed in that branch. So i look for possibl
e
new header at offset 3C. If it is neither a portable executable (PE)
nor a new executable (NE) or (LX), then it is really a DOS executable
(like MACCNV55.EXE). That is now expressed by lines
 >>(0x3c.l)	default	x	MS-DOS executable
 !:mime	application/x-dosexec
 !:ext	exe/com

If it is a portable executable (PE), then do nothing, because PE are
inspected later in another branch. This is now done by line like
 >>(0x3c.l)	string	PE
So samples ext4_x64_signed.efi, Shell_Full.efi and WINWORD.DEV.HXS
are not misidentified any more as DOS executables.

Some OS/S executable like PCISCAN.EXE are now handled by a branch tha
t
looks like:
 >>(0x3c.l)	string	LX
 >>>(0x3c.l)	use		lx-executable

Then i also check for new executables (NE) with low e_lfarlc. In that
branch i only find Windows Icons Library 16-bit. So these are matched
by lines like:
 >>(0x3c.l)	string	NE	Windows Icons Library 16-bit
 !:mime	image/x-ms-icl
 !:ext	icl

For many samples like xcopy32.exe, stinger64.exe, WimUtil.exe i get
after identification as PE32 executable an adaptional messages text
"MZ for MS-DOS". This message was triggered by lines like:

 >>>>&(2.s-514)	string	!LE
 >>>>>&-2	string	!BW \b, MZ for MS-DOS
 !:mime	application/x-dosexec

Unfortunately i was not able to understand magic test lines before
looking like spaghetti. So i skipped such Portable Executables here b
y
additional looking for PE magic. If i do not find this magic and LX,
then it should be a real DOS executable. This now becomes like:
 >>>>&(2.s-514)	string	!LE
 >>>>>&-2	string	!BW
 >>>>>(0x3c.l)	string	!PE	\b, MZ for MS-DOS
 >>>>>>(0x3c.l)	string	!LX
 >>>>>>>(0x3c.l)	string	!PE	\b, MZ for MS-DOS_
 !:mime	application/x-dosexec
Unfortunately i myself found no such DOS executable, but now
irritating DOS message text is vanished.

The displaying part of portable executable (PE) start with lines like
:
 >(0x3c.l)	string		PE\0\0	PE
 !:mime	application/x-dosexec

But according to documentation PE have an own mime type. So this
now becomes like
 >(0x3c.l)	string		PE\0\0	PE
 !:mime	application/vnd.microsoft.portable-executable

For debugging purpose the DLL Characteristics value and Windows
Subsystem can be shown by lines like
 #>>(0x3c.l+22)	leshort		x	\b, CHARACTERISTICS 0x%x
 #>>(0x3c.l+92)	leshort		x	\b, SUBSYSTEM %u

At the end of PE displaying part i show also the number of sections
if more than one. This looks like:
 >>0x30	string		Inno \b, InnoSetup self-extracting archive
 >>(0x3c.l+6)	leshort			>1	\b, %u sections

Normal Windows DLL libraries have a few sections for code, data and
resource for example. Sometimes the PE format is only used as
container like for Microsoft compiled help format 2.0 (*.hxs) or
Windows Icons Library (*.icl). Such PE container have less sections.
So i can use this additional information to distinguish in more
detail PE samples.

After applying the above mentioned modifications by patch
file-5.43-msdos-e_lfarlc.diff then i get a more correct output like:

BGISRV.DRV:    MS-DOS executable, MZ for MS-DOS
CMD8086.COM:   MS-DOS executable, MZ for MS-DOS
EXE64.exe:     PE32+ executable (GUI)
	       x86-64, for MS Windows
	       , 10 sections
MACCNV55.EXE:  MS-DOS executable, MZ for MS-DOS
PCISCAN.EXE:   LX executable for OS/2 (program) (console) i80386
WORD60.ICL:    Windows Icons Library 16-bit
stinger64.exe: PE32+ executable (GUI)
	       x86-64 (stripped to external PDB), for MS Windows
	       , 3 sections

Now with --extension option for inspected samples the correct file
name extensions are shown like:
BGISRV.DRV:    exe/com/vlm/drv
CMD8086.COM:   exe/com/vlm/drv
EXE64.exe:     exe/scr
MACCNV55.EXE:  exe/com/vlm/drv
PCISCAN.EXE:   exe
WORD60.ICL:    icl
stinger64.exe: exe/scr

I hope my diff file can be applied in future version of
file utility. There exist many more errors for MZ executables.
I will try to handle these in a future session.

With best wishes
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY3GWYAAKCRCv8rHJQhrU
1o/dAKC96lCzWDYROmwNer5ByHxqnvGUfQCfcybnRvXnZBjG6UMq/HfqlUaYRik=
=SxEi
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-e_lfarlc.txt.gz
Type: application/x-gzip
Size: 1059 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221114/ee86a8eb/attachment.bin>
-------------- next part --------------
--- file-5.43/magic/Magdir/msdos.old	2022-09-13 20:05:40.000000000 +0200
+++ file-5.43/magic/Magdir/msdos	2022-11-14 02:09:36.928229900 +0100
@@ -50,13 +50,82 @@
 # Many of the compressed formats were extracted from IDARC 1.23 source code.
 #
+# e_magic
 0	string/b	MZ
-# All non-DOS EXE extensions have the relocation table more than 0x40 bytes into the file.
->0x18	leshort <0x40 MS-DOS executable
+#	TODO
+# FLT:	Syntrillium CoolEdit Filter		https://en.wikipedia.org/wiki/Adobe_Audition
+# FMX64:FileMaker Pro 64-bit plug-in		https://en.wikipedia.org/wiki/FileMaker
+# FMX:	FileMaker Pro 32-bit plug-in		https://en.wikipedia.org/wiki/FileMaker
+# FOD:	WIFE Font Driver
+# GAU:	MS Flight Simulator Gauge
+# IFS:	OS/2 Installable File System		https://en.wikipedia.org/wiki/OS/2
+# MEXW32:MATLAB Windows 32bit compiled function	https://en.wikipedia.org/wiki/MATLAB
+# MEXW64:MATLAB Windows 64bit compiled function	https://en.wikipedia.org/wiki/MATLAB
+# MLL:	Maya plug-in (generic)	       		http://en.wikipedia.org/wiki/Autodesk_Maya
+# PFL:	PhotoFilter plugin			http://photofiltre.free.fr
+# 8*:	PhotoShop plug-in (generic)		http://www.adobe.com/products/photoshop/main.html
+# PLG:	Aston Shell plugin			http://www.astonshell.com/
+# QLB:	Microsoft Basic Quick library		https://en.wikipedia.org/wiki/QuickBASIC
+# SKL:	WinLIFT skin				http://www.zapsolution.com/winlift/index.htm
+# TBK:	Asymetrix ToolBook application		http://www.toolbook.com
+# TBP:	The Bat! plugin	   			http://www.ritlabs.com
+# UPC:	Ultimate Paint Graphics Editor plugin	http://ultimatepaint.j-t-l.com
+# XFM:	Syntrillium Cool Edit Transform Effect	bad http://www.cooledit.com
+# XPL:	X-Plane plugin	      			http://www.xsquawkbox.net/xpsdk/
+# ZAP:	ZoneLabs Zone Alarm data		http://www.zonelabs.com
+#
+# NEXT LINES FOR DEBUGGING!
+# e_cblp; bytes on last page of file
+# e_cp; pages in file
+#>4		uleshort	x	\b, e_cp 0x%x
+# e_lfanew; file address of new exe header
+#>0x3c		ulelong		x	\b, e_lfanew 0x%x
+# e_lfarlc; address of relocation table
+#>0x18		uleshort	x	\b, e_lfarlc=0x%x
+# e_ovno; overlay number. If zero, this is the main executable foo
+#>0x1a		uleshort	!0	\b, e_ovno 0x%x
+#>0x1C		ubequad		!0	\b, e_res 0x%16.16llx
+# e_oemid; often 0
+#>0x24		uleshort	!0	\b, e_oemid 0x%x
+# e_oeminfo; typically zeroes, but 13Dh (WORDSTAR.CNV WPFT5.CNV) 143h (WRITWIN.CNV)
+# 1A3h (DBASE.CNV LOTUS123.CNV RFTDCA.CNV WORDDOS.CNV WORDMAC.CNV WORDWIN1.CNVXLBIFF.CNV)
+#>0x26		uleshort	!0	\b, e_oeminfo 0x%x
+#  e_res2; typically zeroes, but 000006006F082D2Ah SCSICFG.EXE 00009A0300007C03h de.exe
+# 0000CA0000000002h country.exe dosxmgr.exe 421E0A00421EA823h QMC.EXE
+#>0x28		ubequad		!0	\b, e_res2 0x%16.16llx
+# https://web.archive.org/web/20171116024937/http://www.ctyme.com/intr/rb-2939.htm#table1593
+# new exe header magic like: PE NE LE LX W3 W4
+# no examples found for ZM DL MP P2 P3
+#>(0x3c.l)	string		x	\b, at 0x3c %.2s
+#
+# Most non-DOS MZ-executable extensions have the relocation table more than 0x40 bytes into the file.
+# http://www.mitec.cz/Downloads/EXE.zip/EXE64.exe	e_lfarlc=0x8ead
+# OS/2 ECS\INSTALL\DETECTEI\PCISCAN.EXE			e_lfarlc=0x1c
+# some EFI apps Shell_Full.efi ext4_x64_signed.efi	e_lfarlc=0
+# Icon library WORD60.ICL				e_lfarlc=0
+# Microsoft compiled help format 2.0 WINWORD.DEV.HXS	e_lfarlc=0
+>0x18	uleshort <0x40
+# check magic of new second header
+# NE executable with low e_lfarlc like: WORD60.ICL
+# ICL:	Icons Library 16-bit			http://fileformats.archiveteam.org/wiki/Icon_library
+>>(0x3c.l)	string	NE	Windows Icons Library 16-bit
+!:mime		image/x-ms-icl
+!:ext		icl
+# handle LX executable with low e_lfarlc like: PCISCAN.EXE
+>>(0x3c.l)	string	LX
+>>>(0x3c.l)	use		lx-executable
+# skip Portable Executable (PE) with low e_lfarlc here, because handled later
+# like: ext4_x64_signed.efi Shell_Full.efi WINWORD.DEV.HXS
+>>(0x3c.l)	string	PE
+# not New Executable (NE) and not PE with low e_lfarlc like:
+# MACCNV55.EXE WORK_RTF.EXE TELE200.EXE NDD.EXE iflash.exe
+>>(0x3c.l)	default	x	MS-DOS executable, MZ for MS-DOS
 !:mime	application/x-dosexec
 # Windows and later versions of DOS will allow .EXEs to be named with a .COM
 # extension, mostly for compatibility's sake.
+# like: EDIT.COM 4DOS.COM CMD8086.COM CMD-FR.COM SYSLINUX.COM
 # URL:		https://en.wikipedia.org/wiki/Personal_NetWare#VLM
 # Reference:	https://mark0.net/download/triddefs_xml.7z/defs/e/exe-vlm-msg.trid.xml
-!:ext	exe/com/vlm
+# also like: BGISRV.DRV
+!:ext	exe/com/vlm/drv
 # These traditional tests usually work but not always.  When test quality support is
 # implemented these can be turned on.
@@ -65,6 +134,15 @@
 
 # Maybe it's a PE?
+# URL:		http://fileformats.archiveteam.org/wiki/Portable_Executable
+# Reference:	https://docs.microsoft.com/de-de/windows/win32/debug/pe-format
 >(0x3c.l)	string		PE\0\0	PE
-!:mime	application/x-dosexec
+!:mime	application/vnd.microsoft.portable-executable
+# https://docs.microsoft.com/de-de/windows/win32/debug/pe-format#characteristics
+# DLL Characteristics
+#>>(0x3c.l+22)	leshort		x	\b, CHARACTERISTICS 0x%x
+# 0x0200~IMAGE_FILE_DEBUG_STRIPPED Debugging information is removed from the image file
+# 0x1000~IMAGE_FILE_SYSTEM The image file is a system file, not a user program. 
+# 0x2000~IMAGE_FILE_DLL The image file is a dynamic-link library (DLL)
+#>>(0x3c.l+92)	leshort		x	\b, SUBSYSTEM %u
 >>(0x3c.l+24)	leshort		0x010b	\b32 executable
 >>(0x3c.l+24)	leshort		0x020b	\b32+ executable
@@ -177,8 +255,11 @@
 >>&(0x3c.l+0xf8)	search/0x100	SharedD \b, Microsoft Installer self-extracting archive
 >>0x30			string		Inno \b, InnoSetup self-extracting archive
+# NumberOfSections; Normal Dynamic Link libraries have a few sections for code, data and resource etc.
+# PE used as container have less sections
+>>(0x3c.l+6)	leshort			>1	\b, %u sections
 
 # If the relocation table is 0x40 or more bytes into the file, it's definitely
 # not a DOS EXE.
->0x18  leshort >0x3f
+>0x18	uleshort	>0x3f
 
 # Hmm, not a PE but the relocation table is too high for a traditional DOS exe,
@@ -269,9 +350,17 @@
 # header data too small for extended executable
 >2		long	!0
->>0x18		leshort <0x40
+>>0x18		uleshort <0x40
 >>>(4.s*512)	leshort !0x014c
 
 >>>>&(2.s-514)	string	!LE
->>>>>&-2	string	!BW \b, MZ for MS-DOS
+>>>>>&-2	string	!BW
+#>>>>>>(0x3c.l)	string		x	\b, 2ND MAGIC %.2s
+# but some LX executable appear here also like: PCISCAN.EXE
+>>>>>>(0x3c.l)	string	!LX
+# because Portable Executable (PE) already done skip many here like:
+# xcopy32.exe stinger64.exe WimUtil.exe
+# NO such DOS examples found and
+# DOS examples seems to be already handled by e_lfarlc <0x40 like: CMD8086.COM CMD-FR.COM
+>>>>>>>(0x3c.l)	string	!PE	\b, MZ for MS-DOS
 !:mime	application/x-dosexec
 >>>>&(2.s-514)	string	LE \b, LE
@@ -387,4 +476,5 @@
 #!:mime	application/x-msdownload
 !:mime	application/x-lx-executable
+!:ext	exe
 # byte order: 00h~little-endian non-zero=1~big-endian
 #>0x02	ubyte			=0		(little-endian)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.43-msdos-e_lfarlc.diff.sig
Type: application/octet-stream
Size: 3277 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221114/ee86a8eb/attachment.obj>


More information about the File mailing list