[File] [PATCH] of Magdir/msdos for MZ executables; negative relocation address *.ICL
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Mon Nov 14 01:14:08 UTC 2022
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
some months ago i inspect files on my EFI partition. For files
starting with 2 byte MZ magic i get unexpected recognitions.
There are chain of errors. So i will split it after work of month.
So will try to start which seems to be the beginning.
When running file command version 5.43 on such examples and other
related files i get an output like:
BGISRV.DRV: MS-DOS executable
EXE64.exe: MS-DOS executable PE32+ executable (GUI)
x86-64, for MS Windows
MACCNV55.EXE: MS-DOS executable
PCISCAN.EXE: MS-DOS executable, MZ for MS-DOS
WORD60.ICL: MS-DOS executable
stinger64.exe: MS-DOS executable PE32+ executable (GUI)
x86-64 (stripped to external PDB), for MS Windows,
MZ for MS-DOS
With --extension option the wrong file name extensions are displayed.
This looks like:
BGISRV.DRV: exe/com/vlm
EXE64.exe: exe/com/vlm
MACCNV55.EXE: exe/com/vlm
PCISCAN.EXE: exe/com/vlm
WORD60.ICL: exe/com/vlm
stinger64.exe: exe/com/vlm
Furthermore with -i option for all samples only generic DOS
executable mime type application/x-dosexec is shown.
For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This list the used
file name extension and often with -v option the related URL
pointing to used file format information (See appended
trid-v-e_lfarlc.txt.gz
Furthermore i looked in the TrID database (triddefs_xml.7z) files for
similar MZ-executables. These are expressed by XML-constructs like:
<ASCII> M Z</ASCII>
<Pos>0</Pos>
Then i look for such MZ executables and insert right lines inside
Magdir/msdos. If i do not found such an MZ sample mentioned by TrID i
add this as TODO comment with lines like:
# TODO
# FLT: Syntrillium CoolEdit Filter
# https://en.wikipedia.org/wiki/Adobe_Audition
# FMX64:FileMaker Pro 64-bit plug-in
# https://en.wikipedia.org/wiki/FileMaker
# FMX: FileMaker Pro 32-bit plug-in
# ...
# ZAP: ZoneLabs Zone Alarm data
# http://www.zonelabs.com
First error is that some EFI files like ext4_x64_signed.efi and
Shell_Full.efi are also identified as MS-DOS executable. The same
error occur for all Windows Icons Library 16-bit like WORD60.ICL. The
same error occur for all Microsoft compiled help format 2.0 like
WINWORD.DEV.HXS. The same error occur for Michal Mutl EXE Explorer
EXE64.exe.
Inside Magdir/msdos the first test looks for e_magic at the beginning
by line like
0 string/b MZ
Afterwards for debugging reason i insert some lines like:
#>0x18 uleshort x \b, e_lfarlc=0x%x
#>(0x3c.l) string x \b, at 0x3c %.2s
e_lfarlc is the address of relocation table. That value is later used
to do sub classification. For some examples i get unexpected values
here. That results are summarised inside the following table:
# http://www.mitec.cz/Downloads/EXE.zip/EXE64.exe 0x8ead
# OS/2 ECS\INSTALL\DETECTEI\PCISCAN.EXE 0x1c
# some EFI apps Shell_Full.efi ext4_x64_signed.efi 0
# Icon library WORD60.ICL 0
# Microsoft compiled help format 2.0 WINWORD.DEV.HXS 0
At Offset 3Ch the next exe header magic is stored. This value is
used in later tests. I myself found samples with values like:
PE NE LE LX W3 W4
And according to documentation also following strings can occur:
ZM DL MP P2 P3
As second test look for "low" relocation table value by lines like
>0x18 leshort <0x40 MS-DOS executable
!:mime application/x-dosexec
!:ext exe/com
As a comment is written:
# All non-DOS EXE extensions have the relocation table more than
0x40 bytes into the file.
This now becomes like:
# Most non-DOS MZ-executable extensions have the relocation table
more than 0x40 bytes into the file.
So i see for Michal Mutl (http://www.mitec.cz/) EXE Explorer
EXE64.exe a value of 0x8ead. By current magic line this is
interpreted as negative value. So this exe sample is handled by above
test branch. So this samples is considered as "MS-DOS executable"
with wrong mime type application/x-dosexec
But according to documentation this value is an unsigned integer.
So i changed all test lines concerning e_lfarlc to unsigned. So
this test line now becomes like:
>0x18 uleshort <0x40
Now additional tests are needed in that branch. So i look for possibl
e
new header at offset 3C. If it is neither a portable executable (PE)
nor a new executable (NE) or (LX), then it is really a DOS executable
(like MACCNV55.EXE). That is now expressed by lines
>>(0x3c.l) default x MS-DOS executable
!:mime application/x-dosexec
!:ext exe/com
If it is a portable executable (PE), then do nothing, because PE are
inspected later in another branch. This is now done by line like
>>(0x3c.l) string PE
So samples ext4_x64_signed.efi, Shell_Full.efi and WINWORD.DEV.HXS
are not misidentified any more as DOS executables.
Some OS/S executable like PCISCAN.EXE are now handled by a branch tha
t
looks like:
>>(0x3c.l) string LX
>>>(0x3c.l) use lx-executable
Then i also check for new executables (NE) with low e_lfarlc. In that
branch i only find Windows Icons Library 16-bit. So these are matched
by lines like:
>>(0x3c.l) string NE Windows Icons Library 16-bit
!:mime image/x-ms-icl
!:ext icl
For many samples like xcopy32.exe, stinger64.exe, WimUtil.exe i get
after identification as PE32 executable an adaptional messages text
"MZ for MS-DOS". This message was triggered by lines like:
>>>>&(2.s-514) string !LE
>>>>>&-2 string !BW \b, MZ for MS-DOS
!:mime application/x-dosexec
Unfortunately i was not able to understand magic test lines before
looking like spaghetti. So i skipped such Portable Executables here b
y
additional looking for PE magic. If i do not find this magic and LX,
then it should be a real DOS executable. This now becomes like:
>>>>&(2.s-514) string !LE
>>>>>&-2 string !BW
>>>>>(0x3c.l) string !PE \b, MZ for MS-DOS
>>>>>>(0x3c.l) string !LX
>>>>>>>(0x3c.l) string !PE \b, MZ for MS-DOS_
!:mime application/x-dosexec
Unfortunately i myself found no such DOS executable, but now
irritating DOS message text is vanished.
The displaying part of portable executable (PE) start with lines like
:
>(0x3c.l) string PE\0\0 PE
!:mime application/x-dosexec
But according to documentation PE have an own mime type. So this
now becomes like
>(0x3c.l) string PE\0\0 PE
!:mime application/vnd.microsoft.portable-executable
For debugging purpose the DLL Characteristics value and Windows
Subsystem can be shown by lines like
#>>(0x3c.l+22) leshort x \b, CHARACTERISTICS 0x%x
#>>(0x3c.l+92) leshort x \b, SUBSYSTEM %u
At the end of PE displaying part i show also the number of sections
if more than one. This looks like:
>>0x30 string Inno \b, InnoSetup self-extracting archive
>>(0x3c.l+6) leshort >1 \b, %u sections
Normal Windows DLL libraries have a few sections for code, data and
resource for example. Sometimes the PE format is only used as
container like for Microsoft compiled help format 2.0 (*.hxs) or
Windows Icons Library (*.icl). Such PE container have less sections.
So i can use this additional information to distinguish in more
detail PE samples.
After applying the above mentioned modifications by patch
file-5.43-msdos-e_lfarlc.diff then i get a more correct output like:
BGISRV.DRV: MS-DOS executable, MZ for MS-DOS
CMD8086.COM: MS-DOS executable, MZ for MS-DOS
EXE64.exe: PE32+ executable (GUI)
x86-64, for MS Windows
, 10 sections
MACCNV55.EXE: MS-DOS executable, MZ for MS-DOS
PCISCAN.EXE: LX executable for OS/2 (program) (console) i80386
WORD60.ICL: Windows Icons Library 16-bit
stinger64.exe: PE32+ executable (GUI)
x86-64 (stripped to external PDB), for MS Windows
, 3 sections
Now with --extension option for inspected samples the correct file
name extensions are shown like:
BGISRV.DRV: exe/com/vlm/drv
CMD8086.COM: exe/com/vlm/drv
EXE64.exe: exe/scr
MACCNV55.EXE: exe/com/vlm/drv
PCISCAN.EXE: exe
WORD60.ICL: icl
stinger64.exe: exe/scr
I hope my diff file can be applied in future version of
file utility. There exist many more errors for MZ executables.
I will try to handle these in a future session.
With best wishes
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY3GWYAAKCRCv8rHJQhrU
1o/dAKC96lCzWDYROmwNer5ByHxqnvGUfQCfcybnRvXnZBjG6UMq/HfqlUaYRik=
=SxEi
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-e_lfarlc.txt.gz
Type: application/x-gzip
Size: 1059 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221114/ee86a8eb/attachment.bin>
-------------- next part --------------
--- file-5.43/magic/Magdir/msdos.old 2022-09-13 20:05:40.000000000 +0200
+++ file-5.43/magic/Magdir/msdos 2022-11-14 02:09:36.928229900 +0100
@@ -50,13 +50,82 @@
# Many of the compressed formats were extracted from IDARC 1.23 source code.
#
+# e_magic
0 string/b MZ
-# All non-DOS EXE extensions have the relocation table more than 0x40 bytes into the file.
->0x18 leshort <0x40 MS-DOS executable
+# TODO
+# FLT: Syntrillium CoolEdit Filter https://en.wikipedia.org/wiki/Adobe_Audition
+# FMX64:FileMaker Pro 64-bit plug-in https://en.wikipedia.org/wiki/FileMaker
+# FMX: FileMaker Pro 32-bit plug-in https://en.wikipedia.org/wiki/FileMaker
+# FOD: WIFE Font Driver
+# GAU: MS Flight Simulator Gauge
+# IFS: OS/2 Installable File System https://en.wikipedia.org/wiki/OS/2
+# MEXW32:MATLAB Windows 32bit compiled function https://en.wikipedia.org/wiki/MATLAB
+# MEXW64:MATLAB Windows 64bit compiled function https://en.wikipedia.org/wiki/MATLAB
+# MLL: Maya plug-in (generic) http://en.wikipedia.org/wiki/Autodesk_Maya
+# PFL: PhotoFilter plugin http://photofiltre.free.fr
+# 8*: PhotoShop plug-in (generic) http://www.adobe.com/products/photoshop/main.html
+# PLG: Aston Shell plugin http://www.astonshell.com/
+# QLB: Microsoft Basic Quick library https://en.wikipedia.org/wiki/QuickBASIC
+# SKL: WinLIFT skin http://www.zapsolution.com/winlift/index.htm
+# TBK: Asymetrix ToolBook application http://www.toolbook.com
+# TBP: The Bat! plugin http://www.ritlabs.com
+# UPC: Ultimate Paint Graphics Editor plugin http://ultimatepaint.j-t-l.com
+# XFM: Syntrillium Cool Edit Transform Effect bad http://www.cooledit.com
+# XPL: X-Plane plugin http://www.xsquawkbox.net/xpsdk/
+# ZAP: ZoneLabs Zone Alarm data http://www.zonelabs.com
+#
+# NEXT LINES FOR DEBUGGING!
+# e_cblp; bytes on last page of file
+# e_cp; pages in file
+#>4 uleshort x \b, e_cp 0x%x
+# e_lfanew; file address of new exe header
+#>0x3c ulelong x \b, e_lfanew 0x%x
+# e_lfarlc; address of relocation table
+#>0x18 uleshort x \b, e_lfarlc=0x%x
+# e_ovno; overlay number. If zero, this is the main executable foo
+#>0x1a uleshort !0 \b, e_ovno 0x%x
+#>0x1C ubequad !0 \b, e_res 0x%16.16llx
+# e_oemid; often 0
+#>0x24 uleshort !0 \b, e_oemid 0x%x
+# e_oeminfo; typically zeroes, but 13Dh (WORDSTAR.CNV WPFT5.CNV) 143h (WRITWIN.CNV)
+# 1A3h (DBASE.CNV LOTUS123.CNV RFTDCA.CNV WORDDOS.CNV WORDMAC.CNV WORDWIN1.CNVXLBIFF.CNV)
+#>0x26 uleshort !0 \b, e_oeminfo 0x%x
+# e_res2; typically zeroes, but 000006006F082D2Ah SCSICFG.EXE 00009A0300007C03h de.exe
+# 0000CA0000000002h country.exe dosxmgr.exe 421E0A00421EA823h QMC.EXE
+#>0x28 ubequad !0 \b, e_res2 0x%16.16llx
+# https://web.archive.org/web/20171116024937/http://www.ctyme.com/intr/rb-2939.htm#table1593
+# new exe header magic like: PE NE LE LX W3 W4
+# no examples found for ZM DL MP P2 P3
+#>(0x3c.l) string x \b, at 0x3c %.2s
+#
+# Most non-DOS MZ-executable extensions have the relocation table more than 0x40 bytes into the file.
+# http://www.mitec.cz/Downloads/EXE.zip/EXE64.exe e_lfarlc=0x8ead
+# OS/2 ECS\INSTALL\DETECTEI\PCISCAN.EXE e_lfarlc=0x1c
+# some EFI apps Shell_Full.efi ext4_x64_signed.efi e_lfarlc=0
+# Icon library WORD60.ICL e_lfarlc=0
+# Microsoft compiled help format 2.0 WINWORD.DEV.HXS e_lfarlc=0
+>0x18 uleshort <0x40
+# check magic of new second header
+# NE executable with low e_lfarlc like: WORD60.ICL
+# ICL: Icons Library 16-bit http://fileformats.archiveteam.org/wiki/Icon_library
+>>(0x3c.l) string NE Windows Icons Library 16-bit
+!:mime image/x-ms-icl
+!:ext icl
+# handle LX executable with low e_lfarlc like: PCISCAN.EXE
+>>(0x3c.l) string LX
+>>>(0x3c.l) use lx-executable
+# skip Portable Executable (PE) with low e_lfarlc here, because handled later
+# like: ext4_x64_signed.efi Shell_Full.efi WINWORD.DEV.HXS
+>>(0x3c.l) string PE
+# not New Executable (NE) and not PE with low e_lfarlc like:
+# MACCNV55.EXE WORK_RTF.EXE TELE200.EXE NDD.EXE iflash.exe
+>>(0x3c.l) default x MS-DOS executable, MZ for MS-DOS
!:mime application/x-dosexec
# Windows and later versions of DOS will allow .EXEs to be named with a .COM
# extension, mostly for compatibility's sake.
+# like: EDIT.COM 4DOS.COM CMD8086.COM CMD-FR.COM SYSLINUX.COM
# URL: https://en.wikipedia.org/wiki/Personal_NetWare#VLM
# Reference: https://mark0.net/download/triddefs_xml.7z/defs/e/exe-vlm-msg.trid.xml
-!:ext exe/com/vlm
+# also like: BGISRV.DRV
+!:ext exe/com/vlm/drv
# These traditional tests usually work but not always. When test quality support is
# implemented these can be turned on.
@@ -65,6 +134,15 @@
# Maybe it's a PE?
+# URL: http://fileformats.archiveteam.org/wiki/Portable_Executable
+# Reference: https://docs.microsoft.com/de-de/windows/win32/debug/pe-format
>(0x3c.l) string PE\0\0 PE
-!:mime application/x-dosexec
+!:mime application/vnd.microsoft.portable-executable
+# https://docs.microsoft.com/de-de/windows/win32/debug/pe-format#characteristics
+# DLL Characteristics
+#>>(0x3c.l+22) leshort x \b, CHARACTERISTICS 0x%x
+# 0x0200~IMAGE_FILE_DEBUG_STRIPPED Debugging information is removed from the image file
+# 0x1000~IMAGE_FILE_SYSTEM The image file is a system file, not a user program.
+# 0x2000~IMAGE_FILE_DLL The image file is a dynamic-link library (DLL)
+#>>(0x3c.l+92) leshort x \b, SUBSYSTEM %u
>>(0x3c.l+24) leshort 0x010b \b32 executable
>>(0x3c.l+24) leshort 0x020b \b32+ executable
@@ -177,8 +255,11 @@
>>&(0x3c.l+0xf8) search/0x100 SharedD \b, Microsoft Installer self-extracting archive
>>0x30 string Inno \b, InnoSetup self-extracting archive
+# NumberOfSections; Normal Dynamic Link libraries have a few sections for code, data and resource etc.
+# PE used as container have less sections
+>>(0x3c.l+6) leshort >1 \b, %u sections
# If the relocation table is 0x40 or more bytes into the file, it's definitely
# not a DOS EXE.
->0x18 leshort >0x3f
+>0x18 uleshort >0x3f
# Hmm, not a PE but the relocation table is too high for a traditional DOS exe,
@@ -269,9 +350,17 @@
# header data too small for extended executable
>2 long !0
->>0x18 leshort <0x40
+>>0x18 uleshort <0x40
>>>(4.s*512) leshort !0x014c
>>>>&(2.s-514) string !LE
->>>>>&-2 string !BW \b, MZ for MS-DOS
+>>>>>&-2 string !BW
+#>>>>>>(0x3c.l) string x \b, 2ND MAGIC %.2s
+# but some LX executable appear here also like: PCISCAN.EXE
+>>>>>>(0x3c.l) string !LX
+# because Portable Executable (PE) already done skip many here like:
+# xcopy32.exe stinger64.exe WimUtil.exe
+# NO such DOS examples found and
+# DOS examples seems to be already handled by e_lfarlc <0x40 like: CMD8086.COM CMD-FR.COM
+>>>>>>>(0x3c.l) string !PE \b, MZ for MS-DOS
!:mime application/x-dosexec
>>>>&(2.s-514) string LE \b, LE
@@ -387,4 +476,5 @@
#!:mime application/x-msdownload
!:mime application/x-lx-executable
+!:ext exe
# byte order: 00h~little-endian non-zero=1~big-endian
#>0x02 ubyte =0 (little-endian)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.43-msdos-e_lfarlc.diff.sig
Type: application/octet-stream
Size: 3277 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221114/ee86a8eb/attachment.obj>
More information about the File
mailing list