[File] [PATCH] doc/magic.man ; in 8 MZ examples e_lfarlc pointers must be unsigned
Christos Zoulas
christos at zoulas.com
Mon Oct 9 13:42:41 UTC 2023
Committed, thanks!
christos
> On Oct 5, 2023, at 4:31 PM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
>
> Hello,
>
> some month ago i send patch file-5.43-msdos-e_lfarlc.diff of
> Magdir/msdos to correct MZ DOS/Windows executables recognition. I looked
> inside Magdir/msdos of file command version 5.45 and see that my patches
> are accepted.
>
> Unfortunately the concerned section are also mentioned as examples in
> man page magic.man (v 1.103). So there the old and wrong expressions are
> listed.
>
> I will recapitulate the lines inside Magdir/msdos that starts like:
> 0 string/b MZ
> #>0x18 uleshort x \b, e_lfarlc=0x%x
> >0x18 uleshort <0x40
>
> After looking for e_magic MZ then use the relocation table pointer
> e_lfarlc to do sub classification. For most non-DOS MZ-executable
> extensions (That are Windows like) have the relocation table more than
> 0x40 bytes into the file whereas for DOS like it is the opposite. For
> MiTeC Portable Executable Reader EXE64.exe found in archive
> http://www.mitec.cz/Downloads/EXE.zip i get "high" value
> e_lfarlc=0x8ead. In old expressions the test for e_lfarlc limit 0x40 was
> done as signed. So here value 0x8ead was handled as a negative number,
> that was considered as below 0x40 limit. So in old expressions EXE64.exe
> was handled wrong by branch for pure DOS executables.
>
> So all tests must be done as unsigned via "uleshort" test. Or in other
> words the test via "leshort" i wrong. When looking doc/magic.man i found
> 8 places where the old "leshort" expression is used like:
>
> 0 string MZ
> >0x18 leshort <0x40 MS-DOS executable
> >0x18 leshort >0x3f extended PC executable (e.g., MS Windows)
>
> # MS Windows executables are also valid MS-DOS executables
> 0 string MZ
> >0x18 leshort <0x40 MZ executable (MS-DOS)
> # skip the whole block below if it is not an extended executable
> >0x18 leshort >0x3f
> >>(0x3c.l) string PE\0\0 PE executable (MS-Windows)
> >>(0x3c.l) string LX\0\0 LX executable (OS/2)
>
> # MS Windows executables are also valid MS-DOS executables
> 0 string MZ
> # sometimes, the value at 0x18 is less that 0x40 but there's still an
> # extended executable, simply appended to the file
> >0x18 leshort <0x40
> >>(4.s*512) leshort 0x014c COFF executable (MS-DOS, DJGPP)
> >>(4.s*512) leshort !0x014c MZ executable (MS-DOS)
>
> 0 string MZ
> >0x18 leshort >0x3f
> >>(0x3c.l) string PE\0\0 PE executable (MS-Windows)
> # immediately following the PE signature is the CPU type
> >>>&0 leshort 0x14c for Intel 80386
> >>>&0 leshort 0x184 for DEC Alpha
>
> 0 string MZ
> >0x18 leshort <0x40
> >>(4.s*512) leshort !0x014c MZ executable (MS-DOS)
> # if it's not COFF, go back 512 bytes and add the offset taken
> # from byte 2/3, which is yet another way of finding the start
> # of the extended executable
> >>>&(2.s-514) string LE LE executable (MS Windows VxD driver)
>
> 0 string MZ
> >0x18 leshort >0x3f
> >>(0x3c.l) string LE\0\0 LE executable (MS-Windows)
> # at offset 0x80 (-4, since relative offsets start at the end
> # of the up-level match) inside the LE header, we find the absolute
> # offset to the code area, where we look for a specific signature
> >>>(&0x7c.l+0x26) string UPX \b, UPX compressed
>
> 0 string MZ
> >0x18 leshort >0x3f
> >>(0x3c.l) string LE\0\0 LE executable (MS-Windows)
> # at offset 0x58 inside the LE header, we find the relative offset
> # to a data area where we look for a specific signature
> >>>&(&0x54.l-3) string UNACE \b, ACE self-extracting archive
>
> 0 string MZ
> >0x18 leshort >0x3f
> >>(0x3c.l) string PE\0\0 PE executable (MS-Windows)
> # search for the PE section called ".idata"...
> >>>&0xf4 search/0x140 .idata
> # ...and go to the end of it, calculated from start+length;
> # these are located 14 and 10 bytes after the section name
> >>>>(&0xe.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive
>
> So i replaced leshort by uleshort expression. I also insert spaces where
> needed to get columns with same indention. In section with DEC Alpha i
> also insert line for x86-64 architecture, which is nowadays more often used.
>
> Because my brain is too little to remember a correct command to get the
> formatted manual text page for control reasons like "groff -Tlatin1 -m
> man doc/magic.man" i put this as compilation instruction for Emacs
> editor inside the man text. I also want that the computer works for me
> and not vice versa. So i instruct Emacs editor to update automatically
> the second line with current man page date which looked like:
> .Dd Arpil 18, 2023
> Apparently all people handling this man page are blind! Because the used
> month name Arpil is wrong! The correct name was April.
>
> So i put suited nroof comment lines at the end of the man page like:
> .\"
> .\" For emacs editor
> .\" Local Variables:
> .\" eval: (add-hook 'before-save-hook 'time-stamp)
> .\" time-stamp-start: ".Dd "
> .\" time-stamp-end: "$"
> .\" time-stamp-format: "%:B %02d, %:Y"
> .\" time-stamp-time-zone: "UTC0"
> .\" system-time-locale: "C"
> .\" eval:(setq compile-command (concat "groff -Tlatin1 -m man "
> (buffer-file-name)) )
> .\" End:
>
> This works for me and ffter applying the above mentioned modifications
> by patch file-5.45-magic.man.diff then correct test for e_lfarlc pointer
> are also shown in mentioned examples in man page.
>
> I hope my diff file can be applied in future version of file
> utility.
>
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <file-5_45-magic_man_diff.DEFANGED-369><file-5_45-magic_man_diff_sig.DEFANGED-370>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
More information about the File
mailing list