[File] [PATCH] Magdir/sun, bsdi, hp, , ibm6000, coff, digital executables -duplicates+extension
Jörg Jenderek (GMX)
joerg.jen.der.ek at gmx.net
Mon Mar 25 15:19:34 UTC 2024
Hello,
some days ago i looked at the content of an exotic CD-ROM. There are
also stored some executables for different UNIX like operating systems.
One program is called HYPERHLP. The corresponding URLs are:
https://en.wikipedia.org/wiki/SoftPC
https://archive.org/details/softwin2-unix
When running file command version 5.45 with -k option on such
executables i get an output like:
CALCSIZE: PA-RISC1.1 shared executable - not stripped
HYPERHLP: COFF format alpha demand paged
executable dynamically linked stripped - version 3.11-2
HYPERHLP-hp: PA-RISC1.0 shared executable
HYPERHLP-risc: executable (RISC System/6000 V3.1) or obj module
HYPERHLP-sunos: SPARC demand paged
dynamically linked executable
a.out SunOS SPARC demand paged
dynamically linked executable
With option --extension only 3 byte sequence ??? is shown and with -i
option only generic application/octet-stream is shown.
For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). The PA-RISC executables
are here also recognized. Here no mime type and no file name suffix is
shown. The sample HYPERHLP-hp is described as "PA-RISC 1.0 object code
(generic)" by pa-risc-10.trid.xml. The sample CALCSIZE is described as
"PA-RISC 1.1 object code (generic)" by pa-risc-11.trid.xml (See
appended trid-v-exec.txt.gz).
For comparison reason i also run the file format identification utility
DROID (See https://sourceforge.net/projects/droid/). Here the samples
are not recognized or described wrong. The sample HYPERHLP-risc is
described as "MPEG 1/2 Audio Layer 3" by PUID fmt/134.
For sample HYPERHLP-sunos i get "duplicate" messages. One is done by
lines inside Magdir/bsdi. These look like:
0 belong&077777777 0600413 SPARC demand paged
>0 byte &0x80
>>20 belong <4096 shared library
>>20 belong =4096 dynamically linked executable
>>20 belong >4096 dynamically linked executable
>0 byte ^0x80 executable
>16 belong >0 not stripped
>36 belong 0xb4100001 (uses shared libs)
The second message is done by lines inside Magdir/sun. These look like:
0 belong&077777777 0600413 a.out SunOS SPARC demand paged
>0 byte &0x80
>>20 belong <4096 shared library
>>20 belong =4096 dynamically linked executable
>>20 belong >4096 dynamically linked executable
>0 byte ^0x80 executable
>16 belong >0 not stripped
So the same file format is described twice. The difference is that in
second message the phrase "a.out SunOS" occurs before phrase "SPARC
demand paged". In first part also a sub classification is done for
executables with shared libs by last line. So i comment out lines in
Magdir/sun. On Unix like systems executables typically have no file name
suffix. So the lines inside Magdir/bsdi now becomes like:
0 belong&077777777 0600413 SPARC demand paged
>0 byte &0x80
>>20 belong <4096 shared library
>>20 belong =4096 dynamically linked executable
>>20 belong >4096 dynamically linked executable
!:ext /
>0 byte ^0x80 executable
>16 belong >0 not stripped
>36 belong 0xb4100001 (uses shared libs)
With the help of other tools i found a page about PA-RISC Architecture.
That informations are expressed by comment lines inside Magdir/hp like:
# URL: http://www.openpa.net/arch.html
# Reference: http://mark0.net/download/triddefs_xml.7z
# defs/p/pa-risc-11.trid.xml
# defs/p/pa-risc-10.trid.xml
The description of PA-RISC executable happens inside Magdir/hp. The
version 1.0 executable description is done by lines like:
0 belong 0x020b0108 PA-RISC1.0 shared executable
>168 belong&0x4 0x4 dynamically linked
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
On Unix like systems executables typically have no file name suffix. So
the lines now becomes like:
0 belong 0x020b0108 PA-RISC1.0 shared executable
!:ext /
>168 belong&0x4 0x4 dynamically linked
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
The version 1.1 executable description is done by lines like:
0 belong 0x02100108 PA-RISC1.1 shared executable
>168 belong&0x4 0x4 dynamically linked
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
On Unix like systems executables typically have no file name suffix. So
the lines now becomes like:
0 belong 0x02100108 PA-RISC1.1 shared executable
!:ext /
>168 belong&0x4 0x4 dynamically linked
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
The sample HYPERHLP-risc is described by lines inside Magdir/ibm6000.
These look like:
0 beshort 0x01df executable (RISC System/6000 V3.1) or obj module
>12 belong >0 not stripped
The displaying part can be done by subroutine from Magdir/coff. The
advantage is that additional tests are done before displaying. Nearly
all COFF sample are described in same way. That means also sub
classification if executable or object is done. Furthermore some more
detail like time stamp are shown. So the above lines now become like:
0 beshort 0x01df
>0 use display-coff
The sub routine to display name, flags and more of Common Object Files
Format (COFF 32bit) inside Magdir/coff implemented be me some time ago
starts like:
0 name display-coff
>18 uleshort&0x8E80 0
>>2 uleshort >0
>>>2 uleshort <4207
>>>>0 clear x
>>>>0 uleshort 0x014C Intel 80386
By first test check for unused flag bits (0x8000, 0x0800, 0x0400,
0x0200, x0080 in f_flags) was done. This knowledge is mainly based on
documentation about intel x86 architecture. Apparently some flag bits
(0x0800, 0x0400, 0x0200) now seems to be used in RISC System/6000. So
this test must become more relaxed. The next 2 tests check number of
sections (f_nscns). This is typically in dozen range. So misidentified
other file format samples with 0 or thousands of sections are skipped.
The displaying part start with Intel 80386 by looking for specific start
magic at offset 0. So with relaxed first test and activated check for
magic (f_magic=0x01DF) of RISC System/6000 the sub routine now starts as:
0 name display-coff
>18 uleshort&0x8E80 0
>>2 uleshort >0
>>>2 uleshort <4207
>>>>0 clear x
>>>>0 uleshort 0x014C Intel 80386
>>>>0 uleshort 0x01DF RISC System/6000 V3.1
The samples are later described as executable if the F_EXEC flag bit is
set. This is done by line like:
>>>>18 leshort &0x0002 executable
Now afterwards show an user defined mime type. On Unix like systems
executables typically have no file name suffix. This is done by
additional lines like:
!:mime application/x-coff-executable
!:ext /
The time-stamp f_timdat seems to be also correct for endian variant.
This is shown by line like:
>>>>4 ledate >0 \b, created %s
If samples contain no optional header then after header part comes first
section. This starts with section name (s_name[8] like .text .data
.debug$S .drectve .testseg). This is done by lines like
>>>>16 uleshort =0
>>>>>20 string x \b, 1st section name "%.8s"
For samples with optional header this comes after header and this is
then followed by first section. In current magic file at the moment only
non zero option header size (f_opthdr) is shown by line like:
>>>>16 uleshort >0 \b, optional header size %u
Now i also show first section name for samples (like IBM\HH\HYPERHLP)
with option header (f_opthdr=72) by line like:
>>>>(16.s+20) string x \b, 1st section name "%.8s"
Luckily that test expression is also true for samples without optional
header. Now i can show if wanted more variables of first section by
lines like:
# physical address s_paddr like: 0
#>>>>(16.s+28) lelong !0 \b, s_paddr %#8.8x
# virtual address s_vaddr like: 0
#>>>>(16.s+32) lelong !0 \b, s_vaddr %#8.8x
# section size s_size
#>>>>(16.s+36) lelong x \b, s_size %#8.8x
# file ptr to raw data for section s_scnpt
#>>>>(16.s+40) lelong x \b, s_scnpt %#8.8x
# file ptr to relocation s_relptr like: 0
#>>>>(16.s+44) lelong !0 \b, s_relptr %#8.8x
# file ptr to gp histogram s_lnnoptr like: 0
#>>>>(16.s+48) lelong !0 \b, s_lnnoptr %#8.8x
# number of relocation entries s_nreloc like:
# 0 1 2 5 6 8 19h 26h 27h 38h 50h 5Fh 89h Dh 1Ch 69h A9h 1DCh 651h
#>>>>(16.s+52) uleshort x \b, s_nreloc %#4.4x
# number of gp histogram entries s_nlnno like: 0
#>>>>(16.s+54) uleshort !0 \b, s_nlnno %#4.4x
# flags s_flags
#>>>>(16.s+56) lelong x \b, s_flags %#8.8x
If the samples contain more than 1 section then afterwards comes second
section. Here again this start with section name (like .bss .data
.debug$S .rsrc$01). So show this second second name by lines like:
>>>>2 uleshort >1
>>>>>(16.s+60) string x \b, 2nd section name "%.8s"
Most section names start with point character except samples created by
"exotic" compilers, but unfortunately i do not remember and found such
samples any more. When magic test lines for COFF samples is still too
weak then tests for that point character can be used to avoid collisions.
Samples like HYPERHLP are described by lines inside Magdir/digital.
These look like:
>24 leshort 0413 COFF format alpha demand paged
>>22 leshort&030000 !020000 executable
>>22 leshort&020000 !0 dynamically linked
>>16 lelong !0 not stripped
>>16 lelong 0 stripped
>>27 byte x - version %d
>>26 byte x \b.%d
>>28 byte x \b-%d
Unfortunately the referenced documentation for COFF does not apply here.
The mentioned documentation are mainly based on Intel x86 architecture.
I assume that the inspected alpha architecture is 64 bit based with
other header structures. So i can not use sub routine display-coff. On
Unix like systems executables typically have no file name suffix. So i
keep lines and add one line like:
!:ext /
After applying the above mentioned modifications by patches
file-5.45-hp-pa-risc.diff file-5.45-sun-hyperhlp.diff
file-5.45-bsdi-hyperhlp.diff file-5.45-ibm6000-hyperhlp.diff
file-5.45-coff-hyperhlp.diff file-5.45-digital-hyperhlp.diff
then all my inspected executables are still described. But now i get
some more details (That can be used to avoid collisions that may be are
triggered by too short pattern). Furthermore duplicate messages are
vanished. This with -k option now looks like:
CALCSIZE: PA-RISC1.1 shared executable - not stripped
HYPERHLP: COFF format alpha demand paged
executable dynamically linked stripped - version 3.11-2
HYPERHLP-hp: PA-RISC1.0 shared executable
HYPERHLP-risc: RISC System/6000 V3.1 COFF executable
, no relocation info, no line number info, stripped
, 7 sections, optional header size 72
, created Thu Jan 12 19:35:47 1995
, 1st section name ".pad", 2nd section name ".text"
HYPERHLP-sunos: SPARC demand paged
dynamically linked executable
When running with --extension option now output looks like:
CALCSIZE: /
HYPERHLP: /
HYPERHLP-hp: /
HYPERHLP-risc: /
HYPERHLP-sunos: /
I hope my diff files can be applied in future version of file
utility.
With best wishes,
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
--
File mailing list
File at astron.com
https://mailman.astron.com/mailman/listinfo/file
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-exec.txt.gz
Type: application/x-gzip
Size: 367 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0001.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/sun.old 2021-02-23 01:49:24.000000000 +0100
+++ file-5.45/magic/Magdir/sun 2024-03-22 23:32:40.074908000 +0100
@@ -9,13 +9,14 @@
# are in aout, as they're indistinguishable from other big-endian
# 32-bit a.out files.
#
-0 belong&077777777 0600413 a.out SunOS SPARC demand paged
->0 byte &0x80
->>20 belong <4096 shared library
->>20 belong =4096 dynamically linked executable
->>20 belong >4096 dynamically linked executable
->0 byte ^0x80 executable
->16 belong >0 not stripped
+# Note: already handled as "SPARC demand paged" by ./bsdi
+#0 belong&077777777 0600413 a.out SunOS SPARC demand paged
+#>0 byte &0x80
+#>>20 belong <4096 shared library
+#>>20 belong =4096 dynamically linked executable~
+#>>20 belong >4096 dynamically linked executable
+#>0 byte ^0x80 executable
+#>16 belong >0 not stripped
0 belong&077777777 0600410 a.out SunOS SPARC pure
>0 byte &0x80 dynamically linked executable
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-sun-hyperhlp.diff.sig
Type: application/octet-stream
Size: 564 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0006.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/bsdi.old 2021-02-23 01:49:24.000000000 +0100
+++ file-5.45/magic/Magdir/bsdi 2024-03-23 19:04:50.229670500 +0100
@@ -10,12 +10,17 @@
>16 lelong >0 not stripped
>32 byte 0x6a (uses shared libs)
+# Update: Joerg Jenderek
# same as in SunOS 4.x, except for static shared libraries
+# Note: was also called "a.out SunOS SPARC demand paged" by ./sun v 1.28
0 belong&077777777 0600413 SPARC demand paged
>0 byte &0x80
>>20 belong <4096 shared library
>>20 belong =4096 dynamically linked executable
>>20 belong >4096 dynamically linked executable
+#!:mime application/x-foo-executable
+# typically no file name suffix for executables
+!:ext /
>0 byte ^0x80 executable
>16 belong >0 not stripped
>36 belong 0xb4100001 (uses shared libs)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-bsdi-hyperhlp.diff.sig
Type: application/octet-stream
Size: 599 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0007.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/hp.old 2021-02-23 01:49:24.000000000 +0100
+++ file-5.45/magic/Magdir/hp 2024-03-23 19:13:36.228874300 +0100
@@ -56,7 +56,14 @@
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
+# Update: Joerg Jenderek
+# URL: http://www.openpa.net/arch.html
+# Reference: http://mark0.net/download/triddefs_xml.7z/defs/p/pa-risc-11.trid.xml
+# Note: called "PA-RISC 1.1 object code (generic)" (generic)" by TrID
0 belong 0x02100108 PA-RISC1.1 shared executable
+#!:mime application/x-foo-executable
+# typically no file name suffix for executables
+!:ext /
>168 belong&0x4 0x4 dynamically linked
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
@@ -104,7 +111,14 @@
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
+# Update: Joerg Jenderek
+# URL: http://www.openpa.net/arch.html
+# Reference: http://mark0.net/download/triddefs_xml.7z/defs/p/pa-risc-10.trid.xml
+# Note: called "PA-RISC 1.0 object code (generic)" by TrID
0 belong 0x020b0108 PA-RISC1.0 shared executable
+#!:mime application/x-foo-executable
+# typically no file name suffix for executables
+!:ext /
>168 belong&0x4 0x4 dynamically linked
>(144) belong 0x054ef630 dynamically linked
>96 belong >0 - not stripped
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-hp-pa-risc.diff.sig
Type: application/octet-stream
Size: 611 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0008.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/digital.old 2021-07-05 11:33:09.000000000 +0200
+++ file-5.45/magic/Magdir/digital 2024-03-25 00:22:37.094790600 +0100
@@ -10,7 +10,12 @@
0 leshort 0603
>24 leshort 0410 COFF format alpha pure
>24 leshort 0413 COFF format alpha demand paged
+# TODO: use other subroutine (./coff) to display name+flags+variables for common object formatted files
+#>0 use display-coff-foo
>>22 leshort&030000 !020000 executable
+#!:mime application/x-foo-executable
+# typically no file name suffix for executables like \DEC\HH\HYPERHLP
+!:ext /
>>22 leshort&020000 !0 dynamically linked
>>16 lelong !0 not stripped
>>16 lelong 0 stripped
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-digital-hyperhlp.diff.sig
Type: application/octet-stream
Size: 568 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0009.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/ibm6000.old 2021-07-05 11:33:09.000000000 +0200
+++ file-5.45/magic/Magdir/ibm6000 2024-03-25 15:38:31.232840800 +0100
@@ -3,8 +3,11 @@
# $File: ibm6000,v 1.15 2021/07/03 14:01:46 christos Exp $
# ibm6000: file(1) magic for RS/6000 and the RT PC.
#
-0 beshort 0x01df executable (RISC System/6000 V3.1) or obj module
->12 belong >0 not stripped
+# Update: Joerg Jenderek
+#0 beshort 0x01df executable (RISC System/6000 V3.1) or obj module
+0 beshort 0x01df
+# use subroutine (./coff) to display name+flags+variables for common object formatted files
+>0 use \^display-coff
# Breaks sun4 statically linked execs.
#0 beshort 0x0103 executable (RT Version 2) or obj module
#>2 byte 0x50 pure
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-ibm6000-hyperhlp.diff.sig
Type: application/octet-stream
Size: 601 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0010.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/coff.old 2022-12-01 19:44:07.000000000 +0100
+++ file-5.45/magic/Magdir/coff 2024-03-25 15:31:43.722181700 +0100
@@ -5,7 +5,7 @@
#
# COFF
#
-# by Joerg Jenderek at Oct 2015, Feb 2021
+# by Joerg Jenderek at Oct 2015, Feb 2021, Mar 2024
# https://en.wikipedia.org/wiki/COFF
# https://de.wikipedia.org/wiki/Common_Object_File_Format
# http://www.delorie.com/djgpp/doc/coff/filhdr.html
@@ -14,8 +14,9 @@
# Maybe used also in adi,att3b,clipper,hitachi-sh,hp,ibm6000,intel,
# mips,motorola,msdos,osf1,sharc,varied.out,vax
0 name display-coff
-# test for unused flag bits (0x8000,0x0800,0x0400,0x0200,x0080) in f_flags
->18 uleshort&0x8E80 0
+# test for unused flag bits (0x8000,x0080) in f_flags
+# flag bits (0x0800,0x0400,0x0200) now seems to be used in RISC System/6000 V3.1
+>18 uleshort&0x8080 0
# skip DOCTOR.DAILY READER.NDA REDBOX.ROOT by looking for positive number of sections
>>2 uleshort >0
# skip ega80woa.fnt svgafix.fnt HP3FNTS1.DAT HP3FNTS2.DAT INTRO.ACT LEARN.PIF by looking for low number of sections
@@ -28,8 +29,8 @@
>>>>0 uleshort 0x0500 Hitachi SH big-endian
# Hitachi SH little-endian COFF (./hitachi-sh)
>>>>0 uleshort 0x0550 Hitachi SH little-endian
-# executable (RISC System/6000 V3.1) or obj module (./ibm6000)
-#>>>>0 uleshort 0x01DF
+# executable (RISC System/6000 V3.1) or obj module (./ibm6000 v 1.15)
+>>>>0 uleshort 0x01DF RISC System/6000 V3.1
# MS Windows COFF Intel Itanium, AMD64
# https://msdn.microsoft.com/en-us/library/windows/desktop/ms680313(v=vs.85).aspx
>>>>0 uleshort 0x0200 Intel ia64
@@ -53,6 +54,9 @@
#!:ext cof/o/obj/lib
>>>>18 leshort &0x0002 executable
#!:mime application/x-coffexec
+!:mime application/x-coff-executable
+# typically no file name suffix for executables
+!:ext /
# F_RELFLG flag bit,static object
>>>>18 leshort &0x0001 \b, no relocation info
# F_LNNO flag bit
@@ -79,16 +83,39 @@
# like: 0 2 7 9 10 11 20 35 41 63 71 80 105 146 153 158 170 208 294 572 831 1546
>>>>12 ulelong >0 \b, %d symbols
# f_opthdr - optional header size. An object file should have a value of 0
+# like: 72 (IBM\HH\HYPERHLP)
>>>>16 uleshort >0 \b, optional header size %u
-# f_timdat - file time & date stamp only for little endian
+# f_timdat - file time & date stamp
>>>>4 ledate >0 \b, created %s
# at offset 20 can be optional header, extra bytes FILHSZ-20 because
# do not rely on sizeof(FILHDR) to give the correct size for header.
# or first section header
# additional variables for other COFF files
>>>>16 uleshort =0
-# first section name s_name[8] like: .text .data .debug$S .drectve .testseg
->>>>>20 string x \b, 1st section name "%.8s"
+# most section names start with point character except samples created by "exotic" compilers
+# first section name s_name[8] like: .text .data .debug$S .drectve .testseg .rsrc .rsrc$01 .pad
+>>>>(16.s+20) string x \b, 1st section name "%.8s"
+# physical address s_paddr like: 0
+#>>>>(16.s+28) lelong !0 \b, s_paddr %#8.8x
+# virtual address s_vaddr like: 0
+#>>>>(16.s+32) lelong !0 \b, s_vaddr %#8.8x
+# section size s_size
+#>>>>(16.s+36) lelong x \b, s_size %#8.8x
+# file ptr to raw data for section s_scnpt
+#>>>>(16.s+40) lelong x \b, s_scnpt %#8.8x
+# file ptr to relocation s_relptr like: 0
+#>>>>(16.s+44) lelong !0 \b, s_relptr %#8.8x
+# file ptr to gp histogram s_lnnoptr like: 0
+#>>>>(16.s+48) lelong !0 \b, s_lnnoptr %#8.8x
+# number of relocation entries s_nreloc like: 0 1 2 5 6 8 19h 26h 27h 38h 50h 5Fh 89h Dh 1Ch 69h A9h 1DCh 651h
+#>>>>(16.s+52) uleshort x \b, s_nreloc %#4.4x
+# number of gp histogram entries s_nlnno like: 0
+#>>>>(16.s+54) uleshort !0 \b, s_nlnno %#4.4x
+# flags s_flags
+#>>>>(16.s+56) lelong x \b, s_flags %#8.8x
+# second section name s_name[8] like: .bss .data .debug$S .rsrc$01
+>>>>2 uleshort >1
+>>>>>(16.s+60) string x \b, 2nd section name "%.8s"
# >20 beshort 0407 (impure)
# >20 beshort 0410 (pure)
# >20 beshort 0413 (demand paged)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-coff-hyperhlp.diff.sig
Type: application/octet-stream
Size: 1949 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240325/2e31348b/attachment-0011.obj>
More information about the File
mailing list