[File] [PATCH] of Magdir/os2, msdos for OS/2 help message *.msg+ *.hlp *.inf *.ini *.dos
Christos Zoulas
christos at zoulas.com
Sun Aug 30 16:23:26 UTC 2020
Committed, thanks!
christos
> On Aug 29, 2020, at 5:13 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
> some days ago i handled some OS/2 disks. When running file command
> version 5.39 with -k option on such OS/2 files and some similar files
> i get an output like:
>
> echo.sys: DOS executable (block device driver)
> ELNK.DOS: data
> EPABKBKS.HLP: OS/2 HLP (Help)
> EPW.INI: DOS executable (block device driver)
> OS/2 INI
> IBMMPC.DOS: data
> IBMTOK.DOS: DOS executable (character device driver,
> control strings-support)
> LSIH.MSG: data
> NWREQOS2.MSG: data
> OS2PING.INF: OS/2 INF (OS2PING Help File)
> PNG.MSG: data
> REX.MSG: data
> VPD.INI: DOS executable (block device driver)
> OS/2 INI
> XDF.MSG: data
> XI1.MSG: data
>
> With --extension option in most cases only ??? is displayed
> Furthermore with -i option for many samples only generic
> application/octet-stream is shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This list the used
> file name extension and often with -v option the related URL
> pointing to used file format information.
>
> Luckily TrID tool identifies msg files as "OS/2 help Message" and
> displays related URL. This is now expressed inside Magdir/os2 by
> additional comment line like
> # URL: http://fileformats.archiveteam.org/wiki/MSG_(OS/2)
> More information about that file format can be found in header file
> of MKMSGF clone.This is now expressed by additional comment line
> like:
> # github.com/OS2World/UTIL-SYSTEM-MKMSGF/blob/master/mkmsgf.h
> This software is just a clone of the original IBM mkmsgf tool. So
> some fields and meaning are not explained, especially for old
> versions and message text pointer handling. Or i am too stupid to
> understand the sources.
>
> According to reference such MSG files start with characteristic 8
> byte magic. That is expressed by magic lines like
> 0 string \xffMKMSGF\0 OS/2 help message
> !:mime application/x-os2-msg
> !:ext msg
> Afterwards comes 3 byte identifier string like DOS, NET, REX, SYS etc
> .
> That is shown by following line
>> 8 string x '%.3s'
>
> To keep output columns low i show values only for seldom or exotic
> cases.
> So the file format version is stored as a 2 byte value. Only two
> values should occur, where 0 means "old" version and 2 means "new"
> version. Most examples especially nowadays are new. So show this
> information only for old versions by line like
>> 16 uleshort !2 \b, version %u
>
> In the byte offset16bit is stored if the message index table use
> 16-bit pointers (1) or 32-bit pointers (0). Most messages examples
> are small (<64K). For such cases 16 bit pointers are used. But for
> some large examples like NWREQOS2.MSG 32 bit pointers are used. So
> show this information by line like
>> 15 ubyte =0 \b, 32-bit
>
> In the short indextaboffset variable the offset of the message index
> table is stored. For "new" examples i only found value 1Fh. That
> means index table directly comes after header. For "old" variant i
> only found value 0. That seems to mean "use default value". Here i
> also found table at offset 1Fh. So show possible unusual table
> value by lines
>> 18 uleshort >0
>>> 18 uleshort !0x1f \b, at 0x%x index
> So test in one branch for 32-bit pointers, then display offset to
> message block and display first message text by lines like:
>>> 15 ubyte =0
>>>> (18.s) ulelong x \b, at 0x%x
>>>>> (&-4.l) ubyte x %c-type
>>>>>> &0 string x %s
> According to os2-1.0-ptk-tools-1988.pdf the string start with 1 ASCII
> character, that describes the type of the message, where E means
> Error, H means Help. I is used for Information and P for Prompt. ?
> seems to mean unused or empty. After that character comes the real
> message text.
> Then i do similar procedure for 16-bit variant and for "old" examples
> with zero indextaboffset.
>
> The last fields in header before padding zero bytes are countryinfo
> and next country info. For version 0 these fields are zero. So show
> only non zero values by lines
>> 20 uleshort !0 \b, at 0x%x countryinfo
>>> 22 uleshort >0 \b, at 0x%x next
>
> Because the country block contains some interesting information i
> jump to this offset and inspect block by sub routine os2-msg-info.
> This looks like:
>>> (20.s) use os2-msg-info
> 0 name os2-msg-info
>
> The possible non zero language id of message file is shown in that
> sub routine by lines like
>> 3 uleshort >0 \b, language %u
>>> 5 uleshort x \b_%u
> So for example LSIH.MSG value 7_1 means German_Germany or value 12_3
> means Canadian French.
>
> After language part comes code page part. First comes the number of
> used code pages (maximal 16), followed by used DOS code page numbers.
> This is expressed by lines like.
>> 7 uleshort x \b, %u code page
>> 7 uleshort >1 \bs
>> 7 uleshort <17
>>> 9 uleshort >0 %u
>>>> 7 uleshort >1
>>>>> 11 uleshort x %u
> Many examples like NWREQOS2.MSG just contain only 1 code page and
> that is 437 in most cases. But a few examples like XDF.MSG contain 2
> code pages and often these two are 437 and 850.
> After the code page part the filename like dbaseos2.msg, xdfh.msg,
> dde4c01e.msg, os2ldr.mgr and so on is stored. So show this
> information by line like
>> 41 string x \b, %s
>
> To show a user defined mime type and file name extension for HLP
> files i also add more lines inside Magdir/os2 after magic line
> 0 string HSP\x10\x9b\x00 OS/2 HLP
> So now for samples like EPABKBKS.HLP i show that information by 2 lin
> es
> !:mime application/x-os2-hlp
> !:ext hlp
> Do the same procedure for OS/2 INF, OS/2 INI. The last one is
> identified by magic line
> 0 string \xff\xff\xff\xff\x14\0\0\0 OS/2 INI
>
> This looks similar to DOS device drivers, which are identified by
> magic line inside Magdir/msdos like
> 0 ulequad&0x07a0ffffffff 0xffffffff
> So OS/2 INI-files like EPW.INI and VPD.INI are misidentified as DOS
> device driver by Magdir/msdos. So i add an additional test to skip
> OS/2 INI-files. This now becomes like
> 0 ulequad&0x07a0ffffffff 0xffffffff
>> 4 ubelong !0x14000000
>>> 0 use msdos-driver
>
> The URL pointing to information DOS device driver probably does not
> exist any more. So i look for similar sites on the net. This is now
> expressed by additional lines like:
> # URL: http://fileformats.archiveteam.org/wiki/DOS_device_driver
> # Reference: http://www.delorie.com/djgpp/doc/rbinter/it/46/16.html
>
> At the beginning a 4 byte pointer to next driver is stored.
> For most (about 94%=98/104 for my inspected samples) DOS device
> drivers this value is 0xffffffff. These are matched by above
> construction. Unfortunately this is not a strict condition. Some
> examples like Uwe Sieber echo.sys found in archive cfg_echo.zip are
> recognized by explicitly looking for characteristic byte sequences at
> the beginning and then calling displaying subroutine by lines like
> 0 ulequad 0x001600000000ffff
>> 0 use msdos-driver
> So show now such an unusual pointer value at the end of that
> subroutine by additional line like
>> 0 ulelong !0xffffffff with pointer 0x%x
>
> This was useful for me, when comparing identification
> success/failure of file command with other tools like TrID.
>
> According to updated reference also DOS is used as file name
> extension. So i found on OS/2 disc samples like IBM Token-Ring
> adapter driver IBMTOK.DOS. So file name extension line now becomes li
> ke
> !:ext sys/dev/bin/dos
>
> I found no explanation why and when DOS file name extension instead
> SYS is used. Maybe to explicitly distinguish such drivers from
> drivers or executables for the OS/2 system like IBMTOK.OS2.
>
> Furthermore i found DOS driver examples inside archive DLSNETDR.ZIP
> on OS2 CD-ROM which are not detected, because the bits that are
> declared in old documentation as reserved are used. But these 2
> examples use expected starting pointer value. So i add more
> additional lines for that 2 exceptions like:
> 0 ulequad 0x027ac0c0ffffffff
>> 0 use msdos-driver
> 0 ulequad 0x00228880ffffffff
>> 0 use msdos-driver
> Maybe it is possible to merge some DOS driver branches.
>
> After applying the above mentioned modifications by patches
> file-5.39-os2.diff, file-5.39-msdos-os2.diff, the misidentifications
> vanish and i get a more precise output like:
>
> echo.sys: DOS executable (block device driver)
> with pointer 0xffff
> ELNK.DOS: DOS executable (character device driver,
> IOCTL-,control strings-support)
> EPABKBKS.HLP: OS/2 HLP (Help)
> EPW.INI: OS/2 INI
> IBMMPC.DOS: DOS executable (character device driver,
> close media-support)
> IBMTOK.DOS: DOS executable (character device driver,
> control strings-support)
> LSIH.MSG: OS/2 help message 'LSI', 113 messages,
> number 559
> at 0x230 H-type Ursache:
> Die Version des z. Zt. installierten,
> at 0x101 countryinfo, language 7_1,
> 1 code page 850, LSIH.MSG
> NWREQOS2.MSG: OS/2 help message 'REQ', 1302 messages,
> 1st number 98, 32-bit,
> at 0x15a5 I-type VeRsIoN=2.11,
> at 0x1477 countryinfo,
> 1 code page 437, nwreqos2.msg
> OS2PING.INF: OS/2 INF (OS2PING Help File)
> PNG.MSG: OS/2 help message 'IIC', 140 messages,
> 1st number 4001, version 0,
> at 0x137 I-type PING -
> ICMP Echoanforderung/-antwort %8.%9,
> at 0x164 I-type Copyright (c) 1995 Network TeleSystems,
> Inc. Alle Rechte vorbehalten.,
> at 0x1ac E-type Paketgráe zu groá. Max. Daten = %9 Byte
> REX.MSG: OS/2 help message 'REX', 127 messages,
> version 0,
> at 0x11d W-type ,
> at 0x120 W-type %1File Table full%2,
> at 0x136 W-type
> VPD.INI: OS/2 INI
> XDF.MSG: OS/2 help message 'XDF', 20 messages,
> 1st number 3502, number 373
> at 0x194 P-type Quellendiskette in
> Laufwerk %1 einlegen,
> at 0x47 countryinfo, at 0x817 next,
> 2 code pages 850 437, xdf.msg
> XI1.MSG: OS/2 help message 'XI1', 180 messages,
> version 0,
> at 0x187 ?-type ,
> at 0x18a E-type Fehler beim Aufruf \201ber
> die Befehlszeile.
> Es wurden nicht alle Parameter oder nicht
> unterst\201tzte Parameter/Werte angegeben.,
> at 0x208 E-type Datei CONFIG.SYS kann nicht wie
> \201ber Parameter "/TU:" vorgeschrieben
> gefunden werden.
>
> I hope my 2 diff files can be applied in future version of
> file utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCX0rE2QAKCRCv8rHJQhrU
> 1iThAKC/G+HW+bcCH7wa3GROVGBU9j1GLACfTYPco7gmR0YqhcOAT0HRwQ823Yo=
> =uzjF
> -----END PGP SIGNATURE-----
> <file-5_39-msdos-dos-os2_diff.DEFANGED-0><file-5_39-os2-msg_diff.DEFANGED-1><file-5_39-msdos-dos-os2_diff_sig.DEFANGED-2><file-5_39-os2-msg_diff_sig.DEFANGED-3>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20200830/a203e4a3/attachment.asc>
More information about the File
mailing list