[File] [PATCH] Magdir/msdos 4DOS help file; older versions are missed

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Sat Nov 25 18:11:27 UTC 2023


Hello,

some days ago i handle file with suffix HLP. Unfortunately this is used
by many programs for their own help documentation.
Unfortunately on my systems some HLP files are not identified. So in
this session i will handle HLP samples which are used by 4DOS commando
interpreter.

When running file command version 5.45 on such HLP examples i get an
output like:

4DOS-7.HLP:        4DOS help file, version 701A
4DOS-750b130.HLP:  4DOS help file, version 701A
4DOS500f.HLP:      data
4DOS552.HLP:       4DOS help file, version 552A
4DOS602.HLP:       4DOS help file, version 602A
4DOS602b.HLP:      4DOS help file, version 602A
4DOS7501-real.HLP: 4DOS help file, version 701A
4DOS7501.HLP:      4DOS help file, version 701A
4DOS777-maybe.HLP: 4DOS help file, version 701A
4DOS800-real.HLP:  4DOS help file, version 701A
4dos100.hlp:       data
4dos402b.hlp:      data
4dos551c_ge.hlp:   data
DOS221.HLP:        ASCII text, with CRLF line terminators
DOS330.HLP:        data

For the help samples with --extension option only ??? is displayed.
Furthermore with -i option for samples only generic
application/octet-stream or text/plain is shown.

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). Here these
samples are not recognized.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). All samples are here
recognised. The newest examples recognized by file command are described
as "4DOS Help" by hlp-4dos.trid.xml. as mime type application/x-4dos-hlp
is shown. A little older version samples (like 4dos551c_ge.hlp and
4DOS500f.HLP) are also described in that way. Samples which are before
that variants (like 4dos402b.hlp) are described as "4DOS Help (v4)" by
hlp-4dos-v4.trid.xml. Samples before that (like DOS221.HLP) are
described as "4DOS Help (v2)" by hlp-4dos-v2.trid.xml. Oldest samples
(like DOS330.HLP 4dos100.hlp) are described "Turbo Pascal Help (v2)" by
hlp-tp-2.trid.xml (See appended
trid-v-hlp-4dos.txt.gz)

TrID list the used file name extension and often with -v option the
related URL pointing to used file format information. With the help of
these tools i found 2 pages on Wikipedia. There the current homepage
4dos.info is listed. There i get most of my 4DOS packages with HLP
samples. So these informations are now expressed inside Magdir/msdos by
additional comment lines like:

# URL:		https://en.wikipedia.org/wiki/Turbo_Pascal
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/h/hlp-tp-2.trid.xml
# URL:		https://en.wikipedia.org/wiki/4DOS
# Reference:	https://4dos.info/4dsource/4helpsrc.zip/TPHELP.PAS
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/h/hlp-4dos.trid.xml
#		defs/h/hlp-4dos-v4.trid.xml
#		defs/h/hlp-4dos-v2.trid.xml

In current Magdir/msdos the description happens by lines like:
0	ulelong	0x48443408		4DOS help file
  >4	string	x			\b, version %-4.4s

TrID check only for 3 byte magic 4DH at offset 1. Luckily 4DOS now
becomes open source since about version 8. So i found information inside
Pascal TPHELP.PAS inside 4HELPSRC directory. I am no pascal programmer.
So now i understand more of some basics. At the beginning pascal string
HelpSystemID is stored ( with value like '4DH701AA'). Because of pascal
nature at offset 0 1 byte string length is stored. For the samples
recognized by file command this value is 8. For little older version
(like in 4DOS500f.HLP 4dos551c_ge.hlp) the strength length is 6.

The  pascal string length of of HelpID can be verified by line like:
  >0	ubyte	x			PLENGHT=%x

So now check for valid pascal string length (6 or 8) of HelpID, 4DH
magic and valid major number (5 6 7 8). So this now starts like:
0 ubequad&0xF1ffFFffF0000000	0x0034444830000000	4DOS help file
!:mime	application/x-4dos-hlp
!:ext	hlp

The help filename is hard coded as 4DOS.HLP and searching for that is
done in some specific directories (like current or determinated by some
environment variables). So file name suffix is HLP.

If the help file does not exist the error message of 4HELP.EXE is
like:
4DOS help error: help file not found or not accessible
Press any key to exit ...
If the wrong help is is used than you get an error message like:
4DOS help error: incorrect version of help file

This is triggered inside TPHELP.PAS if current id at the beginning
mismatches the hardcoded HelpSystemID like '701AA'. In all such
samples the 3 byte 4DH is the shared start characteristic. The next
digit in most cases apparently is the major version number. The next 1
or 2 digits are the minor version number. This can be verified by
running DOS VER command. More information is shown with /R option. This
is also stored in 4DOS _4VER variable. That can be shown by echo
command. So show this version information inside HLP in similar way like
"x.yy zz" for samples with string length 8 and as "x.yy" for samples
with string length 6. the x digit is the major version number and yy
digits are the minor version number.

In my 8 byte samples after 2 minor version digits comes 2 bytes
string AA at offset 7. I first thought that this is the revision number,
but that is not true. For sample 4DOS602b.HLP revision is B (as reported
by VER /R). So i do not know what this AA means. Maybe this is a patch
level. So i show this version string part after minor version digits.
So now the version displaying part now becomes like:
 >4	string	x			\b, version %-1.1s
 >>0	ubyte	8			\b.
 >>>5	string	x			\b%-2.2s
 >>>7	string	x			%-.2s
 >>0	ubyte	6			\b.
 >>>5	string	x			\b%-2.2s
#>>0	default	x			\b.
#>>>5	string	x			%-2.2s

What was confusing me for about some weeks is that is not always true.
In a an ideal world this would be true. But as written on home page
4dos.info for 4DOS 5.51c German the commando interpreter 4DOS.COM
version is 5.51 revision c, but the version of packaged 4HELP.EXE and
4DOS.HLP are 5.50. Or the newest 4dos samples (like 4DOS800-real.HLP
4DOS777-maybe.HLP 4DOS7501.HLP) uses help version 7.01.

In newest versions (false for version 5.52 and older, but true for
version 6.02 and newer) according to sources at offset 24 13 byte
ExtHelpName is stored. In many samples (like 4DOS602.HLP and newer) this
is 8 byte string HELP.COM. But also other names can occur here like
DOSBOOK.EXE from DR-DOS as described in help text. Or you can specify
the external help program name by DOS environment variable DOSHELP. This
is stored in newest version as ExtHelpEnv at offset 38. For DOSBOOK.EXE
this works but not for HELP.COM of dosbox-x. After version part comes
first HighestTopic followed by NumTopics and so on. I show only
interesting external help fields. So this done by lines like:
 >4	ubeshort	>0x3535
#>>9	uleshort x	HighestTopic=%#4.4x
#>>11	uleshort x	NumTopics=%#4.4x
 >>24	pstring	x	\b, external help %s
 >>38	pstring	x	or specified by DOS environment variable %s

My work with these above patterns are not in-vane, because in little
older samples like 4dos402b.hlp the structure is a little bit other
organized. There is not pascal string length at the beginning. It
directly starts with 3 byte magic 4DH followed by digit 4 which
apparently is the major version number four. So such samples are
described by lines like:
0	string		4DH4	4DOS help file, version 4.x
!:mime	application/x-4dos-hlp
!:ext	hlp

There are samples (like DOS221.HLP not the oldest and but before
version 4) which are not recognized by current definitions. In the other
newer HLP samples after the header, the topic/command names are stored
as ASCII strings, but in last part the describing text is stored in a
compressed way. So in this version there is one evolution step before of
4DOS. It is neither the "Pascal" nor the "4DH" help system. So
apparently here all is stored as ASCII text. First come the
topic/command names. Starting with ALIAS, ASSIGN and ending with VOL,
XCOPY, Y sequence. Every name is stored on line of it's own. Afterwards
comes the describing text. Every text section starts with a line which
consist of the topic name surrounded by asteric character (*). Another
difference is that the help program is here called HELP.EXE and the help
file name is DOS.HLP. Apparently to avoid collision with help of
Microsoft MS-DOS in later version these names become 4HELP.EXE and
4DOS.HLP. So such samples are described by lines like:
0	string	ALIAS\r\nASSIGN\r\n
 >13	search/3016	4DOS	4DOS help file, version 2.x
!:mime	application/x-4dos-hlp
!:ext	hlp

Interesting is what is written about the 4DOS help system. It is
modified and based on TPHELP unit from TurboPower software Turbo
Professional 5.0 Turbo Pascal toolkit. So apparently in first versions
(like in samples 4dos100.hlp DOS330.HLP) that original help system was
used. So the oldest samples are therefore described as "Turbo Pascal
Help (v2)" by TrID definition hlp-tp-2.trid.xml. So these patterns
expressed as magic lines now becomes like:
0	string		TPH2	Turbo Pascal help, version 2
!:mime	application/x-pascal-hlp
!:ext	hlp

After applying the above mentioned modifications by patch
file-5.45-msdos-hlp-4dos.diff then now all 4DOS samples are recognized
and i get output like:

4DOS-7.HLP:        4DOS help file, version 7.01 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS-750b130.HLP:  4DOS help file, version 7.01 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS500f.HLP:      4DOS help file, version 5.00
4DOS552.HLP:       4DOS help file, version 5.52 AA
4DOS602.HLP:       4DOS help file, version 6.02 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS602b.HLP:      4DOS help file, version 6.02 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS7501-real.HLP: 4DOS help file, version 7.01 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS7501.HLP:      4DOS help file, version 7.01 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS777-maybe.HLP: 4DOS help file, version 7.01 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4DOS800-real.HLP:  4DOS help file, version 7.01 AA
		   , external help HELP.COM or
		   specified by DOS environment variable DOSHELP
4dos100.hlp:       Turbo Pascal help, version 2
4dos402b.hlp:      4DOS help file, version 4.x
4dos551c_ge.hlp:   4DOS help file, version 5.50
DOS221.HLP:        4DOS help file, version 2.x
DOS330.HLP:        Turbo Pascal help, version 2


I hope my diff file can be applied in future version of
file utility.

With best wishes
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-hlp-4dos.txt.gz
Type: application/x-gzip
Size: 666 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231125/c446595f/attachment.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/msdos.old	2023-05-21 18:04:05.000000000 +0200
+++ file-5.45/magic/Magdir/msdos	2023-11-25 18:51:32.433055200 +0100
@@ -1830,7 +1830,83 @@
 #!:ext	msg/dat
+
+# Summary:	Turbo Pascal Help
+# From:		Joerg Jenderek
+# URL:		https://en.wikipedia.org/wiki/Turbo_Pascal
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/h/hlp-tp-2.trid.xml
+# Note:		called "Turbo Pascal Help (v2)" by TrID
+0	string		TPH2	Turbo Pascal help, version 2
+#!:mime	application/octet-stream
+!:mime	application/x-pascal-hlp
+# 4DOS help file, version 1.00 3.30
+!:ext	hlp
+# URL:		https://en.wikipedia.org/wiki/4DOS
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/h/hlp-4dos-v2.trid.xml
+# Note:		called "4DOS Help (v2)" by TrID
+0	string	ALIAS\r\nASSIGN\r\n
+>13	search/3016	4DOS	4DOS help file, version 2.x
+#!:mime	text/plain
+!:mime	application/x-4dos-hlp
+# DOS.HLP 4DOS help file, version 2.21
+!:ext	hlp
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/h/hlp-4dos-v4.trid.xml
+# Note:		called "4DOS Help (v4)" by TrID
+0	string		4DH4	4DOS help file, version 4.x
+#!:mime	application/octet-stream
+!:mime	application/x-4dos-hlp
+# 4dos402b.hlp
+!:ext	hlp
+# Reference:	https://4dos.info/4dsource/4helpsrc.zip/TPHELP.PAS
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/h/hlp-4dos.trid.xml
 # 4DOS help (.HLP) files added by Joerg Jenderek from source TPHELP.PAS
 # of https://www.4dos.info/
-# pointer,HelpID[8]=4DHnnnmm
-0	ulelong	0x48443408		4DOS help file
->4	string	x			\b, version %-4.4s
+# check for valid pascal string length (6 or 8) of HelpID, 4DH magic, valid major number (5 6 7 8)
+0	ubequad&0xF1ffFFffF0000000	0x0034444830000000	4DOS help file
+#!:mime	application/octet-stream
+!:mime	application/x-4dos-hlp
+!:ext	hlp
+# pascal string length of of HelpID like: 6 8
+#>0	ubyte	x			PLENGHT=%x
+# Note:	version string correspond or is a little bit lower than value of _4VER variable or output of 4DOS command `VER /R`
+# one-digit major version number of version string
+>4	string	x			\b, version %-1.1s
+# two-digit minor version number depending on pascal string length at the beginning
+>>0	ubyte	8			\b.
+>>>5	string	x			\b%-2.2s
+# Byte at offset 7 (A=41h) and 8 (A=41h) is not Revison like C (=43h) as reported by VER /R for 4DOS602b.HLP
+# GRR: maybe this is patch level
+>>>7	string	x			%-.2s
+# few samples with string length 6 (implying exact 2 byte minor version digits) like in 4DOS500f.HLP 4dos551c_ge.hlp
+>>0	ubyte	6			\b.
+>>>5	string	x			\b%-2.2s
+# just in case pascal string length is neither 6 nor 8
+#>>0	default	x			\b.
+#>>>5	string	x			%-2.2s
+# false for version 5.52 and older, but true for version 6.02 and newer
+>4	ubeshort	>0x3535
+# HighestTopic; highest topic number
+#>>9	uleshort x			HighestTopic=%#4.4x
+# NumTopics; number of topics
+#>>11	uleshort x			NumTopics=%#4.4x
+# BiggestTopic; size of largest topic in uncompressed bytes
+#>>13	uleshort x			BiggestTopic=%#4.4x
+# NamedTopics; number of topics in help index
+#>>15	uleshort x			NamedTopics=%#4.4x
+# NameSize; Size of largest name, 0 for none
+#>>17	uleshort x			NameSize=%#4.4x
+# PickSize; size of each entry in pick table, 0 for none
+#>>18	uleshort x			PickSize=%#4.4x
+# width; width of help window, with frame if any
+#>>19	ubyte x				Width=%#2.2x
+# FirstTopic; topic to show first (0 = index)
+#>>20	uleshort x			FirstTopic=%#4.4x
+# KeysTopic; topic to show when keys help needed
+#>>22	uleshort x			KeysTopic=%#4.4x
+# ExtHelpName; string[13]; name for external help program like: HELP.COM DOSBOOK.EXE
+>>24	pstring	x			\b, external help %s
+# ExtHelpEnv; String[16]; environment variable for alternate external help program name like: DOSHELP
+>>38	pstring	x			or specified by DOS environment variable %s
+# XlateArray = array[0..29] of Byte; {Most common characters in help text}
+#>>55	ubequad x			XlateArray=%#16.16llx
+# SharewareData : SharewareDataRec; shareware info for 4DOS.COM
+#>>87	ubequad x			SharewareData=%#16.16llx
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-msdos-hlp-4dos.diff.sig
Type: application/octet-stream
Size: 1827 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231125/c446595f/attachment.obj>


More information about the File mailing list