[File] [PATCH] Magdir/algol68 xBase program *.prg misidentified as Algol 68 source

Jörg Jenderek joerg.jen.der.ek at gmx.net
Thu Aug 12 22:17:37 UTC 2021


Hello,

some days ago i inspected some dBase examples. Now i looked at xBase
program scripts with PRG file name extension.

When running running file command version 5.40 on such examples and
related files i get an output like:

general.a68:  ASCII text, with CRLF line terminators
graph_2d.a68: ASCII text, with CRLF line terminators
is_prime.a68: Algol 68 source, ASCII text, with CRLF line terminators
pow_mod.a68:  Algol 68 source, ASCII text, with CRLF line terminators
sort.a68:     Algol 68 source, ASCII text, with CRLF line terminators
AREACODE.PRG: Algol 68 source, ASCII text, with CRLF line terminators
BARCOUNT.PRG: Algol 68 source, ASCII text, with CRLF line terminators
MENUS.PRG:    Algol 68 source, ASCII text, with CRLF line terminators
VENDORS.PRG:  Algol 68 source, ASCII text, with CRLF line terminators

For the examples a page about ALGOL 68 programming language on
Wikipedia was found. More information can be found in the revised
Report on the Algorithmic Language Algol 68. So this information is
now expressed by additional comment lines inside Magdir/algol68 like:
  # URL: 	https://en.wikipedia.org/wiki/ALGOL_68
  # Reference:	http://www.softwarepreservation.org/projects/ALGOL/
  #		report/Algol68_revised_report-AB.pdf

The detection of my examples happens by lines inside Magdir/algol68 like:
  0	search/8192	(input,			Algol 68 source text
  !:mime	text/x-Algol68
  0	regex/1024	\^PROC			Algol 68 source text
  !:mime	text/x-Algol68

By these lines the program look for specific keyword or language
constructs at the beginning and then display describing text "Algol
68 source text". Unfortunately this is sometimes not unique enough.

So i embed displaying part inside sub routine like:
  0	name		algol_68		Algol 68 source text
  !:mime	text/x-Algol68
  !:ext   a68
Now also file name extension "a68" is shown. Maybe also file name
extension "alg" is used.

Now the first construct becomes like:
  0	search/8192	(input,
  >0	use		algol_68

Some Algol 68 source examples like general.a68 and graph_2d.a68 are
only described as "ASCII text" because the search range is too small.
Furthermore second construct is also true for many dBase program
scripts (*.PRG) with keyword PROCEDURE with names (like: Areacode
BarCount Def_mens Vendors). On the other hand this also is true for
proc keyword probably followed by white space used to specify algol
procedures.

So the second construct now becomes like:
  0	regex/4006	\^PROC
  #>&-4	string		=PROCEDURE		\b, dBase PROCEDURE
  >&-4	string		!PROCEDURE
  >>0	use		algol_68

After applying the above mentioned modifications by patch
file-5.40-algol68-a68.diff then more Algol 68 sources are now
recognized and misidentification of xBase program scripts *.prg
vanish like:

general.a68:  Algol 68 source, ASCII text, with CRLF line terminators
graph_2d.a68: Algol 68 source, ASCII text, with CRLF line terminators
is_prime.a68: Algol 68 source, ASCII text, with CRLF line terminators
pow_mod.a68:  Algol 68 source, ASCII text, with CRLF line terminators
sort.a68:     Algol 68 source, ASCII text, with CRLF line terminators
AREACODE.PRG: ASCII text, with CRLF line terminators
BARCOUNT.PRG: ASCII text, with CRLF line terminators
MENUS.PRG:    ASCII text, with CRLF line terminators
VENDORS.PRG:  ASCII text, with CRLF line terminators

I hope my diff file can be applied in future version of file utility.

With best wishes
Jörg Jenderek
--
Jörg Jenderek
























-------------- next part --------------
--- file-5.40/magic/Magdir/algol68.old	2021-02-22 23:49:24 +0000
+++ file-5.40/magic/Magdir/algol68	2021-08-12 21:59:37 +0000
@@ -5,15 +5,37 @@
 #
-0	search/8192	(input,			Algol 68 source text
-!:mime	text/x-Algol68
-0	regex/1024	\^PROC			Algol 68 source text
-!:mime	text/x-Algol68
-0	regex/1024	\bMODE[\t\ ]		Algol 68 source text
-!:mime	text/x-Algol68
-0	regex/1024	\bREF[\t\ ]		Algol 68 source text
-!:mime	text/x-Algol68
-0	regex/1024	\bFLEX[\t\ ]\*\\[	Algol 68 source text
+# URL: 		https://en.wikipedia.org/wiki/ALGOL_68
+# Reference:	http://www.softwarepreservation.org/projects/ALGOL/report/Algol68_revised_report-AB.pdf
+# Update:	Joerg Jenderek
+0	search/8192	(input,
+>0	use		algol_68
+# graph_2d.a68
+0	regex/4006	\^PROC
+#>&-4	string		x			\b, dBase or Algol "%s"
+# most xBase scripts *.prg with PROCEDURE like: Areacode BarCount Def_mens Vendors
+#>&-4	string		=PROCEDURE		\b, dBase PROCEDURE
+# skip xBase program scripts *.prg with PROCEDURE keyword
+# keyword proc probably followed by white space used to specify algol procedures
+>&-4	string		!PROCEDURE
+>>0	use		algol_68
+0	regex/1024	\bMODE[\t\ ]
+>0	use		algol_68
+0	regex/1024	\bMODE[\t\ ]
+>0	use		algol_68
+0	regex/1024	\bREF[\t\ ]
+>0	use		algol_68
+0	regex/1024	\bFLEX[\t\ ]\*\\[
+>0	use		algol_68
+
+# display information like mime type and file name extension of Algol 68 source text
+0	name		algol_68		Algol 68 source text
 !:mime	text/x-Algol68
+# https://file-extension.net/seeker/file_extension_a68
+!:ext   a68
+#!:ext   a68/alg
+
 #0	regex          	[\t\ ]OD		Algol 68 source text
+#>0	use		algol_68
 #!:mime	text/x-Algol68
 #0	regex          	[\t\ ]FI		Algol 68 source text
+#>0	use		algol_68
 #!:mime	text/x-Algol68
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.40-algol68-a68.diff.sig
Type: application/octet-stream
Size: 861 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20210813/797f62fb/attachment.obj>


More information about the File mailing list