[File] [PATCH] Magdir/printer Hewlett-Packard Graphics Language misidentfies QEMU shell scripts +

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Sat Jun 3 19:44:48 UTC 2023


Hello,

some days ago (16.12.2022) i send patch of Magdir/printer to recognize
Hewlett-Packard Graphics Language which typically have file suffix
like hpgl/hpg/hp. When running file command version 5.44 on such samples
and related misidentified files i get an output like:

Linux-syscall-note:
	Hewlett-Packard Graphics Language, starting with
	"SPDX-Exception-Identifier: Linux-syscall-note"
	with "SPDX-URL: "
miter.hp:
	Hewlett-Packard Graphics Language, starting with
	"PA4000,3000;" with "PW2;\012LA"
test3.hpgl:
	Hewlett-Packard Graphics Language, starting with
	"SP6;DI0,1;SR0.70,1.90;SC0,800,0,576;PA;PU20,0;LBDSA 60"
	with "SP3;SC-864"
test_msa_run_32r5eb.sh:
	Hewlett-Packard Graphics Language, starting with
	"PATH_TO_QEMU="../../../../../../mips-linux-user/qemu-m"

Obviously the Hewlett-Packard Graphics have no unique magic pattern.
But luckily the displaying part is done by sub routine hpgl inside
Magdir/printer which start like:
     0	name			hpgl
     >0	string	x		Hewlett-Packard Graphics Language
     !:mime	application/vnd.hp-HPGL
     !:ext	hpgl/hpg/hp/plt
     >0	string	x			\b, starting with "%-.54s"
Luckily in output by phrase "starting with" the first line of file is
shown. So this can be used for control reasons. So with this output i
can check what is going wrong and i must insert some additional tests
before calling the sub routine.

I often have used about 2 bytes (that are 16 bits) for recognition.
You can blame me for not using 32 bits, but i have not time to read and
understand whole specification and get part time HP plotter expert. But
now some collisions occur so i spent some time to refine test lines.

So samples like test3.hpgl starting with Select Pen directive are done
by lines like:
     0	string	SP
     >0		use		hpgl
According to documentation the argument is pen number n. This argument
is integer. So it is in the range between -32767 and 32768. If there is
no pen number or this 0, the controller performs an end of file command.
Assuming that there exist no negative pen number such samples are now
caught by lines like:
     0	string	SP
     >2	regex	\^([0-9]{1,5})
     #>2	regex	\^([0-9]{1,5})	PEN_NUMBER=%s
     >>0		use		hpgl
So for real HP samples like test3.hpgl i get pen number like 6. The text
file Linux-syscall-note inside qemu sources start with line like:
	SPDX-Exception-Identifier: Linux-syscall-note
So by current test it is identified as HPGL file. With the additional
test line it is now skipped.

Samples like miter.hp with Plot Absolute directive are done by lines like:
     0	string	PA
     >0		use		hpgl

According to documentation the arguments are coordinates x,y{,x,y{...}}.
These arguments are integer. So these are in the range between -32767
and 32768. So now i check also for valid x coordinate. So such samples
are now caught by lines like:
     0	string	PA
     >2	regex	\^([-]{0,1}[0-9]{1,5})
     #>2	regex	\^([-]{0,1}[0-9]{1,5})	COORDINATE=%s
     >>0		use		hpgl
So for examples like miter.hp i get coordinate value 4000. Some shell
scripts like test_msa_run_32r5eb.sh test_msa_run_32r5eb.sh inside qemu
sources start with line like:
	PATH_TO_QEMU="../../../../../../mips-linux-user/qemu-m
So by current test these script are identified as HPGL file. With the
additional test line these scripts are now skipped.

After applying the above mentioned modifications by patch
file-5.44-printer-hpgl.diff then  Hewlett-Packard Graphics samples are
still described as before but misidentification vanish. This now looks like:
Linux-syscall-note:
	ASCII text
miter.hp:
	Hewlett-Packard Graphics Language, starting with
	"PA4000,3000;"
	with "PW2;\012LA"
test3.hpgl:
	Hewlett-Packard Graphics Language, starting with
	"SP6;DI0,1;SR0.70,1.90;SC0,800,0,576;PA;PU20,0;LBDSA 60"
	with "SP3;SC-864"
test_msa_run_32r5eb.sh:
	ASCII text

With best wishes,

Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
--- file-5.44/magic/Magdir/printer.old	2022-12-26 19:00:48.000000000 +0100
+++ file-5.44/magic/Magdir/printer	2023-06-03 21:31:00.421682300 +0200
@@ -178,9 +178,13 @@
 0	string	DF;
 >0		use		hpgl
 # http://ftp.funet.fi/index/graphics/packages/hpgl2ps/hpgl2ps.tar.Z/hpgl2ps/test3.hpgl 
-# Select Pen n
+# Select Pen n; If no pen number or 0, the controller performs an end of file command; n in range between -32767 and 32768 like: 6
 0	string	SP
->0		use		hpgl
+# skip text Linux-syscall-note inside qemu sources starting with SPDX-Exception-Identifier: Linux-syscall-note
+# by checking for valid Pen number
+>2	regex	\^([0-9]{1,5})
+#>2	regex	\^([0-9]{1,5})	PEN_NUMBER=%s
+>>0		use		hpgl
 # charsize.hp pages.hp	set the scaling points (P1 and P2) to their default positions
 0	string	IP0
 >0		use		hpgl
@@ -200,8 +204,13 @@
 0	string	BP
 >0		use		hpgl
 # miter.hp
+# Plot Absolute x,y{,x,y{...}}; x and y in range between -32767 and 32768 like: PA4000,3000;
 0	string	PA
->0		use		hpgl
+# skip shell scripts test_msa_run_32r5eb.sh test_msa_run_32r5eb.sh with variable PATH_TO_QEMU
+# by checking for valid x coordinate
+>2	regex	\^([-]{0,1}[0-9]{1,5})
+#>2	regex	\^([-]{0,1}[0-9]{1,5})	COORDINATE=%s
+>>0		use		hpgl
 # pw.hpg	number of pens x
 0	string	NP
 >0		use		hpgl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-printer-hpgl.diff.sig
Type: application/octet-stream
Size: 849 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230603/35a13205/attachment.obj>


More information about the File mailing list