[File] [PATCH] Magdir/fonts, pdp Adobe Multiple Master font *.MMM misidentfied as PDP-11 executable

Jörg Jenderek joerg.jen.der.ek at gmx.net
Thu Dec 2 00:11:29 UTC 2021


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some days ago i installed an older Adobe software with some fonts.
So i was checking some other font stuff. The inspected examples have
file name extension MMM.

When running running file command version 5.41 on such font
examples i get an output like:
_MI_____.MMM:                 PDP-11 executable not stripped -
			      version 99
_MRG____.MMM:                 PDP-11 executable not stripped -
			      version 99
fmt-521-signature-id-814.mmm: data
sample.mmm:                   Adobe Multiple Master font
zx______.mmm:                 Adobe Multiple Master font
zy______.mmm:                 Adobe Multiple Master font

Furthermore with --extension option only 3 character sequence ???
is shown. With -i option only generic mime type
application/octet-stream is shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). All real examples
are described as "Adobe Type Manager Multiple Master Metrics" by
mmm-atm.trid.xml (See appended mmm-trid-v.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). This
identifies many MMM examples as "Adobe Multiple Master Metrics font
file" by PUID fmt/521 (See appended mmm-droid.csv.gz).

With -v option trid displays used 3 byte file name extension MMM and
a reference URL pointing to Adobe Type Manager on Wikipedia. There an
internal link to Adobe Multiple master fonts is mentioned.

So these informations are now expressed by additional comment lines
inside Magdir/fonts like:

# URL:		https://en.wikipedia.org/wiki/Multiple_master_fonts
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/m/mmm-atm.trid.xml
#		http://www.nationalarchives.gov.uk/pronom/fmt/521

In current version two patterns describe Adobe MMM examples like:
0 string	\007\001\001\000Copyright\ (c)\ 199
0 string	\012\001\001\000Copyright\ (c)\ 199
None of my examples are described by first pattern and for not
detected examples like _MI_____.MMM and _MRG____.MMM an equivalent
pattern would look like:
0 string	\007\001\002\000Copyright\ (c)\ 199

Unfortunately i found no file format specification. So now i put
displaying part inside sub routine mmm-font. This routine start like:
 0	name		mmm-font
 >0x53	ubyte		x	Adobe Multiple Master font Metric
 !:mime	application/x-font-mmm
 !:ext	mmm

Instead of generic mime type application/octet-stream a user defined
one is shown and now also MMM file name extension is shown.
Furthermore i add phrase "Metric" because the MMM examples contain
only the metric (like described by DROID and TrID) whereas the real
font are stored inside PFB examples.

At the beginning 4 bytes are stored which purpose is unknown for
me. For control reasons these can be shown by debugging line like:
 >0	ubelong		x			\b, at 0 %#8.8x
All identifier tools assume that byte at offset 3 is nil.

At offset 4 apparently a 0-terminated copyright message is stored
which looks like:
Copyright (c) 1992, 1993, 1994, 1999 Adobe Systems Incorporated.
All R
Copyright (c) 1992, 1994 Adobe Systems Incorporated.
All Rights Reserv
Copyright (c) 1993, 1994, 1999 Adobe Systems Incorporated.
All Rights
For debugging purpose this can be shown by line like:
 >4	string		x			"%s"

The DROID tool checks only for start keyword Copyright followed by
one space character and does not check for year message part ( like
199?). And the TrID tool only checks for space character (0x20) after
font copy right character embraced by parentheses, which can be
checked by debugging line like:
 >17	byte		!0x20			\b, at 17 "%c"

So the copyright string maybe have year part that is different from
nineteen century or completely different. But i now check for message
part and leading nil byte similar as done in previous file version
and what i found in my inspected examples. So this now becomes like:
 3	string		\000Copyright\ (c)\ 199
 >0	use		mmm-font
If this is not always true, then test line must be changed or more
other test lines must be added before calling sub routine.

After copy right message probably foo factor string (like: 001.001
001.002 001.003) is stored. That can be displayed by debugging line
like:
 >0x4c	string		x		\b, factor %s
That string is also 0-terminated. That can be checked by line like:
 >0x53	byte		!0		\b, at 0x53 %x

Afterwards third string part occurs which apparently seems to be the
font name with optional indicator MM (for Multiple Master font like
MyriadMM-It MyriadMM AdobeSansMM AdobeSerifMM). So show that useful
information by additional lines like
 >0x53	ubyte		=0
 >>0x54	string		x					"%s"

Finally i look what the other tools are also checking. So these facts
maybe can be used as additional tests. Some hundreds bytes later the
DROID tool checks for value 76000000E803E803h which is true for
examples zx______.mmm and zy______.mmm, but for examples
_MI_____.MMM and _MRG____.MMM value is 69000000E803E803h. This can be
checked by line like:
 >0xb8	ubequad		!0x76000000e803e803h	\b, at 0xB8 %#llx

The TrID tool looks for keywords like Weight and Width. These checks
transferred as magic lines look like:
 >0x55	search/0x10B5	Weight\0\0		\b, FOUND Weight
 >0x55	search/0x1131	Width\0\0		\b, FOUND Width

The MMM examples like _MI_____.MMM _MRG____.MMM starting with
\007\001\002\000Copyright are misidentified as PDP-11 a.out via
Magdir/pdp by lines like:

 0	leshort		0407	PDP-11 executable
 >8	leshort		>0	not stripped
 >15	byte		>0	- version %d
because the 2 leading bytes are the same. I have no deeper
knowledge about PDP executables file format, but where in
executable numeric integer values are stored is occupied by
copyright message in MMM examples. So c character (0x63=99) of font
copy right message embraced by parentheses is misinterpreted as
version number as version 99. So by additional test line for copy
right message string the misidentified MMM examples are skipped. So
this now looks now like:

 0	leshort		0407
 >4	string		!Copyright\040	PDP-11 executable
 >>8	leshort		>0		not stripped
 >>15	byte		>0		- version %d

After applying the above mentioned modifications by patches
file-5.41-fonts-mmm.diff and file-5.41-pdp-mmm.diff
then the misidentification vanish and identification gets more detail
s
(font name). This now looks like:

_MI_____.MMM:                 Adobe Multiple Master font Metric
			      "MyriadMM-It"
_MRG____.MMM:                 Adobe Multiple Master font Metric
			      "MyriadMM"
fmt-521-signature-id-814.mmm: data
sample.mmm:                   Adobe Multiple Master font Metric
			      "AdobeSerifMM"
zx______.mmm:                 Adobe Multiple Master font Metric
			      "AdobeSansMM"
zy______.mmm:                 Adobe Multiple Master font Metric
			      "AdobeSerifMM"

I hope my 2 diff files can be applied in future version of file
utility.

With best wishes
Jörg Jenderek
- --
Jörg Jenderek

























































-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYagPMQAKCRCv8rHJQhrU
1l2yAJ9BMEJS+NMIXtiAXuC085IwhDeRJQCgl6sVgvi7gESXw2N70kREGBmNsRk=
=Olxn
-----END PGP SIGNATURE-----
-------------- next part --------------
--- file-5.41/magic/Magdir/pdp.old	2020-05-31 10:34:40 +0000
+++ file-5.41/magic/Magdir/pdp	2021-11-28 20:47:00 +0000
@@ -9,7 +9,13 @@
 # PDP-11 a.out
 #
-0	leshort		0407		PDP-11 executable
->8	leshort		>0		not stripped
->15	byte		>0		- version %d
+# updated by Joerg Jenderek at Nov 2021
+# GRR: line below too general as it catches some Adobe Multiple Master font handled by ./fonts
+0	leshort		0407
+# c character (0x63=99) of font copy right message embraced by parentheses
+#>15	string		x		\b, at 15 %.1s
+# skip font _MI_____.MMM _MRG____.MMM with 0701h and copy right message near the beginning
+>4	string		!Copyright\040	PDP-11 executable
+>>8	leshort		>0		not stripped
+>>15	byte		>0		- version %d
 
 # updated by Joerg Jenderek at Mar 2013
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-pdp-mmm.diff.sig
Type: application/octet-stream
Size: 591 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20211202/a3f3b2d4/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mmm-droid.csv.gz
Type: application/x-gzip
Size: 388 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20211202/a3f3b2d4/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mmm-trid-v.txt.gz
Type: application/x-gzip
Size: 430 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20211202/a3f3b2d4/attachment-0001.bin>
-------------- next part --------------
--- file-5.41/magic/Magdir/fonts.old	2021-05-12 16:30:24 +0000
+++ file-5.41/magic/Magdir/fonts	2021-12-01 23:54:10 +0000
@@ -316,6 +316,44 @@
 >>>>>>>&(&-14.S-17)	lestring16	x	\b, %-11.96s
 
-0	string		\007\001\001\000Copyright\ (c)\ 199	Adobe Multiple Master font
-0	string		\012\001\001\000Copyright\ (c)\ 199	Adobe Multiple Master font
+# Update:	Joerg Jenderek
+# URL:		https://en.wikipedia.org/wiki/Multiple_master_fonts
+# Reference:	http://mark0.net/download/triddefs_xml.7z
+#		defs/m/mmm-atm.trid.xml
+#		http://www.nationalarchives.gov.uk/pronom/fmt/521
+# Note:		still used in Adobe Acrobat Reader
+#0	string		\007\001\001\000Copyright\ (c)\ 199	Adobe Multiple Master font
+#0	string		\012\001\001\000Copyright\ (c)\ 199	Adobe Multiple Master font
+#0	string		\007\001\002\000Copyright\ (c)\ 199	Adobe Multiple Master font
+3	string		\000Copyright\ (c)\ 199
+>0	use		mmm-font
+#	display Adobe Multiple Master font Metric information
+0	name		mmm-font
+>0x53	ubyte		x					Adobe Multiple Master font Metric
+#!:mime	application/octet-stream
+!:mime	application/x-font-mmm
+# http://file.fyicenter.com/c/sample.mmm
+!:ext	mmm
+# unknown like: 07010200 0A010100 07010100 (no example)
+#>0	ubelong		x					\b, at 0 %#8.8x
+# probably copyright message like:
+# Copyright (c) 1992, 1993, 1994, 1999 Adobe Systems Incorporated.  All R
+# Copyright (c) 1992, 1994 Adobe Systems Incorporated.  All Rights Reserv
+# Copyright (c) 1993, 1994, 1999 Adobe Systems Incorporated.  All Rights 
+#>4	string		x					"%s"
+# According to TrID space character (0x20) after font copyright character embraced by parentheses
+#>17	byte		!0x20					\b, at 17 "%c"
+# after copy right message probably foo factor like: 001.001 001.002 001.003
+#>0x4c	string		x					\b, factor %s
+# nul terminating character of foo factor
+#>0x53	byte		!0		\b, at 0x53 %x
+>0x53	ubyte		=0
+# 3rd string part probably font name with optional indicator MM like:
+# AdobeSansMM AdobeSerifMM MyriadMM MyriadMM-It
+>>0x54	string		x					"%s"
+# According to DROID 76000000E803E803h but also 69000000E803E803h (_MI_____.MMM _MRG____.MMM)
+#>0xb8	ubequad		!0x76000000e803e803h			\b, at 0xB8 %#llx
+# According to TrID keywords like: Weight Width
+#>0x55	search/0x10B5	Weight\0\0				\b, FOUND Weight 
+#>0x55	search/0x1131	Width\0\0				\b, FOUND Width 
 
 # TrueType/OpenType font collections (.ttc)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-fonts-mmm.diff.sig
Type: application/octet-stream
Size: 1205 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20211202/a3f3b2d4/attachment-0001.obj>


More information about the File mailing list