[File] [PATCH] Magdir/wordprocessors for Aldus/Adobe PageMaker

Jörg Jenderek joerg.jen.der.ek at gmx.net
Sat Nov 27 21:12:36 UTC 2021


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some times ago i installed an older Aldus PageMaker software.
The documents and templates are files with file name extensions
like PM4 PM5 PM6 P65 PMD PT3 PT6 T65 PMT.

When running file command version 5.41 on such documents all are
described as "data".

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This identifies the
"middle aged" examples like BCOMDOC2.PM4 as "Aldus PageMaker document
(v4)" by pm4-pagemaker.trid.xml and example Mytest5.PM5 as "Aldus
PageMaker document (v5)" by pm5-pagemaker.trid.xml (See appended
trid-v-pagemaker.txt.gz ). This mentions page on Wikipedia and used
file name extension.
Luckily i also found a page about PageMaker on file formats archive
team web site. That informations are expressed by comment lines
inside Magdir/wordprocessors like:
# URL:		http://fileformats.archiveteam.org/wiki/PageMaker
#		https://en.wikipedia.org/wiki/Adobe_PageMaker
# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/p
#		pm4-pagemaker.trid.xml
#		pm5-pagemaker.trid.xml

Unfortunately the documentation is neither official nor complete. So
i put displaying part inside sub routine PageMaker.

At the end according to documentation the numeric version (like: 4 5
6 6.50) is shown by lines like:

 #>110	uleshort	x		\b, VERSION=%#x
 >110	uleshort	>0x03FF
 >>110	uleshort/256	x		\b, version %u
 >>110	uleshort%256	>0		\b.%u
 >110	uleshort	<0x0400		\b, maybe version 3


Unfortunately for version 3 examples the mentioned numeric version
is zero and for version 7 the numeric value is 6.50 as for version 6.
5.

- From version part some sub classification are depending. It started
as Aldus PageMaker, but later (since version 6) it was acquired from
Adobe. So these different names are expressed by subroutine
starting like:
 0	name		PageMaker
 >110	uleshort	<0x0600			Aldus
 >110	uleshort	>0x05FF			Adobe
 >110	uleshort	x			PageMaker
 !:mime		application/vnd.pagemaker

Depending from version are the used file name extensions and the
APPLE creator and type mentioned on page about signatures of
Macintosh Files on web site macdisk.com. So for version 3 this
looks like:
 >110		uleshort/256	=0		document
 !:apple	ALB3ALD3
 !:ext		pm3/pt3
The PT3 extension is used for templates. Nothing is mentioned in
documentation if it is possible to distinguish template from pure
document.

For major version 6 there exist 2 variants 6 and 6.5. So this look
a little bit different like:
 > 110	uleshort	=0x0600			document
 !:apple	ALD6ALB6
 !:ext	pm6/pt6
 > 110	uleshort	=0x0632			document
 !:apple	AD65AB65
 !:ext	p65/t65/pmd/pmt

According to documentation PageMaker documents begin with the hex
values "FF 99" at offset 6 for little endian and according to TrID
for version 4 and 5 the prepending bytes are nil. That is what i
found in my examples, but in version 3 samples only 2 byte before
are nil. So this is used as test by starting lines like:
 4	ubelong		=0x0000FF99
 >0	use		PageMaker
Most of my inspected samples are little endian, but i least i was
able to extract one big endian example Templates-3-BE.pt3. There
byte order is changed. So that example with inverted logic is
described by additional lines like:
 4	ubelong		=0x000099FF
 >0	use		\^PageMaker

After applying the above mentioned modifications by patch
file-5.41-wordprocessors-pagemaker.diff then all my inspected
PageMaker documents are now described. This now looks like:

02TEMPLT-stream.T65:       Adobe PageMaker document,
			   little-endian, version 6.50
BCOMDOC2.PM4:              Aldus PageMaker document,
			   little-endian, version 4
MyPage6-stream.PM6:        Adobe PageMaker document,
			   little-endian, version 6
Mytest5.PM5:               Aldus PageMaker document,
			   little-endian, version 5
SPECSHT.PT3:               Aldus PageMaker document,
			   little-endian, maybe version 3
Templates-3-BE.pt3:        Aldus PageMaker document,
			   big-endian, maybe version 3
brochus-stream.pt6:        Adobe PageMaker document,
			   little-endian, version 6
pm-70-stream.pmd:          Adobe PageMaker document,
			   little-endian, version 6.50
pm-70-template-stream.pmt: Adobe PageMaker document,
			   little-endian, version 6.50
strategies-stream.p65:     Adobe PageMaker document,
			   little-endian, version 6.50

I hope my diff file can be applied in future version of file utility.

Since version 6 such documents are embedded inside Compound
Documents. So such examples must be handled by modifications of
Magdir/ole2compounddocs. I will try to do this in a future session.

With best wishes
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYaKfRAAKCRCv8rHJQhrU
1g/kAJ9lrDRP6vFm2zeaiaqiKqAtsHIjCQCgjP/DW7dEaCRGeacQLG7114+7KnI=
=g6es
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-pagemaker.txt.gz
Type: application/x-gzip
Size: 570 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20211127/7f1b39b9/attachment.bin>
-------------- next part --------------
--- file-5.41/magic/Magdir/wordprocessors.old	2021-08-30 09:10:26 +0000
+++ file-5.41/magic/Magdir/wordprocessors	2021-11-27 20:44:49 +0000
@@ -230,4 +230,66 @@
 2	string	MMXPRa			Motorola Quark Express Document (Korean)
 
+# From:		Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/PageMaker
+#		https://en.wikipedia.org/wiki/Adobe_PageMaker
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/p
+#		pm4-pagemaker.trid.xml
+#		pm5-pagemaker.trid.xml
+# Note:		since version 6 in 1995 called Adobe PageMaker and
+#		embedded in Compound Document handled by ./ole2compounddocs
+#		mainly tested little endian variant
+4	ubelong		=0x0000FF99
+>0	use		PageMaker
+# big endian variant
+4	ubelong		=0x000099FF
+>0	use		\^PageMaker
+#	display information of Aldus/Adobe PageMaker document/publication
+0	name		PageMaker
+>110	uleshort	<0x0600			Aldus
+>110	uleshort	>0x05FF			Adobe
+>110	uleshort	x			PageMaker
+# "MP" marker for newer version 4 and above according to TrID
+#>108	string		x			\b, MARKER "%.2s"
+# http://www.nationalarchives.gov.uk/pronom/fmt/876
+!:mime		application/vnd.pagemaker	
+#!:mime		application/x-pagemaker
+# different file name extensions are used depending on version
+# older version like 3
+>110	uleshort/256	=0			document
+# https://www.macdisk.com/macsigen.php
+!:apple	ALB3ALD3
+# PT3 for template and no example for PageMaker document/publiction with PM3 extension
+!:ext	pm3/pt3
+>110	uleshort/256	=4			document
+!:apple	ALD4ALB4
+# no example for PT4 template
+!:ext	pm4/pt4
+>110	uleshort/256	=5			document
+!:apple	ALD5ALB5
+# no example for PT5 template
+!:ext	pm5/pt5
+>110	uleshort	=0x0600			document
+!:apple	ALD6ALB6
+# PT6 for template
+!:ext	pm6/pt6
+# HOWTO to distinguish version 7 from 6.5 ?
+>110	uleshort	=0x0632			document
+!:apple	AD65AB65
+# no example for T65 template
+!:ext	p65/t65/pmd/pmt
+# version 7 with PMT extension for template
+#!:ext	pmd/pmt
+#!:apple	????PUBF
+# endian marker FF 99 for little endian
+>6	ubyte	=0xFF			\b, little-endian
+>6	ubyte	=0x99			\b, big-endian
+# newer numeric version like: 4 5 6 6.50
+#>110	uleshort	x			\b, VERSION=%#x
+>110	uleshort	>0x03FF
+>>110	uleshort/256	x			\b, version %u
+>>110	uleshort%256	>0			\b.%u
+# older version like 3
+>110	uleshort	<0x0400			\b, maybe version 3
+
 # adobe indesign (document, whatever...) from querkan
 0	belong	0x0606edf5		Adobe InDesign
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-wordprocessors-pagemaker.diff.sig
Type: application/octet-stream
Size: 1265 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20211127/7f1b39b9/attachment.obj>


More information about the File mailing list