[File] [PATCH] Magdir/wordprocessors, ssl, ssh for "oldest" Microsoft Publisher, different public keys

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Thu Jun 6 07:32:00 UTC 2024


Hello,

some days ago i must handle an old CD-ROM.  This contains some older
Microsoft Publisher files with file name suffix pub. These are not
recognized correctly. So i send patch file-ole2compounddocs-pub.diff
some weeks ago. Now i found "oldest" Microsoft Publisher samples like
MSPublisherv1.PUB. Unfortunately the PUB file name suffix i also used
for public keys by different software. So i also look for such samples.


When running file command version 5.45 on such "oldest" Microsoft
Publisher, other public keys and related files i get an output like:

MSPublisherv1.PUB:        data
TMP00044.PUB:             data
Thanksgiving1.DTP:        data
format_gen.key:           OpenSSH private key (no password)
format_gen.pub:           ASCII text
id_dsa.pub:               OpenSSH DSA public key
id_ecdsa384.pub:          OpenSSH ECDSA public key
id_ecdsa521.pub:          OpenSSH ECDSA public key
id_rsa.pub:               OpenSSH RSA public key
localhost.priv:           PEM RSA private key
localhost.pub:            ASCII text
ssh_host_ecdsa_key.pub:   OpenSSH ECDSA public key
ssh_host_ed25519_key.pub: OpenSSH ED25519 public key

With option --extension only 3 byte sequence ??? for most samples is
shown. With option -i only generic application/octet-stream or
text/plain is shown for most examples. This looks like:

MSPublisherv1.PUB:        application/octet-stream; charset=binary
TMP00044.PUB:             application/octet-stream; charset=binary
Thanksgiving1.DTP:        application/octet-stream; charset=binary
format_gen.key:           text/plain; charset=us-ascii
format_gen.pub:           text/plain; charset=us-ascii
id_dsa.pub:               text/plain; charset=us-ascii
id_ecdsa384.pub:          text/plain; charset=us-ascii
id_ecdsa521.pub:          text/plain; charset=us-ascii
id_rsa.pub:               text/plain; charset=us-ascii
localhost.priv:           text/plain; charset=us-ascii
localhost.pub:            text/plain; charset=us-ascii
ssh_host_ecdsa_key.pub:   text/plain; charset=us-ascii
ssh_host_ed25519_key.pub: text/plain; charset=us-ascii

For comparison reason i run the file format identification utility
TrID (See https://mark0.net/soft-trid-e.html).

This identifies some "data" samples (like Thanksgiving1.DTP
TMP00044.PUB) as "COSMI document (generic)" with mime type
application/octet-stream and without file name suffix by cosmi.trid.xml.
It identifies some "data" samples like MSPublisherv1.PUB as "Microsoft
Publisher document (v1)" with mime type application/vnd.ms-publisher and
file name suffix PUB. It identifies some SSH public keys with text/plain
mime type and PUB file name suffix. Some samples (like id_dsa.pub) are
described as "SSH-DSS Public key" by pub-ssh-dss.trid.xml and others
(like id_rsa.pub) are described as "SSH-RSA Public key" by
pub-ssh-rsa.trid.xml. Some ssh keys (like ssh_host_ed25519_key.pub
id_ecdsa384.pub) are here not recognized. The sample localhost.priv is
described as "ASCII armored RSA Private Key" with mime type text/plain
and KEY name suffix (See appended trid-v-pub.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). This
identifies MSPublisherv1.PUB correctly as "Microsoft Publisher" with
version 1 and mime type application/x-mspublisher by PUID fmt/1511.
Other PUB samples are also described wrong as "Microsoft Publisher"
because recognition is based on file name suffix pub (See appended
droid-pub-key.csv.gz).

On Linux according to shared MIME-info database none of these examples
are described.

Luckily with information given by the other tools i also found a
page about Microsoft Publisher on file formats archive team web site.
There also links for samples to download are listed. That informations
are expressed after Microsoft Works entry by comment lines inside
Magdir/wordprocessors like:
# URL:	https://en.wikipedia.org/wiki/Microsoft_Publisher
# Ref.:	http://fileformats.archiveteam.org/wiki/Microsoft_Publisher
Newest Publisher files ( since version2) are OLE 2 Compound based and
described by Magdir/ole2compounddocs for which i send a patch some weeks
ago. According to reference oldest are based on another file format. But
at least the first four bytes are constant and hopefully unique. So the
additional magic lines look like:
0	ubelong		=0xE7AC2C00	Microsoft Publisher (1.0)
!:mime	application/vnd.ms-publisher
!:ext	pub
So i choose the mime type used for other variant. This is also used
by Linux mime database, but that type is not not registered at IANA. So
maybe an used defined mime type application/x-mspublisher is better suited.

Afterwards i add information as comment for COSMI document. This looks like
# URL:	http://fileformats.archiveteam.org/wiki/COSMI_MultiMedia
#	https://en.wikipedia.org/wiki/Cosmi_Corporation
# Ref.:	http://mark0.net/download/triddefs_xml.7z/defs/c/cosmi.trid.xml
The recognition happens by lines like:
0	string/b	LCP		COSMI document
!:mime	application/x-cosmi
!:ext		dtp/pub/bro/bcd/crd
Instead of generic application/octet-stream mime type i show an user
defined one. Beside PUB suffix also other names suffix are used
(BCD~Business Card Maker BRO~Brochure Magic CRD~Greeting Card Magic
DTP~Print Perfect PUB~Desktop Publisher), but i do not know if it is
possible to do sub classification for the different name suffix.

Samples like localhost.priv are described inside Magdir/ssl by line like:
0	string	-----BEGIN\040RSA\040PRIVATE	PEM RSA private key
Luckily with information given by the other tools i also found a
page about SSL on GitHub web site. That informations are expressed by
comment lines inside Magdir/ssl like:
# Ref.:	https://github.com/openssl/openssl/blob/master/include/openssl/
#	pem.h
#	http://mark0.net/download/triddefs_xml.7z/
#	defs/k/key-rsa-pvt.trid.xml
So the above magic line now becomes like:
0 string -----BEGIN\040RSA\040PRIVATE		PEM RSA private key
!:mime		text/x-ssl-private-key
!:ext		key/priv
Instead of generic text/plain i show an user defined one. According to
TrID file name suffix KEY is used, but in may samples i found PRIV
(apparently the abbreviation for PRIVATE). So i show both name
extensions. The counter part is the public key. Here the phrase PUBLIC
instead of PRIVATE is used. Here PUB is used as file name suffix. So the
unrecognized key is now done by additional lines, these look like:
0 string -----BEGIN\040RSA\040PUBLIC\040KEY----- PEM RSA public key
!:mime		text/x-ssl-public-key
!:ext		pub

Samples like id_rsa.pub are described inside Magdir/ssh by line like:
0	string	ssh-rsa\040		OpenSSH RSA public key

Luckily with information given by the other tools i also found a
page about SSH on Wikipedia web site. That informations are expressed by
comment lines inside Magdir/ssl like:
# URL:		https://en.wikipedia.org/wiki/Secure_Shell_Protocol
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/p/pub-ssh-rsa.trid.xml
So the above magic line now becomes like:
0	string	ssh-rsa\040		OpenSSH RSA public key
!:mime		text/x-ssh-public-key
!:ext		pub
Instead of generic text/plain i show an user defined one. According to
TrID file name suffix PUB is used (apparently the abbreviation for PUBLIC).

Then do the same procedure for other listed public ssh keys. Samples
like format_gen.key (found in qemu version 9.0.0 source) are described
inside Magdir/ssl by line like:
0	string	-----BEGIN\040PRIVATE\040KEY-----	\
			OpenSSH private key (no password)
So the above magic line now becomes like:
0	string	-----BEGIN\040PRIVATE\040KEY-----	\
			OpenSSH private key (no password)
!:mime		text/x-ssh-private-key
!:ext		key
Instead of generic text/plain i show an user defined one. In the sample
file name suffix KEY is used. So i show this name extension. The counter
part is the public key. Here the phrase PUBLIC instead of PRIVATE is
used ( see format_gen.pub). Here PUB is used as file name suffix. So the
unrecognized key is now done by additional lines, these look like:
0	string	-----BEGIN\040PUBLIC\040KEY-----	\
			OpenSSH public key
!:mime		text/x-ssh-public-key
!:ext		pub

After applying the above mentioned modifications by patches
file-5.45-wordprocessors-pub.diff file-5.45-ssl-pub.diff
file-5.45-ssh-pub.diff, then most of my inspected examples with PUB name
suffix are now described. This now looks like:
MSPublisherv1.PUB:        Microsoft Publisher (1.0)
TMP00044.PUB:             COSMI document
Thanksgiving1.DTP:        COSMI document
format_gen.key:           OpenSSH private key (no password)
format_gen.pub:           OpenSSH public key
id_dsa.pub:               OpenSSH DSA public key
id_ecdsa384.pub:          OpenSSH ECDSA public key
id_ecdsa521.pub:          OpenSSH ECDSA public key
id_rsa.pub:               OpenSSH RSA public key
localhost.priv:           PEM RSA private key
localhost.pub:            PEM RSA public key
ssh_host_ecdsa_key.pub:   OpenSSH ECDSA public key
ssh_host_ed25519_key.pub: OpenSSH ED25519 public key

I hope my diff files can be applied in future version of file
utility. Unfortunately the pub suffix is also used for PGP/GPG keys.
Here i also found some excerptions which are not recognized. So i need
some time to inspect what is exactly going wrong there. I will try to
handle this in a future session.

With best wishes,
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
-- 
File mailing list
File at astron.com
https://mailman.astron.com/mailman/listinfo/file

-------------- next part --------------
-- 
File mailing list
File at astron.com
https://mailman.astron.com/mailman/listinfo/file

-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-ssl-pub.diff.sig
Type: application/octet-stream
Size: 705 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240606/3819ab70/attachment.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/ssl.old	2021-02-23 01:49:24.000000000 +0100
+++ file-5.45/magic/Magdir/ssl	2024-06-05 17:49:47.403352900 +0200
@@ -8,10 +8,23 @@
 
 0	string	-----BEGIN\040CERTIFICATE-----	PEM certificate
 0	string	-----BEGIN\040CERTIFICATE\040REQ	PEM certificate request
+# Update:	Joerg Jenderek
+# Reference:	https://github.com/openssl/openssl/blob/master/include/openssl/pem.h
+#		http://mark0.net/download/triddefs_xml.7z/defs/k/key-rsa-pvt.trid.xml
+# Note:		called "ASCII armored RSA Private Key" by TrID
 0	string	-----BEGIN\040RSA\040PRIVATE	PEM RSA private key
+#!:mime		text/plain
+!:mime		text/x-ssl-private-key
+!:ext		key/priv
 0	string	-----BEGIN\040DSA\040PRIVATE	PEM DSA private key
 0	string	-----BEGIN\040EC\040PRIVATE	PEM EC private key
 0	string	-----BEGIN\040ECDSA\040PRIVATE	PEM ECDSA private key
+# From:		Joerg Jenderek
+# Reference:	https://github.com/openssl/openssl/blob/master/include/openssl/pem.h
+0	string	-----BEGIN\040RSA\040PUBLIC\040KEY-----	PEM RSA public key
+#!:mime		text/plain
+!:mime		text/x-ssl-public-key
+!:ext		pub
 
 # From Luc Gommans
 # OpenSSL enc file (recognized by a magic string preceding the password's salt)
-------------- next part --------------
--- file-5.45/magic/Magdir/wordprocessors.old	2023-02-09 18:43:53.000000000 +0100
+++ file-5.45/magic/Magdir/wordprocessors	2024-06-04 15:42:13.809049100 +0200
@@ -21,18 +21,40 @@
 >112	ubeshort	=0x0100		Microsoft Works 1-3 (DOS) or 2 (Windows) document
 # title like THE GREAT KHAN GAME
 >>0x100	string		x		%s
 !:mime	application/vnd-ms-works
 #!:mime	application/x-msworks
 # https://www.macdisk.com/macsigen.php
 !:apple	????AWWP
 !:ext	wps
 
+# From:		Joerg Jenderek
+# URL:		https://en.wikipedia.org/wiki/Microsoft_Publisher
+# Reference:	http://fileformats.archiveteam.org/wiki/Microsoft_Publisher
+# Note:		older non OLE 2 (./ole2compounddocs) Compound based version
+0	ubelong		=0xE7AC2C00	Microsoft Publisher (1.0)
+#!:mime	application/x-mspublisher
+# Not registered at IANA but
+# https://web.archive.org/web/20200930085807/https://reposcope.com/mimetype/application/vnd.ms-publisher
+!:mime	application/vnd.ms-publisher
+!:ext	pub
+
+# From:		Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/COSMI_MultiMedia
+#		https://en.wikipedia.org/wiki/Cosmi_Corporation
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/c/cosmi.trid.xml
+# Note:		called "COSMI document (generic)" by TrID
+0	string/b	LCP		COSMI document
+#!:mime		application/octet-stream
+!:mime	application/x-cosmi
+# BCD~Business Card Maker BRO~Brochure Magic CRD~Greeting Card Magic DTP~Print Perfect PUB~Desktop Publisher
+!:ext		bcd/bro/crd/dtp/pub/
+
 # Corel/WordPerfect
 # URL:		https://en.wikipedia.org/wiki/WordPerfect
 # Reference:	https://github.com/OneWingedShark/WordPerfect/blob/master/doc/SDK_Help/FileFormats/WPFF_DocumentStructure.htm
 #		http://mark0.net/download/triddefs_xml.7z/defs/w/wp-generic.trid.xml
 0	string	\xffWPC
 # WordPerfect
 >8	byte	1
 # Reference:	http://mark0.net/download/triddefs_xml.7z/defs/w/wpm-macro.trid.xml
 # Note:		there exist other macro variants
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-wordprocessors-pub.diff.sig
Type: application/octet-stream
Size: 1098 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240606/3819ab70/attachment-0001.obj>
-------------- next part --------------
--- file-5.45/magic/Magdir/ssh.old	2023-06-19 15:43:13.000000000 +0200
+++ file-5.45/magic/Magdir/ssh	2024-06-05 22:37:33.259603700 +0200
@@ -2,2 +2,4 @@
 # From:	Nicolas Collignon <tsointsoin at gmail.com>
+# Update:	Joerg Jenderek
+# URL:		https://en.wikipedia.org/wiki/Secure_Shell_Protocol
 
@@ -8,10 +10,41 @@
 0	string	-----BEGIN\040PRIVATE\040KEY-----	OpenSSH private key (no password)
+#!:mime		text/plain
+!:mime		text/x-ssh-private-key
+!:ext		key
 0	string	-----BEGIN\040ENCRYPTED\040PRIVATE\040KEY-----	OpenSSH private key (with password)
+# https://download.qemu.org/qemu-9.0.0.tar.xz
+# qemu-9.0.0/roms/skiboot/libstb/crypto/mbedtls/tests/data_files/format_gen.pub
+0	string	-----BEGIN\040PUBLIC\040KEY-----		OpenSSH public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
 
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/p/pub-ssh-dss.trid.xml
+# Note:		called "SSH-DSS Public key" by TrID
 0	string	ssh-dss\040		OpenSSH DSA public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/p/pub-ssh-rsa.trid.xml
+# Note:		called "SSH-RSA Public key" by TrID
 0	string	ssh-rsa\040		OpenSSH RSA public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
 0	string	ecdsa-sha2-nistp256	OpenSSH ECDSA public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
 0	string	ecdsa-sha2-nistp384	OpenSSH ECDSA public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
 0	string	ecdsa-sha2-nistp521	OpenSSH ECDSA public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
 0	string	ssh-ed25519		OpenSSH ED25519 public key
+#!:mime		text/plain
+!:mime		text/x-ssh-public-key
+!:ext		pub
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-ssh-pub.diff.sig
Type: application/octet-stream
Size: 792 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240606/3819ab70/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-pub.txt.gz
Type: application/x-gzip
Size: 701 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240606/3819ab70/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: droid-pub-key.csv.gz
Type: application/x-gzip
Size: 936 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20240606/3819ab70/attachment-0001.bin>


More information about the File mailing list