[File] [PATCH] Magdir/wordprocessors, ssl, ssh for "oldest" Microsoft Publisher, different public keys

Christos Zoulas christos at zoulas.com
Mon Jun 10 23:24:40 UTC 2024


Committed, thanks!

christos

> On Jun 6, 2024, at 3:32 AM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
> 
> Hello,
> 
> some days ago i must handle an old CD-ROM.  This contains some older
> Microsoft Publisher files with file name suffix pub. These are not
> recognized correctly. So i send patch file-ole2compounddocs-pub.diff
> some weeks ago. Now i found "oldest" Microsoft Publisher samples like
> MSPublisherv1.PUB. Unfortunately the PUB file name suffix i also used
> for public keys by different software. So i also look for such samples.
> 
> 
> When running file command version 5.45 on such "oldest" Microsoft
> Publisher, other public keys and related files i get an output like:
> 
> MSPublisherv1.PUB:        data
> TMP00044.PUB:             data
> Thanksgiving1.DTP:        data
> format_gen.key:           OpenSSH private key (no password)
> format_gen.pub:           ASCII text
> id_dsa.pub:               OpenSSH DSA public key
> id_ecdsa384.pub:          OpenSSH ECDSA public key
> id_ecdsa521.pub:          OpenSSH ECDSA public key
> id_rsa.pub:               OpenSSH RSA public key
> localhost.priv:           PEM RSA private key
> localhost.pub:            ASCII text
> ssh_host_ecdsa_key.pub:   OpenSSH ECDSA public key
> ssh_host_ed25519_key.pub: OpenSSH ED25519 public key
> 
> With option --extension only 3 byte sequence ??? for most samples is
> shown. With option -i only generic application/octet-stream or
> text/plain is shown for most examples. This looks like:
> 
> MSPublisherv1.PUB:        application/octet-stream; charset=binary
> TMP00044.PUB:             application/octet-stream; charset=binary
> Thanksgiving1.DTP:        application/octet-stream; charset=binary
> format_gen.key:           text/plain; charset=us-ascii
> format_gen.pub:           text/plain; charset=us-ascii
> id_dsa.pub:               text/plain; charset=us-ascii
> id_ecdsa384.pub:          text/plain; charset=us-ascii
> id_ecdsa521.pub:          text/plain; charset=us-ascii
> id_rsa.pub:               text/plain; charset=us-ascii
> localhost.priv:           text/plain; charset=us-ascii
> localhost.pub:            text/plain; charset=us-ascii
> ssh_host_ecdsa_key.pub:   text/plain; charset=us-ascii
> ssh_host_ed25519_key.pub: text/plain; charset=us-ascii
> 
> For comparison reason i run the file format identification utility
> TrID (See https://mark0.net/soft-trid-e.html).
> 
> This identifies some "data" samples (like Thanksgiving1.DTP
> TMP00044.PUB) as "COSMI document (generic)" with mime type
> application/octet-stream and without file name suffix by cosmi.trid.xml.
> It identifies some "data" samples like MSPublisherv1.PUB as "Microsoft
> Publisher document (v1)" with mime type application/vnd.ms-publisher and
> file name suffix PUB. It identifies some SSH public keys with text/plain
> mime type and PUB file name suffix. Some samples (like id_dsa.pub) are
> described as "SSH-DSS Public key" by pub-ssh-dss.trid.xml and others
> (like id_rsa.pub) are described as "SSH-RSA Public key" by
> pub-ssh-rsa.trid.xml. Some ssh keys (like ssh_host_ed25519_key.pub
> id_ecdsa384.pub) are here not recognized. The sample localhost.priv is
> described as "ASCII armored RSA Private Key" with mime type text/plain
> and KEY name suffix (See appended trid-v-pub.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies MSPublisherv1.PUB correctly as "Microsoft Publisher" with
> version 1 and mime type application/x-mspublisher by PUID fmt/1511.
> Other PUB samples are also described wrong as "Microsoft Publisher"
> because recognition is based on file name suffix pub (See appended
> droid-pub-key.csv.gz).
> 
> On Linux according to shared MIME-info database none of these examples
> are described.
> 
> Luckily with information given by the other tools i also found a
> page about Microsoft Publisher on file formats archive team web site.
> There also links for samples to download are listed. That informations
> are expressed after Microsoft Works entry by comment lines inside
> Magdir/wordprocessors like:
> # URL:	https://en.wikipedia.org/wiki/Microsoft_Publisher
> # Ref.:	http://fileformats.archiveteam.org/wiki/Microsoft_Publisher
> Newest Publisher files ( since version2) are OLE 2 Compound based and
> described by Magdir/ole2compounddocs for which i send a patch some weeks
> ago. According to reference oldest are based on another file format. But
> at least the first four bytes are constant and hopefully unique. So the
> additional magic lines look like:
> 0	ubelong		=0xE7AC2C00	Microsoft Publisher (1.0)
> !:mime	application/vnd.ms-publisher
> !:ext	pub
> So i choose the mime type used for other variant. This is also used
> by Linux mime database, but that type is not not registered at IANA. So
> maybe an used defined mime type application/x-mspublisher is better suited.
> 
> Afterwards i add information as comment for COSMI document. This looks like
> # URL:	http://fileformats.archiveteam.org/wiki/COSMI_MultiMedia
> #	https://en.wikipedia.org/wiki/Cosmi_Corporation
> # Ref.:	http://mark0.net/download/triddefs_xml.7z/defs/c/cosmi.trid.xml
> The recognition happens by lines like:
> 0	string/b	LCP		COSMI document
> !:mime	application/x-cosmi
> !:ext		dtp/pub/bro/bcd/crd
> Instead of generic application/octet-stream mime type i show an user
> defined one. Beside PUB suffix also other names suffix are used
> (BCD~Business Card Maker BRO~Brochure Magic CRD~Greeting Card Magic
> DTP~Print Perfect PUB~Desktop Publisher), but i do not know if it is
> possible to do sub classification for the different name suffix.
> 
> Samples like localhost.priv are described inside Magdir/ssl by line like:
> 0	string	-----BEGIN\040RSA\040PRIVATE	PEM RSA private key
> Luckily with information given by the other tools i also found a
> page about SSL on GitHub web site. That informations are expressed by
> comment lines inside Magdir/ssl like:
> # Ref.:	https://github.com/openssl/openssl/blob/master/include/openssl/
> #	pem.h
> #	http://mark0.net/download/triddefs_xml.7z/
> #	defs/k/key-rsa-pvt.trid.xml
> So the above magic line now becomes like:
> 0 string -----BEGIN\040RSA\040PRIVATE		PEM RSA private key
> !:mime		text/x-ssl-private-key
> !:ext		key/priv
> Instead of generic text/plain i show an user defined one. According to
> TrID file name suffix KEY is used, but in may samples i found PRIV
> (apparently the abbreviation for PRIVATE). So i show both name
> extensions. The counter part is the public key. Here the phrase PUBLIC
> instead of PRIVATE is used. Here PUB is used as file name suffix. So the
> unrecognized key is now done by additional lines, these look like:
> 0 string -----BEGIN\040RSA\040PUBLIC\040KEY----- PEM RSA public key
> !:mime		text/x-ssl-public-key
> !:ext		pub
> 
> Samples like id_rsa.pub are described inside Magdir/ssh by line like:
> 0	string	ssh-rsa\040		OpenSSH RSA public key
> 
> Luckily with information given by the other tools i also found a
> page about SSH on Wikipedia web site. That informations are expressed by
> comment lines inside Magdir/ssl like:
> # URL:		https://en.wikipedia.org/wiki/Secure_Shell_Protocol
> # Reference:	http://mark0.net/download/triddefs_xml.7z
> #		defs/p/pub-ssh-rsa.trid.xml
> So the above magic line now becomes like:
> 0	string	ssh-rsa\040		OpenSSH RSA public key
> !:mime		text/x-ssh-public-key
> !:ext		pub
> Instead of generic text/plain i show an user defined one. According to
> TrID file name suffix PUB is used (apparently the abbreviation for PUBLIC).
> 
> Then do the same procedure for other listed public ssh keys. Samples
> like format_gen.key (found in qemu version 9.0.0 source) are described
> inside Magdir/ssl by line like:
> 0	string	-----BEGIN\040PRIVATE\040KEY-----	\
> 			OpenSSH private key (no password)
> So the above magic line now becomes like:
> 0	string	-----BEGIN\040PRIVATE\040KEY-----	\
> 			OpenSSH private key (no password)
> !:mime		text/x-ssh-private-key
> !:ext		key
> Instead of generic text/plain i show an user defined one. In the sample
> file name suffix KEY is used. So i show this name extension. The counter
> part is the public key. Here the phrase PUBLIC instead of PRIVATE is
> used ( see format_gen.pub). Here PUB is used as file name suffix. So the
> unrecognized key is now done by additional lines, these look like:
> 0	string	-----BEGIN\040PUBLIC\040KEY-----	\
> 			OpenSSH public key
> !:mime		text/x-ssh-public-key
> !:ext		pub
> 
> After applying the above mentioned modifications by patches
> file-5.45-wordprocessors-pub.diff file-5.45-ssl-pub.diff
> file-5.45-ssh-pub.diff, then most of my inspected examples with PUB name
> suffix are now described. This now looks like:
> MSPublisherv1.PUB:        Microsoft Publisher (1.0)
> TMP00044.PUB:             COSMI document
> Thanksgiving1.DTP:        COSMI document
> format_gen.key:           OpenSSH private key (no password)
> format_gen.pub:           OpenSSH public key
> id_dsa.pub:               OpenSSH DSA public key
> id_ecdsa384.pub:          OpenSSH ECDSA public key
> id_ecdsa521.pub:          OpenSSH ECDSA public key
> id_rsa.pub:               OpenSSH RSA public key
> localhost.priv:           PEM RSA private key
> localhost.pub:            PEM RSA public key
> ssh_host_ecdsa_key.pub:   OpenSSH ECDSA public key
> ssh_host_ed25519_key.pub: OpenSSH ED25519 public key
> 
> I hope my diff files can be applied in future version of file
> utility. Unfortunately the pub suffix is also used for PGP/GPG keys.
> Here i also found some excerptions which are not recognized. So i need
> some time to inspect what is exactly going wrong there. I will try to
> handle this in a future session.
> 
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <Nachrichtenteil als Anhang.DEFANGED-3731><Nachrichtenteil als Anhang.DEFANGED-3732><file-5_45-ssl-pub_diff_sig.DEFANGED-3733><file-5_45-ssl-pub_diff.DEFANGED-3734><file-5_45-wordprocessors-pub_diff.DEFANGED-3735><file-5_45-wordprocessors-pub_diff_sig.DEFANGED-3736><file-5_45-ssh-pub_diff.DEFANGED-3737><file-5_45-ssh-pub_diff_sig.DEFANGED-3738><trid-v-pub.txt.gz><droid-pub-key.csv.gz>-- 
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>



More information about the File mailing list