[File] [PATCH] Magdir/wordprocessors, ssl, ssh for "oldest" Microsoft Publisher, different public keys
Christos Zoulas
christos at zoulas.com
Mon Jun 10 23:24:40 UTC 2024
Committed, thanks!
christos
> On Jun 6, 2024, at 3:32 AM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
>
> Hello,
>
> some days ago i must handle an old CD-ROM. This contains some older
> Microsoft Publisher files with file name suffix pub. These are not
> recognized correctly. So i send patch file-ole2compounddocs-pub.diff
> some weeks ago. Now i found "oldest" Microsoft Publisher samples like
> MSPublisherv1.PUB. Unfortunately the PUB file name suffix i also used
> for public keys by different software. So i also look for such samples.
>
>
> When running file command version 5.45 on such "oldest" Microsoft
> Publisher, other public keys and related files i get an output like:
>
> MSPublisherv1.PUB: data
> TMP00044.PUB: data
> Thanksgiving1.DTP: data
> format_gen.key: OpenSSH private key (no password)
> format_gen.pub: ASCII text
> id_dsa.pub: OpenSSH DSA public key
> id_ecdsa384.pub: OpenSSH ECDSA public key
> id_ecdsa521.pub: OpenSSH ECDSA public key
> id_rsa.pub: OpenSSH RSA public key
> localhost.priv: PEM RSA private key
> localhost.pub: ASCII text
> ssh_host_ecdsa_key.pub: OpenSSH ECDSA public key
> ssh_host_ed25519_key.pub: OpenSSH ED25519 public key
>
> With option --extension only 3 byte sequence ??? for most samples is
> shown. With option -i only generic application/octet-stream or
> text/plain is shown for most examples. This looks like:
>
> MSPublisherv1.PUB: application/octet-stream; charset=binary
> TMP00044.PUB: application/octet-stream; charset=binary
> Thanksgiving1.DTP: application/octet-stream; charset=binary
> format_gen.key: text/plain; charset=us-ascii
> format_gen.pub: text/plain; charset=us-ascii
> id_dsa.pub: text/plain; charset=us-ascii
> id_ecdsa384.pub: text/plain; charset=us-ascii
> id_ecdsa521.pub: text/plain; charset=us-ascii
> id_rsa.pub: text/plain; charset=us-ascii
> localhost.priv: text/plain; charset=us-ascii
> localhost.pub: text/plain; charset=us-ascii
> ssh_host_ecdsa_key.pub: text/plain; charset=us-ascii
> ssh_host_ed25519_key.pub: text/plain; charset=us-ascii
>
> For comparison reason i run the file format identification utility
> TrID (See https://mark0.net/soft-trid-e.html).
>
> This identifies some "data" samples (like Thanksgiving1.DTP
> TMP00044.PUB) as "COSMI document (generic)" with mime type
> application/octet-stream and without file name suffix by cosmi.trid.xml.
> It identifies some "data" samples like MSPublisherv1.PUB as "Microsoft
> Publisher document (v1)" with mime type application/vnd.ms-publisher and
> file name suffix PUB. It identifies some SSH public keys with text/plain
> mime type and PUB file name suffix. Some samples (like id_dsa.pub) are
> described as "SSH-DSS Public key" by pub-ssh-dss.trid.xml and others
> (like id_rsa.pub) are described as "SSH-RSA Public key" by
> pub-ssh-rsa.trid.xml. Some ssh keys (like ssh_host_ed25519_key.pub
> id_ecdsa384.pub) are here not recognized. The sample localhost.priv is
> described as "ASCII armored RSA Private Key" with mime type text/plain
> and KEY name suffix (See appended trid-v-pub.txt.gz).
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies MSPublisherv1.PUB correctly as "Microsoft Publisher" with
> version 1 and mime type application/x-mspublisher by PUID fmt/1511.
> Other PUB samples are also described wrong as "Microsoft Publisher"
> because recognition is based on file name suffix pub (See appended
> droid-pub-key.csv.gz).
>
> On Linux according to shared MIME-info database none of these examples
> are described.
>
> Luckily with information given by the other tools i also found a
> page about Microsoft Publisher on file formats archive team web site.
> There also links for samples to download are listed. That informations
> are expressed after Microsoft Works entry by comment lines inside
> Magdir/wordprocessors like:
> # URL: https://en.wikipedia.org/wiki/Microsoft_Publisher
> # Ref.: http://fileformats.archiveteam.org/wiki/Microsoft_Publisher
> Newest Publisher files ( since version2) are OLE 2 Compound based and
> described by Magdir/ole2compounddocs for which i send a patch some weeks
> ago. According to reference oldest are based on another file format. But
> at least the first four bytes are constant and hopefully unique. So the
> additional magic lines look like:
> 0 ubelong =0xE7AC2C00 Microsoft Publisher (1.0)
> !:mime application/vnd.ms-publisher
> !:ext pub
> So i choose the mime type used for other variant. This is also used
> by Linux mime database, but that type is not not registered at IANA. So
> maybe an used defined mime type application/x-mspublisher is better suited.
>
> Afterwards i add information as comment for COSMI document. This looks like
> # URL: http://fileformats.archiveteam.org/wiki/COSMI_MultiMedia
> # https://en.wikipedia.org/wiki/Cosmi_Corporation
> # Ref.: http://mark0.net/download/triddefs_xml.7z/defs/c/cosmi.trid.xml
> The recognition happens by lines like:
> 0 string/b LCP COSMI document
> !:mime application/x-cosmi
> !:ext dtp/pub/bro/bcd/crd
> Instead of generic application/octet-stream mime type i show an user
> defined one. Beside PUB suffix also other names suffix are used
> (BCD~Business Card Maker BRO~Brochure Magic CRD~Greeting Card Magic
> DTP~Print Perfect PUB~Desktop Publisher), but i do not know if it is
> possible to do sub classification for the different name suffix.
>
> Samples like localhost.priv are described inside Magdir/ssl by line like:
> 0 string -----BEGIN\040RSA\040PRIVATE PEM RSA private key
> Luckily with information given by the other tools i also found a
> page about SSL on GitHub web site. That informations are expressed by
> comment lines inside Magdir/ssl like:
> # Ref.: https://github.com/openssl/openssl/blob/master/include/openssl/
> # pem.h
> # http://mark0.net/download/triddefs_xml.7z/
> # defs/k/key-rsa-pvt.trid.xml
> So the above magic line now becomes like:
> 0 string -----BEGIN\040RSA\040PRIVATE PEM RSA private key
> !:mime text/x-ssl-private-key
> !:ext key/priv
> Instead of generic text/plain i show an user defined one. According to
> TrID file name suffix KEY is used, but in may samples i found PRIV
> (apparently the abbreviation for PRIVATE). So i show both name
> extensions. The counter part is the public key. Here the phrase PUBLIC
> instead of PRIVATE is used. Here PUB is used as file name suffix. So the
> unrecognized key is now done by additional lines, these look like:
> 0 string -----BEGIN\040RSA\040PUBLIC\040KEY----- PEM RSA public key
> !:mime text/x-ssl-public-key
> !:ext pub
>
> Samples like id_rsa.pub are described inside Magdir/ssh by line like:
> 0 string ssh-rsa\040 OpenSSH RSA public key
>
> Luckily with information given by the other tools i also found a
> page about SSH on Wikipedia web site. That informations are expressed by
> comment lines inside Magdir/ssl like:
> # URL: https://en.wikipedia.org/wiki/Secure_Shell_Protocol
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/p/pub-ssh-rsa.trid.xml
> So the above magic line now becomes like:
> 0 string ssh-rsa\040 OpenSSH RSA public key
> !:mime text/x-ssh-public-key
> !:ext pub
> Instead of generic text/plain i show an user defined one. According to
> TrID file name suffix PUB is used (apparently the abbreviation for PUBLIC).
>
> Then do the same procedure for other listed public ssh keys. Samples
> like format_gen.key (found in qemu version 9.0.0 source) are described
> inside Magdir/ssl by line like:
> 0 string -----BEGIN\040PRIVATE\040KEY----- \
> OpenSSH private key (no password)
> So the above magic line now becomes like:
> 0 string -----BEGIN\040PRIVATE\040KEY----- \
> OpenSSH private key (no password)
> !:mime text/x-ssh-private-key
> !:ext key
> Instead of generic text/plain i show an user defined one. In the sample
> file name suffix KEY is used. So i show this name extension. The counter
> part is the public key. Here the phrase PUBLIC instead of PRIVATE is
> used ( see format_gen.pub). Here PUB is used as file name suffix. So the
> unrecognized key is now done by additional lines, these look like:
> 0 string -----BEGIN\040PUBLIC\040KEY----- \
> OpenSSH public key
> !:mime text/x-ssh-public-key
> !:ext pub
>
> After applying the above mentioned modifications by patches
> file-5.45-wordprocessors-pub.diff file-5.45-ssl-pub.diff
> file-5.45-ssh-pub.diff, then most of my inspected examples with PUB name
> suffix are now described. This now looks like:
> MSPublisherv1.PUB: Microsoft Publisher (1.0)
> TMP00044.PUB: COSMI document
> Thanksgiving1.DTP: COSMI document
> format_gen.key: OpenSSH private key (no password)
> format_gen.pub: OpenSSH public key
> id_dsa.pub: OpenSSH DSA public key
> id_ecdsa384.pub: OpenSSH ECDSA public key
> id_ecdsa521.pub: OpenSSH ECDSA public key
> id_rsa.pub: OpenSSH RSA public key
> localhost.priv: PEM RSA private key
> localhost.pub: PEM RSA public key
> ssh_host_ecdsa_key.pub: OpenSSH ECDSA public key
> ssh_host_ed25519_key.pub: OpenSSH ED25519 public key
>
> I hope my diff files can be applied in future version of file
> utility. Unfortunately the pub suffix is also used for PGP/GPG keys.
> Here i also found some excerptions which are not recognized. So i need
> some time to inspect what is exactly going wrong there. I will try to
> handle this in a future session.
>
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <Nachrichtenteil als Anhang.DEFANGED-3731><Nachrichtenteil als Anhang.DEFANGED-3732><file-5_45-ssl-pub_diff_sig.DEFANGED-3733><file-5_45-ssl-pub_diff.DEFANGED-3734><file-5_45-wordprocessors-pub_diff.DEFANGED-3735><file-5_45-wordprocessors-pub_diff_sig.DEFANGED-3736><file-5_45-ssh-pub_diff.DEFANGED-3737><file-5_45-ssh-pub_diff_sig.DEFANGED-3738><trid-v-pub.txt.gz><droid-pub-key.csv.gz>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
More information about the File
mailing list