[File] [PATCH] Magdir/ole2compounddocs for Microsoft PowerPoint Addin *.PPA and Wizard *.PWZ
Christos Zoulas
christos at zoulas.com
Sun May 29 20:10:26 UTC 2022
Committed, thanks!
christos
> On May 28, 2022, at 7:23 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> some days ago i stalled an old Microsoft 97. Just for interest i
> checked PowerPoint files from that version. When running file command
> version 5.41 with -e cdf option on some examples i get an output like
> :
>
> AutoContent Wizard.pwz: OLE 2 Compound Document, v3.62,
> SecID 0x1, 4 FAT sectors,
> Mini FAT start sector 0x3,
> 2 Mini FAT sectors :
> UNKNOWN, clsid
> 0xf04672810a72cf11871800aa0060263b
> BSHPPT97.PPA: OLE 2 Compound Document, v3.62,
> SecID 0x1,
> Mini FAT start sector 0x3,
> 3 Mini FAT sectors :
> UNKNOWN, clsid
> 0xf04672810a72cf11871800aa0060263b
>
> Furthermore only generic mime type application/x-ole-storage is
> shown with -i and -e cdf option. With option --extension only 3 byte
> sequence ??? is shown.
>
> When running file command with -e soft or no extra option for
> inspected examples i get a output like:
>
> AutoContent Wizard.pwz: Composite Document File V2 Document,
> Little Endian, Os: Windows, Version 3.51,
> Code page: 1252, Title: No Slide Title,
> Author: Microsoft, Last Saved By: Microsoft,
> Revision Number: 1,
> Name of Creating Application:
> Microsoft PowerPoint, Total Editing Time:
> 00:17, Create Time/Date: Mon Nov 4 13:00:18
> 1996, Last Saved Time/Date: Mon Nov 4
> 13:00:36 1996, Number of Words: 0
> BSHPPT97.PPA: Composite Document File V2 Document,
> Little Endian, Os: Windows, Version 3.51,
> Code page: 1252,
> Author: Microsoft, Last Saved By: Microsoft,
> Revision Number: 1,
> Name of Creating Application:
> Microsoft PowerPoint, Total Editing Time:
> 00:06, Create Time/Date: Wed Oct 16 20:40:18
> 1996, Last Saved Time/Date: Wed Oct 16
> 20:40:24 1996, Number of Words: 0
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This identifies also
> all examples with low priority as "Generic OLE2 / Multistream
> Compound" by docfile.trid.xml. All examples are described as generic
> "Microsoft PowerPoint document" by ppt.trid.xml. But it does not
> recognize that it is an Addin or Wizard variant. So it shows wrong
> extensions PPS/PPT (See appended trid-v-ppa.txt.gz).
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies the examples generic as "OLE2 Compound Document Format"
> by fmt/111 Signature (See appended droid-ppa.csv.gz).
>
> Luckily with shown information i found hints about "Wizard" on
> Microsoft PowerPoint page on Wikipedia site and on file extensions
> web site. That informations are expressed by comment lines inside
> Magdir/ole2compounddocs like:
>
> # URL: https://www.file-extensions.org/ppa-file-extension
> # https://en.wikipedia.org/
> # wiki/Microsoft_PowerPoint#cite_note-231
> # Reference: http://fileformats.archiveteam.org/
> # wiki/Microsoft_Compound_File
>
> The examples are recognized as "OLE 2 Compound Document"
> by starting bytes (\320\317\021\340\241\261\032\341) at the beginning
> inside Magdir/ole2compounddocs. Obviously there exist no code
> fragment to do sub class identification. So the examples are
> described as "UNKNOWN". Furthermore the examples have a registered
> Root storage object CLSID. That value is shown as
> 0xf04672810a72cf11871800aa0060263b or expressed in standard curly
> braces expression by {817246F0-720A-11CF-8718-00AA0060263B}.
> That means that in branch handling non null CLSID GUID lines must be
> added. The similar entry was Microsoft PowerPoint 97-2003
> presentation or template (ppt/pps/pot). So i add afterwards lines for
> my inspected examples. That looks like:
>
>>> 88 ubequad 0x871800aa0060263b : Microsoft
>>>> 80 ubequad 0xf04672810a72cf11 PowerPoint Addin or Wizard
> !:mime application/vnd.ms-powerpoint
> !:ext ppa/pwz
>
> Instead of generic application/x-ole-storage these get the mime type
> used by many other PowerPoint samples. The extension PPA is used for
> the Addin variant like for example BSHPPT97.PPA and PWZ is used for
> wizard variant like for example "AutoContent Wizard.pwz". According
> to file extensions web site PWZ are exactly structurally identical to
> the PPA file except for the fact that the extensions are different.
> So i do not know how to distinguish. For both the second, third and
> forth directory entries have names like VBA, PROJECT or PROJECTwm.
>
> For my installation it was registered as PowerPoint.Wizard.8, when
> following hints about wizard on Wikipedia this type exist for
> PowerPoint version 4.0 to 11.0 (2004), but according to
> file-extensions.org addin variant exist for version 97 to 2003.
>
> After applying the above mentioned modifications by patch
> file-ole2compounddocs-ppa.diff to newer master variant then all my
> inspected examples are now described with more details. This now
> looks with -e cdf option like:
>
> AutoContent Wizard.pwz: OLE 2 Compound Document, v3.62,
> SecID 0x1, 4 FAT sectors,
> Mini FAT start sector 0x3,
> 2 Mini FAT sectors :
> Microsoft PowerPoint Addin or Wizard
> BSHPPT97.PPA: OLE 2 Compound Document, v3.62,
> SecID 0x1,
> Mini FAT start sector 0x3,
> 3 Mini FAT sectors :
> Microsoft PowerPoint Addin or Wizard
>
> I hope my diff file can be applied in future version of file
> utility.
>
> With best wishes,
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYpKu1AAKCRCv8rHJQhrU
> 1pN+AKC+BDV7iwx2I/CU8HAiGgTy+mi1VACfa6L3wdvRL15x0GYEAyfw6bK+p74=
> =IgIU
> -----END PGP SIGNATURE-----
> <droid-ppa.csv.gz><trid-v-ppa.txt.gz><file-ole2compounddocs-ppa_diff.DEFANGED-0><file-ole2compounddocs-ppa_diff_sig.DEFANGED-1>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220529/8ab4bafa/attachment.asc>
More information about the File
mailing list