[File] [PATCH] Magdir/ole2compounddocs for Microsoft Office Binder *.OBD, *.OBT

Christos Zoulas christos at zoulas.com
Tue May 31 17:40:30 UTC 2022


On 2022-05-29 8:24 pm, Jörg Jenderek wrote:
> Hello,
> 
> some days ago i stalled an old Microsoft Office 97. Just for interest
> i checked Office Binder files from that version. When running file
> command version 5.41 with -e cdf option on some examples i get an
> output like:
> 
> BINDER.OBD: OLE 2 Compound Document, v3.62,
> 	    SecID 0x4,
> 	    Mini FAT start sector 0x7 :
> 	    UNKNOWN, clsid
> 	    0x0004855964661b10b21c00aa004ba90b
> REPORT.OBT: OLE 2 Compound Document, v3.62,
> 	    SecID 0x119, 3 FAT sectors,
> 	    Mini FAT start sector 0x122 :
> 	    UNKNOWN, clsid
> 	    0x0004855964661b10b21c00aa004ba90b
> 
> Furthermore only generic mime type application/x-ole-storage is
> shown with -i and -e cdf option. With option --extension only 3 byte
> sequence ??? is shown.
> 
> When running file command with -e soft or no extra option for
> inspected examples i get a output like:
> 
> BINDER.OBD: Composite Document File V2 Document,
> 	    Little Endian, Os: Windows, Version 4.0,
> 	    Code page: 1252, Revision Number: 2,
> 	    Name of Creating Application:
> 	    Microsoft Office Binder, Create Time/Date:
> 	    Tue Oct 22 23:28:36 1996, Last Saved Time/Date:
> 	    Tue Oct 22 23:28:36 1996, Number of Pages: 0,
> 	    Number of Words: 0, Number of Characters: 0
> REPORT.OBT: Composite Document File V2 Document,
> 	    Little Endian, Os: Windows, Version 4.0,
> 	    Code page: 1252, Title: Report, Revision Number: 3,
> 	    Name of Creating Application:
> 	    Microsoft Office Binder, Create Time/Date:
> 	    Thu Oct 17 22:52:34 1996, Last Saved Time/Date:
> 	    Thu Oct 17 22:52:34 1996, Number of Pages: 0,
> 	    umber of Words: 0, Number of Characters: 0
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This identifies also
> all examples with low priority as "Generic OLE2 / Multistream
> Compound" by docfile.trid.xml. Some examples are described with high
> rate as "Office Binder Document" by obd.trid.xml. But it does not
> recognize that it is a template. So it shows wrong extension OBD
> (See appended trid-v-obd.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies the examples as "Microsoft Office Binder File for Windows"
> with version 97-2000 by fmt/240 Signature, but it complains about
> the OBT suffix for templates (See appended droid-obd.csv.gz).
> 
> Luckily with shown information i found hints about "Binder" on
> Microsoft Office tools page on Wikipedia site and on file extensions
> web site. That informations are expressed by comment lines inside
> Magdir/ole2compounddocs like:
> # URL:		https://en.wikipedia.org/wiki/
> #		Microsoft_Office_shared_tools#Binder
> # Reference:	http://mark0.net/download/triddefs_xml.7z
> #		defs/o/obd.trid.xml
> #		http://fileformats.archiveteam.org/wiki/
> #		Microsoft_Compound_File
> 
> The examples are recognized as "OLE 2 Compound Document"
> by starting bytes (\320\317\021\340\241\261\032\341) at the beginning
> inside Magdir/ole2compounddocs. Obviously there exist no code
> fragment to do sub class identification. So the examples are
> described as "UNKNOWN". Furthermore the examples have a registered
> Root storage object CLSID. That value is shown as
> 0x0004855964661b10b21c00aa004ba90b or expressed in standard curly
> braces expression by {59850400-6664-101B-B21C-00AA004BA90B}.
> That means that in branch handling non null CLSID GUID lines must be
> added. The similar entry was Microsoft Project (*.mpp). So i add
> afterwards lines for my inspected examples. That looks like:
> 
>  >>88  ubequad 0xb21c00aa004ba90b : Microsoft
>  >>>80 ubequad 0x0004855964661b10 Office Binder Document, Template
>  !:mime	application/x-msbinder
>  !:ext	obd/obt
> 
> Instead of generic application/x-ole-storage these get an user
> defined mime type mentioned on file extensions web site. Also type
> application/vnd.ms-binder is mentioned, but i not find such an
> official type. The extension OBT is used for the template variant
> like for example REPORT.OBT and OBD is used for Microsoft Office
> Binder Document variant like for example BINDER.OBD. I do not know
> if it is possible to distinguish OBT from OBD. Furthermore a third
> variant is mentioned. So the suffix OBZ is used for wizard variant,
> but i myself do not find such examples only entries in the Windows
> registry.
> 
> When reactivating some debugging lines like:
>> 128	lestring16	x \b, 2nd %.20s
> I get here for my examples ",2nd Binder". That is the
> characteristic that is used by TrID definition.
> 
> For my installation it was registered as Office.Binder.8. According
> to Wikipedia it is included with Microsoft Office 95, 97, and 2000.
> Binder files could be opened in Office versions until 2003.
> 
> After applying the above mentioned modifications by patch
> file-5.41-ole2compounddocs-obd.diff then all my inspected Binder
> examples are now described with more details. This now looks with
> -e cdf option like:
> 
> BINDER.OBD: OLE 2 Compound Document, v3.62,
> 	    SecID 0x4,
> 	    Mini FAT start sector 0x7 :
> 	    Microsoft Office Binder Document, Template or wizard
> REPORT.OBT: OLE 2 Compound Document, v3.62,
> 	    SecID 0x119, 3 FAT sectors,
> 	    Mini FAT start sector 0x122 :
> 	    Microsoft Office Binder Document, Template or wizard
> 
> I hope my diff file can be applied in future version of file
> utility.
> 
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
Committed, thanks!

christos
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <https://mailman.astron.com/pipermail/file/attachments/20220531/be0ab733/attachment-0001.asc>


More information about the File mailing list