[File] [PATCH] Magdir/ole2compounddocs Family Tree Maker *.ftw described only generic

Christos Zoulas christos at zoulas.com
Mon May 15 16:46:25 UTC 2023


Committed, thanks!

christos

> On May 10, 2023, at 7:41 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello,
> 
> some days ago send patch to handle some SQLite 3.x databases. This
> format is also used by some genealogy software. One software is
> called Family Tree Maker. The file name suffix for generated projects
> are FTW and FBK.
> When running file command version 5.44 and newer with -e cdf option
> on such examples i get an output like:
> 
> MY.FBK: OLE 2 Compound Document, v3.62, SecID 0x1,
> 	6 FAT sectors, Mini FAT start sector 0x2b9
> MY.ftw: OLE 2 Compound Document, v3.62, SecID 0x1,
> 	6 FAT sectors, Mini FAT start sector 0x2b9
> 
> Furthermore only generic mime type application/x-ole-storage or
> application/octet-stream is shown with -i. With option --extension
> only 3 byte sequence ??? is shown.
> 
> When running file command with -e soft or no extra option for the
> examples i get a output like:
> 
> MY.FBK: Composite Document File V2 Document, Cannot read section info
> MY.ftw: Composite Document File V2 Document, Cannot read section info
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This identifies also
> all examples with low priority as "Generic OLE2 / Multistream
> Compound" by docfile.trid.xml. The examples are described as "Family
> Tree Maker Family Tree" with FTW file name extension by ftw.trid.xml
> (See appended trid-v-ftw.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies the samples as "FamilyTree Maker Database" with additional
> version "1-4" by fmt/1352 signature. Both suffix FTW/FBK are
> considered here as valid (See appended droid-ftw.csv.gz).
> 
> Luckily with information given by the other tools i also found a
> page about Family Tree Maker on file formats archive team web site.
> There it is written that middle aged versions are based on Microsoft
> Compound File format. Following that internal link jumps to that
> page. There are also displayed CLSID is mentioned. That informations
> are expressed by comment lines inside Magdir/ole2compounddocs like:
> # URL:		http://fileformats.archiveteam.org/wiki/
> #		Family_Tree_Maker
> #		https://en.wikipedia.org/wiki/Family_Tree_Maker
> # Reference:	http://mark0.net/download/triddefs_xml.7z
> #		defs/f/ftw.trid.xml
> 
> The Family tree examples are recognized as "OLE 2 Compound Document"
> by starting bytes (\320\317\021\340\241\261\032\341) at the beginning
> inside Magdir/ole2compounddocs. Obviously there exist no code
> fragment to do sub class identification. But the examples are
> not described with additional phrase "UNKNOWN". Furthermore the
> examples have a registered Root storage object CLSID. That value is
> not shown as 0x57020000000000000000000000000000 or expressed in
> standard expression by {00000257-0000-0000-0000-000000000000} with
> curly braces.
> 
> So there is a minor logic error in handling the sub classification.
> 
> In the first branch i look for samples without CLSID. That means guid
> value nil at offset 80. In older file command version the test for
> "guid" does not exist, so the the test was realised by checking two
> quad values (that are 16 bytes of guid). Then in that branch look for
> characteristic directory entry names. The first entry of that kind
> was Microstation V8 CAD and the last was for PageMaker. If no known
> sample is found in that branch then call sub routine ole2-unknown by
> default clause. That is realised by lines like:
>>> 88 	ubequad		0x0
>>>> 80 	ubequad		0x0
>>>>> 128 	lestring16	Dgn~	: Microstation V8 CAD
> ...
>>>>> 128 	lestring16	PageMaker		:
> ...
>>>>> 128 	default		x
>>>>>> 0 	use		ole2-unknown
> 
> In the other branch i look for known clsid GUID. If such guid is
> found then print this sub classification. The first entry in that
> branch was Microsoft Visio 2000-2002 and the last entry was Autodesk
> 3ds Max. If no known guid is found here then also call subroutine
> ole2-unknown by default clause. That is realised by lines like:
> 
>>> 88 	ubequad		0xc000000000000046
>>>> 80 	ubequad		0x131a020000000000	: Microsoft Visio
> ...
>>> 88 	ubequad		0x9fed04143144cc1e	: Autodesk
>>>> 80 	ubequad		0x7b8cdd1cc081a045	3ds Max
> ...
>>> 88 	default		x
>>>> 0 	use		ole2-unknown
> 
> So for FTW examples the second part of CLSID (8 bytes at offset 88)
> is 0. So the first test succeeds and the second test not. So i am
> trapped at this point and get no further message. So i insert branch
> between which handles samples where "second" part is 0 and first part
> is not zero. If no match is found in that branch call sub routine
> ole2-unknown by default clause. That is realised by additional
> lines like:
>>>> 80 	ubequad		!0x0
>>>>> 80 	default		x
>>>>>> 0 	use		ole2-unknown
> 
> After applying the above mentioned modifications then my family tree
> examples are now described with more details. This now looks with -e
> cdf option like:
> MY.FBK: OLE 2 Compound Document, v3.62, SecID 0x1,
> 	6 FAT sectors, Mini FAT start sector 0x2b9
> 	: UNKNOWN, clsid 0x57020000000000000000000000000000
> 	{00000257-0000-0000-0000-000000000000} with names
> 	IND.DB AUX.DB GENERAL.DB NAME.NDX BIRTH.NDX EXTRA.DB
> MY.ftw: OLE 2 Compound Document, v3.62, SecID 0x1,
> 	6 FAT sectors, Mini FAT start sector 0x2b9
> 	: UNKNOWN, clsid 0x57020000000000000000000000000000
> 	{00000257-0000-0000-0000-000000000000} with names
> 	IND.DB AUX.DB GENERAL.DB NAME.NDX BIRTH.NDX EXTRA.DB
> 
> The first five directory entry names (IND.DB AUX.DB GENERAL.DB
> NAME.NDX BIRTH.NDX) are used by TrID as characteristic for FTW
> samples. Because for inspected samples a CLSID is mentioned i use
> this to recognize such samples. So in the right middle sub class
> branch this is now expressed by additional lines like:
>>>>> 80 	ubequad	0x5702000000000000 : Family Tree Maker
>>>>> Windows\
> 				database, version 1-4
> #>>>>>0	search/0x5460c/s	\
> 	F\0i\0l\0e\0\040\0F\0o\0r\0m\0a\0t\0\040\0(\0C\0)\0	\
> 				\b, version
> #>>>>>>&-8	ubyte x		%u
> #!:mime	application/x-ole-storage
> !:mime	application/x-fmt
> !:ext	ftw/fbk
> 
> For Family Tree Maker for Windows (with the OLE container) file name
> suffix is FTW, whereas for older Family Tree Maker for DOS FTM suffix
> is used. According to Family Tree Maker help the program can create
> backup of FTW files. These samples get same main name but suffix FBK.
> I found no official or often used mime type. So i display an user
> defined one.
> 
> According to documentation in stream GENERAL.DB exist a string "File
> Format (C) Copyright 1993 Banner Blue Software Inc. - All Rights
> Reserved". In my examples this string was stored as UTF-16 and not as
> ASCII. In byte 4 the version is stored. In my examples (created by
> version 2.0 ) i got hex value 0x02 but the relative jump
> instruction does not work. So i keep these efforts as comment lines
> and label this variant with additional phrase "version 1-4"
> according to DROID.
> 
> After applying the above mentioned modifications by patch
> file-ole2compounddocs-ftw.diff then all my inspected family tree
> examples are now described. This now looks with -e cdf option like:
> 
> MY.FBK: OLE 2 Compound Document, v3.62, SecID 0x1,
> 	6 FAT sectors, Mini FAT start sector 0x2b9
> 	: Family Tree Maker Windows database, version 1-4
> MY.ftw: OLE 2 Compound Document, v3.62, SecID 0x1,
> 	6 FAT sectors, Mini FAT start sector 0x2b9
> 	: Family Tree Maker Windows database, version 1-4
> 
> I hope my diff file can be applied in future version of file
> utility.
> 
> There exist some other family tree formats. Some are zip based and
> some are SQLite based. So probably this variants are not recognised
> by current file command.
> 
> With best wishes,
> Jörg Jenderek
> - --
> Jörg Jenderek
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCZFwrqwAKCRCv8rHJQhrU
> 1jXbAKDFVzBmzjvvZnexbYsnxdUtD5D24wCfbxTJsruZt9tP2U52xvaWwd/CShM=
> =lgNa
> -----END PGP SIGNATURE-----
> <trid-v-ftw.txt.gz><droid-ftw.csv.gz><file-ole2compounddocs-ftw_diff.DEFANGED-979><file-ole2compounddocs-ftw_diff_sig.DEFANGED-980>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20230515/4d0cf5fc/attachment.asc>


More information about the File mailing list