[File] [PATCH] Magdir/ole2compounddocs Family Tree Maker *.ftw described only generic

Jörg Jenderek joerg.jen.der.ek at gmx.net
Wed May 10 23:41:32 UTC 2023


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some days ago send patch to handle some SQLite 3.x databases. This
format is also used by some genealogy software. One software is
called Family Tree Maker. The file name suffix for generated projects
are FTW and FBK.
When running file command version 5.44 and newer with -e cdf option
on such examples i get an output like:

MY.FBK: OLE 2 Compound Document, v3.62, SecID 0x1,
	6 FAT sectors, Mini FAT start sector 0x2b9
MY.ftw: OLE 2 Compound Document, v3.62, SecID 0x1,
	6 FAT sectors, Mini FAT start sector 0x2b9

Furthermore only generic mime type application/x-ole-storage or
application/octet-stream is shown with -i. With option --extension
only 3 byte sequence ??? is shown.

When running file command with -e soft or no extra option for the
examples i get a output like:

MY.FBK: Composite Document File V2 Document, Cannot read section info
MY.ftw: Composite Document File V2 Document, Cannot read section info

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This identifies also
all examples with low priority as "Generic OLE2 / Multistream
Compound" by docfile.trid.xml. The examples are described as "Family
Tree Maker Family Tree" with FTW file name extension by ftw.trid.xml
(See appended trid-v-ftw.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). This
identifies the samples as "FamilyTree Maker Database" with additional
version "1-4" by fmt/1352 signature. Both suffix FTW/FBK are
considered here as valid (See appended droid-ftw.csv.gz).

Luckily with information given by the other tools i also found a
page about Family Tree Maker on file formats archive team web site.
There it is written that middle aged versions are based on Microsoft
Compound File format. Following that internal link jumps to that
page. There are also displayed CLSID is mentioned. That informations
are expressed by comment lines inside Magdir/ole2compounddocs like:
# URL:		http://fileformats.archiveteam.org/wiki/
#		Family_Tree_Maker
#		https://en.wikipedia.org/wiki/Family_Tree_Maker
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/f/ftw.trid.xml

The Family tree examples are recognized as "OLE 2 Compound Document"
by starting bytes (\320\317\021\340\241\261\032\341) at the beginning
inside Magdir/ole2compounddocs. Obviously there exist no code
fragment to do sub class identification. But the examples are
not described with additional phrase "UNKNOWN". Furthermore the
examples have a registered Root storage object CLSID. That value is
not shown as 0x57020000000000000000000000000000 or expressed in
standard expression by {00000257-0000-0000-0000-000000000000} with
curly braces.

So there is a minor logic error in handling the sub classification.

In the first branch i look for samples without CLSID. That means guid
value nil at offset 80. In older file command version the test for
"guid" does not exist, so the the test was realised by checking two
quad values (that are 16 bytes of guid). Then in that branch look for
characteristic directory entry names. The first entry of that kind
was Microstation V8 CAD and the last was for PageMaker. If no known
sample is found in that branch then call sub routine ole2-unknown by
default clause. That is realised by lines like:
 >>88 	ubequad		0x0
 >>>80 	ubequad		0x0
 >>>>128 	lestring16	Dgn~	: Microstation V8 CAD
 ...
 >>>>128 	lestring16	PageMaker		:
 ...
 >>>>128 	default		x
 >>>>>0 	use		ole2-unknown

In the other branch i look for known clsid GUID. If such guid is
found then print this sub classification. The first entry in that
branch was Microsoft Visio 2000-2002 and the last entry was Autodesk
3ds Max. If no known guid is found here then also call subroutine
ole2-unknown by default clause. That is realised by lines like:

 >>88 	ubequad		0xc000000000000046
 >>>80 	ubequad		0x131a020000000000	: Microsoft Visio
 ...
 >>88 	ubequad		0x9fed04143144cc1e	: Autodesk
 >>>80 	ubequad		0x7b8cdd1cc081a045	3ds Max
 ...
 >>88 	default		x
 >>>0 	use		ole2-unknown

So for FTW examples the second part of CLSID (8 bytes at offset 88)
is 0. So the first test succeeds and the second test not. So i am
trapped at this point and get no further message. So i insert branch
between which handles samples where "second" part is 0 and first part
is not zero. If no match is found in that branch call sub routine
ole2-unknown by default clause. That is realised by additional
lines like:
 >>>80 	ubequad		!0x0
 >>>>80 	default		x
 >>>>>0 	use		ole2-unknown

After applying the above mentioned modifications then my family tree
examples are now described with more details. This now looks with -e
cdf option like:
MY.FBK: OLE 2 Compound Document, v3.62, SecID 0x1,
	6 FAT sectors, Mini FAT start sector 0x2b9
	: UNKNOWN, clsid 0x57020000000000000000000000000000
	{00000257-0000-0000-0000-000000000000} with names
	IND.DB AUX.DB GENERAL.DB NAME.NDX BIRTH.NDX EXTRA.DB
MY.ftw: OLE 2 Compound Document, v3.62, SecID 0x1,
	6 FAT sectors, Mini FAT start sector 0x2b9
	: UNKNOWN, clsid 0x57020000000000000000000000000000
	{00000257-0000-0000-0000-000000000000} with names
	IND.DB AUX.DB GENERAL.DB NAME.NDX BIRTH.NDX EXTRA.DB

The first five directory entry names (IND.DB AUX.DB GENERAL.DB
NAME.NDX BIRTH.NDX) are used by TrID as characteristic for FTW
samples. Because for inspected samples a CLSID is mentioned i use
this to recognize such samples. So in the right middle sub class
branch this is now expressed by additional lines like:
>>>> 80 	ubequad	0x5702000000000000 : Family Tree Maker
>>>> Windows\
				database, version 1-4
#>>>>>0	search/0x5460c/s	\
	F\0i\0l\0e\0\040\0F\0o\0r\0m\0a\0t\0\040\0(\0C\0)\0	\
				\b, version
#>>>>>>&-8	ubyte x		%u
#!:mime	application/x-ole-storage
!:mime	application/x-fmt
!:ext	ftw/fbk

For Family Tree Maker for Windows (with the OLE container) file name
suffix is FTW, whereas for older Family Tree Maker for DOS FTM suffix
is used. According to Family Tree Maker help the program can create
backup of FTW files. These samples get same main name but suffix FBK.
I found no official or often used mime type. So i display an user
defined one.

According to documentation in stream GENERAL.DB exist a string "File
Format (C) Copyright 1993 Banner Blue Software Inc. - All Rights
Reserved". In my examples this string was stored as UTF-16 and not as
ASCII. In byte 4 the version is stored. In my examples (created by
version 2.0 ) i got hex value 0x02 but the relative jump
instruction does not work. So i keep these efforts as comment lines
and label this variant with additional phrase "version 1-4"
according to DROID.

After applying the above mentioned modifications by patch
file-ole2compounddocs-ftw.diff then all my inspected family tree
examples are now described. This now looks with -e cdf option like:

MY.FBK: OLE 2 Compound Document, v3.62, SecID 0x1,
	6 FAT sectors, Mini FAT start sector 0x2b9
	: Family Tree Maker Windows database, version 1-4
MY.ftw: OLE 2 Compound Document, v3.62, SecID 0x1,
	6 FAT sectors, Mini FAT start sector 0x2b9
	: Family Tree Maker Windows database, version 1-4

I hope my diff file can be applied in future version of file
utility.

There exist some other family tree formats. Some are zip based and
some are SQLite based. So probably this variants are not recognised
by current file command.

With best wishes,
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCZFwrqwAKCRCv8rHJQhrU
1jXbAKDFVzBmzjvvZnexbYsnxdUtD5D24wCfbxTJsruZt9tP2U52xvaWwd/CShM=
=lgNa
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-ftw.txt.gz
Type: application/x-gzip
Size: 527 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230511/ddb0a2f4/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: droid-ftw.csv.gz
Type: application/x-gzip
Size: 288 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230511/ddb0a2f4/attachment-0001.bin>
-------------- next part --------------
--- file-master/magic/Magdir/ole2compounddocs.old	2023-05-10 17:45:26.691921700 +0200
+++ file-master/magic/Magdir/ole2compounddocs	2023-05-11 01:22:06.083277100 +0200
@@ -319,12 +319,38 @@
 #>>>>>>&0	use		PageMaker
 # THIS WORKS PARTLY!
 >>>>>>&0	indirect	x
 #	remaining null clsid
 >>>>128 	default		x
 >>>>>0 	use		ole2-unknown
+# look for CLSID where "second" part is 0
+>>>80 	ubequad		!0x0
+#
+# Summary:	Family Tree Maker
+# From:		Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/Family_Tree_Maker
+#		https://en.wikipedia.org/wiki/Family_Tree_Maker
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/f/ftw.trid.xml
+# Note		called "Family Tree Maker Family Tree" by TrID and
+#		"FamilyTree Maker Database" with version "1-4" by DROID via PUID fmt/1352
+#		tested only with version 2.0
+#		verified by Michal Mutl Structured Storage Viewer `SSView.exe my.ftw`
+#		newer versions are SQLite based and handled by ./sql
+# directory names like: IND.DB AUX.DB GENERAL.DB NAME.NDX BIRTH.NDX EXTRA.DB
+>>>>80 	ubequad		0x5702000000000000	: Family Tree Maker Windows database, version 1-4
+# look for "File Format (C) Copyright 1993 Banner Blue Software Inc. - All Rights Reserved" in GENERAL.DB
+#>>>>>0	search/0x5460c/s	F\0i\0l\0e\0\040\0F\0o\0r\0m\0a\0t\0\040\0(\0C\0)\0	\b, VERSION
+# GRR: jump to version value like 2 does not work!
+#>>>>>>&-8	ubyte		x							%u
+#!:mime	application/x-ole-storage
+!:mime	application/x-fmt
+# FBK is used for backup of FTW
+!:ext	ftw/fbk
+#
+>>>>80 	default		x
+>>>>>0 	use		ole2-unknown
 #	look for known clsid GUID
 # - Visio documents
 # URL:	http://fileformats.archiveteam.org/wiki/Visio
 #   Last update on 10/23/2006 by Lester Hightower, 07/20/2019 by Joerg Jenderek
 >>88 	ubequad		0xc000000000000046
 >>>80 	ubequad		0x131a020000000000	: Microsoft Visio 2000-2002 Document, stencil or template
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-ole2compounddocs-ftw.diff.sig
Type: application/octet-stream
Size: 1222 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230511/ddb0a2f4/attachment.obj>


More information about the File mailing list