[File] [PATCH] Magdir/frame FrameMaker Book; mime type and extension

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Sat Dec 9 22:24:06 UTC 2023


Hello,

some days ago i must handles some old software samples from Adobe
Framemaker. One file sort are Book documents.

So i look for more of such files. When running file command
version 5.45 on such samples i get an output like:

CLIPART.BK:      FrameMaker Book file Y)
CRC.BK:          data
CUSTM.BK:        FrameMaker Book file Y)
FIELD.BK:        data
RADIO.BK:        FrameMaker Book file Y)
SampleBook.book: FrameMaker Book file 0)
qrgfm.book:      data
tut.book:        data

With option --extension only 3 byte sequence ??? is shown and with -i
option application/x-mif is shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). Most of the samples
recognized by file command are described here as "FrameMaker book" by
book-fm.trid.xml without mime type and file name suffix BOOK. All of
these samples are described with lower priority as "Maker Interchange
Format Book" with mime type application/vnd.mif and suffix MIF by
mif-book.trid.xml. SampleBook.book is described with highest priority as
"Adobe Extensible Metadata Platform" with suffix XMP by xmp.trid.xml.
This is also described with lowest priority as "HyperText Markup
Language" with suffix HTML by html.trid.xml. Some examples ( like
CLIPART.BK FIELD.B) not detected by file command are here described also
as "Maker Interchange Format Book". With newest database now all samples
are recognized and described as "FrameMaker book (binary)" with mime
type application/vnd.framemaker and 2 suffix (.BK/BOOK) by
bk-fm.trid.xml (See appended trid-v-book.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). This
identifies no examples.

On Linux according to shared MIME-info database such samples are called
"Adobe FrameMaker document". Here also application/vnd.framemaker is
used as mime type and file name suffix fm is shown. The samples are here
just recognized by looking for 5 byte sequence <Book at the beginning.
That information can be seen in source freedesktop.org.xml.in found for
example on gitlab.freedesktop.org.

That informations are now expressed by comment line inside Magdir/frame
like:
# URL:		http://fileformats.archiveteam.org/wiki/FrameMaker
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/b/bk-fm.trid.xml
# 		defs/b/book-fm.trid.xml

The description happens by lines inside Magdir/frame like:
  0	string		\<BookFile	FrameMaker Book file
  !:mime	application/x-mif
  >10	string		3.0		 (3.0
  >10	string		2.0		 (2.0
  >10	string		1.0		 (1.0
  >13	byte		x		  %c)

For control reason you can look at the first line of the samples by
command like:
	head -1 *.bk *.book

So we see (appended head-1.txt.gz) that for the correct described
samples (like CLIPART.BK RADIO.BK) at offset 1 BookFile is stored. For
samples described only as MIF (like CRC.BK FIELD.BK) at offset 1
Bookfile (just capitalised) is stored.
For samples described as unknown (like qrgfm.book tut.book) at offset 1
BOOKFILE (up-cased variant) is stored.
All description happens because at the beginning less sign is stored. So
one sample SampleBook.book is described as HTML by TrID.
First i get only 2 samples, So i thought these are accidents but in the
end i get 8 samples.

Apparently at offset 10 version string (like 3.0F 4.0K 5.0Y 10.0) is
stored. The file command does not detect these versions because it
checks only for 3 versions (1.0 2.0 3.0). Older TrID checks for major
version digit followed by point character and minor version digit 0.
So SampleBook.book with version 10.0 was missed.

So this now becomes like:
  0	string/c	\<Bookfile	FrameMaker Book file
  !:mime	application/vnd.framemaker
  !:ext	bk/book
  >10	string		x		(%-0.3s
  >13	ubyte		=0x3e		\b)
  >13	ubyte		<0x3A		\b%c)
  >13	ubyte		>0x3A		%c)
For most cases we got 3 byte version string, which is terminated by
greater sign. If forth character is digit than assume and print as 4
byte version string. If forth character is a letter than assume and
print as 3 byte version string with append sub level.

After applying the above mentioned modifications by patch
file-5.45-frame-book.diff then now all my inspected samples are
described and now also correct version information is shown. This now
looks like:

CLIPART.BK:      FrameMaker Book file (5.0 Y)
CRC.BK:          FrameMaker Book file (3.0 F)
CUSTM.BK:        FrameMaker Book file (5.0 Y)
FIELD.BK:        FrameMaker Book file (3.0 F)
RADIO.BK:        FrameMaker Book file (5.0 Y)
SampleBook.book: FrameMaker Book file (10.0)
qrgfm.book:      FrameMaker Book file (4.0 K)
tut.book:        FrameMaker Book file (4.0 K)

I hope my diff file can be applied in future version of file
utility.

With best wishes,
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-book.txt.gz
Type: application/x-gzip
Size: 824 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231209/84b4227d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: head-1.txt.gz
Type: application/x-gzip
Size: 153 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231209/84b4227d/attachment-0001.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/frame.old	2021-02-23 01:49:24.000000000 +0100
+++ file-5.45/magic/Magdir/frame	2023-12-09 23:08:05.631538600 +0100
@@ -48,8 +48,27 @@
 !:mime	application/x-mif
-0	string		\<BookFile	FrameMaker Book file
-!:mime	application/x-mif
->10	string		3.0		 (3.0
->10	string		2.0		 (2.0
->10	string		1.0		 (1.0
->13	byte		x		  %c)
+# URL:		http://fileformats.archiveteam.org/wiki/FrameMaker
+# Reference:	http://mark0.net/download/triddefs_xml.7z
+# 		defs/b/book-fm.trid.xml
+# 		defs/b/bk-fm.trid.xml
+# Update:	Joerg Jenderek 2023 Dez
+# Note:		called "FrameMaker book (binary)" by TrID and
+#		"Adobe FrameMaker document" by shared MIME-info database from freedesktop.org
+# look for BookFile, Bookfile (capitalized) or BOOKFILE (upcased) directive
+0	string/c	\<Bookfile	FrameMaker Book file
+#!:mime	application/octet-stream
+#!:mime	application/x-mif
+!:mime	application/vnd.framemaker
+# http://extension.nirsoft.net/book
+!:ext	bk/book
+# version like: 1.0 2.0 3.0 4.0 5.0 5.5 6.0 7.0 8.0 10.0
+# 3 characters of version number string
+>10	string		x		(%-0.3s
+# if greater sign then exact 3 byte version string
+>13	ubyte		=0x3e		\b)
+# if digit then 4 byte version string
+>13	ubyte		<0x3A		\b%c)
+# if letter then this is appended sub level after 3 byte version string
+>13	ubyte		>0x3A		%c)
+# first directive typically is followed by one space character
+>9	ubyte		!0x20		\b, no space before version
 # XXX - this book entry should be verified, if you find one, uncomment this
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-frame-book.diff.sig
Type: application/octet-stream
Size: 955 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231209/84b4227d/attachment.obj>


More information about the File mailing list