[File] [PATCH] Magdir/frame FrameMaker Book; mime type and extension
Christos Zoulas
christos at zoulas.com
Sun Dec 10 15:31:09 UTC 2023
Committed, thanks!
christos
> On Dec 9, 2023, at 5:24 PM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
>
> Hello,
>
> some days ago i must handles some old software samples from Adobe
> Framemaker. One file sort are Book documents.
>
> So i look for more of such files. When running file command
> version 5.45 on such samples i get an output like:
>
> CLIPART.BK: FrameMaker Book file Y)
> CRC.BK: data
> CUSTM.BK: FrameMaker Book file Y)
> FIELD.BK: data
> RADIO.BK: FrameMaker Book file Y)
> SampleBook.book: FrameMaker Book file 0)
> qrgfm.book: data
> tut.book: data
>
> With option --extension only 3 byte sequence ??? is shown and with -i
> option application/x-mif is shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). Most of the samples
> recognized by file command are described here as "FrameMaker book" by
> book-fm.trid.xml without mime type and file name suffix BOOK. All of
> these samples are described with lower priority as "Maker Interchange
> Format Book" with mime type application/vnd.mif and suffix MIF by
> mif-book.trid.xml. SampleBook.book is described with highest priority as
> "Adobe Extensible Metadata Platform" with suffix XMP by xmp.trid.xml.
> This is also described with lowest priority as "HyperText Markup
> Language" with suffix HTML by html.trid.xml. Some examples ( like
> CLIPART.BK FIELD.B) not detected by file command are here described also
> as "Maker Interchange Format Book". With newest database now all samples
> are recognized and described as "FrameMaker book (binary)" with mime
> type application/vnd.framemaker and 2 suffix (.BK/BOOK) by
> bk-fm.trid.xml (See appended trid-v-book.txt.gz).
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies no examples.
>
> On Linux according to shared MIME-info database such samples are called
> "Adobe FrameMaker document". Here also application/vnd.framemaker is
> used as mime type and file name suffix fm is shown. The samples are here
> just recognized by looking for 5 byte sequence <Book at the beginning.
> That information can be seen in source freedesktop.org.xml.in found for
> example on gitlab.freedesktop.org.
>
> That informations are now expressed by comment line inside Magdir/frame
> like:
> # URL: http://fileformats.archiveteam.org/wiki/FrameMaker
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/b/bk-fm.trid.xml
> # defs/b/book-fm.trid.xml
>
> The description happens by lines inside Magdir/frame like:
> 0 string \<BookFile FrameMaker Book file
> !:mime application/x-mif
> >10 string 3.0 (3.0
> >10 string 2.0 (2.0
> >10 string 1.0 (1.0
> >13 byte x %c)
>
> For control reason you can look at the first line of the samples by
> command like:
> head -1 *.bk *.book
>
> So we see (appended head-1.txt.gz) that for the correct described
> samples (like CLIPART.BK RADIO.BK) at offset 1 BookFile is stored. For
> samples described only as MIF (like CRC.BK FIELD.BK) at offset 1
> Bookfile (just capitalised) is stored.
> For samples described as unknown (like qrgfm.book tut.book) at offset 1
> BOOKFILE (up-cased variant) is stored.
> All description happens because at the beginning less sign is stored. So
> one sample SampleBook.book is described as HTML by TrID.
> First i get only 2 samples, So i thought these are accidents but in the
> end i get 8 samples.
>
> Apparently at offset 10 version string (like 3.0F 4.0K 5.0Y 10.0) is
> stored. The file command does not detect these versions because it
> checks only for 3 versions (1.0 2.0 3.0). Older TrID checks for major
> version digit followed by point character and minor version digit 0.
> So SampleBook.book with version 10.0 was missed.
>
> So this now becomes like:
> 0 string/c \<Bookfile FrameMaker Book file
> !:mime application/vnd.framemaker
> !:ext bk/book
> >10 string x (%-0.3s
> >13 ubyte =0x3e \b)
> >13 ubyte <0x3A \b%c)
> >13 ubyte >0x3A %c)
> For most cases we got 3 byte version string, which is terminated by
> greater sign. If forth character is digit than assume and print as 4
> byte version string. If forth character is a letter than assume and
> print as 3 byte version string with append sub level.
>
> After applying the above mentioned modifications by patch
> file-5.45-frame-book.diff then now all my inspected samples are
> described and now also correct version information is shown. This now
> looks like:
>
> CLIPART.BK: FrameMaker Book file (5.0 Y)
> CRC.BK: FrameMaker Book file (3.0 F)
> CUSTM.BK: FrameMaker Book file (5.0 Y)
> FIELD.BK: FrameMaker Book file (3.0 F)
> RADIO.BK: FrameMaker Book file (5.0 Y)
> SampleBook.book: FrameMaker Book file (10.0)
> qrgfm.book: FrameMaker Book file (4.0 K)
> tut.book: FrameMaker Book file (4.0 K)
>
> I hope my diff file can be applied in future version of file
> utility.
>
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <trid-v-book.txt.gz><head-1.txt.gz><file-5_45-frame-book_diff.DEFANGED-9169><file-5_45-frame-book_diff_sig.DEFANGED-9170>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
More information about the File
mailing list