[File] [PATCH] of Magdir/fonts for GEM GDOS font; update
Christos Zoulas
christos at zoulas.com
Tue Jul 16 11:13:21 UTC 2019
Hi,
I've committed the changes but I've left them still commented out. Perhaps we can
add them again with a negative strength, so that they don't interfere with other magic?
Or there should be magic entries with negative strength that are only considered if
a flag is specified on the command line?
christos
> On Jul 9, 2019, at 9:47 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> Hello,
> some weeks ago i send a patch for file version 5.36 to recognize
> GEM GDOS fonts with file extension fnt or gft.
>
> Unfortunately the test lines are not specific enough. So some other
> files were misidentified as "GEM GDOS font" by Magdir/fonts. For such
> bad examples and some extreme font examples (*.FNT) i got an output like:
>
> CAROUSEL.DOC: GEM GDOS font _@\011\004 45, ID 0xa5db,
> lightening mask 0x0, skewing mask 0x0
> Microsoft WinWord 2.0 Document
> cl8m8ocofedso.testfile: Audio file with ID3 version 2.4.0, contains:
> GEM GDOS font 25776, ID 0xfbff
> crmanual.doc: GEM GDOS font L \002 30, ID 0xa59b,
> lightening mask 0x0, skewing mask 0x0
> DOC20A.DOC: GEM GDOS font !@\011\004 45, ID 0xa5db,
> lightening mask 0x0, skewing mask 0x0
> Microsoft WinWord 2.0 Document
> H1CELT72.FNT: GEM GDOS font Celtic #s 72, ID 0x00ca
> HyperMover: GEM GDOS font
> STAK\377\377\377\377\357\260\362 10, ID 0x0000,
> lightening mask 0x0, skewing mask 0x0
> oem.hlp: MS Windows 3.1 help,
> Mon May 01 20:47:30 1995, 6033 bytes
> GEM GDOS font C 3, ID 0x5f3f,
> lightening mask 0x0, skewing mask 0x0
> PRICELIS.DOC: GEM GDOS font Z@\011\004 45, ID 0xa5db,
> lightening mask 0x0, skewing mask 0x0
> Microsoft WinWord 2.0 Document
> TECHREF.DOC: DOS 2.0-3.2 backed up file \TECHREF.DOC;
> GEM GDOS font \220 \002 33, ID 0xa59b,
> lightening mask 0x0, skewing mask 0x0
> TEMPLATE.DOC: GEM GDOS font p@\011\004 45, ID 0xa5db,
> lightening mask 0x0, skewing mask 0x0
> Microsoft WinWord 2.0 Document
> winword2.doc: GEM GDOS font 1@\011\004 45, ID 0xa5db,
> lightening mask 0x0, skewing mask 0x0
> Microsoft WinWord 2.0 Document
> WYEE24HI.FNT: GEM GDOS font WYE 24, ID 0x0073
>
> Unfortunately this is not unique enough, but this is not a problem
> because identifying and showing parts are separated. So i add
> additional test lines for such examples.
>
> Furthermore the specification for GEM fonts has no exact value
> specification. So the font name is shown by line
>> 4 string x %.32s
> Often i found common font names like Century-Schoolbook-Normal,
> Courier, ding bats used also in other font types.
> The names consist of "long" words to be recognized by human readers
> and consist mainly of latin letters, but sometimes name contains
> special printable characters like in "LC-S. Clay Wilson", "Celtic #s",
> "Big&Tall", "Hollywood ** DEMO VERSION **".
> The shortest found font name was 3 byte string WYE. For many bad
> examples interpreted font name often contains low Control-characters.
> So the used test line for valid font name is too common like
>>>> 4 ubeshort >0x1F00
> So this becomes more strict by line
>>>> 4 ulelong >0x001F1f1F
> So now bad samples like oem.hlp are skipped.
>
> The face size in points is shown by line
>> 2 uleshort x %u
> Typical values are 12, 18, 24 and 36, which are known from other font
> types. Theoretical 65535 can appear, but such high font sizes are
> unrealistic. So i tested for highest found value 48 like in KLINGON
> font H1KLIN48.FNT by line
>>> 2 uleshort <49
> Unfortunately this test was too strict, because i found a font with
> size 72. That is the Celtic font H1CELT72.FNT. So relaxed test line
> now becomes
>>> 2 uleshort <73
> Audio file cl8m8ocofedso.testfile was interpreted as GEM font variant
> with 5555h mask value with high font size 25776. So i also add this
> font size test line inside branch with 5555h mask values.
>
> At that point there exist samples like HyperMover with valid font size
> value. And the font name looks at first glance valid like STAK\377.
> So i look for additional tests. The minimal GEM font header size is
> 84 bytes (54h). After the header comes other structures like
> horizontal offset table, character offset table, font data in non
> determined order. But if a structure occurs after header without gap
> the lowest possible offset for structure is 54h. If structures are not
> so big 4 byte offset is just a little above the value like 20Eh.
> So now i also test for valid low positive offset to font data by line:
>>>>> 76 ulelong >83
> Now bad examples like HyperMover and remaining Microsoft WinWord 2.0
> documents are skipped.
>
> After applying the above mentioned modifications by patch
> file-5.37-fonts-gem.diff then misidentified files vanish and i get an
> output like:
>
> CAROUSEL.DOC: Microsoft WinWord 2.0 Document
> cl8m8ocofedso.testfile: Audio file with ID3 version 2.4.0
> crmanual.doc: data
> DOC20A.DOC: Microsoft WinWord 2.0 Document
> H1CELT72.FNT: GEM GDOS font Celtic #s 72, ID 0x00ca,
> 0x142 foffset
> HyperMover: data
> oem.hlp: MS Windows 3.1 help,
> Mon May 01 20:47:30 1995, 6033 bytes
> PRICELIS.DOC: Microsoft WinWord 2.0 Document
> TECHREF.DOC: DOS 2.0-3.2 backed up file \TECHREF.DOC
> TEMPLATE.DOC: Microsoft WinWord 2.0 Document
> winword2.doc: Microsoft WinWord 2.0 Document
> WYEE24HI.FNT: GEM GDOS font WYE 24, ID 0x0073,
> 0x158 foffset
>
> I hope my diff file can be applied in future version of
> file utility.
>
> With best wishes
> Jörg Jenderek
> --
> Jörg Jenderek
>
>
>
>
>
>
> <file-5_37-fonts-gem_diff.DEFANGED-6>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
More information about the File
mailing list