[File] [PATCH] of Magdir/database for dBase III DBT, version number 0
Christos Zoulas
christos at zoulas.com
Wed Mar 25 01:50:16 UTC 2020
Committed, thanks!
christos
> On Mar 23, 2020, at 7:12 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
> some months ago i send patches to handle dBase database files. In the
> mean time i gathered some examples that are also misidentified as
> "dBase III DBT, version number 0". When running file command version
> 5.38 on such misidentified samples with -e cdf and -m Magdir/database
> options i get an output like:
>
> AI070GEP.EPS:
> dBase III DBT, version number 0, next free block index
> 458766
> gluon-ffhat-0.9.4.8-tp-link-tl-wr1043n-nd-v1-sysupgrade.bin:
> dBase III DBT, version number 0, next free block index
> 1, 1st item "\037 \010"
> gluon-ffhat-1.0-tp-link-tl-wr1043n-nd-v1-sysupgrade.bin:
> dBase III DBT, version number 0, next free block index
> 1, 1st item "\037 \010"
> gluon-ffhat-1.0-tp-link-tl-wr1043n-nd-v2-sysupgrade.bin:
> dBase III DBT, version number 0, next free block index
> 1, 1st item "m"
> planmaker-pmd-2010.pmd:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \010\020"
> planmaker-pmd-2012.pmd:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \010\020"
> planmaker-pmv.pmv:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \010\020"
> planmaker-xls-5.0-7.0.xls:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \010\010"
> planmaker-xls-97-2003.xls:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \010\020"
> planmaker-xlt.xlt:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \010\020"
> Sammlung.wsb:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \004"
> sm-presentation-pot.pot:
> dBase III DBT, version number 0, next free block index
> 3759263696
> sm-presentation-pps.pps:
> dBase III DBT, version number 0, next free block index
> 3759263696
> sm-presentation-ppt-2000-2003.ppt:
> dBase III DBT, version number 0, next free block index
> 3759263696
> sm-presentation-ppt-97.ppt:
> dBase III DBT, version number 0, next free block index
> 3759263696
> sm-presentation-prd.prd:
> dBase III DBT, version number 0, next free block index
> 3759263696
> sm-presentation-prv.prv:
> dBase III DBT, version number 0, next free block index
> 759263696
> softmaker-doc-6.0-95.doc:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " h"
> softmaker-doc-97-2003.doc:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \001\001U@ \004"
> softmaker-dot-6.0-95.dot:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " h"
> softmaker-dot-97-2003.dot:
> dBase III DBT, version number 0, next free block index
> 3759263696, 1st item " \001\001U@ \004"
> WinStore.App.exe:
> dBase III DBT, version number 0, next free block index
> 23117
> WORD1XW.DOC:
> dBase III DBT, version number 0, next free block index
> 2205083, 1st item "\200\001"
>
> Unfortunately DBT files have no real good characteristic magic byte
> sequence. But luckily the displaying part for such dBase files is
> encapsulated by sub routine dbase3-memo-print inside
> Magdir/database. So only magic lines for test condition must be
> changed or added.
>
> For misidentified samples exorbitant high values for next free
> block index are shown, where for real world examples only "low"
> occur.
> In the documentations about dBase is mentioned that the upper limit
> for the dBase database is 2 GiB. In the memo file a block size of
> 512 is used. That means that index values are below hexadecimal
> value 0x400000. The values are stored as 4 byte long integer in
> little endian. Nothing is explicitly said about the sign of values,
> but negative values make no sense for real block index number. So
> type ulelong instead must lelong be used in lines with comparisons,
> or otherwise high values like decimal 3759263696 are considered as
> negative values.
>
> Furthermore for real word examples the first memo item is longer
> (more than 2 characters) printable ASCII text. That means byte
> value of characters is equal or higher than hexadecimal 0x20 or
> octal 040. That is value of the space character.
>
> So one test branch looks like:
>>>>>>>>>> 0 lelong <2205083
>>>>>>>>>>> 0 use dbase3-memo-print
> To skip samples like WORD1XW.DOC with improbably high free block
> index and samples like WinStore.App.exe with unprintable second
> character of first memo item field this now becomes
>>>>>>>>>> 0 ulelong <0x400000
>>>>>>>>>>> 513 ubyte >037
>>>>>>>>>>>> 0 use dbase3-memo-print
>
> An other test branch looks like
>>>>>>>>>>> 0 lelong <458766
>>>>>>>>>>>> 0 use dbase3-memo-print
> So skip bad samples with improbably high free block index or non
> printable first or second character of memo field by changed test
> lines. This now becomes:
>>>>>>>>>>> 0 ulelong <0x400000
>>>>>>>>>>>> 512 ubyte >037
>>>>>>>>>>>>> 513 ubyte >037
>>>>>>>>>>>>>> 0 use dbase3-memo-print
>
> After applying the above mentioned modifications by patch
> file-5.38-database-dbt.diff then the above mentioned examples are
> not misidentified any more as "dBase III DBT, version number 0" and
> for real DBT files i still get correct output like
>
> dbase3dbt0.dbt: dBase III DBT, version number 0,
> next free block index 3,
> 1st item "1st memo text\032\032"
> dbase3dbt0_1.dbt: dBase III DBT, version number 0,
> next free block index 2,
> 1st item "1st memo. test umlaut with cp 1252:
> ä=ae, ö=oe, ü=ue, ß=ss,\200=euro, Ä=Ae, Ö=Oe, Ü=Ue\032\032
> dbase3dbt0_4.dbt: dBase III DBT, version number 0,
> next free block index 2,
> 1st item "first memo\032\032"
> fsadress.dbt: dBase III DBT, version number 0,
> next free block index 5,
> 1st item "This is a note for Karl Müller. "
>
> I hope that now test lines for such DBT files are sufficient and that
> my diff file can be applied in future version of file utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
>
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCXnlCVQAKCRCv8rHJQhrU
> 1uKzAKCxWEReTbX3HsEfiwSXlIvYYybFwQCgvnpQesjsCDPkqhhrBpUEqG0mgY8=
> =dAHI
> -----END PGP SIGNATURE-----
> <file-5_38-database-dbt_diff.DEFANGED-248774><file-5_38-database-dbt_diff_sig.DEFANGED-248775>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20200324/0d800e53/attachment.asc>
More information about the File
mailing list