[File] [PATCH] of Magdir/database for dBase III DBT, version number 0

Christos Zoulas christos at zoulas.com
Wed Mar 25 01:50:16 UTC 2020


Committed, thanks!

christos

> On Mar 23, 2020, at 7:12 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello,
> some months ago i send patches to handle dBase database files. In the
> mean time i gathered some examples that are also misidentified as
> "dBase III DBT, version number 0". When running file command version
> 5.38 on such misidentified samples with -e cdf and -m Magdir/database
> options i get an output like:
> 
> AI070GEP.EPS:
> 	dBase III DBT, version number 0, next free block index
> 	458766
> gluon-ffhat-0.9.4.8-tp-link-tl-wr1043n-nd-v1-sysupgrade.bin:
> 	dBase III DBT, version number 0, next free block index
> 	1, 1st item "\037 \010"
> gluon-ffhat-1.0-tp-link-tl-wr1043n-nd-v1-sysupgrade.bin:
> 	dBase III DBT, version number 0, next free block index
> 	1, 1st item "\037 \010"
> gluon-ffhat-1.0-tp-link-tl-wr1043n-nd-v2-sysupgrade.bin:
> 	dBase III DBT, version number 0, next free block index
> 	1, 1st item "m"
> planmaker-pmd-2010.pmd:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "	\010\020"
> planmaker-pmd-2012.pmd:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "	\010\020"
> planmaker-pmv.pmv:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "	\010\020"
> planmaker-xls-5.0-7.0.xls:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "	\010\010"
> planmaker-xls-97-2003.xls:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "	\010\020"
> planmaker-xlt.xlt:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "	\010\020"
> Sammlung.wsb:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "    \004"
> sm-presentation-pot.pot:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696
> sm-presentation-pps.pps:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696
> sm-presentation-ppt-2000-2003.ppt:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696
> sm-presentation-ppt-97.ppt:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696
> sm-presentation-prd.prd:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696
> sm-presentation-prv.prv:
> 	dBase III DBT, version number 0, next free block index
> 	759263696
> softmaker-doc-6.0-95.doc:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "  h"
> softmaker-doc-97-2003.doc:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "  \001\001U@	\004"
> softmaker-dot-6.0-95.dot:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "  h"
> softmaker-dot-97-2003.dot:
> 	dBase III DBT, version number 0, next free block index
> 	3759263696, 1st item "  \001\001U@	\004"
> WinStore.App.exe:
> 	dBase III DBT, version number 0, next free block index
> 	23117
> WORD1XW.DOC:
> 	dBase III DBT, version number 0, next free block index
> 	2205083, 1st item "\200\001"
> 
> Unfortunately DBT files have no real good characteristic magic byte
> sequence. But luckily the displaying part for such dBase files is
> encapsulated by sub routine dbase3-memo-print inside
> Magdir/database. So only magic lines for test condition must be
> changed or added.
> 
> For misidentified samples exorbitant high values for next free
> block index are shown, where for real world examples only "low"
> occur.
> In the documentations about dBase is mentioned that the upper limit
> for the dBase database is 2 GiB. In the memo file a block size of
> 512 is used. That means that index values are below hexadecimal
> value 0x400000. The values are stored as 4 byte long integer in
> little endian. Nothing is explicitly said about the sign of values,
> but negative values make no sense for real block index number. So
> type ulelong instead must lelong be used in lines with comparisons,
> or otherwise high values like decimal 3759263696 are considered as
> negative values.
> 
> Furthermore for real word examples the first memo item is longer
> (more than 2 characters) printable ASCII text. That means byte
> value of characters is equal or higher than hexadecimal 0x20 or
> octal 040. That is value of the space character.
> 
> So one test branch looks like:
>>>>>>>>>> 0	lelong		<2205083
>>>>>>>>>>> 0	use		dbase3-memo-print
> To skip samples like WORD1XW.DOC with improbably high free block
> index and samples like WinStore.App.exe with unprintable second
> character of first memo item field this now becomes
>>>>>>>>>> 0	ulelong		<0x400000
>>>>>>>>>>> 513	ubyte		>037
>>>>>>>>>>>> 0	use		dbase3-memo-print
> 
> An other test branch looks like
>>>>>>>>>>> 0	lelong		<458766
>>>>>>>>>>>> 0	use		dbase3-memo-print
> So skip bad samples with improbably high free block index or non
> printable first or second character of memo field by changed test
> lines. This now becomes:
>>>>>>>>>>> 0	ulelong		<0x400000
>>>>>>>>>>>> 512	ubyte		>037
>>>>>>>>>>>>> 513 ubyte		>037
>>>>>>>>>>>>>> 0	use		dbase3-memo-print
> 
> After applying the above mentioned modifications by patch
> file-5.38-database-dbt.diff then the above mentioned examples are
> not misidentified any more as "dBase III DBT, version number 0" and
> for real DBT files i still get correct output like
> 
> dbase3dbt0.dbt:   dBase III DBT, version number 0,
> 	next free block index 3,
> 	1st item "1st memo text\032\032"
> dbase3dbt0_1.dbt: dBase III DBT, version number 0,
> 	next free block index 2,
> 	1st item "1st memo. test umlaut with cp 1252:
> 	ä=ae, ö=oe, ü=ue, ß=ss,\200=euro, Ä=Ae, Ö=Oe, Ü=Ue\032\032
> dbase3dbt0_4.dbt: dBase III DBT, version number 0,
> 	next free block index 2,
> 	1st item "first memo\032\032"
> fsadress.dbt:     dBase III DBT, version number 0,
> 	next free block index 5,
> 	1st item "This is a note for Karl Müller. "
> 
> I hope that now test lines for such DBT files are sufficient and that
> my diff file can be applied in future version of file utility.
> 
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
> 
> 
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCXnlCVQAKCRCv8rHJQhrU
> 1uKzAKCxWEReTbX3HsEfiwSXlIvYYybFwQCgvnpQesjsCDPkqhhrBpUEqG0mgY8=
> =dAHI
> -----END PGP SIGNATURE-----
> <file-5_38-database-dbt_diff.DEFANGED-248774><file-5_38-database-dbt_diff_sig.DEFANGED-248775>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20200324/0d800e53/attachment.asc>


More information about the File mailing list