From toni.ruottu at iki.fi Wed Aug 3 18:35:28 2022 From: toni.ruottu at iki.fi (Toni Ruottu) Date: Wed, 3 Aug 2022 21:35:28 +0300 Subject: [File] Matching metadata near the end of a large file Message-ID: Hi! I'm trying to create magic for a file format that stores some metadata close to the end of the file. There is a pointer to the metadata near the beginning, so I do know the offset where the metadata is located. However, the file command seems to give up around offset 0x100000 refusing to match any data past that point. The debug mode shows only zeros instead of the actual data. Have I encountered a bug or is this a known limitation? What would be a good approach to take? Should I test my magic with a tiny file and ignore the potential failure due to file size? The files I'm trying to process are quite large and would typically not fit entirely within the first 0x100000 bytes. --Toni -------------- next part -------------- An HTML attachment was scrubbed... URL: From christos at zoulas.com Wed Aug 3 21:56:53 2022 From: christos at zoulas.com (Christos Zoulas) Date: Wed, 3 Aug 2022 17:56:53 -0400 Subject: [File] Matching metadata near the end of a large file In-Reply-To: References: Message-ID: <870D53DD-DE03-4549-B5F1-138315ACBDDC@zoulas.com> File only inspects the first 1048576 bytes in the file which can be changed with -p bytes=XXXX. If the data is close to the end of the file you can use a negative offset (from the end of file) and file will start looking backwards if it can seek. Best, christos > On Aug 3, 2022, at 2:35 PM, Toni Ruottu wrote: > > > > Hi! > > > > I'm trying to create magic for a file format that stores some metadata close to the end of the file. There is a pointer to the metadata near the beginning, so I do know the offset where the metadata is located. However, the file command seems to give up around offset 0x100000 refusing to match any data past that point. The debug mode shows only zeros instead of the actual data. > > > > Have I encountered a bug or is this a known limitation? What would be a good approach to take? Should I test my magic with a tiny file and ignore the potential failure due to file size? The files I'm trying to process are quite large and would typically not fit entirely within the first 0x100000 bytes. > > > > --Toni > > > > > -- > File mailing list > File at astron.com > https://mailman.astron.com/mailman/listinfo/file > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 235 bytes Desc: Message signed with OpenPGP URL: From joerg.jen.der.ek at gmx.net Sat Aug 6 14:44:09 2022 From: joerg.jen.der.ek at gmx.net (=?UTF-8?Q?J=c3=b6rg_Jenderek?=) Date: Sat, 6 Aug 2022 16:44:09 +0200 Subject: [File] [PATCH] of Magdir/wordprocessors for Corel WordPerfect Writing Tools *.CBT *.CBD Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, some days ago i send patches for DOS COM executables. One Syslinux COMboot variant use file name extension CBT instead of COM. For control reason i look for other files with CBT extension on my systems. But there are dozen of such CBT files which are part of Corel WordPerfect Office suite. These are found in sub directory WritingTools inside WordPerfect program directory "c:\Program Files (x86)\Corel\WordPerfect Office 2021". The file names are like: Wt13cbede.cbt Wt13cbeit.cbt Wt13cbefr.cbt WT21cbede.cbt Wt13cbeEN.CBD WT21cbeEN.CBD. These start with 2 letter phrase WT followed by digits which corresponds to Word Perfect version. For version 2021 this digits are 21 and for an older version i found digits 13. The last capitals obviously correspond to used language. For English the file name extension is CBD whereas for all other languages it is CBT. In the sub directory there exist more similar files but with other file name extensions like adv, hyd, icr, lex, mor and sav. The Writing Tools are used for spelling, grammar correction, thesaurus purpose in chosen language. Unfortunately i do not found for which part the CBT files are used. So i choose a "general" name for such CBT samples like "Writing Tools". When running file command (version 5.42) on such examples and related files i get an output like: WT21cbeEN.CBD: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbeEN.CBD: Corel WordPerfect: Unknown filetype 70, v1.0 WT21cbede.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 WT21cbeit.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbeaf.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbede.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbedk.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbees.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbefr.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbeit.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbekd.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbenl.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbeno.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbepo.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 Wt13cbesv.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 With --extension option only ??? is displayed. Furthermore with -i option for my samples only generic application/octet-stream is shown. For comparison reason i also run the file format identification utility DROID ( See https://sourceforge.net/projects/droid/). This identifies all such examples wrong as "Comic Book Archive" by PUID fmt/1462 based on file name extension (See appended droid-wordperfect-cbt.csv.gz) For comparison reason i run the file format identification utility TrID ( See https://mark0.net/soft-trid-e.html). This identifies all such examples with low rate as "WordPerfect (generic)" by wp-generic.trid.xml. And most examples are described with high rate as "WordPerfect Writing Tools data" by cbt-wp.trid.xml (See appended trid-wordperfect-cbt.txt.gz) Unfortunately i found no information especially about file format specification for such WordPerfect CBT files. TrID list the used file name extension and often with -v option the related URL pointing to some information. This is expressed by comment lines inside Magdir/wordprocessors like: # URL: https://en.wikipedia.org/wiki/WordPerfect # Reference: https://github.com/OneWingedShark/WordPerfect/ # blob/master/doc/SDK_Help/FileFormats/ # WPFF_DocumentStructure.htm # http://mark0.net/download/triddefs_xml.7z # defs/w/wp-generic.trid.xml # defs/c/cbt-wp.trid.xml The description happens inside Magdir/wordprocessors by starting like : 0 string \xffWPC So we see that the first 4 bytes are the generic magic for all WordPerfect samples. By bytes at offset 8 and 9 sub classification is done. If sub class is not known as last step the sub class is shown by line like: >>>9 byte x Corel WordPerfect: Unknown filetype %d So for my CBT examples i must insert before lines like: >>9 byte 70 WordPerfect Writing Tools !:mime application/x-wordperfect-cbt !:ext cbd/cbt Instead of generic mime type application/octet-stream i show an user defined one. After applying the above mentioned modifications by patch file-5.42-wordprocessors-cbt.diff then i get a more precise output like: WT21cbede.cbt: WordPerfect Writing Tools, v1.0 WT21cbeit.cbt: WordPerfect Writing Tools, v1.0 Wt13cbeaf.cbt: WordPerfect Writing Tools, v1.0 Wt13cbede.cbt: WordPerfect Writing Tools, v1.0 Wt13cbedk.cbt: WordPerfect Writing Tools, v1.0 Wt13cbees.cbt: WordPerfect Writing Tools, v1.0 Wt13cbefr.cbt: WordPerfect Writing Tools, v1.0 Wt13cbeit.cbt: WordPerfect Writing Tools, v1.0 Wt13cbekd.cbt: WordPerfect Writing Tools, v1.0 Wt13cbenl.cbt: WordPerfect Writing Tools, v1.0 Wt13cbeno.cbt: WordPerfect Writing Tools, v1.0 Wt13cbepo.cbt: WordPerfect Writing Tools, v1.0 Wt13cbesv.cbt: WordPerfect Writing Tools, v1.0 WT21cbeEN.CBD: WordPerfect Writing Tools, v1.0 Wt13cbeEN.CBD: WordPerfect Writing Tools, v1.0 I hope my diff file can be applied in future version of file utility. With best wishes J?rg Jenderek - -- J?rg Jenderek -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYu5+OQAKCRCv8rHJQhrU 1nVqAKCD5wQl6USB+azcGsnSOVTw8uzcaQCdEcvi13Fu+0zoISBZfV8Pqh3Iqjo= =2FVI -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: trid-wordperfect-cbt.txt.gz Type: application/x-gzip Size: 870 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: droid-cbt-wordperfect.csv.gz Type: application/x-gzip Size: 531 bytes Desc: not available URL: -------------- next part -------------- --- file-5.42/magic/Magdir/wordprocessors.old 2021-12-06 16:25:22.000000000 +0100 +++ file-5.42/magic/Magdir/wordprocessors 2022-08-06 15:21:15.335304200 +0200 @@ -57,6 +57,13 @@ >>9 byte 44 WordPerfect 3.5 document >>9 byte 45 WordPerfect 4.2 document >>9 byte 69 WordPerfect dialog file +# From: Joerg Jenderek +# Note: found in sub directory WritingTools inside WordPerfect 2021 program directory +>>9 byte 70 WordPerfect Writing Tools +#!:mime application/octet-stream +!:mime application/x-wordperfect-cbt +# like: Wt13cbede.cbt Wt13cbeit.cbt Wt13cbefr.cbt WT21cbede.cbt Wt13cbeEN.CBD WT21cbeEN.CBD +!:ext cbd/cbt >>9 byte 76 WordPerfect button bar >>9 default x >>>9 byte x Corel WordPerfect: Unknown filetype %d -------------- next part -------------- A non-text attachment was scrubbed... Name: file-5.42-wordprocessors-cbt.diff.sig Type: application/octet-stream Size: 554 bytes Desc: not available URL: From joerg.jen.der.ek at gmx.net Sat Aug 6 23:24:56 2022 From: joerg.jen.der.ek at gmx.net (=?UTF-8?Q?J=c3=b6rg_Jenderek?=) Date: Sun, 7 Aug 2022 01:24:56 +0200 Subject: [File] [PATCH] of Magdir/wordprocessors for Corel WordPerfect dictionary advise *.ADV Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, some days ago i send patch for Word Perfect CBT samples. These are found in sub directory WritingTools inside Word Perfect program directory "c:\Program Files (x86)\Corel\WordPerfect Office 2021". In the sub directory there exist more similar files but with other file name extensions like adv, hyd, icr, lex, mor and sav. For control reason i look for other Word Perfect files there. The ADV samples are used for giving advise to user. The file names are like: WT21de.adv Wt13de.adv Wt13es.adv Wt13fr.adv wt13us.adv These start with 2 letter phrase WT followed by digits which corresponds to Word Perfect version. For version 2021 this digits are 21 and for an older version i found digits 13. The last 2 capitals obviously correspond to used language. For English we get uk, de for German, fr for French, nl for Netherlands and so on. When running file command (version 5.42) on such examples i get an output like: WT21de.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 Wt13de.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 Wt13es.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 Wt13fr.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 Wt13nl.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 wt13kd.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 wt13uk.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 wt13us.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 With --extension option only ??? is displayed. Furthermore with -i option for my samples only generic application/octet-stream is shown. For comparison reason i run the file format identification utility TrID ( See https://mark0.net/soft-trid-e.html). This identifies all such examples with low rate as "WordPerfect (generic)" by wp-generic.trid.xml and the examples are described with high rate as "WordPerfect dictionary advise" by adv-wp.trid.xml (See appended trid-v-wordperfect-adv.txt.gz). Unfortunately i found no information especially about file format specification for such WordPerfect ADV files. TrID list the used file name extension and often with -v option the related URL pointing to some information. This is expressed by comment lines inside Magdir/wordprocessors like: # URL: https://en.wikipedia.org/wiki/WordPerfect # Reference: https://github.com/OneWingedShark/WordPerfect/ # blob/master/doc/SDK_Help/FileFormats/ # WPFF_DocumentStructure.htm # Reference: http://mark0.net/download/triddefs_xml.7z # defs/a/adv-wp.trid.xml The description happens inside Magdir/wordprocessors by starting like : 0 string \xffWPC So we see that the first 4 bytes are the generic magic for all WordPerfect samples. By bytes at offset 8 and 9 sub classification is done. If sub class is not known as last step the sub class is shown by line like: >8 default x >>8 byte x Unknown Corel/Wordperfect product %d, >>>9 byte x file type %d So for my ADV examples i must insert before lines like: >8 byte 34 >>9 byte 11 Corel WordPerfect dictionary advise !:mime application/x-wordperfect-adv !:ext adv Instead of generic mime type application/octet-stream i show an user defined one. According to unofficial WordPerfect File Format documentation at offset 16 pointer is stored. So when inspecting this area for ADV samples we get advise text depending on language like "This is too informal for most writing." for English examples. Unfortunately often some tags like 580A comes before pure text. So i show excerpt from such advise text by additional line like: >>>(16.s+16) string x (...%-.33s...) After applying the above mentioned modifications by patch file-5.42-wordprocessors-adv.diff then i get a more precise output like: WT21de.adv: Corel WordPerfect dictionary advise (...schen Dezimalzahlen und ganzen Za...), v6.0 Wt13de.adv: Corel WordPerfect dictionary advise (...schen Dezimalzahlen und ganzen Za...), v6.0 Wt13es.adv: Corel WordPerfect dictionary advise (...ica porcentaje debe ir inmediatam...), v6.0 Wt13fr.adv: Corel WordPerfect dictionary advise (...rd_ en genre et en nombre dans ce...), v6.0 Wt13nl.adv: Corel WordPerfect dictionary advise (... een hoofdletter. Als dit niet he...), v6.0 wt13kd.adv: Corel WordPerfect dictionary advise (...schen Dezimalzahlen und ganzen Za...), v6.0 wt13uk.adv: Corel WordPerfect dictionary advise (...s too informal for most writing.|...), v6.0 wt13us.adv: Corel WordPerfect dictionary advise (...s too informal for most writing.|...), v6.0 I hope my diff file can be applied in future version of file utility. With best wishes J?rg Jenderek - -- J?rg Jenderek -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYu74RwAKCRCv8rHJQhrU 1hgIAJ9Ddvoto6P3j4SXrTx0E4HzxtlK5QCeNBL+ghyLVZxnxTO87XU6O9dnQk8= =pPQr -----END PGP SIGNATURE----- -------------- next part -------------- -- File mailing list File at astron.com https://mailman.astron.com/mailman/listinfo/file -------------- next part -------------- --- file-5.42/magic/Magdir/wordprocessors.old 2021-12-06 16:25:22.000000000 +0100 +++ file-5.42/magic/Magdir/wordprocessors 2022-08-07 00:46:18.271993200 +0200 @@ -27,8 +27,11 @@ !:apple ????AWWP !:ext wps # Corel/WordPerfect +# URL: https://en.wikipedia.org/wiki/WordPerfect +# Reference: https://github.com/OneWingedShark/WordPerfect/blob/master/doc/SDK_Help/FileFormats/WPFF_DocumentStructure.htm +# http://mark0.net/download/triddefs_xml.7z/defs/w/wp-generic.trid.xml 0 string \xffWPC # WordPerfect >8 byte 1 >>9 byte 1 WordPerfect macro @@ -200,8 +203,22 @@ >8 byte 33 >>9 byte 10 IntelliTAG (SGML) compiled DTD >>9 default x >>>9 byte x IntelliTAG: Unknown filetype %d +# Summary: Corel WordPerfect WritingTools advise part +# From: Joerg Jenderek +# Reference: http://mark0.net/download/triddefs_xml.7z/defs/a/adv-wp.trid.xml +>8 byte 34 +>>9 byte 11 Corel WordPerfect dictionary advise +#!:mime application/octet-stream +!:mime application/x-wordperfect-adv +#!:mime application/vnd.wordperfect.adv +# like: WT21de.adv Wt13de.adv Wt13es.adv Wt13fr.adv wt13us.adv +!:ext adv +# advise text part often start with tag like: 580A +#>>>(16.s) ubequad x ADVISE PART %#llx +# part of advise text like: "This is too informal for most writing." +>>>(16.s+16) string x (...%-.33s...) # everything else >8 default x >>8 byte x Unknown Corel/Wordperfect product %d, >>>9 byte x file type %d -------------- next part -------------- A non-text attachment was scrubbed... Name: file-5.42-wordprocessors-adv.diff.sig Type: application/octet-stream Size: 933 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: trid-v-wordperfect-adv.txt.gz Type: application/x-gzip Size: 718 bytes Desc: not available URL: From christos at zoulas.com Mon Aug 8 12:59:42 2022 From: christos at zoulas.com (Christos Zoulas) Date: Mon, 8 Aug 2022 15:59:42 +0300 Subject: [File] [PATCH] of Magdir/wordprocessors for Corel WordPerfect dictionary advise *.ADV In-Reply-To: References: Message-ID: Applied thanks! christos > On Aug 7, 2022, at 2:24 AM, J?rg Jenderek wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > some days ago i send patch for Word Perfect CBT samples. These are > found in sub directory WritingTools inside Word Perfect program > directory "c:\Program Files (x86)\Corel\WordPerfect Office 2021". > In the sub directory there exist more similar files but with other > file name extensions like adv, hyd, icr, lex, mor and sav. > > For control reason i look for other Word Perfect files there. > The ADV samples are used for giving advise to user. The file names > are like: > WT21de.adv Wt13de.adv Wt13es.adv Wt13fr.adv wt13us.adv > These start with 2 letter phrase WT followed by digits > which corresponds to Word Perfect version. For version 2021 this > digits are 21 and for an older version i found digits 13. The last > 2 capitals obviously correspond to used language. For English we > get uk, de for German, fr for French, nl for Netherlands and so on. > When running file command (version 5.42) on such examples i get an > output like: > > WT21de.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > Wt13de.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > Wt13es.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > Wt13fr.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > Wt13nl.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > wt13kd.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > wt13uk.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > wt13us.adv: Unknown Corel/Wordperfect product 34, file type 11, v6.0 > > With --extension option only ??? is displayed. Furthermore with -i > option for my samples only generic application/octet-stream is shown. > > For comparison reason i run the file format identification utility > TrID ( See https://mark0.net/soft-trid-e.html). This identifies all > such examples with low rate as "WordPerfect (generic)" by > wp-generic.trid.xml and the examples are described with high rate > as "WordPerfect dictionary advise" by adv-wp.trid.xml > (See appended trid-v-wordperfect-adv.txt.gz). > > Unfortunately i found no information especially about file format > specification for such WordPerfect ADV files. TrID list the used > file name extension and often with -v option the related URL pointing > to some information. This is expressed by comment lines inside > Magdir/wordprocessors like: > # URL: https://en.wikipedia.org/wiki/WordPerfect > # Reference: https://github.com/OneWingedShark/WordPerfect/ > # blob/master/doc/SDK_Help/FileFormats/ > # WPFF_DocumentStructure.htm > # Reference: http://mark0.net/download/triddefs_xml.7z > # defs/a/adv-wp.trid.xml > > The description happens inside Magdir/wordprocessors by starting like > : > 0 string \xffWPC > So we see that the first 4 bytes are the generic magic for all > WordPerfect samples. By bytes at offset 8 and 9 sub classification is > done. If sub class is not known as last step the sub class is shown > by line like: >> 8 default x >>> 8 byte x Unknown Corel/Wordperfect product %d, >>>> 9 byte x file type %d > > So for my ADV examples i must insert before lines like: >> 8 byte 34 >>> 9 byte 11 Corel WordPerfect dictionary advise > !:mime application/x-wordperfect-adv > !:ext adv > Instead of generic mime type application/octet-stream i show an user > defined one. > > According to unofficial WordPerfect File Format documentation at > offset 16 pointer is stored. So when inspecting this area for ADV > samples we get advise text depending on language like "This is too > informal for most writing." for English examples. Unfortunately > often some tags like 580A comes before pure text. So i show excerpt > from such advise text by additional line like: >>>> (16.s+16) string x (...%-.33s...) > > After applying the above mentioned modifications by patch > file-5.42-wordprocessors-adv.diff then i get a more precise output > like: > > WT21de.adv: Corel WordPerfect dictionary advise > (...schen Dezimalzahlen und ganzen Za...), v6.0 > Wt13de.adv: Corel WordPerfect dictionary advise > (...schen Dezimalzahlen und ganzen Za...), v6.0 > Wt13es.adv: Corel WordPerfect dictionary advise > (...ica porcentaje debe ir inmediatam...), v6.0 > Wt13fr.adv: Corel WordPerfect dictionary advise > (...rd_ en genre et en nombre dans ce...), v6.0 > Wt13nl.adv: Corel WordPerfect dictionary advise > (... een hoofdletter. Als dit niet he...), v6.0 > wt13kd.adv: Corel WordPerfect dictionary advise > (...schen Dezimalzahlen und ganzen Za...), v6.0 > wt13uk.adv: Corel WordPerfect dictionary advise > (...s too informal for most writing.|...), v6.0 > wt13us.adv: Corel WordPerfect dictionary advise > (...s too informal for most writing.|...), v6.0 > > I hope my diff file can be applied in future version of > file utility. > > With best wishes > J?rg Jenderek > - -- > J?rg Jenderek > -----BEGIN PGP SIGNATURE----- > Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ > > iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYu74RwAKCRCv8rHJQhrU > 1hgIAJ9Ddvoto6P3j4SXrTx0E4HzxtlK5QCeNBL+ghyLVZxnxTO87XU6O9dnQk8= > =pPQr > -----END PGP SIGNATURE----- > -- > File mailing list > File at astron.com > https://mailman.astron.com/mailman/listinfo/file > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 235 bytes Desc: Message signed with OpenPGP URL: From christos at zoulas.com Mon Aug 8 13:01:13 2022 From: christos at zoulas.com (Christos Zoulas) Date: Mon, 8 Aug 2022 16:01:13 +0300 Subject: [File] [PATCH] of Magdir/wordprocessors for Corel WordPerfect Writing Tools *.CBT *.CBD In-Reply-To: References: Message-ID: <83AA92E0-E84A-4A93-A071-BBC900F52DB7@zoulas.com> Committed, thanks! christos > On Aug 6, 2022, at 5:44 PM, J?rg Jenderek wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > some days ago i send patches for DOS COM executables. One Syslinux > COMboot variant use file name extension CBT instead of COM. > > For control reason i look for other files with CBT extension on my > systems. > But there are dozen of such CBT files which are part of Corel > WordPerfect Office suite. These are found in sub directory > WritingTools inside WordPerfect program directory "c:\Program Files > (x86)\Corel\WordPerfect Office 2021". The file names are like: > Wt13cbede.cbt Wt13cbeit.cbt Wt13cbefr.cbt WT21cbede.cbt Wt13cbeEN.CBD > WT21cbeEN.CBD. These start with 2 letter phrase WT followed by digits > which corresponds to Word Perfect version. For version 2021 this > digits are 21 and for an older version i found digits 13. The last > capitals obviously correspond to used language. For English the file > name extension is CBD whereas for all other languages it is CBT. > > In the sub directory there exist more similar files but with other > file name extensions like adv, hyd, icr, lex, mor and sav. The > Writing Tools are used for spelling, grammar correction, thesaurus > purpose in chosen language. Unfortunately i do not found for which > part the CBT files are used. So i choose a "general" name for such > CBT samples like "Writing Tools". > > When running file command (version 5.42) on such examples and related > files i get an output like: > > WT21cbeEN.CBD: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbeEN.CBD: Corel WordPerfect: Unknown filetype 70, v1.0 > WT21cbede.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > WT21cbeit.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbeaf.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbede.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbedk.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbees.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbefr.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbeit.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbekd.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbenl.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbeno.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbepo.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > Wt13cbesv.cbt: Corel WordPerfect: Unknown filetype 70, v1.0 > > With --extension option only ??? is displayed. Furthermore with -i > option for my samples only generic application/octet-stream is shown. > > For comparison reason i also run the file format identification > utility DROID ( See https://sourceforge.net/projects/droid/). This > identifies all such examples wrong as "Comic Book Archive" by PUID > fmt/1462 based on file name extension (See appended > droid-wordperfect-cbt.csv.gz) > > For comparison reason i run the file format identification utility > TrID ( See https://mark0.net/soft-trid-e.html). This identifies all > such examples with low rate as "WordPerfect (generic)" by > wp-generic.trid.xml. And most examples are described with high rate > as "WordPerfect Writing Tools data" by cbt-wp.trid.xml (See appended > trid-wordperfect-cbt.txt.gz) > > Unfortunately i found no information especially about file format > specification for such WordPerfect CBT files. TrID list the used > file name extension and often with -v option the related URL pointing > to some information. This is expressed by comment lines inside > Magdir/wordprocessors like: > # URL: https://en.wikipedia.org/wiki/WordPerfect > # Reference: https://github.com/OneWingedShark/WordPerfect/ > # blob/master/doc/SDK_Help/FileFormats/ > # WPFF_DocumentStructure.htm > # http://mark0.net/download/triddefs_xml.7z > # defs/w/wp-generic.trid.xml > # defs/c/cbt-wp.trid.xml > > The description happens inside Magdir/wordprocessors by starting like > : > 0 string \xffWPC > So we see that the first 4 bytes are the generic magic for all > WordPerfect samples. By bytes at offset 8 and 9 sub classification is > done. If sub class is not known as last step the sub class is shown > by line like: >>>> 9 byte x Corel WordPerfect: Unknown filetype %d > So for my CBT examples i must insert before lines like: >>> 9 byte 70 WordPerfect Writing Tools > !:mime application/x-wordperfect-cbt > !:ext cbd/cbt > Instead of generic mime type application/octet-stream i show an user > defined one. > > After applying the above mentioned modifications by patch > file-5.42-wordprocessors-cbt.diff then i get a more precise output > like: > WT21cbede.cbt: WordPerfect Writing Tools, v1.0 > WT21cbeit.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbeaf.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbede.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbedk.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbees.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbefr.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbeit.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbekd.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbenl.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbeno.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbepo.cbt: WordPerfect Writing Tools, v1.0 > Wt13cbesv.cbt: WordPerfect Writing Tools, v1.0 > WT21cbeEN.CBD: WordPerfect Writing Tools, v1.0 > Wt13cbeEN.CBD: WordPerfect Writing Tools, v1.0 > > I hope my diff file can be applied in future version of > file utility. > > With best wishes > J?rg Jenderek > - -- > J?rg Jenderek > > > > > > > > > > > > > > > > -----BEGIN PGP SIGNATURE----- > Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ > > iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYu5+OQAKCRCv8rHJQhrU > 1nVqAKCD5wQl6USB+azcGsnSOVTw8uzcaQCdEcvi13Fu+0zoISBZfV8Pqh3Iqjo= > =2FVI > -----END PGP SIGNATURE----- > -- > File mailing list > File at astron.com > https://mailman.astron.com/mailman/listinfo/file > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 235 bytes Desc: Message signed with OpenPGP URL: