[File] [PATCH] Magdir/windows cdrtfe Project *.CFP described as Generic INI + 1 section variant not detected
Christos Zoulas
christos at zoulas.com
Mon Feb 20 15:25:51 UTC 2023
Committed, thanks!
christos
> On Feb 19, 2023, at 10:01 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> Some days ago i run Pirisoft ccleaner. Under item for file extension
> under registry cleaner i can scan for errors. There it complains
> about suffix CFP.
>
> So i looked for such files on my system. Such samples like
> test-iso.cfp Win95DE950.cfp WIN-XP_SP3.cfp were created by myself via
> CD-burning tool cdrtfe. This itself is a windows front end for the
> cdrtools (cdrecord, mkisofs, readcd, cdda2wav) and other well-known
> tools.
>
> When running file command version 5.44 on CFP samples and related
> files i get an output like:
>
> WIN-XP_SP3.cfp:
> Generic INItialization configuration [FileExplorer]
> Win95DE950.cfp:
> Generic INItialization configuration [FileExplorer]
> gnucash-guide.hhmap:
> ASCII text, with CRLF line terminators
> gnucash-help.hhmap:
> ASCII text, with CRLF line terminators
> io.github.peazip.PeaZip.flatpakref:
> ASCII text, with very long lines (3799)
> test-cfp.cfp:
> Generic INItialization configuration [FileExplorer]
> test-unknown.ini:
> Generic INItialization configuration [bar]
>
> With option --extension only 3 byte sequence ??? or wrong ini/inf
> is shown and with -i option only application/x-wine-extension-ini or
> text/plain is shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). All examples are
> described with low priority as "Generic INI configuration" by
> ini.trid.xml. The CFP samples are described with high priority as
> "cdrtfe Project" with mime type text/x-cfp by cfp-cdrtfe.trid.xml.
> The flatpakref sample is described as "Flatpack Reference" with
> mime type text/plain by flatpakref.trid.xml (See appended
> trid-v-cfp.txt.gz).
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). It does
> not recognize the samples.
>
> The INI based samples should be recognized by subroutine ini-file
> inside Magdir/windows. If the samples are only described as Generic
> INItialization configuration like test-unknown.ini that means a
> suited branch is missing inside sub routine. The hhmap and flatpakref
> are not recognized as "Generic INItialization configuration". So i
> first check for this error by looking at program logic inside
> Magdir/windows. The first step in sub routine is look for left
> bracket which is the beginning of section. This looks like:
>> 0 search/8192 [
> Then look for known keywords or section names. If it find one then
> display specific describing text witt corresponding mime type and
> file suffix. For the Windows Boot Loader this for example looks like:
>>> &0 regex/c \^boot\x20loader] Windows boot.ini
> !:mime application/x-wine-extension-ini
> !:ext ini
>
> Unfortunately the rules for INI file are not so strict. So it can
> happens that the relevant section name is not the first one.
> So if no known keyword is found after opening bracket, then look for
> bracket of second section. This is done by lines like:
>>> &0 default x
>>>> &0 search/8192 [
> The look again for known keyword like before. So some Windows setup
> INFormation are detected now. If no known keyword is found in second
> section than handle such samples by lines like:
>>>>> &0 default x
>>>>>> &0 ubyte x
>>>>>>> &-1 regex/T \^([A-Za-z0-9_\(\)\ ]+)\]\r \
> Generic INItialization configuration [%-.40s
> !:mime application/x-wine-extension-ini
> !:ext ini/inf
>
> The hhmap and flatpakref are not recognized because these samples
> contain only 1 section. So the search for second left bracket fails
> and then nothing more happens. So i must catch on that level the
> failing search by a default clause. So samples with only 1 and
> unknown section name are now described by additional inserted lines
> like:
>>>> &0 default x Generic INItialization configuration
>>>>> 0 string x \b, 1st line "%s"
>
> Unfortunately i found no page about the used text file format for CFP
> and especially from cdrtfe. So i use cdrtfe home page on sourceforge
> as reference. That is expressed by comment lines like:
> # URL: https://cdrtfe.sourceforge.io/
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/c/cfp-cdrtfe.trid.xml
> According to TrID the first line looks like:
> [General]
> So may be this is too unspecific to be used as magic pattern. Luckily
> the second section name seem to be look always like:
> [FileExplorer]
> That is also shown by current magic line. So i choose this as test
> criterium. So in branch handling second section names this is now
> done by additional lines like:
>>>>> &0 string FileExplorer] cdrtfe Project
> !:mime text/x-cfp
> !:ext cfp
>
> Some information for Flatpak can be found on Wikipedia. So that
> information is expressed by additional lines like:
> # URL: https://en.wikipedia.org/wiki/Flatpak
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/f/flatpakref.trid.xml
> According to documentation such samples contains a section which
> looks like:
> [Flatpak Ref]
> Often this appear as first line. That is described by TrID via
> flatpakref.trid.xml. There may exist also variants that start with
> comment ( that this # character) via flatpakref-rem.trid.xml. But my
> example is for first variant. So i insert inside Magdir/windows after
> first bracket search additional lines that look like:
>>> &0 string Flatpak\ Ref] Flatpak repository reference
> !:mime application/vnd.flatpak.ref
> !:ext flatpakref
> According to Linux database which can be found for example at
> reposcope.com such samples are called "Flatpak repository
> reference" in English language. Furthermore instead of generic mime
> type text/plain i choose the displayed one. But i found no such
> official registered type on iana.org.
>
> After applying the above mentioned modifications by patch
> file-5.44-windows-cfp.diff then most of samples are now described
> with more details and correct name suffix. This now then looks like:
>
> WIN-XP_SP3.cfp:
> cdrtfe Project
> Win95DE950.cfp:
> cdrtfe Project
> gnucash-guide.hhmap:
> Generic INItialization configuration, 1st line "[MAP]"
> gnucash-help.hhmap:
> Generic INItialization configuration, 1st line "[MAP]"
> io.github.peazip.PeaZip.flatpakref:
> Flatpak repository reference
> test-cfp.cfp:
> cdrtfe Project
> test-unknown.ini:
> Generic INItialization configuration [bar]
>
> I hope my diff file can be applied in future version of file
> utility.
>
> There is something to do. Add test lines for samples with hhmap name
> suffix.
>
> With best wishes,
> Jörg Jenderek
> - --
> Jörg Jenderek
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY/LiggAKCRCv8rHJQhrU
> 1kTLAJ40+dxPF0WkyMDc+xPK5AfTOerERgCgrVfmMCXgLW7q7GmkSL46x2UodeM=
> =GQ5N
> -----END PGP SIGNATURE-----
> <trid-v-cfp.txt.gz><file-5_44-windows-cfp_diff.DEFANGED-63><file-5_44-windows-cfp_diff_sig.DEFANGED-64>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20230220/8a918c40/attachment.asc>
More information about the File
mailing list