[File] [PATCH] Magdir/windows cdrtfe Project *.CFP described as Generic INI + 1 section variant not detected

Jörg Jenderek joerg.jen.der.ek at gmx.net
Mon Feb 20 03:01:22 UTC 2023


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

Some days ago i run Pirisoft ccleaner. Under item for file extension
under registry cleaner i can scan for errors. There it complains
about suffix CFP.

So i looked for such files on my system. Such samples like
test-iso.cfp Win95DE950.cfp WIN-XP_SP3.cfp were created by myself via
CD-burning tool cdrtfe. This itself is a windows front end for the
cdrtools (cdrecord, mkisofs, readcd, cdda2wav) and other well-known
tools.

When running file command version 5.44 on CFP samples and related
files i get an output like:

WIN-XP_SP3.cfp:
	Generic INItialization configuration [FileExplorer]
Win95DE950.cfp:
	Generic INItialization configuration [FileExplorer]
gnucash-guide.hhmap:
	ASCII text, with CRLF line terminators
gnucash-help.hhmap:
	ASCII text, with CRLF line terminators
io.github.peazip.PeaZip.flatpakref:
	ASCII text, with very long lines (3799)
test-cfp.cfp:
	Generic INItialization configuration [FileExplorer]
test-unknown.ini:
	Generic INItialization configuration [bar]

With option --extension only 3 byte sequence ??? or wrong ini/inf
is shown and with -i option only application/x-wine-extension-ini or
text/plain is shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). All examples are
described with low priority as "Generic INI configuration" by
ini.trid.xml. The CFP samples are described with high priority as
"cdrtfe Project" with mime type text/x-cfp by cfp-cdrtfe.trid.xml.
The flatpakref sample is described as "Flatpack Reference" with
mime type text/plain by flatpakref.trid.xml (See appended
trid-v-cfp.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). It does
not recognize the samples.

The INI based samples should be recognized by subroutine ini-file
inside Magdir/windows. If the samples are only described as Generic
INItialization configuration like test-unknown.ini that means a
suited branch is missing inside sub routine. The hhmap and flatpakref
are not recognized as "Generic INItialization configuration". So i
first check for this error by looking at program logic inside
Magdir/windows. The first step in sub routine is look for left
bracket which is the beginning of section. This looks like:
 >0	search/8192	[
Then look for known keywords or section names. If it find one then
display specific describing text witt corresponding mime type and
file suffix. For the Windows Boot Loader this for example looks like:
 >>&0	regex/c	\^boot\x20loader]	Windows boot.ini
 !:mime application/x-wine-extension-ini
 !:ext	ini

Unfortunately the rules for INI file are not so strict. So it can
happens that the relevant section name is not the first one.
So if no known keyword is found after opening bracket, then look for
bracket of second section. This is done by lines like:
 >>&0	default				x
 >>>&0	search/8192			[
The look again for known keyword like before. So some Windows setup
INFormation are detected now. If no known keyword is found in second
section than handle such samples by lines like:
 >>>>&0	default				x
 >>>>>&0	ubyte				x
 >>>>>>&-1 regex/T \^([A-Za-z0-9_\(\)\ ]+)\]\r	\
	Generic INItialization configuration [%-.40s
 !:mime	application/x-wine-extension-ini
 !:ext	ini/inf

The hhmap and flatpakref are not recognized because these samples
contain only 1 section. So the search for second left bracket fails
and then nothing more happens. So i must catch on that level the
failing search by a default clause. So samples with only 1 and
unknown section name are now described by additional inserted lines
like:
 >>>&0	default	x	Generic INItialization configuration
 >>>>0	string	x	\b, 1st line "%s"

Unfortunately i found no page about the used text file format for CFP
and especially from cdrtfe. So i use cdrtfe home page on sourceforge
as reference. That is expressed by comment lines like:
# URL:		https://cdrtfe.sourceforge.io/
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/c/cfp-cdrtfe.trid.xml
According to TrID the first line looks like:
[General]
So may be this is too unspecific to be used as magic pattern. Luckily
the second section name seem to be look always like:
[FileExplorer]
That is also shown by current magic line. So i choose this as test
criterium. So in branch handling second section names this is now
done by additional lines like:
 >>>>&0	string	FileExplorer]			cdrtfe Project
 !:mime	text/x-cfp
 !:ext	cfp

Some information for Flatpak can be found on Wikipedia. So that
information is expressed by additional lines like:
# URL:		https://en.wikipedia.org/wiki/Flatpak
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/f/flatpakref.trid.xml
According to documentation such samples contains a section which
looks like:
[Flatpak Ref]
Often this appear as first line. That is described by TrID via
flatpakref.trid.xml. There may exist also variants that start with
comment ( that this # character) via flatpakref-rem.trid.xml. But my
example is for first variant. So i insert inside Magdir/windows after
first bracket search additional lines that look like:
 >>&0	string	Flatpak\ Ref]	Flatpak repository reference
 !:mime	application/vnd.flatpak.ref
 !:ext	flatpakref
According to Linux database which can be found for example at
reposcope.com such samples are called "Flatpak repository
reference" in English language. Furthermore instead of generic mime
type text/plain i choose the displayed one. But i found no such
official registered type on iana.org.

After applying the above mentioned modifications by patch
file-5.44-windows-cfp.diff then most of samples are now described
with more details and correct name suffix. This now then looks like:

WIN-XP_SP3.cfp:
	cdrtfe Project
Win95DE950.cfp:
	cdrtfe Project
gnucash-guide.hhmap:
	Generic INItialization configuration, 1st line "[MAP]"
gnucash-help.hhmap:
	Generic INItialization configuration, 1st line "[MAP]"
io.github.peazip.PeaZip.flatpakref:
	Flatpak repository reference
test-cfp.cfp:
	cdrtfe Project
test-unknown.ini:
	Generic INItialization configuration [bar]

I hope my diff file can be applied in future version of file
utility.

There is something to do. Add test lines for samples with hhmap name
suffix.

With best wishes,
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY/LiggAKCRCv8rHJQhrU
1kTLAJ40+dxPF0WkyMDc+xPK5AfTOerERgCgrVfmMCXgLW7q7GmkSL46x2UodeM=
=GQ5N
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-cfp.txt.gz
Type: application/x-gzip
Size: 580 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230220/4550e04a/attachment-0001.bin>
-------------- next part --------------
--- file-5.44/magic/Magdir/windows.old	2022-12-02 17:18:19.000000000 +0100
+++ file-5.44/magic/Magdir/windows	2023-02-20 03:50:26.554355400 +0100
@@ -830,6 +830,14 @@
 !:mime	text/x-ms-tag
 # like: DATA.TAG
 !:ext	tag
+# URL:		https://en.wikipedia.org/wiki/Flatpak
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/f/flatpakref.trid.xml
+# Note:		called "Flatpack Reference" by TrID
+>>&0	string		Flatpak\ Ref]					Flatpak repository reference
+#!:mime	text/plain
+# https://reposcope.com/mimetype/application/vnd.flatpak.ref
+!:mime	application/vnd.flatpak.ref
+!:ext	flatpakref
 # unknown keyword after opening bracket
 >>&0	default				x
 #>>>&0	string/c			x	UNKNOWN [%s
@@ -839,6 +847,12 @@
 >>>>&0	string/c			version				Windows setup INFormation
 !:mime application/x-setupscript
 !:ext	inf
+# From:		Joerg Jenderek
+# URL:		https://cdrtfe.sourceforge.io/
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/c/cfp-cdrtfe.trid.xml
+>>>>&0	string				FileExplorer]			cdrtfe Project
+!:mime	text/x-cfp
+!:ext	cfp
 # https://en.wikipedia.org/wiki/Initialization_file	Windows Initialization File or other
 >>>>&0	default				x
 >>>>>&0	ubyte				x
@@ -850,6 +864,9 @@
 !:mime	application/x-wine-extension-ini
 #!:mime	text/plain
 !:ext	ini/inf
+# samples with only 1 and unknown section name
+>>>&0	default				x				Generic INItialization configuration
+>>>>0	string				x				\b, 1st line "%s"
 # UTF-16 BOM
 0	ubeshort		=0xFFFE
 # look for phrase of Windows policy ADMinistrative template (UTF-16 by adm-uni.trid.xml)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-windows-cfp.diff.sig
Type: application/octet-stream
Size: 960 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230220/4550e04a/attachment-0001.obj>


More information about the File mailing list