[File] [PATCH] Magdir/compress,blender Blender3D; mime type ; some extensions wrong or missing BLEND XOJ ADZ DIA GNUCASH KMY

Jörg Jenderek joerg.jen.der.ek at gmx.net
Tue Dec 20 00:56:19 UTC 2022


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some days for control reason i check my systems with disc
visualization tool. In my case i choose SequoiaView because i can
easily add specific colours for different name extension as ASCII
text to configuration file. So i look for bigg or many areas which
are uncoloured or grey. This means the tool does not recognize the
found extensions.

One uncoloured extension is BLEND. So i looked for such samples.
On my systems these are part of the OpenShot Video Editor.

When running file command version 5.43 on such samples and other
gzip compressed files i get an output like:

2020-07-19-Note-16-24.xoj:   gzip compressed data,
			     from NTFS filesystem (NT),
			     original size modulo 2^32 3463
BUILDINGEDGE.gxc:            General CADD, Drawing or Component,
			     original size modulo 2^32 701
HANDS96.MCC:                 Monu-Cad Drawing, Component or Font,
			     original size modulo 2^32 11764
Logo.xcfgz:                  gzip compressed data,
			     was "Logo.xcf",
			     last modified: Fri May  4 23:38:28 2018
			     , max compression,
			     from FAT filesystem (MS-DOS, OS/2, NT),
			     original size modulo 2^32 63213
MYgnucash-gz.gnucash:        gzip compressed data,
			     from TOPS/20,
			     original size modulo 2^32 2214
MYrdata.RData:               gzip compressed data,
			     from HPFS filesystem (OS/2, NT),
			     original size modulo 2^32 33
PostbankTest.kmy:            gzip compressed data,
			     was "",
			     last modified: Mon Mar  9 14:49:24 2020,
			     from Unix,
			     original size modulo 2^32 16799
RSI-Mega-Demo_Disk1.adz:     gzip compressed data,
			     was "RSI-MD1.adf",
			     last modified: Tue Apr 22 20:09:00 2008,
			     from Unix,
			     original size modulo 2^32 901120
bzless.1.gz:                 gzip compressed data
			     , max compression,
			     from Unix, truncated
chess_board.ps.gz:           gzip compressed data,
			     was "chess_board.ps",
			     last modified: Tue Jan  6 23:29:47 2009
			     , max compression,
			     from Unix, truncated
earth.blend:                 Blender3D, saved as
			     64-bits little endian with version 2.61
earth_real.blend:            Blender3D, saved as
			     64-bits little endian with version 2.72
escher.blend:                Blender3D, saved as
			     32-bits big endian with version 1.70
explode.blend:               Blender3D, saved as
			     64-bits little endian with version 2.62
file-5.43.tar.gz:            gzip compressed data,
			     last modified: Tue Sep 13 18:49:49 2022
			     , max compression,
			     from Unix,
			     original size modulo 2^32 4392960
foo.gz:                      gzip compressed data,
			     was "small",
			     last modified: Mon Aug 12 18:22:04 2013,
			     from Unix, truncated
kleopatra_splashscreen.svgz: gzip compressed data,
			     was "kleo_1b_splashscreen.svg",
			     last modified: Thu Dec 17 14:42:31 2009,
			     from Unix,
			     original size modulo 2^32 314947
lens_flare.blend:            Blender3D, saved as
			     64-bits little endian with version 2.80
relative.blend:              Blender3D, saved as
			     32-bits little endian with version 1.66
roundvehicle.blend:          Blender3D, saved as
			     32-bits little endian with version 2.11
snow.blend:                  Blender3D, saved as
			     64-bits little endian with version 2.56
text-rotate.dia:             gzip compressed data
			     , max compression,
			     from NTFS filesystem (NT),
			     original size modulo 2^32 316410
trees.blend:                 gzip compressed data,
			     from Unix,
			     original size modulo 2^32 1231020

With option --extension i get an unexpected output like:

2020-07-19-Note-16-24.xoj:   ???
BUILDINGEDGE.gxc:            gxc/gxd
HANDS96.MCC:                 mcc/mcd/fnt
Logo.xcfgz:                  gz/tgz/tpz/zabw/svgz
MYgnucash-gz.gnucash:        ???
MYrdata.RData:               ???
PostbankTest.kmy:            gz/tgz/tpz/zabw/svgz
RSI-Mega-Demo_Disk1.adz:     gz/tgz/tpz/zabw/svgz
bzless.1.gz:                 gz/tgz/tpz/ipk/vbox-extpack/svgz
chess_board.ps.gz:           gz/tgz/tpz/zabw/svgz
earth.blend:                 ???
earth_real.blend:            ???
escher.blend:                ???
explode.blend:               ???
file-5.43.tar.gz:            ???
foo.gz:                      gz/tgz/tpz/zabw/svgz
kleopatra_splashscreen.svgz: gz/tgz/tpz/zabw/svgz
lens_flare.blend:            ???
relative.blend:              ???
roundvehicle.blend:          ???
snow.blend:                  ???
text-rotate.dia:             ???
trees.blend:                 ???

For file version 5.37 i get here a more expected output like:

2020-07-19-Note-16-24.xoj:   gz/tgz/tpz/ipk/vbox-extpack/svgz
BUILDINGEDGE.gxc:            gxc/gxd
bzless.1.gz:                 gz/tgz/tpz/ipk/vbox-extpack/svgz
chess_board.ps.gz:           gz/tgz/tpz/zabw/svgz
earth.blend:                 ???
earth_real.blend:            ???
escher.blend:                ???
explode.blend:               ???
file-5.43.tar.gz:            gz/tgz/tpz/ipk/vbox-extpack/svgz
foo.gz:                      gz/tgz/tpz/zabw/svgz
HANDS96.MCC:                 mcc/mcd/fnt
kleopatra_splashscreen.svgz: gz/tgz/tpz/zabw/svgz
lens_flare.blend:            ???
Logo.xcfgz:                  gz/tgz/tpz/zabw/svgz
MYgnucash-gz.gnucash:        gz/tgz/tpz/ipk/vbox-extpack/svgz
MYrdata.RData:               gz/tgz/tpz/ipk/vbox-extpack/svgz
PostbankTest.kmy:            gz/tgz/tpz/zabw/svgz
relative.blend:              ???
roundvehicle.blend:          ???
RSI-Mega-Demo_Disk1.adz:     gz/tgz/tpz/zabw/svgz
snow.blend:                  ???
text-rotate.dia:             gz/tgz/tpz/ipk/vbox-extpack/svgz
trees.blend:                 gz/tgz/tpz/ipk/vbox-extpack/svgz

With option -i only generic application/octet-stream mime type for no
gzipped Blender samples are shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This identifies
the gzip compressed samples like trees.blend also correctly as
"GZipped data" with mime type application/gzip but wrong extensions
GZ/GZIP by ark-gz.trid.xml. The other examples are described here as
"Blender 3D data" with extension BLEND by blend.trid.xml
Because these examples start with upcase letter B are described wrong
with low priority as "PrintFox/Pagefox bitmap (320x200)" by
bitmap-printfox-s.trid.xml (See appended trid-v-blend.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). The
compressed samples are described as "GZIP Format" by PUID x-fmt/266.
Here it complains about extension BLEND instead of GZ. The 64 bit
examples are described as "Blender 3D" without mime type and with
version phrase "64 bit" by PUID fmt/903	and the other are described
with version phrase "32 bit" by PUID fmt/902 (See appended
droid-blend.csv.gz).

Luckily with information given by other tools i also found page about
BLEND on file formats archive team. There i also find samples for
download. That informations are now expressed by comment lines inside
Magdir/blender like:
# URL: 		http://fileformats.archiveteam.org/wiki/BLEND
#		http://www.blender.org/
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/b/blend.trid.xml
#		http://formats.kaitai.io/blender_blend/index.html

The detection happens inside Magdir/blender by lines starting like:
 0		string	=BLENDER	Blender3D,
 >7		string	=_		saved as 32-bits
So this now becomes like:
 0		string	=BLENDER	Blender3D,
 !:mime		application/x-blender
 !:ext		blend
 >7		string	=_		saved as 32-bits
On Linux also the phrase blender is mentioned as suffix but in my
examples i only found phrase blend as suffix. On Linux according to
mime database instead of generic application/octet-stream an user
defined type is used.

That was easy but now comes the tricky part. The blender samples
can be stored compressed as mentioned in documentation. The used
compression is gzip and suffix is still BLEND. So such compressed
samples should be handled by Magdir/compress.

The standard suffix for gzipped is GZ. Many Software use this
compression to store there files and also many programs use other
file name extensions. So the file command shows all known possible
file name extensions. This is done inside Magdir/compress by line lik
e:
!:ext	gz/tgz/tpz/ipk/vbox-extpack/svgz
So the first thought is to just add blend suffix to this list, but
this is not done easily because there exist many sub branches and
apparently some parts are missing. So for version 5.37 (compress,v
1.75 2019/04/19) this thought is correct. So i looked there and
compare this with version 5.43 (compress,v 1.83 2022/08/16).

In old variant one branch shows as last message part size information
by lines like:
 >>>-4	ulelong		x	\b, original size modulo 2^32 %u
 !:ext	gz/tgz/tpz/ipk/vbox-extpack/svgz
In current definition this has become like:
 >>-0	offset		>48
 >>>-4	ulelong		x	\b, original size modulo 2^32 %u
 >>-0	offset		<48	\b, truncated
 !:ext	gz/tgz/tpz/ipk/vbox-extpack/svgz
So here an additional sub classification was introduced. First for
"original big" files ( > 48) or "small" ( < 48 called truncated).
But then only for truncated samples like bzless.1.gz the size
information is shown.
So correctly you must duplicate extension part and insert for every
sub classification level. Then also of course of must correct the
range. With current definition for samples with original size of 48
no size information is shown.

Furthermore i see no bargain in showing "truncated" instead of real
size value for "small" files. On my systems i found 206 truncated
samples. Some are compressed man pages with containing just
reference to other man mages. Some samples are compressed log
files. There often the log file was empty then the reported
original size is 0.
Some sample are just compressed data ( in my case from TV-browser).
I checked the reported size by gunziping samples and looking at
file size of result. It was always valid and i see in specification
no hint that this can be invalid.

So i undo this last sub classification level and i just must add
undetected suffices to extension lines.

So when working on this item i also add more gzipped extension
after blend.
xoj are compressed XML files of note program xournal
(xournal.sourceforge.net/manual.html)
gnucash are optional compressed XML files of finance software
gnucash (https://wiki.gnucash.org/wiki/GnuCash_XML_format).
dia are by default compressed XML files of diagram software DIA
(https://en.wikipedia.org/wiki/Dia_(software).
adz is used for gzipped adf. That are Amiga Disk Files
(http://fileformats.archiveteam.org/wiki/ADF_(Amiga)
kmy are optional compressed XML files of finance software KMyMoney
(https://docs.kde.org/stable5/en/kmymoney/kmymoney/
details.formats.compressed.html)
xcfgz is used for xcf.gz. That are optional gzip compressed graphic
tool GIMP pictures xcf (http://fileformats.archiveteam.org/wiki/XCF)
rdata is by default compressed workspace of statistical software R
(https://en.wikipedia.org/wiki/R_(programming_language):

So for branch with blender the extensions are now shown by line like:
 !:ext	gz/tgz/tpz/ipk/vbox-extpack/svgz/blend/xoj/gnucash/dia/rdata
In other branch the extensions are now shown by line like:
 !:ext	gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz

After applying the above mentioned modifications by patches
file-5.43-blender.diff and file-5.43-compress-blend.diff
then my Blender samples are still described as before. But now with
option --extension i get a correct output like:

2020-07-19-Note-16-24.xoj:   gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj
BUILDINGEDGE.gxc:            gxc/gxd
HANDS96.MCC:                 mcc/mcd/fnt
Logo.xcfgz:                  gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
MYgnucash-gz.gnucash:        gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj
MYrdata.RData:               gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj
PostbankTest.kmy:            gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
RSI-Mega-Demo_Disk1.adz:     gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
bzless.1.gz:                 gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj
chess_board.ps.gz:           gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
earth.blend:                 blend
earth_real.blend:            blend
escher.blend:                blend
explode.blend:               blend
file-5.43.tar.gz:            gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj
foo.gz:                      gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
kleopatra_splashscreen.svgz: gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
lens_flare.blend:            blend
relative.blend:              blend
roundvehicle.blend:          blend
snow.blend:                  blend
text-rotate.dia:             gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj
trees.blend:                 gz/tgz/tpz/ipk/vbox-extpack/svgz/
			     blend/dia/gnucash/rdata/xoj


With best wishes,

Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY6EIKQAKCRCv8rHJQhrU
1j92AJ408AX2Mck0uAh/2SmynondmD6bZgCePgvr68BoDVAN+kTzZSOOVp4svzY=
=twe5
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-blend.txt.gz
Type: application/x-gzip
Size: 931 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221220/17bd95d8/attachment-0002.bin>
-------------- next part --------------
--- file-5.43/magic/Magdir/compress.old	2022-08-16 13:16:39.000000000 +0200
+++ file-5.43/magic/Magdir/compress	2022-12-20 01:06:29.069081600 +0100
@@ -20,3 +20,3 @@
 # Reference: https://tools.ietf.org/html/rfc1952
-# Update: Joerg Jenderek, Apr 2019
+# Update: Joerg Jenderek, Apr 2019, Dec 2022
 #   Edited by Chris Chittleborough <cchittleborough at yahoo.com.au>, March 2002
@@ -63,5 +63,4 @@
 # size of the original (uncompressed) input data modulo 2^32
->>-0	offset		>48
+# TODO: check for GXD MCD cad the reported size
 >>>-4	ulelong		x		\b, original size modulo 2^32 %u
->>-0	offset		<48		\b, truncated
 # gzipped TAR or VirtualBox extension package
@@ -70,3 +69,3 @@
 # https://www.w3.org/TR/SVG/mimereg.html
-#!:mime	image/image/svg+xml-compressed
+#!:mime	image/svg+xml-compressed
 #	zlib.3.gz
@@ -76,3 +75,8 @@
 #	Oracle_VM_VirtualBox_Extension_Pack-5.0.12-104815.vbox-extpack
-!:ext	gz/tgz/tpz/ipk/vbox-extpack/svgz
+#	trees.blend			http://fileformats.archiveteam.org/wiki/BLEND
+#	2020-07-19-Note-16-24.xoj	https://xournal.sourceforge.net/manual.html
+#	MYgnucash-gz.gnucash		https://wiki.gnucash.org/wiki/GnuCash_XML_format
+#	text-rotate.dia			https://en.wikipedia.org/wiki/Dia_(software)
+#	MYrdata.RData			https://en.wikipedia.org/wiki/R_(programming_language)
+!:ext	gz/tgz/tpz/ipk/vbox-extpack/svgz/blend/dia/gnucash/rdata/xoj
 # FNAME/FCOMMENT bit implies file name/comment as iso-8859-1 text
@@ -85,8 +89,9 @@
 #	kleopatra_splashscreen.svgz	gzipped .svg
-!:ext	gz/tgz/tpz/zabw/svgz
+#	RSI-Mega-Demo_Disk1.adz		gzipped .adf	http://fileformats.archiveteam.org/wiki/ADF_(Amiga)
+#	PostbankTest.kmy		gzipped XML	https://docs.kde.org/stable5/en/kmymoney/kmymoney/details.formats.compressed.html
+#	Logo.xcfgz			gzipped .xcf	http://fileformats.archiveteam.org/wiki/XCF
+!:ext	gz/tgz/tpz/zabw/svgz/adz/kmy/xcfgz
 >>0	use	gzip-info
 # size of the original (uncompressed) input data modulo 2^32
->>-0	offset		>48
->>>-4	ulelong		x		\b, original size modulo 2^32 %u
->>-0	offset		<48		\b, truncated
+>>-4	ulelong		x		\b, original size modulo 2^32 %u
 #	display information of gzip compressed files
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.43-compress-blend.diff.sig
Type: application/octet-stream
Size: 1192 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221220/17bd95d8/attachment-0002.obj>
-------------- next part --------------
--- file-5.43/magic/Magdir/blender.old	2021-02-23 01:49:24.000000000 +0100
+++ file-5.43/magic/Magdir/blender	2022-12-20 01:20:40.846092700 +0100
@@ -8,4 +8,15 @@
 # GLOB chunk was moved near start and provides subversion info since 2.42
-
+# Update:	Joerg Jenderek
+# URL: 		http://fileformats.archiveteam.org/wiki/BLEND
+#		http://www.blender.org/
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/b/blend.trid.xml
+#		http://formats.kaitai.io/blender_blend/index.html
+# Note:		called "Blender 3D data" by TrID
+#		and gzip compressed variant handled by ./compress
 0		string	=BLENDER	Blender3D,
+#!:mime		application/octet-stream
+!:mime		application/x-blender
+!:ext		blend
+# no sample found with extension blender
+#!:ext		blend/blender
 >7		string	=_		saved as 32-bits
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.43-blender.diff.sig
Type: application/octet-stream
Size: 640 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221220/17bd95d8/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: droid-blend.csv.gz
Type: application/x-gzip
Size: 1067 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20221220/17bd95d8/attachment-0003.bin>


More information about the File mailing list