[File] [PATCH] Magdir/database Mozilla Mork database *.MSF versus *.DAT *.MAB

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Sat Oct 7 22:07:23 UTC 2023


Hello,

some month ago i migrate to Windows 10 on system. Therefore i must
transfer also my mail stuff handled by thunderbird. So i look at files
belonging to thunderbird.
When running file command version 5.45 such Thunderbird samples i get an
output like:

Drafts.msf:                   Mozilla Mork database, version 1.4
Drafts_new.msf:               Mozilla Mork database, version 1.4
INBOX.msf:                    Mozilla Mork database, version 1.4
Trash.msf:                    Mozilla Mork database, version 1.4
empty.mab:                    Mozilla Mork database, version 1.4
fmt-612-signature-id-948.mab: exported SGML document, ASCII text
			      , with no line terminators
panacea.dat:                  Mozilla Mork database, version 1.4

With option -i only generic text/plain and with option --extension only
??? is displayed.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). Many of the MSF
samples are described as "Mozilla Mail Summary file" by msf.trid.xml.
The sample panacea.dat is described as "Mozilla Mail folder cache"
by dat-mork.trid.xml with correct suffix. The real MAB samples are
described as "Mozilla Address Book" by mab.trid.xml with correct suffix
(See appended trid-v-mork.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/).
Here all examples are described as "Mork" by PUID fmt/612

TrID list the used file name extension and often with -v option the
related URL pointing to used file format information.

With the help of this tools i add more lines. So this is now expressed
inside Magdir/database by additional comment lines like:
# URL:		http://fileformats.archiveteam.org/wiki/Mork
#		https://en.wikipedia.org/wiki/Mork_(file_format)
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/d/msf.trid.xml
#		defs/m/mab.trid.xml
#		defs/d/dat-mork.trid.xml

In current Magdir/database the description happen by lines like:
  0	string	//\ <!--\ <mdb:mork:z\ v="	Mozilla Mork database
  >23	string	x		\b, version %.3s

Instead of generic text/plain mime type i choose an user defined one.
According to TrID i look for specific other keywords to do sub
classification with different file name extensions. So this is now done
by lines like:

  0	string	//\ <!--\ <mdb:mork:z\ v="	Mozilla Mork database
  !:mime	text/x-mozilla-mork
  >23	string	x		\b, version %.3s
  >26	search/7516	mailboxName		\b, Mail Summary file
  !:ext						msf
  >26	search/192	addrbk			\b, Address Book
  !:ext						mab
  >26	search/210	indexingPriority	\b, Mail folder cache
  !:ext						dat

After applying the above mentioned modifications by patch
file-5.45-database-mork.diff then my Thunderbird samples are now
described with more correct details. This now looks like:
Drafts.msf:                   Mozilla Mork database, version 1.4
			      , Mail Summary file
Drafts_new.msf:               Mozilla Mork database, version 1.4
			      , Mail Summary file
INBOX.msf:                    Mozilla Mork database, version 1.4
			      , Mail Summary file
Trash.msf:                    Mozilla Mork database, version 1.4
			      , Mail Summary file
empty.mab:                    Mozilla Mork database, version 1.4
			      , Address Book
fmt-612-signature-id-948.mab: ASCII text, with no line terminators
panacea.dat:                  Mozilla Mork database, version 1.4
			      , Mail folder cache

I hope my diff file can be applied in future version of file
utility.

With best wishes,
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-mork.txt.gz
Type: application/x-gzip
Size: 635 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231008/bd12e159/attachment.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/database.old	2023-02-09 18:43:52.000000000 +0100
+++ file-5.45/magic/Magdir/database	2023-10-07 23:28:21.567455600 +0200
@@ -873,4 +873,27 @@
 # From: David Korth <gerbilsoft at gerbilsoft.com>
+# Update:	Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/Mork
+#		https://en.wikipedia.org/wiki/Mork_(file_format)
+# Note:		called "Mork" by DROID via fmt/612
 0	string	//\ <!--\ <mdb:mork:z\ v="	Mozilla Mork database
+# display Mozilla Mork database (strength=260=260+0) before "exported SGML document" (strength=28=38-10) via ./sgml
+#!:strength +0
+#!:mime	text/plain
+!:mime	text/x-mozilla-mork
+# version like 1.4
 >23	string	x		\b, version %.3s
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/d/msf.trid.xml
+# Note:		called "Mozilla Mail Summary file" by TrID
+>26	search/7516	mailboxName		\b, Mail Summary file
+# like: Archives.msf Drafts.msf INBOX.msf Junk.msf Sent.msf Templates.msf Trash.msf 
+!:ext						msf
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/m/mab.trid.xml
+# Note:		called "Mozilla Address Book" by TrID
+>26	search/192	addrbk			\b, Address Book
+!:ext						mab
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/d/dat-mork.trid.xml
+# Note:		called "Mozilla Mail folder cache" by TrID
+>26	search/210	indexingPriority	\b, Mail folder cache
+# panacea.dat
+!:ext						dat
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-database-mork.diff.sig
Type: application/octet-stream
Size: 858 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231008/bd12e159/attachment.obj>


More information about the File mailing list