[File] [PATCH] Magdir/mail.news for Mailbox *.mbox

Christos Zoulas christos at zoulas.com
Mon Oct 9 15:57:12 UTC 2023


Added, thanks!

christos

> On Oct 8, 2023, at 1:39 PM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
> 
> Hello,
> 
> some months ago i migrate to Windows 10 on my system. Therefore i must
> transfer also my mail stuff handled by thunderbird. I had some problems.
> So i look at files belonging to thunderbird.
> When running file command version 5.45 on mail messages i get an output
> like:
> 
> INBOX:                         Unicode text
> 			       , UTF-8 text, with CRLF line terminators
> Stromanbieter.mbox:            ASCII text
> 			       , with CRLF, LF line terminators
> Verivox.mbox:                  ASCII text
> 			       , with CRLF line terminators
> file5.18patch-dyadic.mbox:     ASCII text
> file5.19patchWindows.PIF.mbox: ASCII text
> 
> With option -i only generic text/plain and with option --extension only
> ??? is displayed.
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). Many of the mail
> samples are described with highest priority as "Standard Unix Mailbox"
> by mbox.trid.xml with correct file name suffix MBOX and mime type
> application/mbox. All samples are described with low priority as "E-Mail
> message (Var. 2)" by eml-var2.trid.xml with mime type message/rfc822 and
> wrong file suffix EML (See appended trid-v-mbox.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/).
> Here all examples are described as "MIME Email" with mime type
> message/rfc822 by PUID fmt/950. For samples with mbox and without file
> name suffix the names are considered as invalid (See EXTENSION_MISMATCH
> true in droid-mbox.csv.gz)
> 
> According to shared-mime-info database the samples are called "Mailbox
> file" with mime type application/mbox and file name suffix mbox.
> 
> TrID list the used file name extension and often with -v option the
> related URL pointing to used file format information.
> 
> With the help of these tools i add more lines. So this is now expressed
> inside Magdir/mail.news after other mail/news by additional comment
> lines like:
> # URL:		https://tools.ietf.org/rfc/rfc4155.txt
> # Reference:	http://mark0.net/download/triddefs_xml.7z
> #		defs/m/mbox.trid.xml
> 
> According to all tools and documentation the mail samples start with
> capitalized word From followed by one space character. Instead of
> text/plain an official registered mime type should be used.
> So these are now described by lines like:
> 0	string			From\040	Mailbox text
> !:mime	application/mbox
> !:ext	/mbox
>  >0	string		x	\b, 1st line "%s"
> 
> As described in documentation often the file name suffix mbox is used.
> But i also find samples like INBOX without suffix. I am not sure that
> the starting pattern is unique enough. So for control reason show
> complete first line. Maybe additional test lines may be added in such a
> worst case.
> 
> After applying the above mentioned modifications by patch
> file-5.45-mail.news-mbox.diff then my mail messages are now
> recognized and described with some details. This now looks like:
> 
> INBOX:                         Mailbox text, 1st line
> 			       "From - Tue May 30 21:55:54 2023"
> Stromanbieter.mbox:            Mailbox text, 1st line
> 			       "From - Wed Apr 08 17:44:27 2015"
> Verivox.mbox:                  Mailbox text, 1st line
> 			       "From - Tue Apr 07 18:34:15 2015"
> file5.18patch-dyadic.mbox:     Mailbox text, 1st line
> 			       "From joerg.jen.der.ek at gmx.net
> 			       Sat May 31 20:31:20 2014"
> file5.19patchWindows.PIF.mbox: Mailbox text, 1st line
> 			       "From joerg.jen.der.ek at gmx.net
> 			       Fri Aug 22 17:56:31 2014"
> 
> The world seems to be crazy. All talk about AI, waste much time and
> resources in this area, but mail stuff much standardized since decades
> and established is still not 100% working until today. What a shame for
> all people working in IT sector.
> 
> I hope my diff file is unique enough and can be applied in future
> version of file utility.
> 
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <trid-v-mbox.txt.gz><droid-mbox.csv><file-5_45-mail_news-mbox_diff.DEFANGED-22139><file-5_45-mail_news-mbox_diff_sig.DEFANGED-22140>-- 
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>



More information about the File mailing list