[File] [PATCH] Magdir/mail.news, msdos Microsoft TNEF duplicates
Christos Zoulas
christos at zoulas.com
Fri Jun 17 18:05:44 UTC 2022
Committed, thanks!
christos
> On Jun 14, 2022, at 9:42 AM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> some days ago i handled some Microsoft Outlook files. So i look also
> at Outlook files with file name extension DAT.
> When running file command version 5.42 with -k option on such
> examples and related files i get an output like:
>
> minimal.tnef: Transport Neutral Encapsulation Format
> TNEF
> rtf.tnef: Transport Neutral Encapsulation Format
> TNEF
> triples.tnef: Transport Neutral Encapsulation Format
> TNEF
> voice.tnef: Transport Neutral Encapsulation Format
> TNEF
> winmail.dat: Transport Neutral Encapsulation Format
> TNEF
>
> Furthermore correct mime type application/vnd.ms-tnef is shown with
> option -i. With option --extension only 3 byte sequence ??? is
> shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). The examples are
> described here as "Transport Neutral Encapsulation Format" without
> mime type by tnef.trid.xml (See appended trid-v-tnef.txt.gz).
>
> Luckily TrID with -v option shows a related URL and used
> file name extensions DAT and TNEF. With this information i was able
> to find a page about Transport Neutral Encapsulation Format on file
> formats archive team web site. On Wikipedia is a link to official
> Microsoft description [MS-OXTNEF]-210817.pdf. That informations are
> now expressed by additional comment lines inside Magdir/mail.news lik
> e:
>
> # URL: http://fileformats.archiveteam.org/
> # wiki/Transport_Neutral_Encapsulation_Format
> # https://en.wikipedia.org/
> # wiki/Transport_Neutral_Encapsulation_Format
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/t/tnef.trid.xml
> # https://interoperability.blob.core.windows.net/
> # files/MS-OXTNEF/%5bMS-OXTNEF%5d-210817.pdf
>
> The description with abbreviation happens inside Magdir/msdos by
> lines like:
> 0 lelong 0x223e9f78 TNEF
> !:mime application/vnd.ms-tnef
> The description of the same type happens inside Magdir/mail.news by
> lines like:
> 0 lelong 0x223E9F78 Transport Neutral Encapsulation Format
> !:mime application/vnd.ms-tnef
>
> So i remove lines from Magdir/msdos, move and merged that information
> with Magdir/mail.news. So this now starts like:
> 0 lelong 0x223E9F78 Transport Neutral Encapsulation Format (TNEF)
> !:mime application/vnd.ms-tnef
> !:ext tnef/dat
> In Microsoft Outlook the standard name is winmail.dat or win.dat.
>
> With the help of the specification i began to interpret the bytes
> after the signature. Each attribute consist of five parts. First come
> s
> the level, where one means to the message itself and two means an
> attachment. Afterwards the ID of attribute is stored followed by
> length. Then comes the data of the attribute followed by 16-bit CRC.
> So the first attribute information is shown by lines like:
>> 6 ubyte !1 \b, 1st level %#2.2x
>> 7 ubelong !0x06900800 \b, 1st id %#8.8x
>> 7 ubelong =0x06900800
>>> 11 ulelong !4 \b, TnefVersion length %x
>>> 15 ulelong !0x00010000h \b, version %#8.8x
>>> 19 uleshort !1 \b, checksum %#4.4x
>
> For most samples this is the TnefVersion (with
> idTnefVersion=06900800h) and version value 00010000h. One exception
> was example minimal.tnef. Here the first attribute has id 0x02900000.
>
> For examples with TnefVersion second attribute was OEMCodePage (
> with idOEMCodePage=07900600h) and a data length of 8 bytes. The first
> 4 bytes of data are the used Primary CodePage (like: 1251 1252). The
> next 4 data bytes are the Secondary CodePage. At the moment these are
> unused and SHOULD contain zero. So this information is shown by lines
> like:
>>> 21 ubyte !1 \b, level %#2.2x
>>> 22 ubelong =0x07900600 \b, OEM codepage
>>>> 26 ulelong =8
>>>>> 30 ulelong x %u
>>>>> 34 ulelong !0 and %u
>>>>> 38 uleshort x (checksum %#x)
>
> For examples with TnefVersion third attribute was attMessageClass
> (with idMessageClass=08800700h) and a variable data length (like: 16
> 24 25). The data is a string like "IPM.Appointment" or
> "IPM.Note.Microsoft.Voicemail.UM.CA". So this information is shown by
> lines like:
>>> 40 ubyte !1 \b, level %u
>>> 41 ubelong =0x08800700 \b, MessageAttribute
>>>> 45 pstring/l x "%s"
> That information can partly verified by command line tools like:
> tnef --list -v -f voice.tnef
> ytnef -v triples.tnef
> So we see that example voice.tnef contains MP3 files and example
> triples.tnef with "IPM.Appointment" contains something like
> calendar.ics.
>
> After applying the above mentioned modifications by patches
> file-5.42-mail.news-tnef.diff and file-5.42-msdos-tnef.diff
> then the Outlook files are described with only 1 text and with more
> details. This now looks like:
> minimal.tnef: Transport Neutral Encapsulation Format (TNEF)
> , 1st level 0x02, 1st id 0x02900000
> rtf.tnef: Transport Neutral Encapsulation Format (TNEF)
> , OEM codepage 1252 (checksum 0xe8)
> , MessageAttribute "IPM.Microsoft Mail.Note"
> triples.tnef: Transport Neutral Encapsulation Format (TNEF)
> , OEM codepage 1251 (checksum 0xe7)
> , MessageAttribute "IPM.Appointment"
> voice.tnef: Transport Neutral Encapsulation Format (TNEF)
> , OEM codepage 1252 (checksum 0xe8)
> , MessageAttribute "IPM.Note.Microsoft.Voicemail.UM.CA"
> winmail.dat: Transport Neutral Encapsulation Format (TNEF)
> , OEM codepage 1252 (checksum 0xe8)
> , MessageAttribute "IPM.Note.Portada Newseum"
>
>
> I hope my diff files can be applied in future version of file
> utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYqiQEAAKCRCv8rHJQhrU
> 1vGcAJ9JbUwUegVSAOP2HNNDNUh+sKcNzQCg2C9iU7UkVI30bVR/uXq6RhuaKF8=
> =rHk+
> -----END PGP SIGNATURE-----
> <Nachrichtenteil als Anhang.DEFANGED-0><file-5_42-msdos-tnef_diff_sig.DEFANGED-1><file-5_42-msdos-tnef_diff.DEFANGED-2><file-5_42-mail_news-tnef_diff_sig.DEFANGED-3><file-5_42-mail_news-tnef_diff.DEFANGED-4><trid-v-tnef.txt.gz>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220617/06993c5a/attachment.asc>
More information about the File
mailing list