[File] [PATCH] Magdir/mail.news,msdos Microsoft TNEF duplicates

Jörg Jenderek joerg.jen.der.ek at gmx.net
Tue Jun 14 13:42:13 UTC 2022


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some days ago i handled some Microsoft Outlook files. So i look also
at Outlook files with file name extension DAT.
When running file command version 5.42 with -k option on such
examples and related files i get an output like:

minimal.tnef: Transport Neutral Encapsulation Format
	      TNEF
rtf.tnef:     Transport Neutral Encapsulation Format
	      TNEF
triples.tnef: Transport Neutral Encapsulation Format
	      TNEF
voice.tnef:   Transport Neutral Encapsulation Format
	      TNEF
winmail.dat:  Transport Neutral Encapsulation Format
	      TNEF

Furthermore correct mime type application/vnd.ms-tnef is shown with
option -i. With option --extension only 3 byte sequence ??? is
shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). The examples are
described here as "Transport Neutral Encapsulation Format" without
mime type by tnef.trid.xml (See appended trid-v-tnef.txt.gz).

Luckily TrID with -v option shows a related URL and used
file name extensions DAT and TNEF. With this information i was able
to find a page about Transport Neutral Encapsulation Format on file
formats archive team web site. On Wikipedia is a link to official
Microsoft description [MS-OXTNEF]-210817.pdf. That informations are
now expressed by additional comment lines inside Magdir/mail.news lik
e:

# URL:		http://fileformats.archiveteam.org/
#		wiki/Transport_Neutral_Encapsulation_Format
#		https://en.wikipedia.org/
#		wiki/Transport_Neutral_Encapsulation_Format
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/t/tnef.trid.xml
#		https://interoperability.blob.core.windows.net/
#		files/MS-OXTNEF/%5bMS-OXTNEF%5d-210817.pdf

The description with abbreviation happens inside Magdir/msdos by
lines like:
0 lelong 0x223e9f78	TNEF
!:mime	application/vnd.ms-tnef
The description of the same type happens inside Magdir/mail.news by
lines like:
0 lelong 0x223E9F78	Transport Neutral Encapsulation Format
!:mime	application/vnd.ms-tnef

So i remove lines from Magdir/msdos, move and merged that information
with Magdir/mail.news. So this now starts like:
0 lelong 0x223E9F78	Transport Neutral Encapsulation Format (TNEF)
!:mime	application/vnd.ms-tnef
!:ext	tnef/dat
In Microsoft Outlook the standard name is winmail.dat or win.dat.

With the help of the specification i began to interpret the bytes
after the signature. Each attribute consist of five parts. First come
s
the level, where one means to the message itself and two means an
attachment. Afterwards the ID of attribute is stored followed by
length. Then comes the data of the attribute followed by 16-bit CRC.
So the first attribute information is shown by lines like:
 >6	ubyte		!1		\b, 1st level %#2.2x
 >7	ubelong		!0x06900800	\b, 1st id %#8.8x
 >7	ubelong		=0x06900800
 >>11	ulelong		!4		\b, TnefVersion length %x
 >>15	ulelong		!0x00010000h	\b, version %#8.8x
 >>19	uleshort	!1		\b, checksum %#4.4x

For most samples this is the TnefVersion (with
idTnefVersion=06900800h) and version value 00010000h. One exception
was example minimal.tnef. Here the first attribute has id 0x02900000.

For examples with TnefVersion second attribute was OEMCodePage (
with idOEMCodePage=07900600h) and a data length of 8 bytes. The first
4 bytes of data are the used Primary CodePage (like: 1251 1252). The
next 4 data bytes are the Secondary CodePage. At the moment these are
unused and SHOULD contain zero. So this information is shown by lines
like:
 >>21	ubyte		!1		\b, level %#2.2x
 >>22	ubelong		=0x07900600	\b, OEM codepage
 >>>26	ulelong		=8
 >>>>30	ulelong		x		%u
 >>>>34	ulelong		!0		and %u
 >>>>38	uleshort	x		(checksum %#x)

For examples with TnefVersion third attribute was attMessageClass
(with idMessageClass=08800700h) and a variable data length (like: 16
24 25). The data is a string like "IPM.Appointment" or
"IPM.Note.Microsoft.Voicemail.UM.CA". So this information is shown by
lines like:
 >>40	ubyte		!1		\b, level %u
 >>41	ubelong		=0x08800700	\b, MessageAttribute
 >>>45	pstring/l	x		"%s"
That information can partly verified by command line tools like:
	tnef --list -v -f voice.tnef
	ytnef -v triples.tnef
So we see that example voice.tnef contains MP3 files and example
triples.tnef with "IPM.Appointment" contains something like
calendar.ics.

After applying the above mentioned modifications by patches
file-5.42-mail.news-tnef.diff and file-5.42-msdos-tnef.diff
then the Outlook files are described with only 1 text and with more
details. This now looks like:
minimal.tnef: Transport Neutral Encapsulation Format (TNEF)
	      , 1st level 0x02, 1st id 0x02900000
rtf.tnef:     Transport Neutral Encapsulation Format (TNEF)
	      , OEM codepage 1252 (checksum 0xe8)
	      , MessageAttribute "IPM.Microsoft Mail.Note"
triples.tnef: Transport Neutral Encapsulation Format (TNEF)
	      , OEM codepage 1251 (checksum 0xe7)
	      , MessageAttribute "IPM.Appointment"
voice.tnef:   Transport Neutral Encapsulation Format (TNEF)
	      , OEM codepage 1252 (checksum 0xe8)
	      , MessageAttribute "IPM.Note.Microsoft.Voicemail.UM.CA"
winmail.dat:  Transport Neutral Encapsulation Format (TNEF)
	      , OEM codepage 1252 (checksum 0xe8)
	      , MessageAttribute "IPM.Note.Portada Newseum"


I hope my diff files can be applied in future version of file
utility.

With best wishes
Jörg Jenderek
- --
Jörg Jenderek



-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYqiQEAAKCRCv8rHJQhrU
1vGcAJ9JbUwUegVSAOP2HNNDNUh+sKcNzQCg2C9iU7UkVI30bVR/uXq6RhuaKF8=
=rHk+
-----END PGP SIGNATURE-----
-------------- next part --------------
-- 
File mailing list
File at astron.com
https://mailman.astron.com/mailman/listinfo/file

-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.42-msdos-tnef.diff.sig
Type: application/octet-stream
Size: 570 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220614/2557f830/attachment.obj>
-------------- next part --------------
--- file-5.42/magic/Magdir/msdos.old	2022-04-11 15:07:12.000000000 +0200
+++ file-5.42/magic/Magdir/msdos	2022-06-12 19:28:29.782632500 +0200
@@ -1275,11 +1275,6 @@
 >>>20	long		>0		TIFF starts at byte %d
 >>>>24	long		>0		length %d
 
-# TNEF magic From "Joomy" <joomy at se-ed.net>
-# Microsoft Outlook's Transport Neutral Encapsulation Format (TNEF)
-0	lelong		0x223e9f78	TNEF
-!:mime	application/vnd.ms-tnef
-
 # Norton Guide (.NG , .HLP) files added by Joerg Jenderek from source NG2HTML.C
 # of http://www.davep.org/norton-guides/ng2h-105.tgz
 # https://en.wikipedia.org/wiki/Norton_Guides
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.42-mail.news-tnef.diff.sig
Type: application/octet-stream
Size: 1431 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220614/2557f830/attachment-0001.obj>
-------------- next part --------------
--- file-5.42/magic/Magdir/mail.news.old	2021-09-11 21:20:15.000000000 +0200
+++ file-5.42/magic/Magdir/mail.news	2022-06-14 01:25:35.153303700 +0200
@@ -45,6 +45,52 @@
 
 # TNEF files...
-0	lelong		0x223E9F78	Transport Neutral Encapsulation Format
+# URL:		http://fileformats.archiveteam.org/wiki/Transport_Neutral_Encapsulation_Format
+#		https://en.wikipedia.org/wiki/Transport_Neutral_Encapsulation_Format
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/t/tnef.trid.xml
+#		https://interoperability.blob.core.windows.net/files/MS-OXTNEF/%5bMS-OXTNEF%5d-210817.pdf
+# Update:	Joerg Jenderek
+# Note:		moved and merged from ./msdos (version 1.154) there just called "TNEF"
+#		partly verified by `tnef --list -v -f voice.tnef` and `ytnef -v  triples.tnef`
+# TNEF magic From "Joomy" <joomy at se-ed.net>
+# TNEF_SIGNATURE 
+0	lelong		0x223E9F78	Transport Neutral Encapsulation Format (TNEF)
 !:mime	application/vnd.ms-tnef
+# winmail.dat or win.dat by Microsoft Outlook
+!:ext	tnef/dat
+# https://docs.microsoft.com/en-us/openspecs/exchange_server_protocols/ms-oxtnef/7fdb64ee-7f63-4d95-9af1-c672e7475c3a
+# LegacyKey
+#>4	uleshort	x		\b, key %#4.4x
+# attrLevelMessage; Level where attribute applies like: 1~attrLevelMessage 2~attrLevelAttachment
+>6	ubyte		!1		\b, 1st level %#2.2x
+# other ID (like 02900000h) or TnefVersion ID (idTnefVersion=06900800h)
+>7	ubelong		!0x06900800	\b, 1st id %#8.8x
+>7	ubelong		=0x06900800
+# TnefVersion lenght like: 4
+>>11	ulelong		!4		\b, TnefVersion length %x
+# TNEFVersionData; TnefVersion data like: 00010000h
+>>15	ulelong		!0x00010000h	\b, version %#8.8x
+# Checksum like: 1
+>>19	uleshort	!1		\b, checksum %#4.4x
+# attrLevelMessage; level of attOemCodepage like: 1
+>>21	ubyte		!1		\b, level %#2.2x
+# idOEMCodePage; OEMCodePage ID like: 07900600h
+>>22	ubelong		=0x07900600	\b, OEM codepage
+# OEMCodePage length like: 8
+>>>26	ulelong		=8		
+# OEMCodePageData; PrimaryCodePage like: 1251 1252
+>>>>30	ulelong		x		%u
+# OEMCodePageData; SecondaryCodePage; unused and SHOULD contain zero
+>>>>34	ulelong		!0		and %u
+# OEMCodePageData Checksum like: E7h E8h
+>>>>38	uleshort	x		(checksum %#x)
+# attrLevelMessage of attMessageClass like: 1
+>>40	ubyte		!1		\b, level %u
+# idMessageClass; ID of attMessageClass like: 08800700h
+>>41	ubelong		=0x08800700	\b, MessageAttribute
+# attMessageClass length like: 16 24 25
+#>>>45	ulelong		x		(length %u)
+# attMessageClass data like: "IPM.Microsoft Mail.Note" "IPM.Note.Portada Newseum"
+# "IPM.Appointment" "IPM.Note.Microsoft.Voicemail.UM.CA"
+>>>45	pstring/l	x		"%s"
 
 # From: Kevin Sullivan <ksulliva at psc.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-tnef.txt.gz
Type: application/x-gzip
Size: 512 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220614/2557f830/attachment.bin>


More information about the File mailing list