[File] [PATCH] Magdir/Windows Microsoft Outlook email *:PAB, *.PST *.OST
Christos Zoulas
christos at zoulas.com
Fri Jun 17 18:05:59 UTC 2022
Committed, thanks!
christos
> On Jun 6, 2022, at 4:38 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> some days ago i run Pirisoft ccleaner tool. It complains about file
> name extension PAB. So I look for such files on my systems.
>
> When running file command version 5.41 on such examples and related
> files i get an output like:
>
> OL2003Password.pst: Microsoft Outlook email folder
> (>=2003)
> OL2003Password2.pst: Microsoft Outlook email folder
> (>=2003)
> Outlook-hj.pst: Microsoft Outlook email folder
> (>=2003)
> example-64bit.pst: Microsoft Outlook email folder
> (>=2003)
> mailbox.PAB: Microsoft Outlook email folder
> (<=2002)
> outlook.pst: Microsoft Outlook email folder
> (<=2002)
> test-ost.ost: Microsoft Outlook email folder
> test-v15.pst: Microsoft Outlook email folder
> test-v16.pst: Microsoft Outlook email folder
> test-v37.pst: Microsoft Outlook email folder
> x-fmt-248-signature-id-260.pst: Microsoft Outlook email folder
> (<=2002)
> x-fmt-249-signature-id-261.pst: Microsoft Outlook email folder
> (>=2003)
> x-fmt-75-signature-id-472.pab: Microsoft Outlook email folder
>
>
> Furthermore only generic mime type application/octet-stream is
> shown with -i. With option --extension only 3 byte sequence ??? is
> shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html).
> Most PAB examples are described as "Microsoft Personal Address Book"
> by pab.trid.xml. The PST examples marked with ">=2003" are described
> first as "Microsoft Outlook Personal Folder (Unicode)" by
> pst-unicode.trid.xml. The PST examples marked with "<=2002" are
> described only as "Microsoft OutLook Personal Folder (ANSI)" by
> pst.trid.xml. The OST example is described as "Outlook Exchange
> Offline Storage" by ost.trid.xml (See appended trid-v-outlook.txt.gz)
> .
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/).
> This does not identify real PAB examples like mailbox.PAB as
> "Microsoft Outlook Personal Address Book" by PUID x-fmt/75 because at
> offset 8 the wMagicClient as 2-byte string is BA and not AB like in
> x-fmt-75-signature-id-472.pab. So this seems here to be a swap change
> bug. For PST samples it uses the same names as TrID. It shows also
> under version year ranges. The ANSI variant is described by PUID
> x-fmt/248 and an additional 1997-2002, whereas for the Unicode the
> range 2003-2007 is shown by PUID x-fmt/249. So here we get also the
> year information that is also shown by file command. Samples with
> unlikely or maybe not existing versions like test-v16.pst and
> test-v37.pst are not recognized. The OST example is not recognized
> (See appended droid-outlook.csv.gz).
>
> Luckily DROID and TrID with -v option shows a related URL and used
> file name extensions. With this information i was able to find a
> page about Personal Folder File on file formats archive team web
> site. There a link to official Microsoft description [MS-PST].pdf
> is mentioned. And also unofficial PFF format specification is
> listed as "Personal Folder File (PFF) format.pdf".
> That informations are now expressed by additional comment lines
> inside Magdir/Windows like:
> # URL: http://fileformats.archiveteam.org/
> # wiki/Personal_Folder_File
> # Reference: https://interoperability.blob.core.windows.net/files/
> # MS-PST/%5bMS-PST%5d.pdf
> # http://mark0.net/download/triddefs_xml.7z
> # defs/p/pab.trid.xml
> # defs/p/pst.trid.xml
> # defs/p/pst-unicode.trid.xml
> # defs/o/ost.trid.xml
>
> The description happens inside Magdir/Windows by lines like:
> 0 lelong 0x4E444221 Microsoft Outlook email folder
>> 10 leshort 0x0e (<=2002)
>> 10 leshort 0x17 (>=2003)
>
> After the test of starting 4 byte dwMagic !BDN the describing text is
> shown. By next 2 lines for two versions 14 and 23 year information is
> shown. These 2 version seems to be the common one. Then for unusual
> versions like example-v15.pst nothing year information is shown.
>
> Unfortunately this version variable wVer is not clearly explained.
> It it is written that this value must be 14 (=Eh) or 15 (=Fh) if
> the file is an ANSI PST file. From version 21 (=15h according to
> non-official documentation) or value greater than 23 it is a
> Unicode PST file (UTF-16 little-endian) and highest mentioned value
> is 37. So this version information now becomes like:
>
>>> 10 uleshort x (
>>> 10 leshort <0x10 \b<=2002, ANSI,
>>> 10 leshort >0x14 \b>=2003, Unicode,
>>> 10 uleshort x version %u)
>
> In "newer" variant format has now become to Unicode, but also the
> size of some fields grow from 32-bit to 64-bit or meaning changed.
> So after the first twenty four bytes the fields also appear at
> other positions.
>
> So for Unicode exist a branch with additional information, that
> looks like:
>>> 10 uleshort >20
>>>> 184 ulequad x \b, %llu bytes
>>>> 513 ubyte x \b, bCryptMethod=%u
> The size of the file is stored as 8 byte integer variable
> ibFileEof. The variable bCryptMethod describes the Encryption type.
> Zero means no encryption. One is used for encryption with
> 'permutation algorithm'. Two is used for encryption with 'cyclic
> algorithm' and 16
> is used for encrypted with Windows Information Protection (WIP).
> For ANSI variant the same information is shown by branch which
> looks like:
>
>>> 10 uleshort <16
>>>> 168 ulelong x \b, %u bytes
>>>> 461 ubyte x \b, bCryptMethod=%u
>
> The DROID samples x-fmt-75-signature-id-472.pab
> x-fmt-248-signature-id-260.pst x-fmt-249-signature-id-261.pst are
> not real Outlook examples. These contain just few dozen starting
> bytes of such outlook files. To skip these sample from
> misidentification just also test for existence of later field like
> bPlatformCreate value. So this additional part looks like:
>> 14 ubyte x Microsoft Outlook
> !:mime application/vnd.ms-outlook
> Instead generic mime type application/octet-stream i display
> application/vnd.ms-outlook mentioned on reference site. But this
> not mentioned on other sites and is not official registered. So
> maybe this must be changed again.
>
> The wMagicClient can be shown by line like:
>>> 8 leshort x \b, wMagicClient=%#x
> The string value AB (4142h) is used for PAB files. SM (534Dh) is
> used for PST files and SO (534Fh) is used for OST files. So
> depending on that value sub classification (with other type
> description and file name extension) is done. This now is expressed
> by lines like:
>
>>> 8 leshort 0x4142 Personal Address Book
> !:ext pab
>>> 8 leshort 0x4D53 Personal Storage
> !:ext pst
>>> 8 leshort 0x4F53 Offline Storage
> !:ext ost
>
> After applying the above mentioned modifications by patch
> file-5.41-windows-pab.diff then the Outlook files are described
> with more details and misidentification vanish. This now looks like:
>
> OL2003Password.pst: Microsoft Outlook Personal Storage
> (>=2003, Unicode, version 23),
> dwUnique=0x17, 271360 bytes,
> bCryptMethod=1, CRC32 0xfc6a0096
> OL2003Password2.pst: Microsoft Outlook Personal Storage
> (>=2003, Unicode, version 23),
> dwUnique=0x15, 271360 bytes,
> bCryptMethod=2, CRC32 0x6ba5f580
> Outlook-hj.pst: Microsoft Outlook Personal Storage
> (>=2003, Unicode, version 23),
> dwUnique=0x10f31, 556680192 bytes,
> bCryptMethod=1, CRC32 0x5de74682
> example-64bit.pst: Microsoft Outlook Personal Storage
> (>=2003, Unicode, version 23),
> dwUnique=0x1d, 271360 bytes,
> CRC32 0x89cb68c4
> mailbox.PAB: Microsoft Outlook Personal Address
> Book
> (<=2002, ANSI, version 14),
> bPlatformCreate=2, bPlatformAccess=2,
> dwUnique=0x5, 32768 bytes
> outlook.pst: Microsoft Outlook Personal Storage
> (<=2002, ANSI, version 14),
> bPlatformCreate=2, bPlatformAccess=2,
> dwReserved1=0x8361a034,
> dwReserved2=0x373263,
> dwUnique=0x82, 278528 bytes,
> bCryptMethod=1
> test-ost.ost: Microsoft Outlook Offline Storage
> (<=2002, ANSI, version 15),
> dwUnique=0x4c08, 2556928 bytes,
> bCryptMethod=1
> test-v15.pst: Microsoft Outlook Personal Storage
> (<=2002, ANSI, version 15),
> dwUnique=0x4c08, 2556928 bytes,
> bCryptMethod=1
> test-v16.pst: Microsoft Outlook Personal Storage
> ( version 16)
> test-v37.pst: Microsoft Outlook Personal Storage
> (>=2003, Unicode, version 37),
> dwUnique=0x400, 9 bytes,
> bSentinel=0x83, CRC32 0x58585858
> x-fmt-248-signature-id-260.pst: data
> x-fmt-249-signature-id-261.pst: data
> x-fmt-75-signature-id-472.pab: data
>
> I hope my diff file can be applied in future version of file
> utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYp5lpwAKCRCv8rHJQhrU
> 1iFnAJoCAJt+1KUwdcjrnZO/MnXZhHJDVwCeIMgnGziW6W1BfxMWsPh0CK2yvzk=
> =lpzG
> -----END PGP SIGNATURE-----
> <droid-outlook.csv.gz><trid-v-outlook.txt.gz><file-5_41-windows-pab_diff.DEFANGED-445><file-5_41-windows-pab_diff_sig.DEFANGED-446>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220617/4281363d/attachment.asc>
More information about the File
mailing list