[File] [PATCH] of Magdir/msdos Microsoft OneNote Package misidetfied as Microsoft Cabinet archive

Christos Zoulas christos at zoulas.com
Wed Sep 7 11:17:42 UTC 2022


Added, thanks!

christos

> On Sep 6, 2022, at 6:56 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello,
> 
> Some days ago i run the cleaning tool czkawka found on
> https://qarmin.github.io/czkawka/. One menu item concerns bad
> extensions. After running tool i looked in saved file list
> results_bad_extensions.txt for bad extension examples.
> One listed extension is ONEPKG.
> 
> When running file command (version 5.42) on such examples i get an
> output like:
> 
> DemoNotebook.onepkg: Microsoft Cabinet archive data,
> 		     many, 234775 bytes, 8 files,
> 		     at 0x2c
> 		     +A "Bespechungsnotizen.one"
> 		     +A "Forschung.one"
> 		     , number 1,
> 		     12 datablocks, 0x1203 compression
> Notebook03.onepkg:   Microsoft Cabinet archive data,
> 		     many, 1272589 bytes, 2 files,
> 		     at 0x44
> 		     +Utf "Editor \303\266ffnen.onetoc2"
> 		     +Utf "Allgemein.one"
> 		     , flags 0x4, number 1,
> 		     extra bytes 20 in head,
> 		     44 datablocks, 0xf03 compression
> ONGuide.onepkg:      Microsoft Cabinet archive data,
> 		     many, 248915 bytes, 2 files,
> 		     at 0x44
> 		     + "Editor \303\266ffnen.onetoc2"
> 		     + "Erste Schritte - Beta 1.one"
> 		     , flags 0x4, number 1,
> 		     extra bytes 20 in head,
> 		     datablocks, 0xf03 compression
> test-onenote.onepkg: Microsoft Cabinet archive data,
> 		     Windows 2000/XP setup, 3977 bytes, 1 file,
> 		     at 0x2c
> 		     +A "test-onenote.one"
> 		     , number 1,
> 		     1 datablock, 0x1203 compression
> 
> With --extension option cab or _/?_/??_ is displayed. Furthermore
> with -i option only generic application/vnd.ms-cab-compressed
> is shown.
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This identifies all
> such examples with low rate as "Microsoft Cabinet Archive" by
> ark-cab.trid.xml. Many examples are described with high rate
> as "Microsoft OneNote Package" by onepkg.trid.xml
> (See appended trid-v-onepkg.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies few examples also only generic as "Windows Cabinet File"
> with mime type application/vnd.ms-cab-compressed by PUID x-fmt/414.
> But it complains about file suffix ONEPKG.  Many examples are
> described as "Microsoft OneNote Package File" by fmt/987 (See
> appended output/droid-onepkg.csv). This utility identifies onepkg by
> looking for file name extension onetoc inside the first 2 KB blocks.
> But this not always true, because some packages have no table of
> contents.
> 
> Luckily TrID with -v option shows Related URL and file name
> extension. So with this help information about file formats can be
> added. That is expressed inside Magdir/msdos by comment lines like:
> 
> # URL:		https://en.wikipedia.org/wiki/Microsoft_OneNote
> #		http://fileformats.archiveteam.org/wiki/OneNote
> # Reference:	https://mark0.net/download/triddefs_xml.7z
> #		defs/o/onepkg.trid.xml
> 
> According to that documentation ONEPKG are just CAB archive
> containing Microsoft OneNote, with 3 byte file name extension one.
> OneNote table of contents have file name extension ONETOC or ONETOC2.
> 
> So when looking in current output of file command, we see that first
> member name is something like "Class Notes.one", "test-onenote.one",
> "Open Notebook.onetoc2" or "Editor Öffnen.onetoc2". This can be
> verified by unpacking command line tool via command line like:
> 	7z l -tcab *onepkg
> 
> The description happens inside Magdir/msdos by starting like:
> 0	string/b MSCF\0\0\0\0	Microsoft Cabinet archive data
> 
> No i must insert lines looking for 3 byte one file name suffix. The
> jump to first member entry and looking for point character before
> suffix is done by lines like:
>>> (16.l+16)	ubyte	x
> .
>>>>> &-1	search/255 	.
> Now i am in branch for file name extension. After last of that kind
> (that is theme for Windows 7 or 8 Theme Pack) i insert lines for
> OneNote Package. This looks like:
>>>>>> &0	string/c	one		\b, OneNote Package
> !:mime	application/msonenote
> !:ext	onepkg
> Instead of generic mime type application/vnd.ms-cab-compressed
> or application/octet-stream i show a type mentioned on nirsoft web
> site. But this is not official registered.
> 
> After applying the above mentioned modifications by patch
> file-5.42-msdos-onepkg.diff then my OneNote Packages are described
> more precisely like:
> 
> DemoNotebook.onepkg: Microsoft Cabinet archive data,
> 		     OneNote Package, 234775 bytes, 8 files,
> 		     at 0x2c
> 		     +A "Bespechungsnotizen.one"
> 		     +A "Forschung.one"
> 		     , number 1,
> 		     12 datablocks, 0x1203 compression
> Notebook03.onepkg:   Microsoft Cabinet archive data,
> 		     OneNote Package, 1272589 bytes, 2 files,
> 		     at 0x44
> 		     +Utf "Editor \303\266ffnen.onetoc2"
> 		     +Utf "Allgemein.one"
> 		     , flags 0x4, number 1,
> 		     extra bytes 20 in head,
> 		     44 datablocks, 0xf03 compression
> ONGuide.onepkg:      Microsoft Cabinet archive data,
> 		     OneNote Package, 248915 bytes, 2 files,
> 		     at 0x44
> 		     + "Editor \303\266ffnen.onetoc2"
> 		     + "Erste Schritte - Beta 1.one"
> 		     , flags 0x4, number 1,
> 		     extra bytes 20 in head,
> 		     9 datablocks, 0xf03 compression
> test-onenote.onepkg: Microsoft Cabinet archive data,
> 		     OneNote Package, 3977 bytes, 1 file,
> 		     at 0x2c
> 		     +A "test-onenote.one"
> 		     , number 1,
> 		     1 datablock, 0x1203 compression
> 
> I hope my diff file can be applied in future version of
> file utility.
> 
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
> 
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYxfQBgAKCRCv8rHJQhrU
> 1v9rAKCdBfg22WJHViuJPPCmi4tT1XFSyQCgqhcIFHC+MO6hLZ4FT+hdNE2DxKw=
> =CI3n
> -----END PGP SIGNATURE-----
> <trid-v-onepkg.txt.gz><droid-onepkg.csv.gz><file-5_42-msdos-openpkg_diff.DEFANGED-303><file-5_42-msdos-openpkg_diff_sig.DEFANGED-304>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220907/26f116e5/attachment.asc>


More information about the File mailing list