[File] [PATCH] of Magdir/msdos Microsoft OneNote Package misidetfied as Microsoft Cabinet archive
Christos Zoulas
christos at zoulas.com
Wed Sep 7 11:17:42 UTC 2022
Added, thanks!
christos
> On Sep 6, 2022, at 6:56 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> Some days ago i run the cleaning tool czkawka found on
> https://qarmin.github.io/czkawka/. One menu item concerns bad
> extensions. After running tool i looked in saved file list
> results_bad_extensions.txt for bad extension examples.
> One listed extension is ONEPKG.
>
> When running file command (version 5.42) on such examples i get an
> output like:
>
> DemoNotebook.onepkg: Microsoft Cabinet archive data,
> many, 234775 bytes, 8 files,
> at 0x2c
> +A "Bespechungsnotizen.one"
> +A "Forschung.one"
> , number 1,
> 12 datablocks, 0x1203 compression
> Notebook03.onepkg: Microsoft Cabinet archive data,
> many, 1272589 bytes, 2 files,
> at 0x44
> +Utf "Editor \303\266ffnen.onetoc2"
> +Utf "Allgemein.one"
> , flags 0x4, number 1,
> extra bytes 20 in head,
> 44 datablocks, 0xf03 compression
> ONGuide.onepkg: Microsoft Cabinet archive data,
> many, 248915 bytes, 2 files,
> at 0x44
> + "Editor \303\266ffnen.onetoc2"
> + "Erste Schritte - Beta 1.one"
> , flags 0x4, number 1,
> extra bytes 20 in head,
> datablocks, 0xf03 compression
> test-onenote.onepkg: Microsoft Cabinet archive data,
> Windows 2000/XP setup, 3977 bytes, 1 file,
> at 0x2c
> +A "test-onenote.one"
> , number 1,
> 1 datablock, 0x1203 compression
>
> With --extension option cab or _/?_/??_ is displayed. Furthermore
> with -i option only generic application/vnd.ms-cab-compressed
> is shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This identifies all
> such examples with low rate as "Microsoft Cabinet Archive" by
> ark-cab.trid.xml. Many examples are described with high rate
> as "Microsoft OneNote Package" by onepkg.trid.xml
> (See appended trid-v-onepkg.txt.gz).
>
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies few examples also only generic as "Windows Cabinet File"
> with mime type application/vnd.ms-cab-compressed by PUID x-fmt/414.
> But it complains about file suffix ONEPKG. Many examples are
> described as "Microsoft OneNote Package File" by fmt/987 (See
> appended output/droid-onepkg.csv). This utility identifies onepkg by
> looking for file name extension onetoc inside the first 2 KB blocks.
> But this not always true, because some packages have no table of
> contents.
>
> Luckily TrID with -v option shows Related URL and file name
> extension. So with this help information about file formats can be
> added. That is expressed inside Magdir/msdos by comment lines like:
>
> # URL: https://en.wikipedia.org/wiki/Microsoft_OneNote
> # http://fileformats.archiveteam.org/wiki/OneNote
> # Reference: https://mark0.net/download/triddefs_xml.7z
> # defs/o/onepkg.trid.xml
>
> According to that documentation ONEPKG are just CAB archive
> containing Microsoft OneNote, with 3 byte file name extension one.
> OneNote table of contents have file name extension ONETOC or ONETOC2.
>
> So when looking in current output of file command, we see that first
> member name is something like "Class Notes.one", "test-onenote.one",
> "Open Notebook.onetoc2" or "Editor Öffnen.onetoc2". This can be
> verified by unpacking command line tool via command line like:
> 7z l -tcab *onepkg
>
> The description happens inside Magdir/msdos by starting like:
> 0 string/b MSCF\0\0\0\0 Microsoft Cabinet archive data
>
> No i must insert lines looking for 3 byte one file name suffix. The
> jump to first member entry and looking for point character before
> suffix is done by lines like:
>>> (16.l+16) ubyte x
> .
>>>>> &-1 search/255 .
> Now i am in branch for file name extension. After last of that kind
> (that is theme for Windows 7 or 8 Theme Pack) i insert lines for
> OneNote Package. This looks like:
>>>>>> &0 string/c one \b, OneNote Package
> !:mime application/msonenote
> !:ext onepkg
> Instead of generic mime type application/vnd.ms-cab-compressed
> or application/octet-stream i show a type mentioned on nirsoft web
> site. But this is not official registered.
>
> After applying the above mentioned modifications by patch
> file-5.42-msdos-onepkg.diff then my OneNote Packages are described
> more precisely like:
>
> DemoNotebook.onepkg: Microsoft Cabinet archive data,
> OneNote Package, 234775 bytes, 8 files,
> at 0x2c
> +A "Bespechungsnotizen.one"
> +A "Forschung.one"
> , number 1,
> 12 datablocks, 0x1203 compression
> Notebook03.onepkg: Microsoft Cabinet archive data,
> OneNote Package, 1272589 bytes, 2 files,
> at 0x44
> +Utf "Editor \303\266ffnen.onetoc2"
> +Utf "Allgemein.one"
> , flags 0x4, number 1,
> extra bytes 20 in head,
> 44 datablocks, 0xf03 compression
> ONGuide.onepkg: Microsoft Cabinet archive data,
> OneNote Package, 248915 bytes, 2 files,
> at 0x44
> + "Editor \303\266ffnen.onetoc2"
> + "Erste Schritte - Beta 1.one"
> , flags 0x4, number 1,
> extra bytes 20 in head,
> 9 datablocks, 0xf03 compression
> test-onenote.onepkg: Microsoft Cabinet archive data,
> OneNote Package, 3977 bytes, 1 file,
> at 0x2c
> +A "test-onenote.one"
> , number 1,
> 1 datablock, 0x1203 compression
>
> I hope my diff file can be applied in future version of
> file utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYxfQBgAKCRCv8rHJQhrU
> 1v9rAKCdBfg22WJHViuJPPCmi4tT1XFSyQCgqhcIFHC+MO6hLZ4FT+hdNE2DxKw=
> =CI3n
> -----END PGP SIGNATURE-----
> <trid-v-onepkg.txt.gz><droid-onepkg.csv.gz><file-5_42-msdos-openpkg_diff.DEFANGED-303><file-5_42-msdos-openpkg_diff_sig.DEFANGED-304>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20220907/26f116e5/attachment.asc>
More information about the File
mailing list