[File] [PATCH] Magdir/filesystems UDF versus 9660 CD-ROM *.ISO
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Fri Mar 17 21:19:20 UTC 2023
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
Some days ago i handle CD-ROM images (401 samples including
duplicates). The standard format has a suffix like iso. 344 are
described as "ISO 9660 CD-ROM filesystem data" by file command.
The remaining are described other or not. Some of these samples are
created by myself via CD-ROM burning software ( like nero, imgburn,
power2go, Ashampoo) or are used by myself. So i know that these must
be valid CD-ROM/DVD images.
When running file command version 5.44 on such samples i get an
output like:
BOOKSHELF.ISO: High Sierra CD-ROM filesystem data
'BOOKSHELF'
MyAshampoo-7.iso: data
Shareware Grab Bag.iso: High Sierra CD-ROM filesystem data
'GRAB_BAG'
TEST-11-imgburn.iso: data
TEST-imgburn.iso: UDF filesystem data (version 1.5)
'TEST_IMGBURN'
ct_dvd_2019.iso: data
imgburn-12-udf.iso: data
nero-UDF1.iso: data
nero-UDFv26.iso: data
power2go.iso: data
test-imgburn-2.udf: data
With option --extension i get output like:
BOOKSHELF.ISO: ???
MyAshampoo-7.iso: ???
Shareware Grab Bag.iso: ???
TEST-11-imgburn.iso: ???
TEST-imgburn.iso: iso/udf
ct_dvd_2019.iso: ???
imgburn-12-udf.iso: ???
nero-UDF1.iso: ???
nero-UDFv26.iso: ???
power2go.iso: ???
test-imgburn-2.udf: ???
For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This does only
recognize here the sample TEST-imgburn.iso as "ISO 9660 CD image" by
iso-9660-image.trid.xml. All other of these samples are described
wrong or are not recognized (See appended trid-v-udf.txt.gz).
For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/).
The recognized TEST-imgburn.iso is described here as "UDF-ISO 9660
Bridge Disc" by PUID fmt/1739. Many of the samples are described as "
UDF Disc Image" by PUID fmt/1738. But only ISO suffix is considered
as valid and UDF is considered as bad (EXTENSION_MISMATCH true see
appended droid-udf.csv.gz)
With the help of DROID tool i found pages about UDF file format. That
is expressed inside Magdir/filesystems by comment lines like:
# URL: http://fileformats.archiveteam.org/
# wiki/Universal_Disk_Format
# https://en.wikipedia.org/wiki/Universal_Disk_Format
# Reference: https://wiki.osdev.org/UDF
The detected samples are done by lines inside Magdir/filesystems
which looks like:
32769 string CD001
>0 use cdrom
0 name cdrom
>38913 string !NSR0 ISO 9660 CD-ROM filesystem data
!:mime application/x-iso9660-image
!:ext iso/iso9660
>38913 string NSR0 UDF filesystem data
!:mime application/x-iso9660-image
!:ext iso/udf
>>38917 string 1 (version 1.0)
>>38917 string 2 (version 1.5)
>>38917 string 3 (version 2.0)
So samples like TEST-imgburn.iso are detected. These contain two
parts. One for ISO 9660 CD-ROM and also a part for UDF. That hybrid
images are therefore called like "UDF-ISO 9660 Bridge Disc". The
current undetected samples obviously contain no ISO 9660 CD-ROM part
but an UDF part. This can be verified by udftool command line and/or
7-zip packing command line like:
udfinfo nero-UDFv26.iso
7z l -tUdf nero-UDF1.iso
7z l -tIso TEST-imgburn.iso
The method used by DROID for UDF expressed as magic lines look like:
32769 string BEA01
>34817 string NSR0 UDF filesystem data
!:mime application/x-udf-image
!:ext iso/udf
>>34821 ubyte 0x32 (version 1.x)
>>34821 ubyte 0x33 (version 2.x)
Instead string CD001 here i find extended descriptor section
(indicated by BEA01 string) at relative offset 1 of block 16 with
size 2048. In the next block i find string NSR0. This type descriptor
is an indicator for UDF. Because for CD-ROM/DVD with ISO 9660 CD-ROM
filesystem the suffix ISO is used. This is also used for images with
UDF part. To distinguish from that old part obviously also suffix UDF
is used. This this not mentioned or described officially but this
essential needed for Windows system relying on suffix of file name.
Furthermore i found in shared mime database an user defined mime type
for hybrid variant. So i take this. Unfortunately i have not enough
time and brain to read and understand hundreds of pages with
specification. So i do not know if lines with version are correct.
In output of udfinfo there is something reported by lines like:
udfrev=2.50
udfwriterev=2.60
The images contain also more meta information fields (like owner
organization contact appid impid winserialnum) which maybe useful and
shown by mentioned tools, but i was not able to do this.
After applying the above mentioned modifications by patch
file-5.44-filesystems-udf.diff then most of my UDF samples are now
recognized. This now then looks like:
A few of my images like BOOKSHELF.ISO and "Shareware Grab Bag.iso"
are described inside Magdir/filesystems by lines like:
32777 string CDROM High Sierra CD-ROM filesystem data
>32816 string/T >\0 '%.32s'
This variant is also described on file formats archive team web site.
That is expressed inside Magdir/filesystems by additional comment
lines like:
# URL: http://fileformats.archiveteam.org/wiki/High_Sierra
So i add line to show file name suffix by line:
!:ext iso
Unfortunately i found no mime type. I get some vague hints like
application/x-hsfs-image where HSFS means High Sierra Filesystem.
Such hints can be found when searching for man pages to mount such
hsfs file system images.
After applying the above mentioned modifications by patch
file-5.44-filesystems-udf.diff then most of my samples are now
recognized. This now then looks like:
BOOKSHELF.ISO: High Sierra CD-ROM filesystem data
'BOOKSHELF'
MyAshampoo-7.iso: UDF filesystem data (version 1.x)
Shareware Grab Bag.iso: High Sierra CD-ROM filesystem data
'GRAB_BAG'
TEST-11-imgburn.iso: UDF filesystem data (version 2.x)
TEST-imgburn.iso: UDF filesystem data (version 1.5)
'TEST_IMGBURN'
ct_dvd_2019.iso: UDF filesystem data (version 1.x)
imgburn-12-udf.iso: data
nero-UDF1.iso: UDF filesystem data (version 1.x)
nero-UDFv26.iso: UDF filesystem data (version 2.x)
power2go.iso: UDF filesystem data (version 2.x)
test-imgburn-2.udf: UDF filesystem data (version 2.x)
For High Sierra CD-ROM filesystem now file name suffix is shown when
running with --extension option. This now looks like:
BOOKSHELF.ISO: iso
MyAshampoo-7.iso: iso/udf
Shareware Grab Bag.iso: iso
TEST-11-imgburn.iso: iso/udf
TEST-imgburn.iso: iso/udf
ct_dvd_2019.iso: iso/udf
imgburn-12-udf.iso: ???
nero-UDF1.iso: iso/udf
nero-UDFv26.iso: iso/udf
power2go.iso: iso/udf
test-imgburn-2.udf: iso/udf
I hope my diff file can be applied in future version of file
utility.
There is something to do. Some ISO samples are still not recognized.
These seems to have also an UDF file system.
With option --apple the file command output the file type and
creator code as used by older MacOS versions. This option works
properly only for file formats that have the apple-style output
defined. So for
GIF image data this is done by magic line like:
!:apple 8BIMGIFf
So it would be nice to have option to show information concerning
TrID and DROID identification tools by lines like:
!:trid bitmap-gif89a.trid.xml
!:puid fmt/4
At the moment i put such information as comment inside magic lines.
The situation is similar to anti virus software. Every company or
institution has it own naming convention. If you are lucky then
with brain you see the same thing is meant because description is
nearly the same. But sometimes this is quit different. This is
confusing the people.
With best wishes,
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCZBTZRQAKCRCv8rHJQhrU
1pZDAJ9aS++IR6VQzZ4ZNhZ2Ry0Bhg3uxQCgmFNWnIqwfRXAqbmINvf/1OBEolo=
=riYG
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: droid-udf.csv.gz
Type: application/x-gzip
Size: 642 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230317/ff3090c7/attachment.bin>
-------------- next part --------------
--- file-5.44/magic/Magdir/filesystems.old 2022-12-26 19:00:47.000000000 +0100
+++ file-5.44/magic/Magdir/filesystems 2023-03-17 17:59:22.381480100 +0100
@@ -1957,7 +1957,14 @@
>34816 string \000CD001\001EL\ TORITO\ SPECIFICATION (bootable)
37633 string CD001 ISO 9660 CD-ROM filesystem data (raw 2352 byte sectors)
!:mime application/x-iso9660-image
+# URL: http://fileformats.archiveteam.org/wiki/High_Sierra
+# Update: Joerg Jenderek
32777 string CDROM High Sierra CD-ROM filesystem data
+# https://www.unix.com/man-page/OpenSolaris/7fs/hsfs/
+#!:mime application/octet-stream
+#!:mime application/x-hsfs-image
+# BOOKSHELF.ISO "Shareware Grab Bag.iso"
+!:ext iso
# "application id" which appears to be used as a volume label
>32816 string/T >\0 '%.32s'
@@ -1973,6 +1980,26 @@
!:strength +35
>0 use cdrom
+# From: Joerg Jenderek
+# URL: http://fileformats.archiveteam.org/wiki/Universal_Disk_Format
+# https://en.wikipedia.org/wiki/Universal_Disk_Format
+# Reference: https://wiki.osdev.org/UDF
+# Note: called "UDF Disc Image" by DROID via PUID fmt/1738
+# verified by udftools `udfinfo nero-UDFv26.iso` and 7-Zip `7z l -tUdf nero-UDF1.iso`
+# there seems to exist variants which are not recognized by current test lines
+#
+# look for type descriptor at relative offset 1 of block 16 with size 2048
+# it is an extended descriptor section
+32769 string BEA01
+# look for type descriptor at relative offset 1 of block 17 with size 2048
+>34817 string NSR0 UDF filesystem data
+#!:mime application/octet-stream
+!:mime application/x-udf-image
+!:ext iso/udf
+# reported as udfrev and udfwriterev by udfinfo
+>>34821 ubyte 0x32 (version 1.x)
+>>34821 ubyte 0x33 (version 2.x)
+
# URL: https://en.wikipedia.org/wiki/NRG_(file_format)
# Reference: https://dl.opendesktop.org/api/files/download/id/1460731811/
# 11577-mount-iso-0.9.5.tar.bz2/mount-iso-0.9.5/install.sh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-udf.txt.gz
Type: application/x-gzip
Size: 1407 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230317/ff3090c7/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-filesystems-udf.diff.sig
Type: application/octet-stream
Size: 1182 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230317/ff3090c7/attachment.obj>
More information about the File
mailing list