[File] [PATCH] Magdir/filesystems UDF versus 9660 CD-ROM *.ISO

Jörg Jenderek joerg.jen.der.ek at gmx.net
Fri Mar 17 21:19:20 UTC 2023


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

Some days ago i handle CD-ROM images (401 samples including
duplicates). The standard format has a suffix like iso. 344 are
described as "ISO 9660 CD-ROM filesystem data" by file command.
The remaining are described other or not. Some of these samples are
created by myself via CD-ROM burning software ( like nero, imgburn,
power2go, Ashampoo) or are used by myself. So i know that these must
be valid CD-ROM/DVD images.
When running file command version 5.44 on such samples i get an
output like:

BOOKSHELF.ISO:		High Sierra CD-ROM filesystem data
			'BOOKSHELF'
MyAshampoo-7.iso:	data
Shareware Grab Bag.iso: High Sierra CD-ROM filesystem data
			'GRAB_BAG'
TEST-11-imgburn.iso:    data
TEST-imgburn.iso:       UDF filesystem data (version 1.5)
			'TEST_IMGBURN'
ct_dvd_2019.iso:        data
imgburn-12-udf.iso:     data
nero-UDF1.iso:          data
nero-UDFv26.iso:        data
power2go.iso:           data
test-imgburn-2.udf:     data

With option --extension i get output like:

BOOKSHELF.ISO:          ???
MyAshampoo-7.iso:       ???
Shareware Grab Bag.iso: ???
TEST-11-imgburn.iso:    ???
TEST-imgburn.iso:       iso/udf
ct_dvd_2019.iso:        ???
imgburn-12-udf.iso:     ???
nero-UDF1.iso:          ???
nero-UDFv26.iso:        ???
power2go.iso:           ???
test-imgburn-2.udf:     ???

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This does only
recognize here the sample TEST-imgburn.iso as "ISO 9660 CD image" by
iso-9660-image.trid.xml. All other of these samples are described
wrong or are not recognized (See appended trid-v-udf.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/).
The recognized TEST-imgburn.iso is described here as "UDF-ISO 9660
Bridge Disc" by PUID fmt/1739. Many of the samples are described as "
UDF Disc Image" by PUID fmt/1738. But only ISO suffix is considered
as valid and UDF is considered as bad (EXTENSION_MISMATCH true see
appended droid-udf.csv.gz)

With the help of DROID tool i found pages about UDF file format. That
is expressed inside Magdir/filesystems by comment lines like:
# URL:		http://fileformats.archiveteam.org/
#		wiki/Universal_Disk_Format
#		https://en.wikipedia.org/wiki/Universal_Disk_Format
# Reference:	https://wiki.osdev.org/UDF

The detected samples are done by lines inside Magdir/filesystems
which looks like:
 32769	string    CD001
 >0	use	cdrom
 0	name				cdrom
 >38913	string   !NSR0      ISO 9660 CD-ROM filesystem data
 !:mime	application/x-iso9660-image
 !:ext	iso/iso9660
 >38913	string    NSR0      UDF filesystem data
 !:mime	application/x-iso9660-image
 !:ext	iso/udf
 >>38917	string    1         (version 1.0)
 >>38917	string    2         (version 1.5)
 >>38917	string    3         (version 2.0)
So samples like TEST-imgburn.iso are detected. These contain two
parts. One for ISO 9660 CD-ROM and also a part for UDF. That hybrid
images are therefore called like "UDF-ISO 9660 Bridge Disc". The
current undetected samples obviously contain no ISO 9660 CD-ROM part
but an UDF part. This can be verified by udftool command line and/or
7-zip packing command line like:
	udfinfo nero-UDFv26.iso
	7z l -tUdf nero-UDF1.iso
	7z l -tIso TEST-imgburn.iso

The method used by DROID for UDF expressed as magic lines look like:
 32769	string    	BEA01
 >34817	string    	NSR0	UDF filesystem data
 !:mime	application/x-udf-image
 !:ext	iso/udf
 >>34821	ubyte    	0x32	(version 1.x)
 >>34821	ubyte    	0x33	(version 2.x)

Instead string CD001 here i find extended descriptor section
(indicated by BEA01 string) at relative offset 1 of block 16 with
size 2048. In the next block i find string NSR0. This type descriptor
is an indicator for UDF. Because for CD-ROM/DVD with ISO 9660 CD-ROM
filesystem the suffix ISO is used. This is also used for images with
UDF part. To distinguish from that old part obviously also suffix UDF
is used. This this not mentioned or described officially but this
essential needed for Windows system relying on suffix of file name.
Furthermore i found in shared mime database an user defined mime type
for hybrid variant. So i take this. Unfortunately i have not enough
time and brain to read and understand hundreds of pages with
specification. So i do not know if lines with version are correct.
In output of udfinfo there is something reported by lines like:
	udfrev=2.50
	udfwriterev=2.60
The images contain also more meta information fields (like owner
organization contact appid impid winserialnum) which maybe useful and
shown by mentioned tools, but i was not able to do this.

After applying the above mentioned modifications by patch
file-5.44-filesystems-udf.diff then most of my UDF samples are now
recognized. This now then looks like:

A few of my images like BOOKSHELF.ISO and "Shareware Grab Bag.iso"
are described inside Magdir/filesystems by lines like:
 32777	string    CDROM     High Sierra CD-ROM filesystem data
 >32816	string/T  >\0       '%.32s'

This variant is also described on file formats archive team web site.
That is expressed inside Magdir/filesystems by additional comment
lines like:
# URL:		http://fileformats.archiveteam.org/wiki/High_Sierra
So i add line to show file name suffix by line:
!:ext	iso
Unfortunately i found no mime type. I get some vague hints like
application/x-hsfs-image where HSFS means High Sierra Filesystem.
Such hints can be found when searching for man pages to mount such
hsfs file system images.

After applying the above mentioned modifications by patch
file-5.44-filesystems-udf.diff then most of my samples are now
recognized. This now then looks like:

BOOKSHELF.ISO:		High Sierra CD-ROM filesystem data
			'BOOKSHELF'
MyAshampoo-7.iso:	UDF filesystem data (version 1.x)
Shareware Grab Bag.iso: High Sierra CD-ROM filesystem data
			'GRAB_BAG'
TEST-11-imgburn.iso:	UDF filesystem data (version 2.x)
TEST-imgburn.iso:	UDF filesystem data (version 1.5)
			'TEST_IMGBURN'
ct_dvd_2019.iso:	UDF filesystem data (version 1.x)
imgburn-12-udf.iso:	data
nero-UDF1.iso:		UDF filesystem data (version 1.x)
nero-UDFv26.iso:	UDF filesystem data (version 2.x)
power2go.iso:		UDF filesystem data (version 2.x)
test-imgburn-2.udf:	UDF filesystem data (version 2.x)

For High Sierra CD-ROM filesystem now file name suffix is shown when
running with --extension option. This now looks like:

BOOKSHELF.ISO:          iso
MyAshampoo-7.iso:       iso/udf
Shareware Grab Bag.iso: iso
TEST-11-imgburn.iso:    iso/udf
TEST-imgburn.iso:       iso/udf
ct_dvd_2019.iso:        iso/udf
imgburn-12-udf.iso:     ???
nero-UDF1.iso:          iso/udf
nero-UDFv26.iso:        iso/udf
power2go.iso:           iso/udf
test-imgburn-2.udf:     iso/udf

I hope my diff file can be applied in future version of file
utility.

There is something to do. Some ISO samples are still not recognized.
These seems to have also an UDF file system.

With option --apple the file command output the file type and
creator code as used by older MacOS versions. This option works
properly only for file formats that have the apple-style output
defined. So for
GIF image data this is done by magic line like:
!:apple	8BIMGIFf
So it would be nice to have option to show information concerning
TrID and DROID identification tools by lines like:
!:trid	bitmap-gif89a.trid.xml
!:puid	fmt/4
At the moment i put such information as comment inside magic lines.
The situation is similar to anti virus software. Every company or
institution has it own naming convention. If you are lucky then
with brain you see the same thing is meant because description is
nearly the same. But sometimes this is quit different. This is
confusing the people.

With best wishes,
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCZBTZRQAKCRCv8rHJQhrU
1pZDAJ9aS++IR6VQzZ4ZNhZ2Ry0Bhg3uxQCgmFNWnIqwfRXAqbmINvf/1OBEolo=
=riYG
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: droid-udf.csv.gz
Type: application/x-gzip
Size: 642 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230317/ff3090c7/attachment.bin>
-------------- next part --------------
--- file-5.44/magic/Magdir/filesystems.old	2022-12-26 19:00:47.000000000 +0100
+++ file-5.44/magic/Magdir/filesystems	2023-03-17 17:59:22.381480100 +0100
@@ -1957,7 +1957,14 @@
 >34816	string    \000CD001\001EL\ TORITO\ SPECIFICATION    (bootable)
 37633	string    CD001     ISO 9660 CD-ROM filesystem data (raw 2352 byte sectors)
 !:mime	application/x-iso9660-image
+# URL:		http://fileformats.archiveteam.org/wiki/High_Sierra
+# Update:	Joerg Jenderek
 32777	string    CDROM     High Sierra CD-ROM filesystem data
+# https://www.unix.com/man-page/OpenSolaris/7fs/hsfs/
+#!:mime	application/octet-stream
+#!:mime	application/x-hsfs-image
+# BOOKSHELF.ISO "Shareware Grab Bag.iso"
+!:ext	iso
 # "application id" which appears to be used as a volume label
 >32816	string/T  >\0       '%.32s'
 
@@ -1973,6 +1980,26 @@
 !:strength +35
 >0	use	cdrom
 
+# From:		Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/Universal_Disk_Format
+#		https://en.wikipedia.org/wiki/Universal_Disk_Format
+# Reference:	https://wiki.osdev.org/UDF
+# Note:		called "UDF Disc Image" by DROID via PUID fmt/1738
+#		verified by udftools `udfinfo nero-UDFv26.iso` and 7-Zip `7z l -tUdf nero-UDF1.iso`
+# 		there seems to exist variants which are not recognized by current test lines
+#		
+# look for type descriptor at relative offset 1 of block 16 with size 2048
+# it is an extended descriptor section
+32769	string    	BEA01
+# look for type descriptor at relative offset 1 of block 17 with size 2048
+>34817	string    	NSR0	UDF filesystem data
+#!:mime	application/octet-stream
+!:mime	application/x-udf-image
+!:ext	iso/udf
+# reported as udfrev and udfwriterev by udfinfo
+>>34821	ubyte    	0x32	(version 1.x)
+>>34821	ubyte    	0x33	(version 2.x)
+
 # URL: https://en.wikipedia.org/wiki/NRG_(file_format)
 # Reference: https://dl.opendesktop.org/api/files/download/id/1460731811/
 #	11577-mount-iso-0.9.5.tar.bz2/mount-iso-0.9.5/install.sh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-udf.txt.gz
Type: application/x-gzip
Size: 1407 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230317/ff3090c7/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-filesystems-udf.diff.sig
Type: application/octet-stream
Size: 1182 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230317/ff3090c7/attachment.obj>


More information about the File mailing list