[File] [PATCH] Magdir/archive Atari MSA archive misidentifies setup.skin

Jörg Jenderek joerg.jen.der.ek at gmx.net
Tue May 24 22:38:58 UTC 2022


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some days ago i handle some skin examples. When running file
command version 5.41 some such SKIN examples and related files i
get an output like:

PDATS578.msa: Atari MSA archive data,
	      9 sectors per track, 2 sided,
	      starting track: 0, ending track: 79
XXX_INT.MSA:  Atari MSA archive data,
	      10 sectors per track, 2 sided,
	      starting track: 0, ending track: 81
adr_1.msa:    Atari MSA archive data,
	      10 sectors per track, 2 sided,
	      starting track: 0, ending track: 80
maggie4a.msa: Atari MSA archive data,
	      10 sectors per track, 2 sided,
	      starting track: 0, ending track: 39
setup.skin:   Atari MSA archive data,
	      -11636 sectors per track,
	      starting track: 22332, ending track: 3470


With option -i only generic application/octet-stream is shown. With
option --extension only 3 byte sequence ??? is shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). It identifies the
Atari examples as "Atari MSA Disk Image" by msa.trid.xml. It does not
misidentifies example setup.skin as MSA but i was not able to
recognise it (described as "Unknown!" see appended trid-v-msa.txt.gz)
.

Luckily TrID tool with option -v shows used file name extension.
With this information i was able to find a page about
MSA (Magic Shadow Archiver) on file formats archive team web site.
There also a link to Atari Image File Formats Specifications is
listed. That information is added inside Magdir/archive by comment
lines like:

# URL:		http://fileformats.archiveteam.org/
#		wiki/MSA_(Magic_Shadow_Archiver)
# Reference:	http://info-coach.fr
#		atari/documents/_mydoc/FD_Image_File_Format.pdf
#		http://mark0.net/download/triddefs_xml.7z
#		defs/m/msa.trid.xml

On mentioned site download links for examples and tools are
listed. I verified information partly by decoding tool deark (See
appended deark-l-msa.txt)  by command lines like:
	deark -l -m msa -d2 PDATS578.msa
Here also example setup.skin is not misidentified.

The detection happens inside Magdir/archive by lines like:
# Atari MSA archive - Teemu Hukkanen <tjhukkan at iki.fi>
0	beshort 0x0e0f		Atari MSA archive data

So only 2 bytes are used for detection and obviously this
recognition method is not strong enough. Additional information is
shown by additional lines like:
 >2	beshort x		\b, %d sectors per track
 >6	beshort x		\b, starting track: %d
 >8	beshort x		\b, ending track: %d

For real MSA examples i get low values:
	9 10		sectors per track
	0		starting track
	39 79 80 81	ending track
For bad example like setup.skin i get unrealistic high numbers or
numbers are so high that these are interpreted as negative value.
So i could use that values as additional test, but there is a
problem to define or research the limits for what is valid or wrong.

The sides value minus one is stored as 2 byte integer value in big
endian format. So only value zero or one can occur here. That
information is shown by lines like:
 >4	beshort 0		\b, 1 sided
 >4	beshort 1		\b, 2 sided

So i use that reliable information and recognition now starts with
lines like:
 0	beshort 0x0e0f
 >4	ubeshort <2		Atari MSA archive data
 !:mime	application/x-atari-msa
 !:ext	msa
So now bad example setup.skin is skipped. Instead of generic mime
type application/octet-stream for such binary files i apply an user
defined one.

After applying the above mentioned modifications by patch
file-5.41-archive-msa.diff then all my inspected MSA examples are
still described but misidentification vanish. This now looks like:

PDATS578.msa: Atari MSA archive data,
	      9 sectors per track, 2 sided,
	      starting track: 0, ending track: 79
XXX_INT.MSA:  Atari MSA archive data,
	      10 sectors per track, 2 sided,
	      starting track: 0, ending track: 81
adr_1.msa:    Atari MSA archive data,
	      10 sectors per track, 2 sided,
	      starting track: 0, ending track: 80
maggie4a.msa: Atari MSA archive data,
	      10 sectors per track, 2 sided,
	      starting track: 0, ending track: 39
setup.skin:   data

I hope my diff file can be applied in future version of file
utility.

With best wishes,
Jörg Jenderek
- --
Jörg Jenderek





-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYo1egQAKCRCv8rHJQhrU
1vfaAKCs+4Ft0ISxNuYhQv5tm9m0Sul0SACgqQ3sSWCHFpkwNEwY7mXzKgnm3g0=
=2miB
-----END PGP SIGNATURE-----
-------------- next part --------------
-- 
File mailing list
File at astron.com
https://mailman.astron.com/mailman/listinfo/file

-------------- next part --------------
--- file-5.41/magic/Magdir/archive.old	2021-08-30 11:10:26.000000000 +0200
+++ file-5.41/magic/Magdir/archive	2022-05-25 00:26:40.428775500 +0200
@@ -1523,10 +1523,28 @@
 
 # Atari MSA archive - Teemu Hukkanen <tjhukkan at iki.fi>
-0	beshort 0x0e0f		Atari MSA archive data
->2	beshort x		\b, %d sectors per track
->4	beshort 0		\b, 1 sided
->4	beshort 1		\b, 2 sided
->6	beshort x		\b, starting track: %d
->8	beshort x		\b, ending track: %d
+# URL:		http://fileformats.archiveteam.org/wiki/MSA_(Magic_Shadow_Archiver)
+# Reference:	http://info-coach.fr/atari/documents/_mydoc/FD_Image_File_Format.pdf
+#		http://mark0.net/download/triddefs_xml.7z/defs/m/msa.trid.xml
+# Update:	Joerg Jenderek
+# Note:		called by TrID "Atari MSA Disk Image" and verified by
+#		command like `deark -l -m msa -d2 PDATS578.msa` as " Atari ST floppy disk image"
+# GRR: line below is too general as it matches setup.skin
+0	beshort 0x0e0f
+# skip foo setup.skin with unrealistic high number 52255 of sides by check for valid "low" value
+>4	ubeshort <2		Atari MSA archive data
+#!:mime	application/octet-stream
+!:mime	application/x-atari-msa
+!:ext	msa
+# sectors per track like: 9 10
+>>2	beshort x		\b, %d sectors per track
+# sides (0 or 1; add 1 to this to get correct number of sides)
+>>4	beshort 0		\b, 1 sided
+>>4	beshort 1		\b, 2 sided
+# starting track like: 0
+>>6	beshort x		\b, starting track: %d
+# ending track like: 39 79 80 81
+>>8	beshort x		\b, ending track: %d
+# tracks content
+#>>10	ubequad x		\b, track content %#16.16llx
 
 # Alternate ZIP string (amc at arwen.cs.berkeley.edu)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-archive-msa.diff.sig
Type: application/octet-stream
Size: 992 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220525/f15cf02a/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deark-l-msa.txt.gz
Type: application/x-gzip
Size: 12880 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220525/f15cf02a/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-msa.txt.gz
Type: application/x-gzip
Size: 325 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220525/f15cf02a/attachment-0003.bin>


More information about the File mailing list