[File] [PATCH] Magdir/archive Atari MSA archive misidentifies setup.skin
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Tue May 24 22:38:58 UTC 2022
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
some days ago i handle some skin examples. When running file
command version 5.41 some such SKIN examples and related files i
get an output like:
PDATS578.msa: Atari MSA archive data,
9 sectors per track, 2 sided,
starting track: 0, ending track: 79
XXX_INT.MSA: Atari MSA archive data,
10 sectors per track, 2 sided,
starting track: 0, ending track: 81
adr_1.msa: Atari MSA archive data,
10 sectors per track, 2 sided,
starting track: 0, ending track: 80
maggie4a.msa: Atari MSA archive data,
10 sectors per track, 2 sided,
starting track: 0, ending track: 39
setup.skin: Atari MSA archive data,
-11636 sectors per track,
starting track: 22332, ending track: 3470
With option -i only generic application/octet-stream is shown. With
option --extension only 3 byte sequence ??? is shown.
For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). It identifies the
Atari examples as "Atari MSA Disk Image" by msa.trid.xml. It does not
misidentifies example setup.skin as MSA but i was not able to
recognise it (described as "Unknown!" see appended trid-v-msa.txt.gz)
.
Luckily TrID tool with option -v shows used file name extension.
With this information i was able to find a page about
MSA (Magic Shadow Archiver) on file formats archive team web site.
There also a link to Atari Image File Formats Specifications is
listed. That information is added inside Magdir/archive by comment
lines like:
# URL: http://fileformats.archiveteam.org/
# wiki/MSA_(Magic_Shadow_Archiver)
# Reference: http://info-coach.fr
# atari/documents/_mydoc/FD_Image_File_Format.pdf
# http://mark0.net/download/triddefs_xml.7z
# defs/m/msa.trid.xml
On mentioned site download links for examples and tools are
listed. I verified information partly by decoding tool deark (See
appended deark-l-msa.txt) by command lines like:
deark -l -m msa -d2 PDATS578.msa
Here also example setup.skin is not misidentified.
The detection happens inside Magdir/archive by lines like:
# Atari MSA archive - Teemu Hukkanen <tjhukkan at iki.fi>
0 beshort 0x0e0f Atari MSA archive data
So only 2 bytes are used for detection and obviously this
recognition method is not strong enough. Additional information is
shown by additional lines like:
>2 beshort x \b, %d sectors per track
>6 beshort x \b, starting track: %d
>8 beshort x \b, ending track: %d
For real MSA examples i get low values:
9 10 sectors per track
0 starting track
39 79 80 81 ending track
For bad example like setup.skin i get unrealistic high numbers or
numbers are so high that these are interpreted as negative value.
So i could use that values as additional test, but there is a
problem to define or research the limits for what is valid or wrong.
The sides value minus one is stored as 2 byte integer value in big
endian format. So only value zero or one can occur here. That
information is shown by lines like:
>4 beshort 0 \b, 1 sided
>4 beshort 1 \b, 2 sided
So i use that reliable information and recognition now starts with
lines like:
0 beshort 0x0e0f
>4 ubeshort <2 Atari MSA archive data
!:mime application/x-atari-msa
!:ext msa
So now bad example setup.skin is skipped. Instead of generic mime
type application/octet-stream for such binary files i apply an user
defined one.
After applying the above mentioned modifications by patch
file-5.41-archive-msa.diff then all my inspected MSA examples are
still described but misidentification vanish. This now looks like:
PDATS578.msa: Atari MSA archive data,
9 sectors per track, 2 sided,
starting track: 0, ending track: 79
XXX_INT.MSA: Atari MSA archive data,
10 sectors per track, 2 sided,
starting track: 0, ending track: 81
adr_1.msa: Atari MSA archive data,
10 sectors per track, 2 sided,
starting track: 0, ending track: 80
maggie4a.msa: Atari MSA archive data,
10 sectors per track, 2 sided,
starting track: 0, ending track: 39
setup.skin: data
I hope my diff file can be applied in future version of file
utility.
With best wishes,
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCYo1egQAKCRCv8rHJQhrU
1vfaAKCs+4Ft0ISxNuYhQv5tm9m0Sul0SACgqQ3sSWCHFpkwNEwY7mXzKgnm3g0=
=2miB
-----END PGP SIGNATURE-----
-------------- next part --------------
--
File mailing list
File at astron.com
https://mailman.astron.com/mailman/listinfo/file
-------------- next part --------------
--- file-5.41/magic/Magdir/archive.old 2021-08-30 11:10:26.000000000 +0200
+++ file-5.41/magic/Magdir/archive 2022-05-25 00:26:40.428775500 +0200
@@ -1523,10 +1523,28 @@
# Atari MSA archive - Teemu Hukkanen <tjhukkan at iki.fi>
-0 beshort 0x0e0f Atari MSA archive data
->2 beshort x \b, %d sectors per track
->4 beshort 0 \b, 1 sided
->4 beshort 1 \b, 2 sided
->6 beshort x \b, starting track: %d
->8 beshort x \b, ending track: %d
+# URL: http://fileformats.archiveteam.org/wiki/MSA_(Magic_Shadow_Archiver)
+# Reference: http://info-coach.fr/atari/documents/_mydoc/FD_Image_File_Format.pdf
+# http://mark0.net/download/triddefs_xml.7z/defs/m/msa.trid.xml
+# Update: Joerg Jenderek
+# Note: called by TrID "Atari MSA Disk Image" and verified by
+# command like `deark -l -m msa -d2 PDATS578.msa` as " Atari ST floppy disk image"
+# GRR: line below is too general as it matches setup.skin
+0 beshort 0x0e0f
+# skip foo setup.skin with unrealistic high number 52255 of sides by check for valid "low" value
+>4 ubeshort <2 Atari MSA archive data
+#!:mime application/octet-stream
+!:mime application/x-atari-msa
+!:ext msa
+# sectors per track like: 9 10
+>>2 beshort x \b, %d sectors per track
+# sides (0 or 1; add 1 to this to get correct number of sides)
+>>4 beshort 0 \b, 1 sided
+>>4 beshort 1 \b, 2 sided
+# starting track like: 0
+>>6 beshort x \b, starting track: %d
+# ending track like: 39 79 80 81
+>>8 beshort x \b, ending track: %d
+# tracks content
+#>>10 ubequad x \b, track content %#16.16llx
# Alternate ZIP string (amc at arwen.cs.berkeley.edu)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-archive-msa.diff.sig
Type: application/octet-stream
Size: 992 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220525/f15cf02a/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deark-l-msa.txt.gz
Type: application/x-gzip
Size: 12880 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220525/f15cf02a/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-msa.txt.gz
Type: application/x-gzip
Size: 325 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20220525/f15cf02a/attachment-0003.bin>
More information about the File
mailing list