[File] [PATCH] Magdir/archive for Ai32 archive *.ai

Jörg Jenderek joerg.jen.der.ek at gmx.net
Thu Jan 12 22:29:44 UTC 2023


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

some times ago i tried to add extension for PostScript document text,
but i had some difficulties. Some samples with suffix AI are
PostScript files for which normally the suffix PS is used.
Unfortunately the suffix AI is also used for other file formats.
One is for compressed files.

When running running file command version 5.44 on such compressed
files i get an output like:

lmhosts-m3-solid.ai: Ai32 archive data
readme-m4.ai:        Ai32 archive data
test-Ai-1.ai:        Ai archive data
test-Ai-2.ai:        Ai archive data

With option -i only generic application/octet-stream is shown.
Furthermore with --extension option only ??? is displayed.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). The samples which are
described by file commandas "Ai32" are described here as "Ai
Archivator compressed archive" with correct suffix AI by
ark-ai.trid.xml (See appended trid-v-ai.txt.gz).

For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/). Here
all AI examples described wrong as "Adobe Illustrator" by PUID
fmt/421	based on file name extension.

With the help of TrID out put i found pages on file formats archive
team web site. That informations are expressed by comment lines like:
# URL:		http://fileformats.archiveteam.org/wiki/Ai_Archiver
# Reference:	http://mark0.net/download/triddefs_xml.7z
#		defs/a/ark-ai.trid.xml

The current description happens in side Magdir/archive. There exist 4
similar entries for such compressed files. The forth entry consist of
just one line like:
0	string	Ai\2\1 Ai32 archive data
So i add afterwards 2 line for mime type and file name suffix like
!:mime	application/x-compress-ai
!:ext	ai
Instead of generic mime type application/octet-stream i choose an
user defined. At offset 10 apparently the name of original
uncompressed file is stored. At offset 8 apparently the length of
this name is stored as 2 byte little endian integer. So show this
information by additional line like:
 >8	pstring/h x	"%s"

According to TrID the next 3 bytes after start pattern are nil, but i
do not know if this is always true. So in case of unexpected values
i show this by lines like:
 >5	ubyte	!0	\b, at 5 %#x
 >6	ubyte	!0	\b, at 6 %#x
 >7	ubyte	!0	\b, at 7 %#x

According to documentation and also found in my examples the fourth
byte with value 0x01 is probably a flag for "solid" mode. So show
this by additional line
 >3	ubyte	=0x01	\b, solid mode

Then do the same modification for third entry which consist of 1 line
like:
0	string	Ai\2\0 Ai32 archive data
The only difference here is that at offset 3 byte field has value 0,
which apparently means "non solid". That is the default mode when
compressing files with ai tool. So this information can be
expressed by line like:
 #>3	ubyte	=0x00	\b, unsolid mode

The first and second entry are apparently for other older versions.
That was expressed by lines like:
0	string	Ai\1\1\0 Ai archive data
0	string	Ai\1\0\0 Ai archive data
Unfortunately i found no real samples for these versions. So i only
add same lines for mime type and file name suffix.

After applying the above mentioned modifications by patch
file-5.44-archive-ai.diff then all such inspected AI compressed files
are still described, but with more details. This now looks like:

lmhosts-m3-solid.ai: Ai32 archive data "lmhosts.sam", solid mode
readme-m4.ai:        Ai32 archive data "readme.txt"
test-Ai-1.ai:        Ai archive data
test-Ai-2.ai:        Ai archive data

I hope my diff file can be applied in future version of file utility.

With best wishes
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iFwEARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY8CJ2AAKCRCv8rHJQhrU
1qIyAJiVzn8KYoNknquMJr2sRAzmfIcDAJ0cPO15yGgSihVcsaN4DsuonTiPmA==
=KN8F
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-ai.txt.gz
Type: application/x-gzip
Size: 321 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230112/51a05431/attachment.bin>
-------------- next part --------------
--- file-5.44/magic/Magdir/archive.old	2022-12-26 19:00:47.000000000 +0100
+++ file-5.44/magic/Magdir/archive	2023-01-12 22:43:35.038255500 +0100
@@ -1010,7 +1010,35 @@
 # Ai
+# Update:	Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/Ai_Archiver
 0	string	Ai\1\1\0 Ai archive data
+#!:mime	application/octet-stream
+!:mime	application/x-compress-ai
+!:ext	ai
 0	string	Ai\1\0\0 Ai archive data
+#!:mime	application/octet-stream
+!:mime	application/x-compress-ai
+!:ext	ai
 # Ai32
+# Reference:	http://mark0.net/download/triddefs_xml.7z/defs/a/ark-ai.trid.xml
+# Note:		called "Ai Archivator compressed archive" by TrID
 0	string	Ai\2\0 Ai32 archive data
+#!:mime	application/octet-stream
+!:mime	application/x-compress-ai
+!:ext	ai
+# original file name
+>8	pstring/h x	"%s"
+# according to TrID the next 3 bytes are nil
+>5	ubyte	!0	\b, at 5 %#x
+>6	ubyte	!0	\b, at 6 %#x
+>7	ubyte	!0	\b, at 7 %#x
+# the fourth byte with value 0 is probably a flag for "non solid" mode
+#>3	ubyte	=0x00	\b, unsolid mode
 0	string	Ai\2\1 Ai32 archive data
+#!:mime	application/octet-stream
+!:mime	application/x-compress-ai
+!:ext	ai
+# original file name
+>8	pstring/h x	"%s"
+# the fourth byte with value 0x01 is probably a flag for "solid" mode; this is not the default
+>3	ubyte	=0x01	\b, solid mode
 # SBC
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-archive-ai.diff.sig
Type: application/octet-stream
Size: 719 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230112/51a05431/attachment.obj>


More information about the File mailing list