[File] [PATCH] Magdir/archive for EDI LZSS compressed file *.??_ *.??$ *.LZS
Christos Zoulas
christos at zoulas.com
Fri Nov 18 15:57:13 UTC 2022
Committed, thanks!
christos
> On Nov 17, 2022, at 9:28 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> some times ago i installed an old Windows Greenstreet software. In
> installation directory are files with underscore as last character of
> file name extension.
>
> When running running file command version 5.43 on such compressed
> files and the related unpacked files i get an output like:
>
> 4WAY.WA$: data
> 4WAY.WAW: RIFF (little-endian) data, WAVE audio,
> Microsoft PCM, 8 bit, mono 11025 Hz
> BOOK01A.IC$: data
> BOOK01A.ICO: MS Windows icon resource - 1 icon, 32x32, 16 colors
> CTL3D.DL$: data
> CTL3D.DLL: MS-DOS executable, NE for MS Windows 3.x (DLL or font)
> GUNSHOT.LZS: data
> GUNSHOT.bmp: PC bitmap, Windows 3.x format, 335 x 364 x 8,
> image size 122304, resolution 3543 x 3543 px/m,
> cbSize 123382, bits offset 1078
> HERBTEXT.LZS: data
> HERBTEXT.txt: ASCII text, with very long lines (369)
> LACERATE.LZS: data
> LACERATE.bmp: PC bitmap, Windows 3.x format, 261 x 351 x 8,
> image size 92664, resolution 2756 x 2756 px/m,
> cbSize 93742, bits offset 1078
> PLANTAIN.LZS: data
> SKYMAP.EXE: MS-DOS executable, NE for MS Windows 3.x (EXE)
> SKYMAP.EX_: data
> SPELMATE.H: C source, ASCII text, with CRLF line terminators
> SPELMATE.H$: data
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). This
> identifies some examples with dollar or underscore as last character
> like 4WAY.WA$ or SKYMAP.EX_ as "EDI Install Pro LZSS2 compressed
> data" by edi-lzss2.trid.xml. The other compressed examples are
> described as "EDI Install LZS compressed data" by
> ediinstall-lzss1.trid.xml (See appended trid-v-edi.txt.gz).
>
> With the help of TrID out put i found pages on file formats archive
> team web site. That informations are expressed by comment lines like:
> # URL: http://fileformats.archiveteam.org/wiki/
> # EDI_Install_packed_file
> # EDI_LZSSLib
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # /defs/e/ediinstall-lzss1.trid.xml
> # /defs/e/edi-lzss2.trid.xml
>
> The compressed data format is similar or identical to Okumura's LZSS.
> So i add inside Magdir/archive lines after that LZSS compressed
> archive section.
>
> According to documentation side i add magic lines like:
> 0 string EDILZSS
>> 7 string 2
> !:mime application/x-edi-pack-lzss
> !:ext ??$/??_
>>> 8 string x "%-0.13s"
>>> 21 ulelong x \b, %u bytes
>>>> 25 ubequad x \b, data %#llx...
> After the 8-byte signature EDILZSS2 , the original NIL-terminated
> filename ( like 4way.wav skymap.exe) padded to 13 bytes is stored.
> Afterwards the original file size is stored as a 4-byte integer. That
> is followed by compressed data. Instead of generic mime type
> application/octet-stream i show an user defined one. The name of a
> compressed file often ends in character '$' or '_'.
>
> Then there exist '1'-variant . There the start magic is 8-byte
> signature EDILZSS1. There the file size field is missing. I must
> put displaying part inside sub routine edi-pack. That looks like:
> 0 name edi-pack
>> 8 string x EDI LZSS packed "%-.13s"
> !:mime application/x-edi-pack-lzss
> !:ext ??$/?$
>> 21 ubequad x \b, data %#16.16llx...
> That variant is described as "EDI Pack LZSS1" by mentioned software
> deark. That can be verified by running command like:
> deark -l -d2 SPELMATE.H$
>
> Unfortunately there exist a third variant. There the original file
> name field is missing. And there in my inspected examples the suffix
> LSZ was used. That variant is described as "EDI LZSSLib" by
> mentioned software deark. That can be verified by running command lik
> e:
> deark -l -d2 GUNSHOT.LZS
> Unfortunately i was not able to express this as regular
> expression, because then sample HERBTEXT.LZS is misidentified. So i
> put displaying part in sub routine edi-lzs. This looks like:
> 0 name edi-lzs
>> 8 string x EDI LZSSLib packed
> !:mime application/x-edi-pack-lzss
> !:ext lzs
>> 8 ubequad x \b, data %#16.16llx...
>
> Instead of regular expression is use a bunch of test lines. That
> look like:
> 0 string EDILZSS
>> 7 string 1
>>> 8 search/9/b .
>>>> &0 ubyte <0x20
>>>>> 0 use edi-lzs
>>>> &0 ubyte >0x1F
>>>>> &0 ubyte =0
>>>>>> 0 use edi-pack
>>>>> &0 ubyte >0x1F
>>>>>> &0 ubyte =0
>>>>>>> 0 use edi-pack
>>>>>> &0 ubyte >0x1F
>>>>>>> &0 ubyte =0
>>>>>>>> 0 use edi-pack
>>>>>>> &0 ubyte !0
>>>>>>>> 0 use edi-lzs
>>>>>> &0 default x
>>>>>>> 0 use edi-lzs
>>>>> &0 default x
>>>>>> 0 use edi-lzs
>>> 8 default x
>>>> 0 use edi-lzs
> So i look for point character before original file name extension
> in possible 13 byte name field. If i found no point it must be be
> LSZ variant. If i found point character i inspect character of
> possible suffix part. If this is nil then is the file name
> terminator and it is pack variant. If that value is "low" than it
> is "no valid" file name. This must be LZS variant. If that value is
> "high" i must inspect next character by same procedure. This must
> be repeated until the maximal length of file name suffix (that is
> 3) is reached.
>
> After applying the above mentioned modifications by patch
> file-5.43-archive-edi.diff and using Magdir/msdos,images,riff then
> all such inspected EDI LZSS compressed files are now described. This
> now looks like:
>
> 4WAY.WA$: EDI install LZSS2 packed
> "4way.wav",
> 60430 bytes,
> data 0xff5249464606ec00...
> 4WAY.WAW: RIFF (little-endian) data, WAVE audio,
> Microsoft PCM, 8 bit, mono 11025 Hz
> BOOK01A.IC$: EDI LZSS packed
> "book01a.ico",
> data 0xf7000001eff02020...
> BOOK01A.ICO: MS Windows icon resource - 1 icon, 32x32, 16 colors
> CTL3D.DL$: EDI LZSS packed
> "ctl3d.dll",
> data 0xff4d5aa900020000...
> CTL3D.DLL: MS-DOS executable, NE for MS Windows 3.x (DLL or font)
> GUNSHOT.LZS: EDI LZSSLib packed
> data 0xbf424df6e10100f3...
> GUNSHOT.bmp: PC bitmap, Windows 3.x format, 335 x 364 x 8,
> image size 122304, resolution 3543 x 3543 px/m,
> cbSize 123382, bits offset 1078
> HERBTEXT.LZS: EDI LZSSLib packed
> data 0xff416c6f652e6c7a...
> HERBTEXT.txt: ASCII text, with very long lines (369)
> LACERATE.LZS: EDI LZSSLib packed
> data 0xbf424d2e6e0100f3...
> LACERATE.bmp: PC bitmap, Windows 3.x format, 261 x 351 x 8,
> image size 92664, resolution 2756 x 2756 px/m,
> cbSize 93742, bits offset 1078
> PLANTAIN.LZS: EDI LZSSLib packed
> data 0xbf424d962e0100f3...
> SKYMAP.EXE: MS-DOS executable, NE for MS Windows 3.x (EXE)
> SKYMAP.EX_: EDI install LZSS2 packed
> "skymap.exe", 576032 bytes,
> data 0xff4d5aa601010000...
> SPELMATE.H: ASCII text, with CRLF line terminators
> SPELMATE.H$: EDI LZSS packed
> "spelmate.h",
> data 0xff2f2a207370656c...
>
> I hope my diff file can be applied in future version of file utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY3btxQAKCRCv8rHJQhrU
> 1lo/AJoC6tcfma1nfbLIo0HRzLgDqUk5qACfZ9ElsRcq2lu4mTRvcFdGrj6MTOQ=
> =oBQp
> -----END PGP SIGNATURE-----
> <file-5_43-archive-edi_diff.DEFANGED-273><file-5_43-archive-edi_diff_sig.DEFANGED-274><trid-v-edi.txt.gz>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20221118/eff9b878/attachment.asc>
More information about the File
mailing list