[File] [PATCH] Magdir/archive+windows; InstallShield setup header *.HDR+Language Identifier *.LID ; *.INS; *.TAG
Christos Zoulas
christos at zoulas.com
Sun Nov 7 16:26:48 UTC 2021
Applied, thanks!
christos
> On Nov 4, 2021, at 11:30 AM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> Hello,
>
> some days ago i installed an old Windows software. In installation
> directory are files which are not recognised or described only partly
> or generic by file command.
>
> When running running file command version 5.41 on such examples i
> get an output like:
>
> DATA.TAG: ASCII text, with CRLF line terminators
> Setup.exe: PE32 executable (GUI) Intel 80386, for MS Windows
> _sys1.cab: InstallShield CAB
> _sys1.hdr: InstallShield CAB
> _user1.cab: InstallShield CAB
> _user1.hdr: InstallShield CAB
> data1.cab: InstallShield CAB
> data1.hdr: InstallShield CAB
> data2.cab: InstallShield CAB
> setup.ins: COM executable for DOS
> setup.lid: ASCII text, with CRLF line terminators
>
> Furthermore with --extension option only 3 character sequence ??? is
> shown. With -i option only generic mime types like text/plain or
> application/octet-stream are shown.
>
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html).
> Some HDR examples are described correctly by TrID first as
> "InstallShield setup header" by ark-cab-ishield-hdr.trid.xml and
> second generic as "InstallShield compressed Archive" by
> ark-cab-ishield.trid.xml.
> Most INS examples are described correctly as "InstallShield Script"
> by ins.trid.xml.
> The examples described by file command as "ASCII text" are described
> as "Generic INI configuration" by ini.trid.xml. All LID examples are
> described more specific as "InstallShield Language Identifier" by
> lid-is.trid.xml and all DATA.TAG examples are described as "TagInfo
> data" by taginfo.trid.xml (See appended installshield-trid-v.txt.gz).
>
> This list the correct file name extensions and often with -v option
> the related URL pointing to used file format information. Luckily
> there exist a free software unshield, that can handle such
> InstallShield Cabinet archives. The relevant information is found an
> header file cabfile.h and c-source file helper.c. So these
> informations are now expressed by comment lines inside
> Magdir//archive like:
> # URL: https://en.wikipedia.org/wiki/InstallShield
> # Reference: https://github.com/twogood/unshield
> # /blob/master/lib/cabfile.h
> # https://github.com/twogood/unshield/blob/master/lib/helper.c
>
> In current version the only magic line looks like:
> 0 string ISc( InstallShield CAB
>
> Now after test for this CAB_SIGNATURE (0x28635349) now according to
> c-source print version information by line like:
>> 4 ulelong x \b, version %#x
>
> Afterwards print volume_info and cab_descriptor_offset with unusual
> values like:
>> 8 ulelong !0 \b, volume_info %#x
>> 12 ulelong !0x200 \b, offset %#x
> Afterward the cab_descriptor_size is shown if non zero by line like:
>> 16 ulelong !0 \b, descriptor size %#x
>
> After inspecting hundreds of InstallShield this value was zero in all
> my CAB examples and non zero in my HDR examples. Hoping that this is
> always true, i use this observation to distinguish HDR from CAB
> InstallShield with correct file name extensions by lines like:
> 0 string ISc( InstallShield
> !:mime application/x-installshield
>> 16 ulelong !0 setup header
> !:ext hdr
>> 16 ulelong =0 CAB
> !:ext cab
> Instead generic mime type application/octet-stream i display a user
> defined one.
>
> Unfortunately no official or complete documentation exist for LID
> file format. So i use information provided by TrID. So this
> information is manifested inside Magdir/windows by comment lines like:
> # URL: https://en.wikipedia.org/wiki/InstallShield
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/l/lid-is.trid.xml
> According to TrID i create equivalent magic lines inside ini-file
> sub routine of Magdir/windows after Windows code page translator
> section. This now looks like:
>>> &0 regex/c \^(Languages)] InstallShield Language Identifier
> !:mime text/x-installshield-lid
> !:ext lid
>
> Instead of generic mime type text/plain i display a user defined one.
> The test for keyword Languages in bracket section was sufficient to
> recognize my LID examples. IF this is not sufficient then additional
> test for keywords (three like: count Default key0 mentioned in global
> strings section of TrID definition) must be done.
>
> Unfortunately no official or complete documentation exist for TagInfo
> file format. So i use information provided by TrID. So this
> information is manifested inside Magdir/windows after LID section by
> comment lines like:
> # URL: https://www.file-extensions.org/tag-file-extension
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/t/taginfo.trid.xml
> According to TrID i create equivalent magic lines inside ini-file
> sub routine of Magdir/windows after Windows codepage translator
> section. This now looks like:
>>> &0 regex/c \^(TagInfo)] TagInfo
> !:mime text/x-ms-tag
> !:ext tag
> Instead of generic mime type text/plain i display a user defined one.
> The test for keyword TagInfo in bracket section was sufficient to
> recognize my DATA.TAG examples. IF this is not sufficient then
> additional test for keywords (like: Application Category Company Misc
> Version mentioned in global strings section of TrID definition) must
> be done.
>
> Unfortunately no official or complete documentation exist for
> InstallShield INS file format. So i use information provided by TrID.
> So this information is manifested at the end inside Magdir/windows by
> comment lines like:
> # URL: https://en.wikipedia.org/wiki/InstallShield
> # Reference: http://mark0.net/download/triddefs_xml.7z
> # defs/i/ins.trid.xml
> According to TrID i create equivalent magic lines in Magdir/windows
> This start with lines like:
> 0 ubelong 0xB8C90C00 InstallShield Script
> !:mime application/x-installshield-ins
> !:ext ins
> Instead of generic mime type application/octet-stream i display a
> user defined one.
> Because i am unsure if this starting 4-byte value is always true i
> look for additional information inside INS examples. Apparently
> before strings the length of string is stored as 2 byte little endian
> integer. So the first string seem to be a copyright message at fixed
> offset like "Stirling Technologies, Inc. (c) 1990-1994" or
> "InstallSHIELD Software Coporation (c) 1990-1997". This is displayed
> by line like:
>> 13 pstring/h x "%s"
>
> In global strings section of TrID definition are mentioned some
> keywords like: SRCDIR, SRCDISK, TARGETDISK, TARGETDIR, WINDIR,
> WINDISK, WINSYSDIR, LOGHANDLE. Apparently this seem to be variable
> names, which maybe can be used to configure some install options.
>
> Some few dozen bytes later inside INS examples at different offsets
> these names appear in same order. And before a kind of sequence
> number seems to be stored as 2 byte integer. So show this variable
> name information by additional lines like:
>> 1 search/0x121/s SRCDIR \b, variable names:
>>> &-4 leshort x #%u
>>> &-2 pstring/h x %s
>>>> &0 leshort x #%u
>>>> &2 pstring/h x %s
>>>>> &0 leshort x #%u
>>>>> &2 pstring/h x %s
>>>>>> &0 leshort x #%u
>>>>>> &2 pstring/h x %s
>>>>>>> &0 leshort x #%u
>>>>>>> &2 pstring/h x %s
>>>>>>>> &0 leshort x #%u
>>>>>>>> &2 pstring/h x %s
>>>>>>>>> &0 leshort x #%u
>>>>>>>>> &2 pstring/h x %s
>> 0 ubelong x ...
>
> After applying the above mentioned modifications by patches
> file-5.41-archive-installshield.diff and file-5.41-windows-lid.diff
> then all my InstallShield examples are now recognised or are
> described with more details like:
> DATA.TAG: TagInfo
> Setup.exe: PE32 executable (GUI) Intel 80386, for MS Windows
> _sys1.cab: InstallShield CAB, version 0x1005201
> _sys1.hdr: InstallShield setup header, version 0x1005201,
> descriptor size 0x116c
> _user1.cab: InstallShield CAB, version 0x1005201
> _user1.hdr: InstallShield setup header, version 0x1005201,
> descriptor size 0x130b
> data1.cab: InstallShield CAB, version 0x4000834
> data1.hdr: InstallShield setup header, version 0x1007000,
> descriptor size 0x1e9ee
> data2.cab: InstallShield CAB, version 0x20005dc
> setup.ins: InstallShield Script
> "InstallSHIELD Software Coporation (c) 1990-1997",
> variable names: #0 SRCDIR #1 SRCDISK #2 TARGETDISK ...
> setup.lid: InstallShield Language Identifier
>
> I hope my 2 diff files can be applied in future version of file utility.
>
> With best wishes
> Jörg Jenderek
> --
> Jörg Jenderek
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> <Nachrichtenteil als Anhang.DEFANGED-14><file-5_41-windows-lid_diff.DEFANGED-15><file-5_41-windows-lid_diff_sig.DEFANGED-16><file-5_41-archive-installshield_diff_sig.DEFANGED-17><file-5_41-archive-installshield_diff.DEFANGED-18><installshield-trid-v.txt.gz>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20211107/e739ac7a/attachment.asc>
More information about the File
mailing list