[File] [PATCH] of Magdir/msdos,printer for DOS EPS Binary File; - duplicates + *.eps *.ept
Christos Zoulas
christos at zoulas.com
Sun Jan 22 15:03:16 UTC 2023
Committed, thanks!
christos
> On Jan 13, 2023, at 8:43 PM, Jörg Jenderek <joerg.jen.der.ek at gmx.net> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> some days ago i want to install an Intel Based WIFI card.
> Under directory "c:\Program Files\Intel\WiFi\" in sub directory
> ProfileImporters i found samples with suffix EPI ( like MurocImp.epi
> M100Imp.epi SbrngImp.epi). For that suffix i expect Encapsulated
> PostScript files.
>
> When running file command version 5.44 on such examples and some
> other test samples with -k option i get an output like:
>
> M100Imp.epi: data
> SOCCER.WMF: Windows metafile data
> abydos.tiff: TIFF image data, little-endian,
> direntries=17, height=600, bps=28946,
> compression=deflate,
> PhotometricInterpretation=RGB,
> orientation=upper-left\012- , width=800
> drawX8-ps2wmf.eps: DOS EPS Binary File
> Postscript starts at byte 30
> length 37402
> Metafile starts at byte 37432
> length 452
> DOS EPS Binary File
> Postscript starts at byte 30
> length 37402
> Metafile starts at byte 37432
> length 452
> OpenPGP Secret Key
> dreieck.ept: DOS EPS Binary File
> Postscript starts at byte 30
> length 6367
> TIFF starts at byte 6397
> length 12910
> DOS EPS Binary File
> Postscript starts at byte 30
> length 6367
> TIFF starts at byte 6397
> length 12910
> OpenPGP Secret Key
> example.eps: DOS EPS Binary File
> Postscript starts at byte 43350
> length 263893
> TIFF starts at byte 30
> length 43320
> DOS EPS Binary File
> Postscript starts at byte 43350
> length 263893
> TIFF starts at byte 30
> length 43320
> OpenPGP Secret Key
> fmt-122-signature-id-174.eps: DOS EPS Binary File
> Postscript starts at byte 1397760293
> length 1868841261
> Metafile starts at byte 841835874
> length 1159737390
> TIFF starts at byte 759583568
> length 221392433
> DOS EPS Binary File
> Postscript starts at byte 1397760293
> length 1868841261
> Metafile starts at byte 841835874
> length 1159737390
> TIFF starts at byte 759583568
> length 221392433
> OpenPGP Secret Key
> fmt-123-signature-id-178.eps: DOS EPS Binary File
> Postscript starts at byte 1397760293
> length 1868841261
> Metafile starts at byte 841835874
> length 1159737390
> TIFF starts at byte 759583568
> length 221261362
> DOS EPS Binary File
> Postscript starts at byte 1397760293
> length 1868841261
> Metafile starts at byte 841835874
> length 1159737390
> TIFF starts at byte 759583568
> length 221261362
> OpenPGP Secret Key
> fmt-124-signature-id-180.eps: DOS EPS Binary File
> Postscript starts at byte 1397760293
> length 1868841261
> Metafile starts at byte 858613090
> length 1159737390
> TIFF starts at byte 759583568
> length 221261363
> DOS EPS Binary File
> Postscript starts at byte 1397760293
> length 1868841261
> Metafile starts at byte 858613090
> length 1159737390
> TIFF starts at byte 759583568
> length 221261363
> OpenPGP Secret Key
>
> Furthermore with -i option expected image/x-eps for DOS EPS Binary
> samples is shown, but with --extension for such samples only ??? is
> displayed.
>
> For comparison reason i run other utilities. The file identifier
> tool TrID (see http://mark0.net/soft-trid-e.html) describes such
> DOS EPS Binary examples with low priority as "Adobe Encapsulated
> PostScript" by definition eps-adobe.trid.xml.
> Most of the real DOS EPS ( that is excluding DROID test samples
> fmt-122-signature-id-174.eps fmt-123-signature-id-178.eps
> fmt-124-signature-id-180.eps) are described with highest priority as
> "Encapsulated PostScript binary (with TIFF preview)" by
> eps-tiff.trid.xml. The few real real DOS EPS not described by this
> definition ( like sample drawX8-ps2wmf.eps) are described with
> highest rate as "Encapsulated PostScript binary (with WMF preview)"
> by eps-wmf.trid.xml (See appended trid-v-DOS-EPS.txt.gz).
>
> DROID (Digital Record and Object Identification) is a software tool
> developed by The National Archives of UK to perform automated batch
> identification of file formats. See
> https://digital-preservation.github.io/droid/
> According to that tool the samples are described as "Encapsulated
> PostScript File Format" with mime type application/postscript. The
> suffix EPS is here accepted whereas EPT is not accepted. The sub
> classification with version "1.2" happens by by PUID fmt/122. The
> sub classification with version "2.0" happens by by PUID fmt/123. The
> sub classification with version "3" happens by by PUID fmt/124 (See
> appended droid-DOS-EPS.csv.gz)
>
> I also run the command line tool of XnView graphic tool by command
> line like:
> nconvert -info *.EP?
> Here the real samples with TIFF images are described as Format TIFF
> and name epsp. For samples with WMF like drawX8-ps2wmf.eps it
> failed (See appended nconvert-info-DOS-EPS.txt.gz).
>
> I also run the command line tool of ImageMagick graphic tool by
> command line like:
> identify -verbose *
> Here all real DOS binary samples are described as EPT (Encapsulated
> PostScript with TIFF preview) even the samples with WMF preview
> (See appended identify-verbose-DOS-EPS.txt.gz)
>
> First we see that we get duplicate messages, because in Magdir/msdos
> and Magdir/printer in principal the same recognition lines are found
> starting with line:
> 0 belong 0xC5D0D3C6 DOS EPS Binary File
>
> So first i delete concerning lines inside Magdir/msdos by patch
> file-5.44-msdos-eps.diff to remove duplicate messages.
>
> In Magdir/printer the mime type line missing. In Magdir/msdos the
> next lines look like:
> !:mime image/x-eps
>> 4 long >0 Postscript starts at byte %d
>>> 8 long >0 length %d
>>>> 12 long >0 Metafile starts at byte %d
>>>>> 16 long >0 length %d
>>>> 20 long >0 TIFF starts at byte %d
>>>>> 24 long >0 length %d
>
> Encapsulated PostScript can contain a TIFF preview. Such variants
> are described by TrID as "Encapsulated PostScript binary (with TIFF
> preview)" by eps-tiff.trid.xml. If stored offset and length of this
> embedded image is not zero then print this information with beginning
> phrase "TIFF starts". This is not always true. The sample can be
> corrupted. It is also false for the DROID test samples
> fmt-122-signature-id-174.eps fmt-123-signature-id-178.eps
> fmt-124-signature-id-180.eps. These are used by DROID tool to
> recognize Encapsulated PostScript samples and contains just the
> header bytes. With the help of the offset i can jump to that location
> and inspected these parts via indirect call by file command again. So
> these concerning magic lines now becomes like:
>>>>> 20 long >0 at byte %d
> !:ext eps/ept
>>>>>> 24 long >0 length %d
>>>>>>> (20.l) indirect x
> So for the DROID samples nothing is shown where for real samples
> additional information about embedded TIFF is shown by Magdir/images.
> For this variant also suffix EPT instead of standard EPS is used.
>
> If Encapsulated PostScript contain no TIFF preview it contains
> instead a Windows Metafile (*.WMF) and the values for TIFF are nil.
> Such variants are described by TrID as "Encapsulated PostScript
> binary (with WMF preview)" by eps-wmf.trid.xml. If stored offset
> and length of this embedded image is not zero print this
> information with
> beginning phrase "Metafile starts". This is not always true. The
> sample can be corrupted. It is also false for the DROID test samples.
> These are used by DROID tool to recognize Encapsulated PostScript
> samples and contains just the header bytes. With the help of the
> offset i can jump to that location and inspected this part via
> indirect call by file command again. So these concerning magic lines
> now becomes like:
>>>>> 12 long >0 at byte %d
> !:ext eps
>>>>> 16 long >0 length %d
>>>>>> (12.l) indirect x
> So for the DROID samples nothing is shown where for real samples
> additional information about embedded WMF is shown by Magdir/msdos.
> For this variant apparently only EPS suffix is used.
>
> In test lines "long" is used as integer type. This is true for me
> on my machines which are all little endian, but i think the above
> test lines fail if running file command on big endian machines. So
> i believe the right expression must use something like "lelong".
> Unfortunately i have no machine with big endian. So maybe somebody
> can check this?
>
> Then do the same procedure for the embedded Postscript parts which
> often comes direct after header. So often (850/857 on my systems )
> this offset is 30 or 32, but i also found few samples with values
> like 2788 10644 43350 71828. So the postscript part now becomes like:
>>> 4 long >0 at byte %d
>>>> 8 long >0 length %d
>>>>> (4.l) indirect x
> I get here calling indirect of ./printer phrase like "length 263893
> PostScript document text" when adding 1 space character after
> length value. In the TIFF parts i get little "strange" phrase like
> "length 43320\012- TIFF image data," In the WMF parts i get little
> "strange" phrase like "length 452\012- Windows metafile". So maybe
> this seems to be a BUG in file command.
>
> The DROID samples are no real Encapsulated Postscript. So i add
> additional test right after first test magic. So i check for the
> existence of content after header. I do this by second test line like
> :
>> 32 ulelong >0 DOS EPS Binary File
> In version 5.44 some other variants do not work like:
>> 32 long !0 DOS EPS Binary File
>> 32 lelong !0 DOS EPS Binary File
>
> After applying the above mentioned modifications by patch
> file-5.44-msdos-eps.diff and file-5.44-printer-eps.diff and using
> Magdir/images for TIFF parts then i get an output like:
>
> M100Imp.epi: data
> SOCCER.WMF: Windows metafile
> abydos.tiff: TIFF image data, little-endian,
> direntries=17, height=600, bps=28946,
> compression=deflate,
> PhotometricInterpretation=RGB,
> orientation=upper-left, width=800
> drawX8-ps2wmf.eps: DOS EPS Binary File
> at byte 30
> length 37402
> PostScript document text
> conforming DSC level 3.0, type EPS,
> Level 2
> at byte 37432
> length 452
> \012- Windows metafile
> dreieck.ept: DOS EPS Binary File
> at byte 30
> length 6367
> PostScript document text
> conforming DSC level 3.0, type EPS,
> Level 1
> at byte 6397
> length 12910
> \012- TIFF image data, big-endian,
> direntries=20, height=25, bps=16,
> compression=none,
> PhotometricInterpretation=BlackIsZero,
> orientation=upper-left, width=100
> example.eps: DOS EPS Binary File
> at byte 43350
> length 263893
> PostScript document text
> conforming DSC level 3.1, type EPS,
> Level 2
> at byte 30
> length 43320
> \012- TIFF image data, little-endian,
> direntries=16, height=708, bps=8,
> compression=LZW,
> PhotometricInterpretation=RGB Palette,
> width=498
> fmt-122-signature-id-174.eps: ISO-8859 text, with CR line terminators
> fmt-123-signature-id-178.eps: ISO-8859 text, with CR line terminators
> fmt-124-signature-id-180.eps: ISO-8859 text, with CR line terminators
>
> I hope my diff files can be applied in future version of
> file utility.
>
> With best wishes
> Jörg Jenderek
> - --
> Jörg Jenderek
>
>
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCY8IIuwAKCRCv8rHJQhrU
> 1jYLAKDaw2FMZAkVLj1GkQFQOGtGzBvTLACg3stQpM6+xrPSBGDI8fy37SdITK8=
> =UJvN
> -----END PGP SIGNATURE-----
> <trid-v-DOS-EPS.txt.gz><droid-DOS-EPS.csv.gz><nconvert-info-DOS-EPS.txt.gz><identify-verbose-DOS-EPS.txt.gz><file-5_44-msdos-eps_diff.DEFANGED-558><file-5_44-msdos-eps_diff_sig.DEFANGED-559><file-5_44-printer-eps_diff.DEFANGED-560><file-5_44-printer-eps_diff_sig.DEFANGED-561>--
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20230122/44a0aa9b/attachment.asc>
More information about the File
mailing list