[File] [PATCH] of Magdir/images DCX multi-page +mime typ +extension *.dcx

Jörg Jenderek joerg.jen.der.ek at gmx.net
Sun Apr 25 23:25:54 UTC 2021


Hello,
some days ago my system with Windows 10 updates itself. When checking
the system with ccleaner many file name associations are now broken.
So i look for the reported filename extension.

One file name extension is DCX. So i run file command version 5.40 on
such examples and relate file. The examples are identified by
Magdir/images correctly as like:

abydos.dcx:   DCX multi-page PCX image data
input.dcx:    DCX multi-page PCX image data
SAMPLE.DCX:   DCX multi-page PCX image data
FAXCOVER.PCX: PCX ver. 2.8 image data,
	      without palette bounding box [0, 0] - [1727, 575],
	      1-bit 640 x 480 dpi,
	      RLE compressed
FAXMEMO.PCX:  PCX ver. 2.8 image data,
	      without palette bounding box [0, 0] - [1727, 575],
	      1-bit 640 x 480 dpi,
	      RLE compressed

For DCX examples no extension is found ("???" displayed with
--extension option) and only generic mime type
"application/octet-stream" is shown.

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html). This describes the
DCX examples as "Graphics Multipage PCX bitmap" by definition
bitmap-dcx.trid.xml (See appended dcx_trid-v.txt.gz ). It also
displays used file name extension "DCX".

Some information about the DCX image file format can be found on file
formats archive team web site. This is now expressed by additional
remark line inside Magdir/images like:
  # URL:		http://fileformats.archiveteam.org/wiki/DCX
The file name extension is now expressed by additional line
    !:ext	dcx

For comparison reason i also run the file format identification
utility DROID ( See http://sourceforge.net/projects/droid/ ). This
describes the DCX examples as "Multipage Zsoft Paintbrush Bitmap" by
definition x-fmt/348. It also displays used mime type "image/x-dcx".
This information is now expressed by additional line like
  !:mime	image/x-dcx

The examples are identified by magic line like:
  0	lelong	987654321	DCX multi-page PCX image data
I changed this to lines like:
  0	lelong	987654321	DCX multi-page
  >4	lelong	x		\b, at 0x%x
  >(4.l)	indirect		x

According to documentation at position 4 the offset of the embedded
PCX image is stored, which is in most case 0x1004. So show this
additional information. With this information the file command can
inspect at this offset the embedded PCX image with the indirect
command directive. Now also information like dimension of first PCX
image is also shown.

If more information are needed like for possible second embedded PCX
image, this information can be shown by addition lines like
  >8		lelong	!0	\b, at 0x%x
  >>(8.l)	indirect	x

After applying the above mentioned modifications by patch
file-5.40-images-dcx.diff then such images are now described with
more details like:

abydos.dcx:   DCX multi-page, at 0x1004
	      PCX ver. 3.0 image data
	      bounding box [0, 0] - [799, 599],
	      4 planes each of 8-bit colour,
	      RLE compressed
input.dcx:    DCX multi-page, at 0x1004
	      PCX ver. 3.0 image data
	      bounding box [0, 0] - [69, 45],
	      3 planes each of 8-bit colour, 70 x 46 dpi,
	      RLE compressed
SAMPLE.DCX:   DCX multi-page, at 0x1000
	      PCX ver. 2.8 image data,
	      with palette bounding box [0, 0] - [1728, 373],
	      1-bit 640 x 200 dpi,
	      RLE compressed
FAXCOVER.PCX: PCX ver. 2.8 image data,
	      without palette bounding box [0, 0] - [1727, 575],
	      1-bit 640 x 480 dpi,
	      RLE compressed
FAXMEMO.PCX:  PCX ver. 2.8 image data,
	      without palette bounding box [0, 0] - [1727, 575],
	      1-bit 640 x 480 dpi,
	      RLE compressed

And now the right mime type and file name extension is also shown.
I hope my diff file can be applied in future version of file utility.

With best wishes
Jörg Jenderek
--
Jörg Jenderek



-------------- next part --------------
A non-text attachment was scrubbed...
Name: dcx_trid-v.txt.gz
Type: application/x-gzip
Size: 627 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20210426/ab7a21ce/attachment.bin>
-------------- next part --------------
--- file-5.40/magic/Magdir/images.old	2021-02-22 23:49:24 +0000
+++ file-5.40/magic/Magdir/images	2021-04-25 22:06:26 +0000
@@ -1333,3 +1333,18 @@
 # From: Joerg Wunsch <joerg_wunsch at uriah.heep.sax.de>
-0	lelong	987654321	DCX multi-page PCX image data
+# Update:	Joerg Jenderek
+# URL:		http://fileformats.archiveteam.org/wiki/DCX
+0	lelong	987654321	DCX multi-page
+# http://www.nationalarchives.gov.uk/pronom/x-fmt/348
+!:mime	image/x-dcx
+!:ext	dcx
+# The first file offset usually starts at file offset 0x1004
+# print 1 space after 0x100? offset and then handles PCX images by ./images
+>4	lelong	x		\b, at 0x%x 
+>(4.l)	indirect		x
+# possible 2nd PCX image
+#>8	lelong	!0		\b, at 0x%x 
+#>>(8.l)	indirect		x
+# possible 3rd PCX image
+#>12	lelong	!0		\b, at 0x%x 
+#>>(12.l)	indirect		x
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.40-images-dcx.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20210426/ab7a21ce/attachment.obj>


More information about the File mailing list