[File] [PATCH] Magdir/jpeg,images for "unusual" JPEG; extensions *.jxr *.wdp; still duplicates

Jörg Jenderek (GMX) joerg.jen.der.ek at gmx.net
Fri Oct 6 20:08:09 UTC 2023


Am 03.06.2022 um 14:55 schrieb Jörg Jenderek:
> Hello,

When running file command version 5.45 with -k option on more JPEG-XR
images and related files i get an output looking like:

FLOWER.wdp:
	JPEG-XR Image
	, hard tiling, spatial xform=TL, short header
	, 2592x3904, bitdepth=5-6-5, colorfmt=YONLY
	JPEG-XR
	, hard tiling, spatial xform=TL, short header
	, 2592x3904, bitdepth=5-6-5, colorfmt=YONLY
MARKET-3361-ipm-bg-DE-treat[1].wdp:
	JPEG-XR
MARKET.tif:
	TIFF image data, little-endian, direntries=15
	, height=600, bps=194, compression=none
	, PhotometricInterpretation=RGB
	, orientation=upper-left\012- , width=800
SAKURA.wdp:
	JPEG-XR Image
	, hard tiling, spatial xform=TL, short header
	, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
	JPEG-XR
	, hard tiling, spatial xform=TL, short header
	, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
SMALLTOMATO.wdp:
	JPEG-XR Image
	, hard tiling, spatial xform=TL, short header
	, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
	JPEG-XR
	, hard tiling, spatial xform=TL, short header
	, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
abydos.jxr:
	JPEG-XR Image
	, spatial xform=TL, short header
	, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
	JPEG-XR
	, spatial xform=TL, short header
	, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
example.tif:
	TIFF image data, little-endian, direntries=15
	, height=800, bps=194, compression=none
	, PhotometricInterpretation=RGB
	, orientation=upper-left\012- , width=1200
example.wdp:
	JPEG-XR
fmt-590-signature-id-931.wdp:
	data

For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html).
The JXR and WDP samples are still described as "JPEG XR bitmap" by
bitmap-wmp.trid.xml, but now with lower priority. Still 4 file name
extensions (HDP/JXR/WDP/WMP) are listed and mime type image/vnd.ms-photo
is shown. But now this identifies the samples with highest priority
as "JPEG XR bitmap (WMPHOTO)" by bitmap-jxr.trid.xml. There only 2 file
name extensions (JXR/WDP) are listed and official registered mime type
image/jxr is shown. (See appended trid-v-jxr.txt.gz)

For comparison reason i also run the file format identification utility
DROID (See https://sourceforge.net/projects/droid/). Here the examples
are also recognized. These are described here "JPEG Extended Range" and
mime type image/jxr by PUID fmt/590.

During my work i had to do some steps. First i need to verify that my
few two "strange" samples (MARKET-3361-ipm-bg-DE-treat[1].wdp and
example.wdp) are really JXR images. I tried to do these by ImageMagick
command like:
	    identify -verbose *.wdp *.jxr
This works partly on Linux system, but fails on Windows. As written on
Wikipedia page ImageMagick does not support JXR natively but needs the
jxrlib packages. On Linux identify works because i installed library
package libjxr0 and command line tools JxrDecApp, JxrEncApp (package
libjxr-tools). After jumping about this hurdle for control reason i just
convert the "strange" JXR samples by these command line tools like:
      JxrDecApp -v -i example.wdp -o example.tif

I also verified the validity of the JXR samples with help of the XnView
graphic viewer. This was able to open and display the images. For
control reasons you get relevant image information like dimension by the
command line tool via line like
	nconvert -fullinfo *.jxr *.wdp

So now i am very sure that my "strange" JXR samples are real and valid
JXR images (See appended nconvert-jxr.txt.gz)

First i still get often duplicate messages! In Magdir/images is an entry
for "JPEG-XR Image". That looks like:
90	ubequad		0x574D50484F544F00	JPEG-XR Image
>98	ubyte&0x08	=0x08			\b, hard tiling
...
>>101	ubeshort&0xf0	0x80			\bRGBE
>>101	ubeshort&0xf0	>0x80			\b(reserved %#x)

So i had moved that part and merged to Magdir/jpeg. Unfortunately my
send patch file-5.41-images-jpeg.diff to remove that lines was not
applied. So i send it again.

In Magdir/jpeg is a similar entry for "JPEG-XR". Here as first test the
first five bytes are checked. That was done by lines like:
0	string		\x49\x49\xbc
 >3	byte		1
 >>4	lelong%2	0	JPEG-XR
!:mime	image/jxr
!:ext	jxr/wdp/hdp
 >90	bequad		0x574D50484F544F00
 >>98	byte&0x08	=0x08			\b, hard tiling
...
 >>>101	beshort&0xf0	0x80			\bRGBE
 >>>101	beshort&0xf0	>0x80			\b(reserved %#x)

Unfortunately i have not enough time and brain to read and understand
the full file format specification, but luckily i read more carefully
the mime type image/jxr information at iana.org.

Under item magic number is following written:
Data begins with a FILE_HEADER( ) data structure, which begins with a
FIXED_FILE_HEADER_II_2BYTES field equal to 0x4949, followed by a
FIXED_FILE_HEADER_0XBC_BYTE field equal to 0xBC, followed by a
FILE_VERSION_ID which is equal to 1 for the current version of the
Recommendation and International Standard (with other values reserved
for future use, as modified in additional parts or amendments, by ITU-T
or ISO/IEC).

That is expressed by first 2 magic lines. The JXR format started before
2009 and now we have year 2023. Now there exist other newer graphic
image formats like WebP or HEIF. So in my option there will probably no
evolution of JXR from version 1 to something like 2 become reality.

Within the payload data, JPEG XR IMAGE_HEADER data structures begin with
a GDI_SIGNATURE, which is a 64-bit syntax element that has the value
0x574D50484F544F00 that corresponds to "WMPHOTO" using the UTF-8
character set encoding specified in Annex D of ISO/IEC 10646, followed
by a byte equal to 0.

In many examples (like: FLOWER.wdp abydos.jxr SMALLTOMATO.wdp) this
characteristic is stored at offset 90. So such samples are described
with many details. But for few samples (like example.wdp and
MARKET-3361-ipm-bg-DE-treat[1].wdp) apparently to documentation this
characteristic string occur at other offsets.
I verified this by running a command like:
	grep WMPHOTO "MARKET-3361-ipm-bg-DE-treat[1].wdp" example.wdp
Binary file MARKET-3361-ipm-bg-DE-treat[1].wdp matches
Binary file example.wdp matches

So i must only adjust some minor details. First i put code fragment
which start with check for GDI_SIGNATURE signature GDI_SIGNATURE inside
sub routine that looks like:
0	name	jxr-info
 >90	bequad		0x574D50484F544F00
 >>98	byte&0x08	=0x08			\b, hard tiling
 >>>101	beshort&0xf0	0x80			\bRGBE
 >>>101	beshort&0xf0	>0x80			\b(reserved %#x)

For many samples with signature at offset 90 now just call this
subroutine. For other samples (like MARKET-3361-ipm-bg-DE-treat[1].wdp
example.wdp) i just search for that signature and call the subroutine
with relative offset. So this now becomes like
0	string		\x49\x49\xbc
 >3	byte		1
 >>4	lelong%2	0	JPEG-XR Image
!:mime	image/jxr
!:ext	jxr/wdp/hdp
 >90	bequad		0x574D50484F544F00
 >>0	use	jxr-info
 >90	bequad		!0x574D50484F544F00
 >>4	search/3267/sb	WMPHOTO\0
 >>>&-90	use	jxr-info

After applying the above mentioned modifications by 2 patches
the duplicate messages vanish and all my inspected JPEG-XR Images are
still recognized but now i get for all samples detail information
provided by code fragment which is embedded inside subroutine
jxr-info. This now looks like:
FLOWER.wdp:
	JPEG-XR Image
	, hard tiling, spatial xform=TL, short header
	, 2592x3904, bitdepth=5-6-5, colorfmt=YONLY
MARKET-3361-ipm-bg-DE-treat[1].wdp:
	JPEG-XR Image
	, codestream present, spatial xform=TL, short header
	, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
MARKET.tif:
	TIFF image data, little-endian, direntries=15
	, height=600, bps=194, compression=none
	, PhotometricInterpretation=RGB
	, orientation=upper-left, width=800
SAKURA.wdp:
	JPEG-XR Image, hard tiling, spatial xform=TL, short header
	, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
SMALLTOMATO.wdp:
	JPEG-XR Image, hard tiling, spatial xform=TL, short header
	, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
abydos.jxr:
	JPEG-XR Image, spatial xform=TL, short header
	, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
example.tif:
	TIFF image data, little-endian, direntries=15
	, height=800, bps=194, compression=none
	, PhotometricInterpretation=RGB
	, orientation=upper-left, width=1200
example.wdp:
	JPEG-XR Image
	, codestream present, spatial xform=TL, short header
	, 1200x800, bitdepth=16-FLOAT, colorfmt=YONLY
fmt-590-signature-id-931.wdp:
	data

With best wishes,
Jörg Jenderek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-images-jpeg.diff.sig
Type: application/octet-stream
Size: 923 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment.obj>
-------------- next part --------------
--- file-5.41/magic/Magdir/images.old	2021-10-18 16:20:03.000000000 +0200
+++ file-5.41/magic/Magdir/images	2022-06-03 00:40:07.046440900 +0200
@@ -1865,54 +1865,4 @@
 0	string	\x46\x4d\x52\x00	ISO/IEC 19794-2 Format Minutiae Record (FMR)
 
-# doc: https://www.shikino.co.jp/eng/products/images/FLOWER.jpg.zip
-# example: https://www.shikino.co.jp/eng/products/images/FLOWER.wdp.zip
-90	bequad		0x574D50484F544F00	JPEG-XR Image
->98	byte&0x08	=0x08			\b, hard tiling
->99	byte&0x80	=0x80			\b, tiling present
->99	byte&0x40	=0x40			\b, codestream present
->99	byte&0x38	x			\b, spatial xform=
->99	byte&0x38	0x00			\bTL
->99	byte&0x38	0x08			\bBL
->99	byte&0x38	0x10			\bTR
->99	byte&0x38	0x18			\bBR
->99	byte&0x38	0x20			\bBT
->99	byte&0x38	0x28			\bRB
->99	byte&0x38	0x30			\bLT
->99	byte&0x38	0x38			\bLB
->100	byte&0x80	=0x80			\b, short header
->>102	beshort+1	x			\b, %d
->>104	beshort+1	x			\bx%d
->100	byte&0x80	=0x00			\b, long header
->>102	belong+1	x			\b, %x
->>106	belong+1	x			\bx%x
->101	beshort&0xf	x			\b, bitdepth=
->>101	beshort&0xf	0x0			\b1-WHITE=1
->>101	beshort&0xf	0x1			\b8
->>101	beshort&0xf	0x2			\b16
->>101	beshort&0xf	0x3			\b16-SIGNED
->>101	beshort&0xf	0x4			\b16-FLOAT
->>101	beshort&0xf	0x5			\b(reserved 5)
->>101	beshort&0xf	0x6			\b32-SIGNED
->>101	beshort&0xf	0x7			\b32-FLOAT
->>101	beshort&0xf	0x8			\b5
->>101	beshort&0xf	0x9			\b10
->>101	beshort&0xf	0xa			\b5-6-5
->>101	beshort&0xf	0xb			\b(reserved %d)
->>101	beshort&0xf	0xc			\b(reserved %d)
->>101	beshort&0xf	0xd			\b(reserved %d)
->>101	beshort&0xf	0xe			\b(reserved %d)
->>101	beshort&0xf	0xf			\b1-BLACK=1
->101	beshort&0xf0	x			\b, colorfmt=
->>101	beshort&0xf0	0x00			\bYONLY
->>101	beshort&0xf0	0x10			\bYUV240
->>101	beshort&0xf0	0x20			\bYWV422
->>101	beshort&0xf0	0x30			\bYWV444
->>101	beshort&0xf0	0x40			\bCMYK
->>101	beshort&0xf0	0x50			\bCMYKDIRECT
->>101	beshort&0xf0	0x60			\bNCOMPONENT
->>101	beshort&0xf0	0x70			\bRGB
->>101	beshort&0xf0	0x80			\bRGBE
->>101	beshort&0xf0	>0x80			\b(reserved %#x)
-
 # From: Johan van der Knijff <johan.vanderknijff at kb.nl>
 #
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-jxr.txt.gz
Type: application/x-gzip
Size: 822 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nconvert-jxr.txt.gz
Type: application/x-gzip
Size: 673 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment-0001.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/jpeg.old	2022-12-21 16:58:05.000000000 +0100
+++ file-5.45/magic/Magdir/jpeg	2023-10-06 21:44:18.412532000 +0200
@@ -167,2 +167,3 @@
 # JPEG extended range
+# Update:	Joerg Jenderek 2023
 # URL:		http://fileformats.archiveteam.org/wiki/JPEG_XR
@@ -170,8 +171,12 @@
 #		http://mark0.net/download/triddefs_xml.7z/defs/b/bitmap-wmp.trid.xml
-# Note:         called by TrID "JPEG XR bitmap"
+#		http://mark0.net/download/triddefs_xml.7z/defs/b/bitmap-jxr.trid.xml
+# Note:         called by TrID "JPEG XR bitmap" and "JPEG XR bitmap (WMPHOTO)"
+#		verified as "JPEG XR" by XnView `nconvert -fullinfo *.jxr *.wdp`
+#		partly by ImageMagick command `identify -verbose *.wdp`
+#		and libjxr-tools `JxrDecApp -v -i example.wdp -o example.tif`
 0	string		\x49\x49\xbc
-# FILE_VERSION_ID; shall be equal to 1; other values are reserved for future use
+# FILE_VERSION_ID; shall be equal to 1; other values are reserved for future use and are unlike to appear
 >3	byte		1
 # FIRST_IFD_OFFSET; shall be an integer multiple of 2; so skip DROID fmt-590-signature-id-931.wdp
->>4	lelong%2	0	JPEG-XR
+>>4	lelong%2	0	JPEG-XR Image
 #!:mime	image/vnd.ms-photo
@@ -182,6 +187,16 @@
 #!:ext	jxr/wdp/hdp/wmp
-# moved from ./images (version 1.205 ), merged and
-# partly verified by XnView `nconvert -info abydos.jxr FLOWER.wdp`
-# example: https://web.archive.org/web/20160403012904/
+# moved from ./images (version 1.243 ) and merged
+# example:
 # http://shikino.co.jp/solution/upfile/FLOWER.wdp.zip
+# often GDI_SIGNATURE "WMPHOTO\0" at offset 90 like: FLOWER.wdp abydos.jxr SMALLTOMATO.wdp
+>90	bequad		0x574D50484F544F00
+>>0	use	jxr-info
+# seldom no GDI_SIGNATURE WMPHOTO\0 at offset 90 like: example.wdp MARKET-3361-ipm-bg-DE-treat[1].wdp
+>90	bequad		!0x574D50484F544F00
+# look for GDI_SIGNATURE WMPHOTO\0 at other offset
+>>4	search/3267/sb	WMPHOTO\0
+>>>&-90	use	jxr-info
+#
+0	name	jxr-info
+# check for GDI_SIGNATURE that corresponds to "WMPHOTO\0"
 >90	bequad		0x574D50484F544F00
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-jpeg-jxr.diff.sig
Type: application/octet-stream
Size: 1198 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment-0001.obj>


More information about the File mailing list