[File] [PATCH] Magdir/jpeg,images for "unusual" JPEG; extensions *.jxr *.wdp; still duplicates
Jörg Jenderek (GMX)
joerg.jen.der.ek at gmx.net
Fri Oct 6 20:08:09 UTC 2023
Am 03.06.2022 um 14:55 schrieb Jörg Jenderek:
> Hello,
When running file command version 5.45 with -k option on more JPEG-XR
images and related files i get an output looking like:
FLOWER.wdp:
JPEG-XR Image
, hard tiling, spatial xform=TL, short header
, 2592x3904, bitdepth=5-6-5, colorfmt=YONLY
JPEG-XR
, hard tiling, spatial xform=TL, short header
, 2592x3904, bitdepth=5-6-5, colorfmt=YONLY
MARKET-3361-ipm-bg-DE-treat[1].wdp:
JPEG-XR
MARKET.tif:
TIFF image data, little-endian, direntries=15
, height=600, bps=194, compression=none
, PhotometricInterpretation=RGB
, orientation=upper-left\012- , width=800
SAKURA.wdp:
JPEG-XR Image
, hard tiling, spatial xform=TL, short header
, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
JPEG-XR
, hard tiling, spatial xform=TL, short header
, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
SMALLTOMATO.wdp:
JPEG-XR Image
, hard tiling, spatial xform=TL, short header
, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
JPEG-XR
, hard tiling, spatial xform=TL, short header
, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
abydos.jxr:
JPEG-XR Image
, spatial xform=TL, short header
, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
JPEG-XR
, spatial xform=TL, short header
, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
example.tif:
TIFF image data, little-endian, direntries=15
, height=800, bps=194, compression=none
, PhotometricInterpretation=RGB
, orientation=upper-left\012- , width=1200
example.wdp:
JPEG-XR
fmt-590-signature-id-931.wdp:
data
For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html).
The JXR and WDP samples are still described as "JPEG XR bitmap" by
bitmap-wmp.trid.xml, but now with lower priority. Still 4 file name
extensions (HDP/JXR/WDP/WMP) are listed and mime type image/vnd.ms-photo
is shown. But now this identifies the samples with highest priority
as "JPEG XR bitmap (WMPHOTO)" by bitmap-jxr.trid.xml. There only 2 file
name extensions (JXR/WDP) are listed and official registered mime type
image/jxr is shown. (See appended trid-v-jxr.txt.gz)
For comparison reason i also run the file format identification utility
DROID (See https://sourceforge.net/projects/droid/). Here the examples
are also recognized. These are described here "JPEG Extended Range" and
mime type image/jxr by PUID fmt/590.
During my work i had to do some steps. First i need to verify that my
few two "strange" samples (MARKET-3361-ipm-bg-DE-treat[1].wdp and
example.wdp) are really JXR images. I tried to do these by ImageMagick
command like:
identify -verbose *.wdp *.jxr
This works partly on Linux system, but fails on Windows. As written on
Wikipedia page ImageMagick does not support JXR natively but needs the
jxrlib packages. On Linux identify works because i installed library
package libjxr0 and command line tools JxrDecApp, JxrEncApp (package
libjxr-tools). After jumping about this hurdle for control reason i just
convert the "strange" JXR samples by these command line tools like:
JxrDecApp -v -i example.wdp -o example.tif
I also verified the validity of the JXR samples with help of the XnView
graphic viewer. This was able to open and display the images. For
control reasons you get relevant image information like dimension by the
command line tool via line like
nconvert -fullinfo *.jxr *.wdp
So now i am very sure that my "strange" JXR samples are real and valid
JXR images (See appended nconvert-jxr.txt.gz)
First i still get often duplicate messages! In Magdir/images is an entry
for "JPEG-XR Image". That looks like:
90 ubequad 0x574D50484F544F00 JPEG-XR Image
>98 ubyte&0x08 =0x08 \b, hard tiling
...
>>101 ubeshort&0xf0 0x80 \bRGBE
>>101 ubeshort&0xf0 >0x80 \b(reserved %#x)
So i had moved that part and merged to Magdir/jpeg. Unfortunately my
send patch file-5.41-images-jpeg.diff to remove that lines was not
applied. So i send it again.
In Magdir/jpeg is a similar entry for "JPEG-XR". Here as first test the
first five bytes are checked. That was done by lines like:
0 string \x49\x49\xbc
>3 byte 1
>>4 lelong%2 0 JPEG-XR
!:mime image/jxr
!:ext jxr/wdp/hdp
>90 bequad 0x574D50484F544F00
>>98 byte&0x08 =0x08 \b, hard tiling
...
>>>101 beshort&0xf0 0x80 \bRGBE
>>>101 beshort&0xf0 >0x80 \b(reserved %#x)
Unfortunately i have not enough time and brain to read and understand
the full file format specification, but luckily i read more carefully
the mime type image/jxr information at iana.org.
Under item magic number is following written:
Data begins with a FILE_HEADER( ) data structure, which begins with a
FIXED_FILE_HEADER_II_2BYTES field equal to 0x4949, followed by a
FIXED_FILE_HEADER_0XBC_BYTE field equal to 0xBC, followed by a
FILE_VERSION_ID which is equal to 1 for the current version of the
Recommendation and International Standard (with other values reserved
for future use, as modified in additional parts or amendments, by ITU-T
or ISO/IEC).
That is expressed by first 2 magic lines. The JXR format started before
2009 and now we have year 2023. Now there exist other newer graphic
image formats like WebP or HEIF. So in my option there will probably no
evolution of JXR from version 1 to something like 2 become reality.
Within the payload data, JPEG XR IMAGE_HEADER data structures begin with
a GDI_SIGNATURE, which is a 64-bit syntax element that has the value
0x574D50484F544F00 that corresponds to "WMPHOTO" using the UTF-8
character set encoding specified in Annex D of ISO/IEC 10646, followed
by a byte equal to 0.
In many examples (like: FLOWER.wdp abydos.jxr SMALLTOMATO.wdp) this
characteristic is stored at offset 90. So such samples are described
with many details. But for few samples (like example.wdp and
MARKET-3361-ipm-bg-DE-treat[1].wdp) apparently to documentation this
characteristic string occur at other offsets.
I verified this by running a command like:
grep WMPHOTO "MARKET-3361-ipm-bg-DE-treat[1].wdp" example.wdp
Binary file MARKET-3361-ipm-bg-DE-treat[1].wdp matches
Binary file example.wdp matches
So i must only adjust some minor details. First i put code fragment
which start with check for GDI_SIGNATURE signature GDI_SIGNATURE inside
sub routine that looks like:
0 name jxr-info
>90 bequad 0x574D50484F544F00
>>98 byte&0x08 =0x08 \b, hard tiling
>>>101 beshort&0xf0 0x80 \bRGBE
>>>101 beshort&0xf0 >0x80 \b(reserved %#x)
For many samples with signature at offset 90 now just call this
subroutine. For other samples (like MARKET-3361-ipm-bg-DE-treat[1].wdp
example.wdp) i just search for that signature and call the subroutine
with relative offset. So this now becomes like
0 string \x49\x49\xbc
>3 byte 1
>>4 lelong%2 0 JPEG-XR Image
!:mime image/jxr
!:ext jxr/wdp/hdp
>90 bequad 0x574D50484F544F00
>>0 use jxr-info
>90 bequad !0x574D50484F544F00
>>4 search/3267/sb WMPHOTO\0
>>>&-90 use jxr-info
After applying the above mentioned modifications by 2 patches
the duplicate messages vanish and all my inspected JPEG-XR Images are
still recognized but now i get for all samples detail information
provided by code fragment which is embedded inside subroutine
jxr-info. This now looks like:
FLOWER.wdp:
JPEG-XR Image
, hard tiling, spatial xform=TL, short header
, 2592x3904, bitdepth=5-6-5, colorfmt=YONLY
MARKET-3361-ipm-bg-DE-treat[1].wdp:
JPEG-XR Image
, codestream present, spatial xform=TL, short header
, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
MARKET.tif:
TIFF image data, little-endian, direntries=15
, height=600, bps=194, compression=none
, PhotometricInterpretation=RGB
, orientation=upper-left, width=800
SAKURA.wdp:
JPEG-XR Image, hard tiling, spatial xform=TL, short header
, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
SMALLTOMATO.wdp:
JPEG-XR Image, hard tiling, spatial xform=TL, short header
, 3888x2592, bitdepth=1-BLACK=1, colorfmt=YONLY
abydos.jxr:
JPEG-XR Image, spatial xform=TL, short header
, 800x600, bitdepth=16-SIGNED, colorfmt=YONLY
example.tif:
TIFF image data, little-endian, direntries=15
, height=800, bps=194, compression=none
, PhotometricInterpretation=RGB
, orientation=upper-left, width=1200
example.wdp:
JPEG-XR Image
, codestream present, spatial xform=TL, short header
, 1200x800, bitdepth=16-FLOAT, colorfmt=YONLY
fmt-590-signature-id-931.wdp:
data
With best wishes,
Jörg Jenderek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.41-images-jpeg.diff.sig
Type: application/octet-stream
Size: 923 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment.obj>
-------------- next part --------------
--- file-5.41/magic/Magdir/images.old 2021-10-18 16:20:03.000000000 +0200
+++ file-5.41/magic/Magdir/images 2022-06-03 00:40:07.046440900 +0200
@@ -1865,54 +1865,4 @@
0 string \x46\x4d\x52\x00 ISO/IEC 19794-2 Format Minutiae Record (FMR)
-# doc: https://www.shikino.co.jp/eng/products/images/FLOWER.jpg.zip
-# example: https://www.shikino.co.jp/eng/products/images/FLOWER.wdp.zip
-90 bequad 0x574D50484F544F00 JPEG-XR Image
->98 byte&0x08 =0x08 \b, hard tiling
->99 byte&0x80 =0x80 \b, tiling present
->99 byte&0x40 =0x40 \b, codestream present
->99 byte&0x38 x \b, spatial xform=
->99 byte&0x38 0x00 \bTL
->99 byte&0x38 0x08 \bBL
->99 byte&0x38 0x10 \bTR
->99 byte&0x38 0x18 \bBR
->99 byte&0x38 0x20 \bBT
->99 byte&0x38 0x28 \bRB
->99 byte&0x38 0x30 \bLT
->99 byte&0x38 0x38 \bLB
->100 byte&0x80 =0x80 \b, short header
->>102 beshort+1 x \b, %d
->>104 beshort+1 x \bx%d
->100 byte&0x80 =0x00 \b, long header
->>102 belong+1 x \b, %x
->>106 belong+1 x \bx%x
->101 beshort&0xf x \b, bitdepth=
->>101 beshort&0xf 0x0 \b1-WHITE=1
->>101 beshort&0xf 0x1 \b8
->>101 beshort&0xf 0x2 \b16
->>101 beshort&0xf 0x3 \b16-SIGNED
->>101 beshort&0xf 0x4 \b16-FLOAT
->>101 beshort&0xf 0x5 \b(reserved 5)
->>101 beshort&0xf 0x6 \b32-SIGNED
->>101 beshort&0xf 0x7 \b32-FLOAT
->>101 beshort&0xf 0x8 \b5
->>101 beshort&0xf 0x9 \b10
->>101 beshort&0xf 0xa \b5-6-5
->>101 beshort&0xf 0xb \b(reserved %d)
->>101 beshort&0xf 0xc \b(reserved %d)
->>101 beshort&0xf 0xd \b(reserved %d)
->>101 beshort&0xf 0xe \b(reserved %d)
->>101 beshort&0xf 0xf \b1-BLACK=1
->101 beshort&0xf0 x \b, colorfmt=
->>101 beshort&0xf0 0x00 \bYONLY
->>101 beshort&0xf0 0x10 \bYUV240
->>101 beshort&0xf0 0x20 \bYWV422
->>101 beshort&0xf0 0x30 \bYWV444
->>101 beshort&0xf0 0x40 \bCMYK
->>101 beshort&0xf0 0x50 \bCMYKDIRECT
->>101 beshort&0xf0 0x60 \bNCOMPONENT
->>101 beshort&0xf0 0x70 \bRGB
->>101 beshort&0xf0 0x80 \bRGBE
->>101 beshort&0xf0 >0x80 \b(reserved %#x)
-
# From: Johan van der Knijff <johan.vanderknijff at kb.nl>
#
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-jxr.txt.gz
Type: application/x-gzip
Size: 822 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nconvert-jxr.txt.gz
Type: application/x-gzip
Size: 673 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment-0001.bin>
-------------- next part --------------
--- file-5.45/magic/Magdir/jpeg.old 2022-12-21 16:58:05.000000000 +0100
+++ file-5.45/magic/Magdir/jpeg 2023-10-06 21:44:18.412532000 +0200
@@ -167,2 +167,3 @@
# JPEG extended range
+# Update: Joerg Jenderek 2023
# URL: http://fileformats.archiveteam.org/wiki/JPEG_XR
@@ -170,8 +171,12 @@
# http://mark0.net/download/triddefs_xml.7z/defs/b/bitmap-wmp.trid.xml
-# Note: called by TrID "JPEG XR bitmap"
+# http://mark0.net/download/triddefs_xml.7z/defs/b/bitmap-jxr.trid.xml
+# Note: called by TrID "JPEG XR bitmap" and "JPEG XR bitmap (WMPHOTO)"
+# verified as "JPEG XR" by XnView `nconvert -fullinfo *.jxr *.wdp`
+# partly by ImageMagick command `identify -verbose *.wdp`
+# and libjxr-tools `JxrDecApp -v -i example.wdp -o example.tif`
0 string \x49\x49\xbc
-# FILE_VERSION_ID; shall be equal to 1; other values are reserved for future use
+# FILE_VERSION_ID; shall be equal to 1; other values are reserved for future use and are unlike to appear
>3 byte 1
# FIRST_IFD_OFFSET; shall be an integer multiple of 2; so skip DROID fmt-590-signature-id-931.wdp
->>4 lelong%2 0 JPEG-XR
+>>4 lelong%2 0 JPEG-XR Image
#!:mime image/vnd.ms-photo
@@ -182,6 +187,16 @@
#!:ext jxr/wdp/hdp/wmp
-# moved from ./images (version 1.205 ), merged and
-# partly verified by XnView `nconvert -info abydos.jxr FLOWER.wdp`
-# example: https://web.archive.org/web/20160403012904/
+# moved from ./images (version 1.243 ) and merged
+# example:
# http://shikino.co.jp/solution/upfile/FLOWER.wdp.zip
+# often GDI_SIGNATURE "WMPHOTO\0" at offset 90 like: FLOWER.wdp abydos.jxr SMALLTOMATO.wdp
+>90 bequad 0x574D50484F544F00
+>>0 use jxr-info
+# seldom no GDI_SIGNATURE WMPHOTO\0 at offset 90 like: example.wdp MARKET-3361-ipm-bg-DE-treat[1].wdp
+>90 bequad !0x574D50484F544F00
+# look for GDI_SIGNATURE WMPHOTO\0 at other offset
+>>4 search/3267/sb WMPHOTO\0
+>>>&-90 use jxr-info
+#
+0 name jxr-info
+# check for GDI_SIGNATURE that corresponds to "WMPHOTO\0"
>90 bequad 0x574D50484F544F00
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.45-jpeg-jxr.diff.sig
Type: application/octet-stream
Size: 1198 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20231006/625fd70d/attachment-0001.obj>
More information about the File
mailing list