[File] [PATCH] Magdir/images FITS image; more extensions+mime type

Christos Zoulas christos at zoulas.com
Fri Jan 5 16:18:50 UTC 2024


Committed, thanks!

christos

> On Dec 24, 2023, at 10:29 PM, Jörg Jenderek (GMX) <joerg.jen.der.ek at gmx.net> wrote:
> 
> Hello,
> 
> some days ago i must handles some files with FTS suffix. Some samples
> are astronomic graphic images.
> 
> When running file command version 5.45 on such graphic images and
> related files i get an output like:
> 
> DDTSUVDATA.fits:                FITS image data	, 32-bit
> 				, floating point, single precision
> M57.FIT:                        FITS image data
> M57.PGM:                        Netpbm image data
> 				, size = 192 x 165, rawbits, greymap
> MOON.FTS:                       FITS image data
> arange.fits:                    FITS image data, 32-bit
> 				, two's complement binary integer
> blank.fits:                     FITS image data
> example.fit:                    FITS image data, 8-bit
> 				, character or unsigned binary integer
> group.fits:                     FITS image data, 32-bit
> 				, floating point, single precision
> header_newlines.fits:           FITS image data, 64-bit
> 				, floating point, double precision
> ngc1316r-d.fz:                  FITS image data, 16-bit
> 				, two's complement binary integer
> ngc1316r-gzip.fz:               FITS image data, 16-bit
> 				, two's complement binary integer
> ngc1316r-gzip2.fz:              FITS image data, 16-bit
> 				, two's complement binary integer
> ngc1316r-hcomp.fz:              FITS image data, 16-bit
> 				, two's complement binary integer
> ngc1316r-rice.fz:               FITS image data, 16-bit
> 				, two's complement binary integer
> ngc1316r.fit:                   FITS image data, 16-bit
> 				, two's complement binary integer
> o4sp040b0_raw-p.fz:             FITS image data, 16-bit
> 				, two's complement binary integer
> x-fmt-383-signature-id-57.fits: FITS image data, 8-bit
> 				, character or unsigned binary integer
> x-fmt-383-signature.fits:       FITS image data, 8-bit
> 				, character or unsigned binary integer
> 
> With option --extension only 2 suffix fits/fts are shown and with -i
> option not false image/fits is shown.
> 
> For comparison reason i run the file format identification utility
> TrID ( See https://mark0.net/soft-trid-e.html). Most samples are
> "recognized" and described with low priority as "Flexible Image
> Transport System bitmap (gen)" with mime type image/fits by
> bitmap-fts.trid.xml. But here 4 suffix (.FITS/FIT/FTS/FZ) are listed.
> The samples with FZ suffix are also described with higher priority as
> "Flexible Image Transport System bitmap (compressed)" via
> bitmap-fz.trid.xml. Here only 1 suffix is listed. The sample (like
> M57.FIT)  and DROID samples (like x-fmt-383-signature.fits
> x-fmt-383-signature-id-57.fits) are described as "Unknown!" (See appended
> trid-v-fits.txt.gz).
> 
> For comparison reason i also run the file format identification
> utility DROID ( See https://sourceforge.net/projects/droid/). This
> identifies most examples as "Flexible Image Transport System" described
> by PUID x-fmt/383. Here now 2 mime types (application/fits image/fits)
> are listed. Here only fits suffix is considered as valid. A few samples
> (like M57.FIT MOON.FTS) are not recognized (See appended
> droid-fits.csv.gz)
> 
> Luckily with information given by the other tools i also found a
> page about Flexible Image Transport System on Wikipedia and file formats
> archive team web site. There also links for samples, suited software and
> references are listed. That information is expressed by comment lines
> inside Magdir/images like:
> 
> # URL:	http://fileformats.archiveteam.org/
> #	wiki/Flexible_Image_Transport_System
> #	https://en.wikipedia.org/wiki/FITS
> # Ref.:	https://mark0.net/download/triddefs_xml.7z
> #	defs/b/bitmap-fts.trid.xml
> # URL:	https://heasarc.nasa.gov/fitsio/fpack/
> # Ref.:	https://mark0.net/download/triddefs_xml.7z
> #	defs/b/bitmap-fz.trid.xml
> 
> The description starts inside Magdir/images by lines like:
> 0	string	SIMPLE\ \ =	FITS image data
> !:mime	image/fits
> !:ext	fits/fts
> 
> To improve recognition i summarize the file format specification.
> Especially at the beginning the data are organized in blocks (or called
> card image) with 80 bytes. A card consist of 3 structures (keyword,
> assignment-operator, value with optional comment). The keyword is a 1-
> to 8-character, left-justified ASCII string. The assignment indicator
> starts with equal sign (=). This indicator always occupies columns nine
> and ten in the card image. The value is an ASCII representation of the
> numerical or string data associated with the keyword. A comment is
> separated from the value by a slash (/) or a space and a slash (/); the
> latter is recommended. A boolean value always occupies column 30.
> Columns that do not contain data are filled with spaces. Integer and
> floating-point values are located in columns 11 through 30 and are
> right-justified with spaces, if necessary. if a keyword contains fewer
> than eight characters, it is padded with spaces. There are five keywords
> that are required in every FITS file: SIMPLE, BITPIX, NAXIS, NAXISn, and
> END. (EXTEND is also a required keyword if extensions are present in the
> file.).So file command at the moment only check keyword and assignment
> of first card. TrID and DROID also checks keyword and assignment of
> second card. DROID also checks keyword, assignment and value of third card.
> 
> On the one hand find command could not be a validator for FITS samples.
> That is done for example by tool fitsverify (see appended
> DDTSUVDATA-fitsverify.txt.gz x-fmt-383-fitsverify.txt.gz
> M57-fitsverify.txt.gz MOON-fitsverify.txt.gz). On the other hand in my
> opinion it should not so stiff as the other mentioned tools, because
> some other mentioned software accepts "strange" samples (like M57.FIT
> MOON.FTS) and some crashed or freeze without telling reason. So i
> decided to accept for file command most "strange" and telling
> information about "strangeness". So i only skip DROID samples (like
> x-fmt-383-signature-id-57.fits). This is used by DROID tool to recognize
> FITS samples. So these are not real files but contain just some leading
> characteristic bytes. Or more concrete the first 2 cards and
> keyword/assignment part of third card. The difference to real examples
> is that most values are filled with dummy bytes (0x00 or 0xAB) whereas
> there space characters (0x20) are used for padding. So to skip DROID
> samples the magic lines now starts like:
> 0	string	SIMPLE\ \ =
> >89	ubeshort	=0x2020	FITS image
> 
> The used bits length are shown by lines afterwards like:
> >109	string	8	\b, 8-bit, character or unsigned binary integer
> >108	string	16	\b, 16-bit, two's complement binary integer
> >107	string	\ 32	\b, 32-bit, two's complement binary integer
> >107	string	-32	\b, 32-bit, floating point, single precision
> >107	string	-64	\b, 64-bit, floating point, double precision
> 
> What is wrong here? An entry for second 64-bit variant (two's complement
> binary integer found for example in blank.fits) is missing. In well
> formed samples the bit numbers are stored at defined positions. So in
> few samples (like MOON.FTS) the eight digit is some bytes more left found.
> So in few samples (like M57.FIT) the eight digit is some bytes more
> right found. Furthermore in few samples (like M57.FIT) card 2 (BITPIX)
> and card 3 (NAXIS) are swapped. To describe also these mentioned samples
> the bit part now becomes like:
> 
> >>80	search/81/b	BITPIX\040\040=
> >>>&28	string	8	\b, 8-bit, character or unsigned binary integer
> >>>>0	string	x		(too right positioned)
> >>>&11	string	8	\b, 8-bit, character or unsigned binary integer
> >>>>0	string	x		(too left positioned)
> >>>&20	string	8	\b, 8-bit, character or unsigned binary integer
> >>>&19	string	16	\b, 16-bit, two's complement binary integer
> >>>&18	string	\04032	\b, 32-bit, two's complement binary integer
> >>>&18	string	-32	\b, 32-bit, floating point, single precision
> >>>&18	string	-64	\b, 64-bit, floating point, double precision
> >>>&18	string	\04064	\b, 64-bit, two's complement binary integer
> 
> Additional information for "strange" samples (like M57.FIT) is shown at
> the end by new lines like:
> >>80	string	!BITPIX\040\040=	\b, at 80
> >>>80	string	x			"%-0.9s"
> >>160	string	!NAXIS\040\040\040=	\b, at 160
> >>>160	string	x			"%-0.9s"
> 
> Samples can be compressed (with types like NOCOMPRESS GZIP_1 GZIP_2
> HCOMPRESS_1 PLIO_1 RICE_1). To avoid complications (some software can
> not handle compressed samples) here other file name suffix FZ is used.
> With the help of TrID definition and fpack user guide i see that this
> information is stored in card with ZCMPTYPE keyword. So the extension
> part is now done by lines like:
> >>240	search/0x4790/b	ZCMPTYPE=	data, compression type
> #>>>&0	string		x		COMPRESSION=%0.13s
> >>>&0	regex		[A-Z_1-2]{4,11}	%s
> >>240	default		x		data
> !:ext	fits/fit/fts
> 
> The dimensions are stored in card with keyword NAXIS normally third. A
> single digit 2 implies conventional bitmap in most cases. The digit 3
> implies data cubes of three dimensions (animated bitmap or similar).
> Such samples can often be displayed/converted by graphic tools (like
> XnView ImageMagick GIMP).
> 
> I verified information by XnView command line tool by line like:
> 	nconvert -in fits -fullinfo M57.FIT MOON.FTS
> Here some samples (like example.fit M57.FIT MOON.FTS ngc1316r.fit) are
> recognized (See appended nconvert-fits.txt.gz)
> I also tried ImageMagick version 7.1.1. Here some others samples
> (like arange.fits blank.fits example.fit header_newlines.fits MOON.FTS
> ngc1316r.fit) are recognized (See appended identify-fits.txt.gz) by
> command line like:
> 	identify MOON.FTS
> I also tried NetPBM tools. This can be done by command line like:
> 	fitstopnm M57.FIT | file
> 
> So such samples get mime type image/fits. The samples with other
> dimensions (like 0 5 6) can normally not be displayed by graphic tools.
> So such samples get mime type application/fits. This is now done by
> lines like:
> 
> >>80	search/81/b	NAXIS\040\040\040=		\b,
> #>>>>&0 string		x		NAXIS=%-0.31s
> >>>&0	search/31/b	\0400\040	0 axes
> !:mime	application/fits
> >>>&-1	search/31/b	\0401\040	1 axis
> !:mime	application/fits
> #!:mime	image/fits
> >>>&0	search/31/b	\0402\040	2 axes
> !:mime	image/fits
> >>>&0	search/31/b	\0403\040	3 axes
> !:mime	image/fits
> >>>&0	default		x
> >>>>&0	regex/31/s	=[0-9]{1,3} 	%s axis
> !:mime	application/fits
> 
> Then of course you want to get the dimensions of data as shown for other
> graphics (like M57.PGM). For real images you often get known dimensions
> (like 1200x800 example.fit) whereas for application samples you often
> get "strange" dimensions (like 0x3 DDTSUVDATA.fits 0x5 group.fits 8x300
> ngc1316r-gzip.fz). This information is stored in cards with keywords
> NAXIS1 and NAXIS2. So this is now shown by lines like:
> 
> >>240	search/29400/bs	NAXIS1\040\040=		\b,
> >>>&9	regex	=[0-9]{1,31} 	%s
> >>>320	search/29120/bs	NAXIS2\040\040=		x
> >>>>&9	regex	=[0-9]{1,31} 	%s
> 
> After applying the above mentioned modifications by patch
> file-5.45-images-fits.diff then all my inspected astronomic graphic
> images still described but now i get for all my inspected samples
> bit depth information. Also dimension and compression informations are
> now shown. This then looks like:
> 
> DDTSUVDATA.fits:                FITS image data, 32-bit
> 				, floating point, single precision
> 				, 6 axis, 0 x 3
> M57.FIT:                        FITS image data, 8-bit
> 				, character or unsigned binary integer
> 				(too right positioned)
> 				, 2 axes, 192 x 165
> 				, at 80 "NAXIS   =", at 160 "BITPIX  ="
> M57.PGM:                        Netpbm image data
> 				, size = 192 x 165, rawbits, greymap
> MOON.FTS:                       FITS image data, 8-bit
> 				, character or unsigned binary integer
> 				(too left positioned)
> 				, 2 axes, 192 x 165
> arange.fits:                    FITS image data, 32-bit
> 				, two's complement binary integer
> 				, 3 axes, 11 x 10
> blank.fits:                     FITS image data, 64-bit
> 				, two's complement binary integer
> 				, 2 axes, 1 x 1
> example.fit:                    FITS image data, 8-bit
> 				, character or unsigned binary integer
> 				, 3 axes, 1200 x 800
> group.fits:                     FITS image data, 32-bit
> 				, floating point, single precision
> 				, 5 axis, 0 x 5
> header_newlines.fits:           FITS image data, 64-bit
> 				, floating point, double precision
> 				, 2 axes, 1 x 1
> ngc1316r-d.fz:                  FITS image data
> 				, compression type NOCOMPRESS, 16-bit
> 				, two's complement binary integer
> 				, 0 axes, 16 x 300
> ngc1316r-gzip.fz:               FITS image data
> 				, compression type GZIP_1, 16-bit
> 				, two's complement binary integer
> 				, 0 axes, 8 x 300
> ngc1316r-gzip2.fz:              FITS image data
> 				, compression type GZIP_2, 16-bit
> 				, two's complement binary integer
> 				, 0 axes, 8 x 300
> ngc1316r-hcomp.fz:              FITS image data
> 				, compression type HCOMPRESS_1 , 16-bit
> 				, two's complement binary integer
> 				, 0 axes, 8 x 19
> ngc1316r-rice.fz:               FITS image data
> 				, compression type RICE_1, 16-bit
> 				, two's complement binary integer
> 				, 0 axes, 8 x 300
> ngc1316r.fit:                   FITS image data, 16-bit
> 				, two's complement binary integer
> 				, 2 axes, 440 x 300
> o4sp040b0_raw-p.fz:             FITS image data
> 				, compression type PLIO_1, 16-bit
> 				, two's complement binary integer
> 				, 0 axes, 8 x 44
> x-fmt-383-signature-id-57.fits: data
> x-fmt-383-signature.fits:       ISO-8859 text, with no line terminators
> 
> I hope my diff file can be applied in future version of file
> utility.
> 
> With best wishes,
> Jörg Jenderek
> --
> Jörg Jenderek
> <Nachrichtenteil als Anhang.DEFANGED-28><nconvert-fits.txt.gz><identify-fits.txt.gz><trid-v-fits.txt.gz><droid-fits.csv.gz><DDTSUVDATA-fitsverify.txt.gz><x-fmt-383-fitsverify.txt.gz><M57-fitsverify.txt.gz><MOON-fitsverify.txt.gz><file-5_45-images-fits_diff_sig.DEFANGED-29><file-5_45-images-fits_diff.DEFANGED-30>-- 
> File mailing list
> File at astron.com
> https://mailman.astron.com/mailman/listinfo/file
> <sanitizer.log>



More information about the File mailing list