[File] [PATCH] of Magdir/cad for Intergraph MicroStation update; *.dgn *.cel *.cit *.rgb *.rle
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Mon Aug 5 22:24:26 UTC 2019
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
some weeks ago i handled Microstation V8 CAD variants which are based
on Compound Document format (abbreviated as CDF). I run file
command version 5.37 on non CDF based cad files with name extension
dgn and correlated files. That are libraries with file name extension
cel and raster images (*.cit *.rle *.rgb). With -k -m Magdir/cad
options i get an output like:
civsur.cel: Bentley/Intergraph MicroStation DGN cell library
COMP27.RGB: Microstation
Bentley/Intergraph MicroStation
COMP9.rle: Microstation
Bentley/Intergraph MicroStation
FLOORPLA.DGN: Bentley/Intergraph MicroStation DGN vector CAD
Microstation
Bentley/Intergraph MicroStation
LONGLAT.CIT: Microstation CITFile
Bentley/Intergraph MicroStation CIT raster CAD
samp15.dgn: Bentley/Intergraph MicroStation DGN vector CAD
Microstation
Bentley/Intergraph MicroStation
seed2d_b.dgn: Bentley/Intergraph MicroStation DGN vector CAD
Microstation
Bentley/Intergraph MicroStation
seed3d_b.dgn: Bentley/Intergraph MicroStation DGN vector CAD
WHEEL.DGN: Bentley/Intergraph MicroStation DGN vector CAD
WRENCH.DGN: Bentley/Intergraph MicroStation DGN vector CAD
Microstation DGNFile
Bentley/Intergraph MicroStation
The messages starting with phrase "Bentley/Intergraph" appears 2
times, because the following lines in Magdir/cad appears twice:
0 belong 0x0809fe02 Bentley/Intergraph MicroStation DGN vector CAD
0 beshort 0x0809 Bentley/Intergraph MicroStation
The remaining third message starting with phrase Microstation is
triggered by same expression, but only expressed by octal
representation lines like:
0 string \010\011\376 Microstation
>3 string \002
>>30 string x DGNFile
Furthermore with --extension option only ??? is displayed. And with -i
option only application/octet-stream is displayed.
The raster images are identified by octal expressions like
>4 string \030\000\000 CITFile
>4 string \030\000\003 CITFile
In principal the same is done by hexadecimal expression like
>>0x04 beshort 0x1800 CIT raster CAD
As reference i use page about dgn files found on dgnlib site. So i add
comment line like
# reference: http://dgnlib.maptools.org/dgn.html
On the the same site i found MicroStation 95 Reference Guide as
ref18.pdf. Both are not full complete, but with that information it is
possible to understand current magic identifications and correct
lines. According to documentation for debugging purpose information
can be shown by lines like
>0 ubyte&0x3F x \b, level %u
>0 ubyte &0x80 \b, complex
>0 ubyte &0x40 \b, reserved
>1 ubyte&0x7F x \b, type %u
>2 uleshort x \b, words 0x%4.4x to follow
Level seems to be always 8. DGB files always start with element of TCB
type, that is value 9. That is also matched for samples like
seed3d_b.dgn or WHEEL.DGN with complex and reserved bit set. These
samples were described with only one text by magic line
0 belong 0xc809fe02 Bentley/Intergraph MicroStation DGN vector CAD
CEL libraries always start with element type Group Data Elements, that
is value 5. For such libraries words to follow in element (WTF) have
value 0017h. This was expressed by magic line
0 belong 0x08051700 Bentley/Intergraph MicroStation DGN cell library
So this magic lines assumes that all cell libraries have a WTF value
17h, but in documentation i see no hint that this should always be
true. So i removed for libries test relying on WTF value.
So i replace all magic lines concerning inspected samples and first
test for level 8 and type 5 or 9 by magic line
0 beshort&0x3F73 0x0801
By adding the 2 leading words to WTF value you get size of first
element in words and then by multplying by 2 you get size of first
element in bytes. Or use pointer expression to jump to second element
by line
>(2.s*2) ulong x
For debugging purpose the second element type value can be displayed
by line like
>>&1 ubyte&0x7F x \b, 2nd type %u
According to documentation for DGN files this is always 8 for
Digitizer element and for CEL files this is always 5 for library cell
header.
So test for second element type 1 for branch with cell library by
>>&1 ubyte&0x7F 1
Afterwards test for 1st element with level 8 and type 5 for cell
library by line
>>>0 beshort 0x0805 Bentley/Intergraph Microstation CAD cell library
Afterwards now show user defined mime typ and file name extension by
lines
!:mime application/x-bentley-cel
!:ext cel
So branch for DGN files test for second Digitizer element by lines
>>&1 ubyte&0x7F 8
For DGN files the documentation explicitly mention that first element
has 1536 bytes, that are 3 blocks with 512 bytes. By dividing by 2
this size of element is 768 words long. By subtracting the 2 leading
words you get a WTF value of 766 or expressed in hexadecimal 2FEh. So
here test for valid WTF can be used by lines starting with
>>>2 uleshort =0x02FE Bentley/Intergraph Microstation CAD drawing
I changed name to phrase with "CAD drawing" instead "DGN vector CAD"
or "DGNFile" according how other call such files by looking at web
site URL http://file-extension.net/seeker/file_extension_dgn .
I also removed the phrase "DGN" because this information is now
visible by user defined mime type and file name extension by
addition lines
!:mime application/x-bentley-dgn
!:ext dgn
By the help of documentation some more useful additional information
can be displayed. So if the 0x40 bit of a byte is 1 if the file is 3D,
otherwise 0 for two dimension samples. This is expressed by lines
>>>>1214 ubyte &0x40 3D
>>>>1214 ubyte ^0x40 2D
This dimensional information is not obvious visible like in samples
seed2d_b.dgn or seed3d_b.dgn.
Furthermore 2 character as abbreviation for sub unit and master unit
can be displayed by lines
>>>>1120 string x \b, units %-.2s
>>>>1122 string >\0 %-.2s
In CAD samples like FLOORPLA.DGN made by people using metric systems
you find here often something like m mm.
In samples like seed2d_b.dgn or samp15.dgn made by people using feet
and inch as units you find here often something like FT IN or ' ".
For debugging purpose the words to optional attribute linkage can be
shown by lines
>>>>30 ubyte x \b, attindx \%o
>>>>31 ubyte x \b\%o
These values are different, but apparently only a dozen of combination
seems to appear. This was used as last test for DGN files by 19 lines
likes
>>30 string \026\105 DGNFile
...
>>30 string \376\103 DGNFile
I do not understand why these tests for attindx values are used. For
me this make no sense. So i removed these lines. Instead i used test
for documented second element type 8 mentioned above.
The shown information can be verified by running from dgnlib suite
the dgndump tool on DGN files.
Third branch is for Intergraph raster images (INGR). Information is
found on fileformats.archiveteam.org web site. So i add comment line
# URL: http://fileformats.archiveteam.org/wiki/Intergraph_Raster
There a link to specifications of Intergraph Raster File Format (from
archive.org) is also mentioned.
Unfortunately the use of the second element trick is not useful here,
because the documentation says nothing about second element.
According to documentation at the end of first block 3 bytes are
reserved with value always null. For CEL and DGN files there value is
not null. There "conversion" variable of ViewInfo structure is stored.
So catch raster images by new second test line
>508 ubelong&0xFFffFF00 =0
According to docs raster image always start with byte sequence 08 09.
So test for level 8 and type 9 by third test line like
>>0 beshort 0x0809
According to documentation first element occupies some blocks a 512
bytes. So size of element in byte is something like 0200h. By dividing
through 2 you get size in words like 0100h. Subtracting 2 for
leading words gives a WTF value like 00FEh. So test for length of 1st
element by line
>>>2 ubyte 0xfe
Afterwards call new subroutine to describe INGR raster images.
>>>>0 use ingr-image
0 name ingr-image
At offset 4 the 2 bytes sized variable DataTypeCode is stored. This
indicates format, depth of the pixel data and used compression.
In version 5.37 what was called by "CITFile" and "CIT raster CAD", i
now describe this by lines like
>4 uleshort x Intergraph raster image
>>4 uleshort 0x0018 \b, CCITT Group 4 1-bit
!:mime image/x-intergraph-cit
!:ext cit
>>4 default x
>>>4 uleshort x \b, Type %u
!:mime image/x-intergraph
I changed name. I removed "CIT" phrase because this information is
now shown by --extension and mime typ option. So i look how other
call such images by site like
http://file-extension.net/seeker/file_extension_cit .
And i also look at reference where type 24 is described as "CCITT
Group 4 1-bit". I removed additional magic lines with test for
DataTypeCode 18h, instead i used test for 3 reserved null bytes.
Because then only CIT images are recognized, and for the 33 other
images types you get an unspecific description like MicroStation or
Microstation Bentley/Intergraph for samples like COMP27.RGB and
COMP9.rle. Unfortunately i only get only samples for 2 other image
types. So i insert matching code segments:
>>4 uleshort 0x0009 \b, Run-Length Encoded 1-bit
!:mime image/x-intergraph-rle
!:ext rel
>>4 uleshort 27 \b, Adaptive RLE RGB
!:mime image/x-intergraph-rgb
!:ext rgb
Afterwards show the ApplicationType, which can have ten possible
values by line:
>6 uleshort !0 \b, ApplicationType %u
0 means Generic raster image, 3 means Drawing, Scanning. So in
version 5.37 only CIT examples with these 2 ApplicationType were
recognized by lines
>4 string \030\000\000 CITFile
>4 string \030\000\003 CITFile
So i removed these additional magic lines, because i now use as
additional line which test for 3 reserved null bytes.
According to documentation now show also image dimension by lines
>184 ulelong x \b, %u x
>188 ulelong x %u
The variable ScanlineOrient indicates the origin and the orientation
of the scan lines. This is now shown by lines
>194 ubyte x \b, orientation
>194 ubyte &0x01 right
>194 ubyte ^0x01 left
>194 ubyte &0x02 down
>194 ubyte ^0x02 top
>194 ubyte &0x04 horizontal
>194 ubyte ^0x04 vertical
The shown information for inspected images can be verified by running
nconvert of xnview suite with -fullinfo option.
After applying the above mentioned modifications by patch
file-5.37-cad-intergraph.diff then duplicate identification vanish and
i get a more precise output like:
civsur.cel: Bentley/Intergraph Microstation CAD cell library
COMP27.RGB: Intergraph raster image, Adaptive RLE RGB,
640 x 480, orientation left top horizontal
COMP9.rle: Intergraph raster image, Run-Length Encoded 1-bit,
640 x 480, orientation left top horizontal
FLOORPLA.DGN: Bentley/Intergraph Microstation CAD drawing 2D,
units m mm
LONGLAT.CIT: Intergraph raster image, CCITT Group 4 1-bit,
1064 x 1201, orientation left top horizontal
samp15.dgn: Bentley/Intergraph Microstation CAD drawing 2D,
units FT IN
seed2d_b.dgn: Bentley/Intergraph Microstation CAD drawing 2D,
units ' "
seed3d_b.dgn: Bentley/Intergraph Microstation CAD drawing 3D,
units ' "
WHEEL.DGN: Bentley/Intergraph Microstation CAD drawing 3D,
units mu su
WRENCH.DGN: Bentley/Intergraph Microstation CAD drawing 2D,
units in th
I hope my diff file can be applied in future version of
file utility.
With best wishes
Jörg Jenderek
- --
Jörg Jenderek
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
iF0EARECAB0WIQS5/qNWKD4ASGOJGL+v8rHJQhrU1gUCXUiskQAKCRCv8rHJQhrU
1isuAJ9qeD1o0rElk6xm+yW+ZpfjpXV7jQCdFsbvG+BSMS0uuf7UYgTN3/zuDr4=
=ZFKl
-----END PGP SIGNATURE-----
-------------- next part --------------
--- file-5.37/magic/Magdir/cad.old 2019-04-19 00:42:27 +0000
+++ file-5.37/magic/Magdir/cad 2019-08-05 21:38:08 +0000
@@ -18,29 +18,162 @@
# 3F86C928&method=display&p_objectid=97F351F5-9C35-4E5E-89C280A93F86C928
# https://www.bentley.com/products/default.cfm?objectid=A5C2FD43-3AC9-4C71-B682
# 721C479F&method=display&p_objectid=A5C2FD43-3AC9-4C71-B682C7BE721C479F
-0 string \010\011\376 Microstation
->3 string \002
->>30 string \026\105 DGNFile
->>30 string \034\105 DGNFile
->>30 string \073\107 DGNFile
->>30 string \073\110 DGNFile
->>30 string \106\107 DGNFile
->>30 string \110\103 DGNFile
->>30 string \120\104 DGNFile
->>30 string \172\104 DGNFile
->>30 string \172\105 DGNFile
->>30 string \172\106 DGNFile
->>30 string \234\106 DGNFile
->>30 string \273\105 DGNFile
->>30 string \306\106 DGNFile
->>30 string \310\104 DGNFile
->>30 string \341\104 DGNFile
->>30 string \372\103 DGNFile
->>30 string \372\104 DGNFile
->>30 string \372\106 DGNFile
->>30 string \376\103 DGNFile
->4 string \030\000\000 CITFile
->4 string \030\000\003 CITFile
+#
+# URL: https://en.wikipedia.org/wiki/MicroStation
+# reference: http://dgnlib.maptools.org/dgn.html
+# http://dgnlib.maptools.org/dl/ref18.pdf
+# Update: Joerg Jenderek
+# Note: verfied by command like `dgndump seed2d_b.dgn`
+# test for level 8 and type 5 or 9
+0 beshort&0x3F73 0x0801
+# level of element like 8
+#>0 ubyte&0x3F x \b, level %u
+#>0 ubyte &0x80 \b, complex
+#>0 ubyte &0x40 \b, reserved
+# type of element 9~TCB 8~Digitizer setup 5~Group Data Elements
+#>1 ubyte&0x7F x \b, type %u
+# words to follow in element: 17H~CEL libray 2FEh~DGN 9FEh,DFEh~CIT
+#>2 uleshort x \b, words 0x%4.4x to follow
+# test for 3 reserved 0 bytes in CIT or "conversion" in ViewInfo structure (DGN CEL)
+#>508 ubelong x \b, RESERVED %8.8x
+>508 ubelong&0xFFffFF00 =0
+# test for level 8 and type 9 for INGR raster image
+>>0 beshort 0x0809
+# test for length of 1st element is multiple of blocks a 512 bytes
+>>>2 ubyte 0xfe
+>>>>0 use ingr-image
+# test for DGN or CEL by jump words (uleshort) forward to next element
+>(2.s*2) ulong x
+# 2nd element type: 8~Digitizer~DesiGNfile 1~library cell header other~CIT
+#>>&1 ubyte&0x7F x \b, 2nd type %u
+# DGN
+>>&1 ubyte&0x7F 8
+>>>2 uleshort =0x02FE Bentley/Intergraph Microstation CAD drawing
+!:mime application/x-bentley-dgn
+!:ext dgn
+# The 0x40 bit of this byte is 1 if the file is 3D, otherwise 0
+>>>>1214 ubyte &0x40 3D
+>>>>1214 ubyte ^0x40 2D
+# 2 chars for name of subunits like ft FT in IN mu m mm '\0 '\040
+>>>>1120 string x \b, units %-.2s
+# 2 chars for name of master unit like IN in ML SU tn th TH HU mm "\0 "\040 \0\0
+>>>>1122 string >\0 %-.2s
+#>>>>1120 ubelong x \b, units 0x%8.8x
+# element range low,high x y z like xlow=0 08010000h 01080000h
+#>>>>4 ubelong !0 \b, xlow %8.8x
+#>>>>8 ubelong !0 \b, ylow %8.8x
+#>>>>12 ubelong !0 \b, zlow %8.8x
+#>>>>16 ubelong !0 \b, xhigh %8.8x
+#>>>>20 ubelong !0 \b, yhigh %8.8x
+#>>>>24 ubelong !0 \b, zhigh %8.8x
+# graphic group number; all other elements in that group have same non-0 number
+#>>>>28 leshort x \b, grphgrp 0x%4.4x
+# words to optional attribute linkage
+#>>>>30 ubyte x \b, attindx \%o
+#>>>>31 ubyte x \b\%o
+# >>30 string \026\105 DGNFile
+# >>30 string \034\105 DGNFile
+# >>30 string \073\107 DGNFile
+# >>30 string \073\110 DGNFile
+# >>30 string \106\107 DGNFile
+# >>30 string \110\103 DGNFile
+# >>30 string \120\104 DGNFile
+# >>30 string \172\104 DGNFile
+# >>30 string \172\105 DGNFile
+# >>30 string \172\106 DGNFile
+# >>30 string \234\106 DGNFile
+# >>30 string \273\105 DGNFile
+# >>30 string \306\106 DGNFile
+# >>30 string \310\104 DGNFile
+# >>30 string \341\104 DGNFile
+# >>30 string \372\103 DGNFile
+# >>30 string \372\104 DGNFile
+# >>30 string \372\106 DGNFile
+# >>30 string \376\103 DGNFile
+# elements properties indicator
+#>>>>32 uleshort !0 \b, properties 0x%4.4x
+# class 0~Primary
+#>>>>>32 uleshort&0x000F !0 \b, class 0x%4.4x
+# Symbology
+#>>>>>34 uleshort x \b, Symbology 0x%4.4x
+# test for 2nd element type 1~library cell header
+>>&1 ubyte&0x7F 1
+# test for 1st element with level 8 and type 5 for cell library
+>>>0 beshort 0x0805 Bentley/Intergraph Microstation CAD cell library
+!:mime application/x-bentley-cel
+!:ext cel
+#
+# URL: http://fileformats.archiveteam.org/wiki/Intergraph_Raster
+# reference: https://web.archive.org/web/20140903185431/
+# http://oreilly.com/www/centers/gff/formats/ingr/index.htm
+# note: verfied by command like `nconvert -fullinfo LONGLAT.CIT`
+# display information for intergraph raster bitmap
+0 name ingr-image
+# in 5.37 "Microstation CITFile" "Bentley/Intergraph MicroStation CIT raster CAD"
+# DataTypeCode indicates format, depth of the pixel data and used compression
+>4 uleshort x Intergraph raster image
+>>4 uleshort 0x0009 \b, Run-Length Encoded 1-bit
+!:mime image/x-intergraph-rle
+!:ext rel
+>>4 uleshort 0x0018 \b, CCITT Group 4 1-bit
+!:mime image/x-intergraph-cit
+!:ext cit
+>>4 uleshort 27 \b, Adaptive RLE RGB
+!:mime image/x-intergraph-rgb
+!:ext rgb
+>>4 default x
+>>>4 uleshort x \b, Type %u
+!:mime image/x-intergraph
+# TODO:
+#>4 uleshort 0 \b, no data
+# ...
+#>4 uleshort 0x0045 \b, Continuous Tone CMKY (Uncompressed)
+# ApplicationType: 0~generic raster image 3~drawing, scanning
+# 8~I/IMAGE and MicroStation Imager 9~ModelView
+>6 uleshort !0 \b, ApplicationType %u
+#>6 uleshort x \b, ApplicationType %u
+# XViewOrigin; Raster grid data X origin
+#>8 ulequad !0 \b, XViewOrigin %llx
+# PixelsPerLine is the number of pixels in a scan line of bitmapp
+>184 ulelong x \b, %u x
+# NumberOfLines is height of the raster data in scanlines
+>188 ulelong x %u
+# DeviceResolution; resolution of scanning device
+# positive indicates number of micros between lines; negative indicates DPI
+#>192 leshort x \b, DeviceResolution %d
+# ScanlineOrient indicates the origin and the orientation of the scan lines
+#>194 ubyte x \b, ScanlineOrient %x
+>194 ubyte x \b, orientation
+>194 ubyte &0x01 right
+>194 ubyte ^0x01 left
+>194 ubyte &0x02 down
+>194 ubyte ^0x02 top
+>194 ubyte &0x04 horizontal
+>194 ubyte ^0x04 vertical
+# ScannableFlag; Scanline indexing method used
+#>195 ubyte !0 \b, ScannableFlag 0x%x
+# RotationAngle; Rotation angle of raster data
+#>196 ubequad !0 \b, RotationAngle 0x%llx
+# SkewAngle; Skew angle of raster data
+#>204 ubequad !0 \b, SkewAngle %llx
+# DataTypeModifier; Additional raster data format info
+#>212 uleshort !0 \b, DataTypeModifier 0x%4.4x
+# DesignFile[66]; Name of the design file
+>214 string >\0 \b, DesignFile %-.66s
+# DatabaseFile[66]; Name of the database file
+>280 string >\0 \b, DatabaseFile %-.66s
+# ParentGridFile[66]; Name of parent grid file
+>346 string >\0 \b, ParentGridFile %-.66s
+# FileDescription[80]; Text description of file and contents
+>412 string >\0 \b, FileDescription %-.80s
+# MinValue
+#>492 ubequad !0 \b, MinValue 0x%llx
+# MaxValue
+#>500 ubequad !0 \b, MaxValue 0x%llx
+# Reserved[3]; Unused (always 0)
+#>508 ubelong&0xFFffFF00 x \b, RESERVED %8.8x
+# GridFileVersion; Grid File Version like 2 3
+#>511 ubyte x \b, GridFileVersion %x
# AutoCAD
# Merge of the different contributions and updates from https://en.wikipedia.org/wiki/Dwg
@@ -140,12 +273,6 @@
# Phillip Griffith <phillip dot griffith at gmail dot com>
# AutoCAD magic taken from the Open Design Alliance's OpenDWG specifications.
#
-0 belong 0x08051700 Bentley/Intergraph MicroStation DGN cell library
-0 belong 0x0809fe02 Bentley/Intergraph MicroStation DGN vector CAD
-0 belong 0xc809fe02 Bentley/Intergraph MicroStation DGN vector CAD
-0 beshort 0x0809 Bentley/Intergraph MicroStation
->0x02 byte 0xfe
->>0x04 beshort 0x1800 CIT raster CAD
# 3DS (3d Studio files)
0 leshort 0x4d4d
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.37-cad-intergraph.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20190806/496d4ce2/attachment.obj>
More information about the File
mailing list