[File] [PATCH] Recognize SOSI map data (Was: Advice on writing magic entry for SOSI map format?)

Petter Reinholdtsen pere at hungry.com
Thu May 9 12:12:25 UTC 2019


Is this the wrong mailing list to ask for advice on witing magic
entries?  I've seen no replies so far, and wonder if I am using the
wrong contact point.

Anyway, here is my current patch, which seem to work fairly well.  For
some reason ", ISO-8859 text, with CRLF line terminators" is appended to
the match from my magic rule, so I dropped the reporting of SOSI charset
(..TEGNSETT), to avoid duplicate info in the output from file.

The match is not fool proof, but seem to match all the test files I have
used so far.  It is possible to construct a valid SOSI file not matched
by this rule, though.

diff --git a/magic/Makefile.am b/magic/Makefile.am
index 244eebdc..21312cb4 100644
--- a/magic/Makefile.am
+++ b/magic/Makefile.am
@@ -256,6 +256,7 @@ $(MAGIC_FRAGMENT_DIR)/smalltalk \
 $(MAGIC_FRAGMENT_DIR)/smile \
 $(MAGIC_FRAGMENT_DIR)/sniffer \
 $(MAGIC_FRAGMENT_DIR)/softquad \
+$(MAGIC_FRAGMENT_DIR)/sosi \
 $(MAGIC_FRAGMENT_DIR)/spec \
 $(MAGIC_FRAGMENT_DIR)/spectrum \
 $(MAGIC_FRAGMENT_DIR)/sql \
diff --git a/magic/Magdir/sosi b/magic/Magdir/sosi
new file mode 100644
index 00000000..afbd7d6c
--- /dev/null
+++ b/magic/Magdir/sosi
@@ -0,0 +1,36 @@
+#------------------------------------------------------------------------------
+# $File: $
+# SOSI
+# Summary: Systematic Organization of Spatial Information
+# Long description: Norwegian text based map format
+# File extension: .sos
+# Full name:    Petter Reinholdtsen (pere at hungry.com)
+# Reference: https://en.wikipedia.org/wiki/SOSI
+#
+# Example SOSI files available from
+# https://trac.osgeo.org/gdal/ticket/3638
+# https://nedlasting.geonorge.no/geonorge/Basisdata/N50Kartdata/SOSI/
+# https://nedlasting.geonorge.no/geonorge/Samferdsel/Elveg/SOSI/
+#
+# Start with optional comments (from "!" to the next line end)
+# followed by ".HODE" and end with "\n.SLUTT" followed by an optional
+# separator (any number of " ", "\t", "\n" or "\r"), might have BOM at
+# the start and following ".HODE" near the start there is "..OMRÅDE",
+# "..TRANSPAR", "..TEGNSETT " followed by the charset and a separator,
+# as well as "..SOSI-VERSJON " followed by the format version and a
+# separator.
+#
+# FIXME figure out how to accept any of [space], [tab], [newline] and
+# [carrige return] as separators, not only line end.
+
+0      search  ..OMRÅDE
+>0     search  ..TRANSPAR
+>>0       search  .HODE           SOSI map data
+>>>&0      search  ..SOSI-VERSJON
+>>>>&1 string  x               \b, version %s
+# FIXME could not figure out way to make a match for .SLUTT at the end required
+#>-7      string  \n.SLUTT     slutt
+#>-8      string  \n.SLUTT\n   slutt-nl
+#>-9      string  \n.SLUTT\r\n slutt-crnl2
+!:mime application/vnd.sosi
+!:ext sos

-- 
Happy hacking
Petter Reinholdtsen


More information about the File mailing list