[File] [PATCH] of Magdir/archive,console,c64 ; *.LNX; LyNX archive,Lynx cartridge
Jörg Jenderek
joerg.jen.der.ek at gmx.net
Fri Jun 2 15:38:20 UTC 2023
Hello,
some days ago i want to handle some Linux kernel images. I remember that
sometimes lnx suffix is used. So i search on my system for files with
that suffix.
When running file command version 5.44 with -k option on such LNX
samples and related files, i get an output like:
Berania-install.lnx: LyNX archive
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
, offset 0000, line 8205,
token (0x32)
Darkon.lnx: LyNX archive
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
, offset 0000, line 8205,
token (0x35)
, 3 last bytes 0x4c4c29
Hockey (NA).lnx: Lynx cartridge, bank 0 256k,
"hockey.lyx", "Atari"
Jimmy Conners Tennis (NA).lnx: Lynx cartridge, bank 0 512k,
"jconnort.lyx", "Atari"
Splat_and_Shout.lnx: LyNX archive
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
, offset 0000, line 8205,
token (0x31)
, 3 last bytes 0xb1fc60
Warlords.lnx: LyNX archive
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
, offset 0000, line 8205,
token (0x31)
atari-lynx-chips-challenge.lnx: Lynx cartridge, bank 0 128k,
"Atari"
helloWorld.prg: Commodore C64 BASIC program
, offset 0x0815, line 10,
token (0x99) PRINT "Hello world"
, offset 0000, line 0, token (0)
phassine.lnx: LyNX archive
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
, offset 0000, line 8205,
token (0x33)
saveroms: Commodore C64 BASIC program
, offset 0x0828, line 10,
token (0x8f) REM
********************************
, offset 0x084f, line 12,
token (0x8f) REM
* SAVEROMS *
Furthermore only generic mime type application/octet-stream or
application/x-atari-lynx-rom for Lynx cartridges are shown with -i
option. With option --extension only 3 byte sequence ??? is shown.
For comparison reason i also run the file format identification
utility DROID ( See https://sourceforge.net/projects/droid/).
None of the samples are recognized.
For comparison reason i run the file format identification utility
TrID ( See https://mark0.net/soft-trid-e.html).
Samples like atari-lynx-chips-challenge.lnx are described with highest
priority as "Atari Lynx ROM" with LNX name suffix by lnx.trid.xml.
The other LNX samples are all described correctly with low priority as
"Commodore 64 BASIC V2 program" with suffix PRG by prg-c64.trid.xml.
The LNX variants are described with highest priority as "Lynx archive"
with correct suffix LNX by ark-lnx.trid.xml (See appended
trid-v-lynx.txt.gz).
TrID list the used file name extension and often with -v option the
related URL pointing to used file format information.
With the help of this tools 1 add more lines. So this is now expressed
inside Magdir/console by additional comment lines like:
# Reference: http://mark0.net/download/triddefs_xml.7z
# defs/l/lnx.trid.xml
This starts with lines like:
0 string LYNX Lynx cartridge
!:mime application/x-atari-lynx-rom
Now according to TrID i show correct suffix by line like:
!:ext lnx
Afterwards the page size of the banks are shown. For first bank i get
values like 128, 256 or 512 and for second bank i got no values. This is
done by lines like:
>4 leshort/4 >0 \b, bank 0 %dk
>6 leshort/4 >0 \b, bank 1 %dk
Afterwards if available the 32 bytes cart name like "jconnort.lyx",
"viking~1.lyx", "Eye of the Beholder" or "C:\EMU\LYNX\ROMS\ULTCHESS.LYX"
are shown by line like:
>10 string >\0 \b, "%.32s"
Afterwards if available the 16 bytes manufacturer name like "Atari",
"NuFX Inc." or "Matthias Domin" are shown by line like:
>42 string >\0 \b, "%.16s"
Only 4 byte ASCII like start magic is checked. Just in case more is
needed i check for more fields in header according to source exehdr.s.
The version number seems to be always 1 and the last bytes in header are
apparently be nil. So these facts are handled by comment lines like:
#>8 leshort !1 \b, version number %u
#>59 lelong !0 \b, spare %#x
Most examples are not rotated, but example "Lexis (NA).lnx" is left
rotated and example "Centipede (Prototype).lnx" is right rotated. So i
also show this information by line like:
>58 ubyte >0 \b, rotation %u
With the help of TrID tools i found page about LNX (LyNX containers).
So this is now expressed inside Magdir/archive by additional comment
lines like:
# URL: http://fileformats.archiveteam.org/wiki/Lynx_archive
# Reference: http://ist.uwaterloo.ca/~schepers/formats/LNX.TXT
# http://mark0.net/download/triddefs_xml.7z
# defs/a/ark-lnx.trid.xml
According to that documentation the archives starts with a small BASIC
program which, when loaded and run, displays the message "Use LYNX to
dissolve this file". That was used by file command as test criterium.
According to documentation this BASIC program look like:
10 POKE53280,0:
POKE53281,0:
POKE646,PEEK(162):
PRINT"<CLS><DOWN><DOWN><DOWN><DOWN><DOWN><DOWN><DOWN><DOWN>":
PRINT" USE LYNX TO DISSOLVE THIS FILE":
GOTO10
But in current version in output (via Magdir/c64) we get only starting
phrase "POKE 53280,0" because in sub routine basic-line interpreted
tokenized PEEK command (97h=0227) is done by lines like:
>4 string \x97 POKE
>>5 regex \^[0-9,\040]+ %s
By the above regular expression only the peek arguments <Memory
address>,<number> are greped and shown, but you can put some BASIC
commands at the same line separated by colon character (:=3Ah=0227).
So show this by additional lines afterwards like:
>>>>&-2 ubyte =0x3A with
>>>>>&0 string x "%s"
This can get many columns in output because whole BASIC line limit is
about 256 and is terminated by \0-character. So for the LNX archives now
get i here string like:
\22753281,0:
\227646,\302(162):
\231"\223\021\021\021\021\021\021\021\021":
\231" USE LYNX TO DISSOLVE THIS FILE":
\21110
Now we see that this the mentioned BASIC program in tokenized form.
That means the BASIC keywords are replaced by 1 or more bytes according
to table that is specific for different BASIC variants. The TrID command
checks also for these Basic byte sequences. But according to documents
there may exist samples which contain other number of spaces for
example, but i myself found no such examples. Then the magic lines maybe
must be adapted in Magdir/archive.
Inside Magdir/c64 the samples are described as "Commodore C64 BASIC
program" by c64-prg. This interpret the first 2 BASIC fragments via sub
routine basic-line. This first displays the offset to next fragment,
then the BASIC line number and then the BASIC line content typically
starting with a token.
An offset value 0000 indicates the end of the program like in sample
helloWorld.prg. So here end of program is also end of file.
That means interpretation of bytes afterwards is not useful or gives
garbage. So normally for real BASIC programs you get in last 3 bytes of
file a nil character for BASIC line terminator and 2 bytes 0000 for
offset. This is checked and displayed by magic lines like:
>-3 ubyte !0 \b, 3 last bytes %#2.2x
>>-2 ubeshort x \b%4.4x
So for real BASIC examples like helloWorld.prg i get after offset part
a not useful phrase like "line 0, token (0)". For LNX samples after
BASIC programs comes appended data. So in many cases the last 3 bytes
are not nil. So for sample Darkon.lnx i get 0x4c4c29 and for
Splat_and_Shout.lnx i get 0xb1fc60. Here after offset part i get
garbage/misleading phrase "line 8205, token (0x35)" or "line 8205, token
(0x31)". So i must insert an additional line that checks for not nil
offset and then continues like before. So this now becomes in sub
routine basic-line like:
0 name basic-line
>0 uleshort x \b, offset %#4.4x
>0 uleshort >0
>>2 uleshort x \b, line %u
>>4 ubyte x \b, token (%#x)
...
For LNX at this point comes the data (not line number, not token,...).
So show this in hexadecimal form. Just for interest or debugging purpose
you can also show this as string. So samples with appended data part are
now handled by additional branch which looks like:
>0 uleshort 0
>>2 ubeshort x \b, data %#4.4x
>>4 ubeshort x \b%4.4x
>>6 ubequad x \b%16.16llx
#>>3 string x "%-0.30s"
When interpreting the data bytes as string i get something like:
Berania-install.lnx: " 2 *LYNX BY CBMCONVERT 2.0*"
Darkon.lnx: " 5 LYNX IX BY WILL CORLEY"
Splat_and_Shout.lnx: " 1 *LYNX XII BY WILL CORLEY"
Warlords.lnx: " 1 *STAR LYNX 0.72 JOE/STA"
phassine.lnx: " 3 *LYNX BY CBMCONVERT 2.0*"
Now we can check the recognition as "LyNX archive". This is done with
strength 330 inside Magdir/archive by line like:
56 string USE\040LYNX\040TO\040DISSOLVE\040THIS\040FILE LyNX archive
So this part of text is shown by BASIC PRINT. I am not sure that this is
always be true, because according to documentation there may exist
variants where other number of spaces are used. So this magic string may
occur at other offsets in unusual cases, but at the moment it just keep
this test line. Afterwards show now file name suffix and a user defined
mime type instead of generic application/octet-stream. So this done by
additional lines like:
!:mime application/x-commodore-lnx
!:ext lnx
Because when handling dozen of such LNX samples i want some more
information in order to distinguish the LNX files. So afterwards i look
for BASIC tokenized GOTO (89h) 10, line terminator \0, end of program
tag \0\0 and Carriage Return by magic lines like:
>86 search/10 \x8910\0\0\0\r \b,
#>>&0 string x STRING="%s"
For debugging purpose look at data bytes after BASIC program as string
as we done it by Magdir/c64. According to documentation we look for the
number of directory blocks (with values like 1 2 3 5) in ASCII with
spaces on both sides by next additional line like:
>>&0 regex [0-9]{1,5} %s directory blocks
Afterwards show signature like "*LYNX XII BY WILL CORLEY" " LYNX IX BY
WILL CORLEY" "*LYNX BY CBMCONVERT 2.0*" by line like:
>>>&2 regex [^\r]{1,24} \b, signature "%s"
Afterwards show number of files (with values like 2 3 6 13 69 144?=
maximum) in ASCII surrounded by spaces and delimited by Carriage Return
via line like:
>>>>&1 regex [0-9]{1,3} \b, %s files
After applying the above mentioned modifications by patches
file-5.44-console-lnx.diff, file-5.44-c64-lnx.diff and
file-5.44-archive-lnx.diff then some more details are shown and now i
get an output with -k option like:
Berania-install.lnx: LyNX archive, 2 directory blocks,
signature "*LYNX BY CBMCONVERT 2.0*",
6 files
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
"\22753281,0:
\227646,\302(162):
\231"\223\021\021\021\021\021\021\021\021":
\231" USE LYNX TO DISSOLVE THIS FILE":
\21110"
, offset 0000, data
0x0d203220202a4c594e582042592043424d434f4e...
Darkon.lnx: LyNX archive, 5 directory blocks,
signature " LYNX IX BY WILL CORLEY",
69 files
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
"\22753281,0:
\227646,\302(162):
\231"\223\021\021\021\021\021\021\021\021":
\231" USE LYNX TO DISSOLVE THIS FILE":
\21110"
, offset 0000, data
0x0d20352020204c594e5820495820204259205749...
, 3 last bytes 0x4c4c29
Hockey (NA).lnx: Lynx cartridge, bank 0 256k
, "hockey.lyx", "Atari"
Jimmy Conners Tennis (NA).lnx: Lynx cartridge, bank 0 512k
, "jconnort.lyx", "Atari"
Splat_and_Shout.lnx: LyNX archive, 1 directory blocks,
signature "*LYNX XII BY WILL CORLEY",
2 files
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
"\22753281,0:
\227646,\302(162):
\231"\223\021\021\021\021\021\021\021\021":
\231" USE LYNX TO DISSOLVE THIS FILE":
\21110"
, offset 0000, data
0x0d203120202a4c594e5820584949204259205749...
, 3 last bytes 0xb1fc60
Warlords.lnx: LyNX archive, 1 directory blocks,
signature "*STAR LYNX 0.72 JOE/STA",
3 files
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
"\22753281,0:
\227646,\302(162):
\231"\223\021\021\021\021\021\021\021\021":
\231" USE LYNX TO DISSOLVE THIS FILE":
\21110"
, offset 0000, data
0x0d203120202a53544152204c594e5820302e3732...
atari-lynx-chips-challenge.lnx: Lynx cartridge, bank 0 128k
, "Atari"
helloWorld.prg: Commodore C64 BASIC program
, offset 0x0815, line 10,
token (0x99) PRINT "Hello world"
, offset 0000, data
0000000000000000000000000000000000000000...
phassine.lnx: LyNX archive, 3 directory blocks,
signature "*LYNX BY CBMCONVERT 2.0*",
13 files
Commodore C64 BASIC program
, offset 0x085b, line 10,
token (0x97) POKE 53280,0
"\22753281,0:
\227646,\302(162):
\231"\223\021\021\021\021\021\021\021\021":
\231" USE LYNX TO DISSOLVE THIS FILE":
\21110"
, offset 0000, data
0x0d203320202a4c594e582042592043424d434f4e...
saveroms: Commodore C64 BASIC program
, offset 0x0828, line 10,
token (0x8f) REM
********************************
, offset 0x084f, line 12,
token (0x8f) REM
* SAVEROMS *
I hope my diff file can be applied in future version of
file utility. Maybe "LyNX archive" in Magdir/archive and "Commodore C64
BASIC program" in Magdir/c64 should be merged. Furthermore i found
beside some Linux kernel more samples with LNX file name suffix but with
other file format.
With best wishes
Jörg Jenderek
--
Jörg Jenderek
-------------- next part --------------
--- file-5.44\magic\Magdir\console.old Mon Dec 26 19:00:47 2022
+++ file-5.44\magic\Magdir\console Mon May 29 21:36:56 2023
@@ -697,12 +697,25 @@
>6 string BS93 Lynx homebrew cartridge
!:mime application/x-atari-lynx-rom
>>2 beshort x \b, RAM start $%04x
+# Update: Joerg Jenderek
+# Reference: http://mark0.net/download/triddefs_xml.7z/defs/l/lnx.trid.xml
+# Note: called "Atari Lynx ROM" by TrID
0 string LYNX Lynx cartridge
!:mime application/x-atari-lynx-rom
+!:ext lnx
+# bank 0 page size like: 128 256 512
>4 leshort/4 >0 \b, bank 0 %dk
>6 leshort/4 >0 \b, bank 1 %dk
+# 32 bytes cart name like: "jconnort.lyx" "viking~1.lyx" "Eye of the Beholder" "C:\EMU\LYNX\ROMS\ULTCHESS.LYX"
>10 string >\0 \b, "%.32s"
+# 16 bytes manufacturer like: "Atari" "NuFX Inc." "Matthias Domin"
>42 string >\0 \b, "%.16s"
+# version number
+#>8 leshort !1 \b, version number %u
+# rotation: 1~left Lexis (NA).lnx 2~right Centipede (Prototype).lnx
+>58 ubyte >0 \b, rotation %u
+# spare
+#>59 lelong !0 \b, spare %#x
# Opera file system that is used on the 3DO console
# From: Serge van den Boom <svdb at stack.nl>
-------------- next part --------------
--- file-5.44\magic\Magdir\archive.old Mon Dec 26 19:00:47 2022
+++ file-5.44\magic\Magdir\archive Fri Jun 02 00:18:55 2023
@@ -2124,7 +2124,28 @@
>3 byte x version %d
# LyNX archive
+# Update: Joerg Jenderek
+# URL: http://fileformats.archiveteam.org/wiki/Lynx_archive
+# Reference: http://ist.uwaterloo.ca/~schepers/formats/LNX.TXT
+# http://mark0.net/download/triddefs_xml.7z/defs/a/ark-lnx.trid.xml
+# Note: called "Lynx archive" by TrID and "Commodore C64 BASIC program" with "POKE 53280" by ./c64
+# TODO: merge and unify with Commodore C64 BASIC program
56 string USE\040LYNX\040TO\040DISSOLVE\040THIS\040FILE LyNX archive
+# display "Lynx archive" (strength=330) before Commodore C64 BASIC program (strength=50) handled by ./c64
+#!:strength +0
+#!:mime application/octet-stream
+!:mime application/x-commodore-lnx
+!:ext lnx
+# afterwards look for BASIC tokenized GOTO (89h) 10, line terminator \0, end of programm tag \0\0 and CarriageReturn
+>86 search/10 \x8910\0\0\0\r \b,
+# for DEBUGGING
+#>>&0 string x STRING="%s"
+# number in ASCII of directory blocks with spaces on both sides like: 1 2 3 5
+>>&0 regex [0-9]{1,5} %s directory blocks
+# signature like: "*LYNX XII BY WILL CORLEY" " LYNX IX BY WILL CORLEY" "*LYNX BY CBMCONVERT 2.0*"
+>>>&2 regex [^\r]{1,24} \b, signature "%s"
+# number of files in ASCII surrounded by spaces and delimited by CR like: 2 3 6 13 69 144 (maximum?)
+>>>>&1 regex [0-9]{1,3} \b, %s files
# From: Joerg Jenderek
# URL: https://www.acronis.com/
-------------- next part --------------
--- file-5.44\magic\Magdir\c64.old Wed Nov 30 00:04:06 2022
+++ file-5.44\magic\Magdir\c64 Fri Jun 02 00:46:17 2023
@@ -203,6 +203,8 @@
# TODO: unify Commodore BASIC/program sub routines
# Note: "PUCrunch archive data" moved from ./archive and merged with c64-exe
0 leshort 0x0801
+# display Commodore C64 BASIC program (strength=50) after "Lynx archive" (strength=330) handled by ./archive
+#!:strength +0
# if first token is not SYS this implies BASIC program in most cases
>6 ubyte !0x9e
# but sELF-ExTRACTING-zIP executable unzp6420.prg contains SYS token at end of second BASIC line (at 0x35)
@@ -499,33 +501,49 @@
# pointer to memory address of beginning of "next" BASIC line
# greater then previous offset but maximal 100h difference
>0 uleshort x \b, offset %#4.4x
+# offset 0x0000 indicates the end of BASIC program; so bytes afterwards may be some other data
+>0 uleshort 0
+# not line number but first 2 data bytes
+>>2 ubeshort x \b, data %#4.4x
+# not token but next 2 data bytes
+>>4 ubeshort x \b%4.4x
+# not token arguments but next data bytes
+>>6 ubequad x \b%16.16llx
+>>14 ubequad x \b%16.16llx...
+# like 0x0d20352020204c594e5820495820204259205749 "\r 5 LYNX IX BY WILL CORLEY" for LyNX archive Darkon.lnx handled by ./archive
+#>>3 string x "%-0.30s"
+>0 uleshort >0
# BASIC line number with range from 0 to 65520; practice to increment numbers by some value (5, 10 or 100)
->2 uleshort x \b, line %u
+>>2 uleshort x \b, line %u
# https://www.c64-wiki.com/wiki/BASIC_token
# The "high-bit" bytes from #128-#254 stood for the various BASIC commands and mathematical operators
->4 ubyte x \b, token (%#x)
+>>4 ubyte x \b, token (%#x)
# https://www.c64-wiki.com/wiki/REM
->4 string \x8f REM
+>>4 string \x8f REM
# remark string like: ** SYNTHESIZER BY RICOCHET **
->>5 string >\0 %s
-#>>>&1 uleshort x \b, NEXT OFFSET %#4.4x
+>>>5 string >\0 %s
+#>>>>&1 uleshort x \b, NEXT OFFSET %#4.4x
# https://www.c64-wiki.com/wiki/PRINT
->4 string \x99 PRINT
+>>4 string \x99 PRINT
# string like: "Hello world" "\021 \323ELF-E\330TRACTING-\332IP (64 ONLY)\016\231":\2362141
->>5 string x %s
-#>>>&0 ubequad x AFTER_PRINT=%#16.16llx
+>>>5 string x %s
+#>>>>&0 ubequad x AFTER_PRINT=%#16.16llx
# https://www.c64-wiki.com/wiki/POKE
->4 string \x97 POKE
+>>4 string \x97 POKE
# <Memory address>,<number>
->>5 regex \^[0-9,\040]+ %s
+>>>5 regex \^[0-9,\040]+ %s
+# BASIC command delimiter colon (:=3Ah)
+>>>>&-2 ubyte =0x3A
+# after BASIC command delimiter colon remaining (<255) other tokenized BASIC commands
+>>>>>&0 string x "%s"
# https://www.c64-wiki.com/wiki/SYS 0x9e=\236
->4 string \x9e SYS
+>>4 string \x9e SYS
# SYS <Address> parameter is a 16-bit unsigned integer; in the range 0 - 65535
->>5 regex \^[0-9]{1,5} %s
+>>>5 regex \^[0-9]{1,5} %s
# maybe followed by spaces, "control-characters" or colon (:) followed by next commnds or in victracker.prg
# (\302(43)\252256\254\302(44)\25236) /T.L.R/
-#>>5 string x SYS_STRING="%s"
+#>>>5 string x SYS_STRING="%s"
# https://www.c64-wiki.com/wiki/GOSUB
->4 string \x8d GOSUB
+>>4 string \x8d GOSUB
# <line>
->>5 string >\0 %s
+>>>5 string >\0 %s
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-console-lnx.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230602/d07d50d7/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-archive-lnx.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230602/d07d50d7/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file-5.44-c64-lnx.diff.sig
Type: application/octet-stream
Size: 95 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230602/d07d50d7/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trid-v-lnx.txt.gz
Type: application/x-gzip
Size: 770 bytes
Desc: not available
URL: <https://mailman.astron.com/pipermail/file/attachments/20230602/d07d50d7/attachment.bin>
More information about the File
mailing list