[File] suggestion or maybe a documentation point
Patrice Duroux
patrice.duroux at gmail.com
Tue Mar 7 09:15:29 UTC 2023
Hi,
Goal: get also the character and EOL encodings for script files like a Perl one.
Currently:
```
$ file -v
file-5.44
magic file from /etc/magic:/usr/share/misc/magic
$ file test.cgi
test.cgi: Perl script text executable
$ cat test.cgi
#!/usr/bin/perl
print "€\n";
```
Man documentation says:
```
If a file does not match any of the entries in the magic file, it
is examined to see if it seems to be a text file. ASCII, ISO-8859-x,
non-ISO
8-bit extended-ASCII character sets (such as those used on
Macintosh and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded
Unicode, and EBCDIC
character sets can be distinguished by the different ranges and
sequences of bytes that constitute printable text in each set. If a
file passes
any of these tests, its character set is reported. ASCII,
ISO-8859-x, UTF-8, and extended-ASCII files are identified as “text”
because they will
be mostly readable on nearly any terminal; UTF-16 and EBCDIC are
only “character data” because, while they contain text, it is text
that will re‐
quire translation before it can be read. In addition, file will
attempt to determine other characteristics of text-type files. If the
lines of a
file are terminated by CR, CRLF, or NEL, instead of the
Unix-standard LF, this will be reported. Files that contain embedded
escape sequences or
overstriking will also be identified.
```
So if I understand well, it is like the following pseudo-code:
magic(input) || text(input)
Would it be possible then to have an option to get something like
('--force-text'?):
magic(input) ; text(input)
or may be just ('--no-magic'?):
text(input)
Sure, I did not try to create a pseudo (empty) magic file and to use
the -m option.
If my point here has been already addressed (probably many times) and
already solved
in some way, could it be added to the example section of its man page then?
Many thanks,
Patrice
More information about the File
mailing list