[File] suggestion or maybe a documentation point

Christos Zoulas christos at zoulas.com
Thu Mar 30 13:45:39 UTC 2023



> On Mar 7, 2023, at 5:15 AM, Patrice Duroux <patrice.duroux at gmail.com> wrote:
> 
> Hi,
> 
> Goal: get also the character and EOL encodings for script files like a Perl one.
> 
> Currently:
> ```
> $ file -v
> file-5.44
> magic file from /etc/magic:/usr/share/misc/magic
> $ file test.cgi
> test.cgi: Perl script text executable
> $ cat test.cgi
> #!/usr/bin/perl
> 
> print "€\n";
> ```
> 
> Man documentation says:
> ```
>     If a file does not match any of the entries in the magic file, it
> is examined to see if it seems to be a text file.  ASCII, ISO-8859-x,
> non-ISO
>     8-bit extended-ASCII character sets (such as those used on
> Macintosh and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded
> Unicode, and EBCDIC
>     character sets can be distinguished by the different ranges and
> sequences of bytes that constitute printable text in each set.  If a
> file passes
>     any of these tests, its character set is reported.  ASCII,
> ISO-8859-x, UTF-8, and extended-ASCII files are identified as “text”
> because they will
>     be mostly readable on nearly any terminal; UTF-16 and EBCDIC are
> only “character data” because, while they contain text, it is text
> that will re‐
>     quire translation before it can be read.  In addition, file will
> attempt to determine other characteristics of text-type files.  If the
> lines of a
>     file are terminated by CR, CRLF, or NEL, instead of the
> Unix-standard LF, this will be reported.  Files that contain embedded
> escape sequences or
>     overstriking will also be identified.
> ```
> 
> So if I understand well, it is like the following pseudo-code:
> magic(input) || text(input)
> 
> Would it be possible then to have an option to get something like
> ('--force-text'?):
> magic(input) ; text(input)
> or may be just ('--no-magic'?):
> text(input)
> 
> Sure, I did not try to create a pseudo (empty) magic file and to use
> the -m option.
> 
> If my point here has been already addressed (probably many times) and
> already solved
> in some way, could it be added to the example section of its man page then?
> 

Can't you just run 'file -e soft'?

christos

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <https://mailman.astron.com/pipermail/file/attachments/20230330/8d9f7233/attachment.asc>


More information about the File mailing list