[Tcsh] "Readable" Unicode in setenv
Jamie Landeg-Jones
jamie at catflap.org
Tue Nov 16 10:52:58 UTC 2021
"H.Merijn Brand" <tcsh at tux.freedom.nl> wrote:
> I expect \u20AC (not interpreted), but yes, this is becoming a gray
> area where expectations might/will differ and DWIM is not the same
> for all.
>
> > I mean, quotes are optional for strings in shell, and that makes life complicated :-)
> >
> > christos
Rather than altering anything that already exists, have you thought of adding
a new type, e.g. like the bourne shell format $'xxxx' ?
This would remove ambiguity, and also be easily memorable by those used to sh.
It already covers escaping, and unicode code-points:
>From the FreeBSD sh(1) man page:
| Quoting
| Quoting is used to remove the special meaning of certain characters or
| words to the shell, such as operators, whitespace, keywords, or alias
| names.
|
| There are four types of quoting: matched single quotes, dollar-single
| quotes, matched double quotes, and backslash.
|
| Single Quotes
| Enclosing characters in single quotes preserves the literal
| meaning of all the characters (except single quotes, making it
| impossible to put single-quotes in a single-quoted string).
|
| Dollar-Single Quotes
| Enclosing characters between $' and ' preserves the literal
| meaning of all characters except backslashes and single quotes.
| A backslash introduces a C-style escape sequence:
|
| \a Alert (ring the terminal bell)
|
| \b Backspace
|
| \cc The control character denoted by ^c in stty(1). If c
| is a backslash, it must be doubled.
|
| \e The ESC character (ASCII 0x1b)
|
| \f Formfeed
|
| \n Newline
|
| \r Carriage return
|
| \t Horizontal tab
|
| \v Vertical tab
|
| \\ Literal backslash
|
| \' Literal single-quote
|
| \" Literal double-quote
|
| \nnn The byte whose octal value is nnn (one to three
| digits)
|
| \xnn The byte whose hexadecimal value is nn (one or more
| digits only the last two of which are used)
|
| \unnnn The Unicode code point nnnn (four hexadecimal digits)
|
| \Unnnnnnnn The Unicode code point nnnnnnnn (eight hexadecimal
| digits)
|
| The sequences for Unicode code points are currently only useful
| with UTF-8 locales. They reject code point 0 and UTF-16
| surrogates.
|
| If an escape sequence would produce a byte with value 0, that
| byte and the rest of the string until the matching single-quote
| are ignored.
|
| Any other string starting with a backslash is an error.
|
More information about the Tcsh
mailing list