[Tcsh] "Readable" Unicode in setenv

H.Merijn Brand tcsh at tux.freedom.nl
Sun Nov 14 13:29:56 UTC 2021


On Sat, 13 Nov 2021 19:16:07 -0500, Christos Zoulas <christos at zoulas.com> wrote:

> Committed, thanks!

Wow, thanks. Would it be hard(er) to also allow

 % setenv EURO "\u20AC"

directly too instead of

 % setenv EURO `echo "\u20AC"`

(you can also hint me to the location in/of the code and I'll play around myself)

> christos
> 
> > On Nov 12, 2021, at 10:46 AM, H.Merijn Brand <tcsh at tux.freedom.nl> wrote:
> > 
> > Signed PGP part
> > On Fri, 12 Nov 2021 16:32:42 +0100, "H.Merijn Brand" <tcsh at tux.freedom.nl <mailto:tcsh at tux.freedom.nl>> wrote:
> >   
> >> If I have an environment variable that is to contain something Unicodish,
> >> I currently have to to something similar to
> >> 
> >> % setenv EURO_CH `perl -CO -e'print "\N{EURO SIGN}"'`
> >> or
> >> % setenv EURO_CH `perl -CO -e'print "\x{20ac}"'`
> >> 
> >> so this works
> >> % echo $EURO_CH
> >> €
> >> 
> >> I browsed the tcsh manual, but could not find anything that would hint
> >> to doing this natively. Is there a (hidden) feature to set Unicode
> >> characters from the command line by their name or hex value (in the
> >> current encoding)? Something similar to
> >> 
> >> % setenv EURO_CH "\x{20ac}"
> >> % setenv EURO_CH "\u20AC"
> >> 
> >> In my digging, I found that
> >> 
> >> % setenv TAB_CH "\t"
> >> 
> >> just sets the environment variable TAB_CH to a literal '\' followed by
> >> a 't'. Which was kinda surprising to me, as I expected a TAB to be in
> >> there as a literal TAB. That was done in echo in sh.func.c, so
> >> 
> >> % echo $TAB_CH
> >> 
> >> translated \t to TAB. To do the same for \x{20ac}, \xbf, and \u0020ac
> >> I changed sh.func.c like below, but I eventually want those escapes
> >> to end up literally in the environment. Thoughts welcome
> >> 
> >> --8<---
> >> diff --git a/sh.func.c b/sh.func.c
> >> index cdfb6d8d..cbc4ff41 100644
> >> --- a/sh.func.c
> >> +++ b/sh.func.c
> >> @@ -1196,6 +1196,22 @@ doglob(Char **v, struct command *c)
> >>     flush();
> >> }
> >> 
> >> +static Char
> >> +parse_hex_range(Char **cp, int l)
> >> +{
> >> +    int  ui = 0;
> >> +    char ub[9];
> >> +
> >> +    if (l > 8) return 0; /* Unsupported length */
> >> +
> >> +    while (**cp && ui < l && isxdigit(**cp)) {
> >> +       ub[ui++] = (char)**cp;
> >> +       (*cp)++;
> >> +    }
> >> +    ub[ui] = (char)0;
> >> +    return strtol (ub, NULL, 16);
> >> +}
> >> +
> >> static void
> >> xecho(int sep, Char **v)
> >> {
> >> @@ -1289,6 +1305,28 @@ xecho(int sep, Char **v)
> >>                    if (*cp >= '0' && *cp < '8')
> >>                        c = c * 8 + *cp++ - '0';
> >>                    break;
> >> +               case 'x':
> >> +                   if (*cp == '{' && isxdigit(*(cp + 1))) { /* \x{20ac} */
> >> +                       cp++;
> >> +                       c = parse_hex_range (&cp, 8);
> >> +                       if (*cp != '}')
> >> +                           stderror(ERR_NAME | ERR_VARBEGIN);  
> >     This needs a proper new error message of course
> >   
> >> +                       cp++;
> >> +                   }
> >> +                   else if (isxdigit(*cp)) {   /* \x9f */
> >> +                       c = parse_hex_range (&cp, 2);
> >> +                   }
> >> +                   else /* backward compat */
> >> +                       xputchar('\\' | QUOTE);
> >> +                   break;
> >> +               case 'u':
> >> +                   if (isxdigit(*cp)) {        /* \u0020ac */
> >> +                       c = parse_hex_range (&cp, 6);
> >> +                   }
> >> +                   else /* backward compat */
> >> +                       xputchar('\\' | QUOTE);
> >> +                   break;
> >> +
> >>                case '\0':
> >>                    c = '\\';
> >>                    cp--;  
>  [...]  
> > 
> > And a demo of course
> > 
> > % unsetenv EURO
> > % echo $EURO
> > EURO: Undefined variable.
> > 
> > % setenv EURO "\u20ac"
> > % env | grep EURO
> > EURO=\u20ac
> > % echo $EURO
> > €
> > 
> > % setenv EURO `echo "\u20ac"`
> > % env | grep EURO
> > EURO=€
> > % echo $EURO
> > €
> > 
> > --
> > H.Merijn Brand  https://tux.nl <https://tux.nl/>   Perl Monger   http://amsterdam.pm.org/ <http://amsterdam.pm.org/>
> > using perl5.00307 .. 5.33        porting perl5 on HP-UX, AIX, and Linux
> > https://tux.nl/email.html <https://tux.nl/email.html> http://qa.perl.org <http://qa.perl.org/> https://www.test-smoke.org <https://www.test-smoke.org/>
> > 
> > 
> >   
> 


-- 
H.Merijn Brand  https://tux.nl   Perl Monger   http://amsterdam.pm.org/
using perl5.00307 .. 5.33        porting perl5 on HP-UX, AIX, and Linux
https://tux.nl/email.html http://qa.perl.org https://www.test-smoke.org
                           
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://mailman.astron.com/pipermail/tcsh/attachments/20211114/1a036600/attachment.asc>


More information about the Tcsh mailing list