[Tcsh] Multi-byte characters in promptchars

Amol Deshpande amol.vinayak.deshpande at gmail.com
Fri Sep 20 16:30:41 UTC 2024


So, this is how the Windows version works and if you can tell me that this
applies to Linux as well, I can hopefully whip up a patch over the weekend.

If the prompt string being output is UTF-8, tcsh is still putting each byte
into a 32-bit int when WIDE_CHAR is set.

Therefore, it may actually be 2 Chars (or 3, or 4) in the prompt that are
consumed for displaying an emoji or whatever.
However, the code in ed.refresh.c (RefreshPromptpart) is written to assume
each Char is an independent byte.

So, in addition to wcwidth to find the length of the output, NLSWidthMB
should also report how many Chars of the input string it consumed to
represent  the emoji.  (line 318)

We should then increment cp by "+= consumed" in the above function while
looping through the buffer, instead of by just 1. (line 320)

Does that sound right or am I missing something ?

thanks,
-amol



On Fri, Sep 20, 2024 at 7:38 AM Kimmo Suominen <kim at netbsd.org> wrote:

> On Fri, Sep 20, 2024 at 10:46:49AM +0200, H.Merijn Brand wrote:
> > With multibyte promptchars and a complicated prompt like
> >
> >  promptchars ����
> >  prompt      xyz%{\e[47;34m0392\e[0m%} %U%m:%u%{\e[1m%}%/ %h
> %{\e[0;38;2;255;24;0m%}%#%{\e[0m%}
> >
> > Positioning inside the line when editing still frequently messes up. A
> > control-R fixes that, but I guess it should be smooth
>
> I think the issue is highlighted by this question I made:
>
> On Fri, 5 Apr 2024 12:47:05 +0300, Kimmo Suominen <kim at netbsd.org>
> wrote:
> > Why does it only apply to NLSCLASS_ILLEGAL2?
>
> I think the whole logic there across the NLSCLASS_ILLEGALn cases is
> incorrect.
>
> > Do we have something already that correctly provides the rendering
> > width of the character?  Here is a Stack Overflow answer that points
> > to using wcwidth(3) and wcswidth(3):
> >
> >     https://stackoverflow.com/a/9145712/1511370
> >
> >     https://man.netbsd.org/wcwidth.3
> >     https://man.netbsd.org/wcswidth.3
>
> I think the correct implementation needs to include the detection of the
> rendering width of the characters.
>
> Kind regards,
> + Kimmo
>
> --
> Tcsh mailing list
> Tcsh at astron.com
> https://mailman.astron.com/mailman/listinfo/tcsh
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.astron.com/pipermail/tcsh/attachments/20240920/5d14cfb8/attachment.htm>


More information about the Tcsh mailing list