2601Re: Suggestion: Redefine \Uxxxxx in double-quoted strings

>
> Vim is now capable of displaying any Unicode codepoint for which the
> installed 'guifont' has a glyph, even outside the BMP (i.e., even
> above
> U+FFFF),

Tony,

Good news.

Many may not know that MacVim has been doing this rather well for
quite a while.
I routinely edit texts in Deseret Alphabet and Shaw (Shavian)
Alphabet, which lie in the
supplementary area.

> but there's no easy way to represent those "high" codepoints by
> Unicode value in strings: I mean, "\uxxxx" and \Uxxxx" still accept no
> more than four hex digits.
>
> I propose to keep "\uxxxx" at its present meaning, but extend
> "\Uxxxxxxxx" to allow additional hex digits (either up to a total of 8
> hex digits, in line with ^VUxxxxxxxx as opposed to ^Vuxxxx in Insert
> mode, or at least up to the value \U10FFFF,

Sounds good.

\Uxxxxxxxx is also the Python convention for representing
supplementary characters in strings.
I think it requires exactly 8 hex digits, just as \uxxxx requires
exactly four, but I'm willing to be
corrected.

The other reasonable convention is the Perl-like \x{x...}, (the prefix
\x is literally backslash,
small X) which, being delimited with curly braces, can contain any
number of hex digits
without confusing the tokenization. But your proposal is more in line
with what Vim has
already.

>
>
> I'm aware that this is an "incompatible" change, but I believe the
> risk
> is low compared with the advantages

For what it's worth, I agree.

> The notation "\<Char-0x20000>" or "\<Char-131072>" doesn't work: here
> (in my GTK2/Gnome2 gvim with 'encoding' set to UTF-8), ":echo"ing
> such a
> string displays <f0><a0><80><fe>X<80><fe>X instead of just the one CJK
> character 𠀀 (and, yes, I've set my mailer to send this post as
> UTF-8 so
> if yours is "well-behaved" it should display that character properly).

In MacVim, at least, supplementary code point values can appear
usefully in <Char- > in keymap files.
Entries like the following appear in my deseret-sampa_utf-8.vim keymap
file. It all works great.

"in out comment
i <Char-0x10428> DESERET SMALL LETTER LONG I (e.g. i in
machine)
e <Char-0x10429> DESERET SMALL LETTER LONG E (e.g. a in make)
A <Char-0x1042A> DESERET SMALL LETTER LONG A (e.g. a in father)
O <Char-0x1042B> DESERET SMALL LETTER LONG AH (e.g. a in call,
au in caught, British/USEastCoastCity pronunciation)
o <Char-0x1042C> DESERET SMALL LETTER LONG O (e.g. oa in boat)
u <Char-0x1042D> DESERET SMALL LETTER LONG OO (e.g. oo in boot)

Thanks to all those developers who have toiled to handle Unicode in Vim.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---