*[PATCH v3 0/3] have the vt console preserve unicode characters@ 2018-06-27 3:56 Nicolas Pitre
2018-06-27 3:56 ` [PATCH v3 1/3] vt: preserve unicode values corresponding to screen characters Nicolas Pitre
` (3 more replies)0 siblings, 4 replies; 13+ messages in thread
From: Nicolas Pitre @ 2018-06-27 3:56 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
linux-kernel, linux-console
The vt code translates UTF-8 strings into glyph index values and stores
those glyph values in the screen buffer. Because there can only be at
most 512 glyphs at the moment, it is impossible to represent most
unicode characters, in which case a default glyph (often '?') is
displayed instead. The original unicode value is then lost.
The 512-glyph limitation is inherent to text-mode VGA displays after
which the core console code was modelled. This also means that the
/dev/vcs* devices only provide user space with glyph index values, and
then user applications must get hold of the unicode-to-glyph table the
kernel is using in order to back-translate those into actual characters.
It is not possible to get back the original unicode value when multiple
unicode characters map to the same glyph, especially for the vast
majority that maps to the default replacement glyph.
Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode
space from lossy screen content because of that. This is especially true
for accessibility applications such as BRLTTY that rely on /dev/vcs to
render screen content onto braille terminals.
It was also argued that the VGA-centric glyph buffer should eventually
go entirely. The current design made sense when hardware was slow and
managing the screen directly into the VGA memory made a difference (i.e.
25 years ago). Modern console display drivers no longer have to be
limited to 512 glyphs.
Quoting Alan Cox:
|The only driver that it suits is the VGA text mode driver, which at
|2GHz+ is going to be fast enough whatever format you convert from. We
|have the memory, the processor power and the fact almost all our
|displays are bitmapped (or more complex still) all in favour of
|throwing away that limit.
This patch series introduces unicode screen support to the core console
code with /dev/vcs* as a first user. Memory is allocated, and possible
CPU overhead introduced, only if /dev/vcsu is read at least once. For
now both the glyph and unicode buffers are maintained in parallel to
allow for a smooth transition.
I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY
maintainer Dave Mielke who implemented support for this in BRLTTY. There
is therefore a vested interest in maintaining this feature as necessary.
And this received extensive testing as well at this point.
This is also available on top of v4.18-rc2 here:
git://git.linaro.org/people/nicolas.pitre/linux vt-unicode
Changes from v2:
- Dropped patch #4 as it was useful only for initial debugging and it
attracted all the review comments so far -- actually more than the
patch is worth.
- Added Adam Borowski's ACK.
Changes from v1:
- Rebased to v4.18-rc1.
- Dropped first patch (now in mainline as commit 4b4ecd9cb8).
- Removed a printk instance from an error path easily triggerable
from user space.
- Minor cleanup.
Diffstat:
drivers/tty/vt/vc_screen.c | 90 ++++++++--
drivers/tty/vt/vt.c | 308 +++++++++++++++++++++++++++++++--
include/linux/console_struct.h | 2 +
include/linux/selection.h | 5 +
4 files changed, 380 insertions(+), 25 deletions(-)
^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: [PATCH v3 0/3] have the vt console preserve unicode characters
2018-06-27 3:56 [PATCH v3 0/3] have the vt console preserve unicode characters Nicolas Pitre
` (2 preceding siblings ...)
2018-06-27 3:56 ` [PATCH v3 3/3] vt: unicode fallback for scrollback Nicolas Pitre
@ 2018-06-28 12:38 ` Greg Kroah-Hartman
2018-07-18 1:00 ` Nicolas Pitre3 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2018-06-28 12:38 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
linux-kernel, linux-console
On Tue, Jun 26, 2018 at 11:56:39PM -0400, Nicolas Pitre wrote:
> The vt code translates UTF-8 strings into glyph index values and stores
> those glyph values in the screen buffer. Because there can only be at
> most 512 glyphs at the moment, it is impossible to represent most
> unicode characters, in which case a default glyph (often '?') is
> displayed instead. The original unicode value is then lost.
>
> The 512-glyph limitation is inherent to text-mode VGA displays after
> which the core console code was modelled. This also means that the
> /dev/vcs* devices only provide user space with glyph index values, and
> then user applications must get hold of the unicode-to-glyph table the
> kernel is using in order to back-translate those into actual characters.
> It is not possible to get back the original unicode value when multiple
> unicode characters map to the same glyph, especially for the vast
> majority that maps to the default replacement glyph.
>
> Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode
> space from lossy screen content because of that. This is especially true
> for accessibility applications such as BRLTTY that rely on /dev/vcs to
> render screen content onto braille terminals.
>
> It was also argued that the VGA-centric glyph buffer should eventually
> go entirely. The current design made sense when hardware was slow and
> managing the screen directly into the VGA memory made a difference (i.e.
> 25 years ago). Modern console display drivers no longer have to be
> limited to 512 glyphs.
> Quoting Alan Cox:
>
> |The only driver that it suits is the VGA text mode driver, which at
> |2GHz+ is going to be fast enough whatever format you convert from. We
> |have the memory, the processor power and the fact almost all our
> |displays are bitmapped (or more complex still) all in favour of
> |throwing away that limit.
>
> This patch series introduces unicode screen support to the core console
> code with /dev/vcs* as a first user. Memory is allocated, and possible
> CPU overhead introduced, only if /dev/vcsu is read at least once. For
> now both the glyph and unicode buffers are maintained in parallel to
> allow for a smooth transition.
>
> I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY
> maintainer Dave Mielke who implemented support for this in BRLTTY. There
> is therefore a vested interest in maintaining this feature as necessary.
> And this received extensive testing as well at this point.
>
> This is also available on top of v4.18-rc2 here:
>
> git://git.linaro.org/people/nicolas.pitre/linux vt-unicode
>
> Changes from v2:
>
> - Dropped patch #4 as it was useful only for initial debugging and it
> attracted all the review comments so far -- actually more than the
> patch is worth.
If you want this "feature" back, I'll be glad to take it, as odds are it
will help when any future person wants to test any changes in the code.
So feel free to resend it, I have no objection to it as-is.
And I've queued the other 3 up now, nice job.
greg k-h
^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: [PATCH v3 0/3] have the vt console preserve unicode characters
2018-06-28 12:38 ` [PATCH v3 0/3] have the vt console preserve unicode characters Greg Kroah-Hartman
@ 2018-07-18 1:00 ` Nicolas Pitre0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2018-07-18 1:00 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Dave Mielke, Samuel Thibault, Adam Borowski, Alan Cox,
linux-kernel, linux-console
On Thu, 28 Jun 2018, Greg Kroah-Hartman wrote:
> On Tue, Jun 26, 2018 at 11:56:39PM -0400, Nicolas Pitre wrote:
> > The vt code translates UTF-8 strings into glyph index values and stores
> > those glyph values in the screen buffer. Because there can only be at
> > most 512 glyphs at the moment, it is impossible to represent most
> > unicode characters, in which case a default glyph (often '?') is
> > displayed instead. The original unicode value is then lost.
> >
> > The 512-glyph limitation is inherent to text-mode VGA displays after
> > which the core console code was modelled. This also means that the
> > /dev/vcs* devices only provide user space with glyph index values, and
> > then user applications must get hold of the unicode-to-glyph table the
> > kernel is using in order to back-translate those into actual characters.
> > It is not possible to get back the original unicode value when multiple
> > unicode characters map to the same glyph, especially for the vast
> > majority that maps to the default replacement glyph.
> >
> > Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode
> > space from lossy screen content because of that. This is especially true
> > for accessibility applications such as BRLTTY that rely on /dev/vcs to
> > render screen content onto braille terminals.
> >
> > It was also argued that the VGA-centric glyph buffer should eventually
> > go entirely. The current design made sense when hardware was slow and
> > managing the screen directly into the VGA memory made a difference (i.e.
> > 25 years ago). Modern console display drivers no longer have to be
> > limited to 512 glyphs.
> > Quoting Alan Cox:
> >
> > |The only driver that it suits is the VGA text mode driver, which at
> > |2GHz+ is going to be fast enough whatever format you convert from. We
> > |have the memory, the processor power and the fact almost all our
> > |displays are bitmapped (or more complex still) all in favour of
> > |throwing away that limit.
> >
> > This patch series introduces unicode screen support to the core console
> > code with /dev/vcs* as a first user. Memory is allocated, and possible
> > CPU overhead introduced, only if /dev/vcsu is read at least once. For
> > now both the glyph and unicode buffers are maintained in parallel to
> > allow for a smooth transition.
> >
> > I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY
> > maintainer Dave Mielke who implemented support for this in BRLTTY. There
> > is therefore a vested interest in maintaining this feature as necessary.
> > And this received extensive testing as well at this point.
> >
> > This is also available on top of v4.18-rc2 here:
> >
> > git://git.linaro.org/people/nicolas.pitre/linux vt-unicode
> >
> > Changes from v2:
> >
> > - Dropped patch #4 as it was useful only for initial debugging and it
> > attracted all the review comments so far -- actually more than the
> > patch is worth.
>
> If you want this "feature" back, I'll be glad to take it, as odds are it
> will help when any future person wants to test any changes in the code.
>
> So feel free to resend it, I have no objection to it as-is.
>
> And I've queued the other 3 up now, nice job.
Thanks!
I'm about to send 3 more patches to put on top of what you already have:
patch #1 is that debugging code (still disabled by default), patch #2
removes the VLA, and patch #3 updates devices.txt.
Nicolas
^permalinkrawreply [flat|nested] 13+ messages in thread