Keyboard input methods and i18n on elementary OS

This post contains some ideas I’ve had lately on how to improve the internationalization of input methods on elementary OS (mainly keyboard input). Originally, my idea was to write a blueprint but I think there’s still a lot to be discussed before proposing what to actually do. I will write down my ideas here and hopefully with some feedback we will manage to get a blueprint we can work on.

This is a very long text but it is what I currently need to clearly explain my reasoning. The idea is to summarize everything later, but you can always skip to the end where the most important information is summarized in bullet points.

The problem

There are a lot of different types of keyboard devices, and it’s very hard to guess which one the user has. However this layer has been simplified greatly by the fact that almost all modern keyboards use USB, and that the kernel handles the translation of key presses to keycodes and provides them to us via evdev. In spite of this, even if we can decode keyboards relatively easily, there is also the problem of which language the user wants to type in on their PC, this we cannot guess from basically anywhere but maybe the user’s locale configuration and even if we tried to do so people who type in a language other than English will most likely want to do so in several other languages with very specific needs (applications they use have shortcuts better suited for US layouts, their language can be typed in several ways, or they are fluent in more than 2 languages) currently this has been solved by allowing users to switch their keyboard layouts by using a sequence of keys, but this has been broken constantly because there has never been a definite solution to the problem that encompasses enough languages to suit a very large audience, so languages are patched on top, and then any change breaks something.

In the times when X was developed the keyboard input was very basic and it didn’t offer support for almost any features other than the ones needed for a US layout. This is why the X keyboard extension (xkb) was devised; it was a way to allow switching the symbols that every key had assigned on the fly, it added more modifier keys, and added support for Unicode (I think this wasn’t there before but I’m not sure). This was a good thing but it still left out several languages that are quite complex such as Japanese or Chinese. Later on ibus and several other input methods were created to support these but in order to correctly make layout switching work distributions had to disable several xkb options because xkb modified the keyboard layout at a lower layer than them and would then produce some weird behaviors.

A lot of the design of xkb was influenced by the limitations imposed by the X protocol, and how fast (or slow) computers were back then. But now, people is trying to leave X behind, so we can try to get rid of most of this cruft and try to design a more robust solution that won’t break as often. We want something that supports typing in as many languages as we can, allows users to switch seamlessly between them, and makes it as easy as possible for them.

Current state of things

Some time ago most distributions exposed all xkb options through a very ugly interface, just like elementary actually does. Ubuntu and Gnome were others that did the same, but they decided to remove them in favor of better support for complex languages that used ibus but compromising a lot of these xkb options (which some times can even conflict with each other). This angered users greatly because they couldn’t easily change their layouts as they did before. This was solved but still a lot of the flexibility that the xkb options allowed was lost and feelings were hurt. I think there is no need for such a compromise.

Currently the workflow for anyone trying to type in another language is:

Try to set up the language from the operating system’s settings panel, if they find it there then they are fine and happy.

If this does not work google “how to type in <language name> on Linux/elementaryOS/Ubuntu” and get to some tutorial about configuring ibus or spend hours deciding which of all the options available they should try for their language.

Install ibus (or another input method), and in the case of ibus also install the actual engine for the language they want to type in.

Use the interface provided by the input method (which will always try to override what the operating system does because it knows better), then the user just hopes the operating system can handle this and wait for it to magically work, which some times doesn’t happen.

After doing this, even if they succeed at step 1, there are some caveats, for example a lot of people got used to changing layouts by using both shift keys. They will be disappointed to see this does not work anymore, but the only reason it worked before was because X was the sole manager of the keyboard and this is not the case anymore.

There are other issues with the fact that the panels provided by these input methods look ugly in elementary which is not nice aesthetically.

The solution

To me input methods can be classified in 2 types, let’s call them basic and advanced, basic input methods map 1 input thing to exactly 1 keysym where input thing stands for either one key press, several modifier key presses and another non modifier key or a dead key followed by several other keys, the point here is the computer can know when to translate the input to the needed keysym by itself, either because the keymap file tells it, or a dead key sequence matched. These can be easily specified and configured with xkb and it’s keymap file format to describe layouts.

These files have often been kept hidden from the users and xkb options were provided as an “easier” way of editing them to the user’s needs. I think these options have evolved to fit a lot of particular requirements that not a lot of people actually need like “Left Alt as Ctrl, Left Ctrl as Win, Left Win as Alt” nevertheless distributions often just spit all of them in some GUI to the user. This has grown to a point where I think it may be even simpler to explain the format of a keymap file, than trying to describe what these options do, don’t do, or how they interact when conflicting ones are enabled like “Swap Ctrl and Caps Lock” and “Swap ESC and Caps Lock”. I think the definition of a keymap file is not difficult to understand once you remove a lot of the complexity added back then that we don’t need anymore like groups, geometry, rules. Just the opposite, the flexibility gained by learning this can’t be matched by any GUI or set of options provided by someone. We should just expose this to power users and stop trying to digest it for them.

Contrary to basic input methods, advanced input methods are required when the input character sequence yields to multiple options of keysyms, this happens in languages where you type the sound of a sentence but it can be written in several ways, so the user must choose which is the one they want. In some cases the program even has to guess how to separate characters into words (note that I don’t know any of these languages so this is what I have concluded from reading about them). This would also be the case if we wanted to provide some kind of predictive text input method like the one on phones nowadays. The difference here is the input method can’t know what the user wants to translate their key presses into or when the translation should happen, so it needs to provide an interface to them (usually a popup with options), and then wait for them to choose the correct one.

However, these two do not need different implementations, actually, advanced input methods need a basic input method to get the characters they want to translate into keysyms, so these actually sit on top of basic ones, and should work with them instead of trying to override everything they do.

On top of all this we should provide nice graphical interfaces so users can configure input methods to their liking which basically means provide ways of changing the basic input method and specific options to the advanced method they are using (if they are using one). Advanced methods should be “bundled” with a basic one that the user will be able to change if for example they want to use Pinyin on an AZERTY keyboard instead of QWERTY. The keyboard layout configuration would consist of a set of basic methods and advanced methods each with their own basic method bundled to them.

Implementation

Probably the most important factor here is that X is being replaced by Wayland. This will render all X specific stuff useless but will also give us the opportunity to choose where to go next. Either way we must be aware that some stuff will be useless in some time (code wise) unless we start to move to the actual libraries that will be used on Wayland.

To provide the kind of integration a user would expect from elementary I believe we will need to add some new functionality to Gala. For the basic level of input methods libxkbcommon should be more than enough, this library contains all the currently used layouts on Linux but in an X free environment and is the proposed solution for when we move to Wayland. This library loads a keymap from several sources such as a description like the one used before by setxkbmap with the RMLVO syntax (which is the only one used currently by Mutter), or a full .xkb file containing everything we need for a layout. We should provide a gsettings interface that allows to chose a layout using any of these options.

We also need to finally get rid of the options tab on the keyboard plug on switchboard because as stated before is hard to understand, provides conflicting options, breaks often because X is not the only one controlling the keyboard anymore and is not flexible enough for the user’s specific needs. Instead I have thought about a solution that takes into account the most common scenarios I’ve found people complains about when loosing this tab: “I can’t swap X and Y keys anymore”, “I can’t enable the compose key where I want it” and “I can’t change layouts anymore”. The last one will hopefully be handled by Gala and the fact that layouts will mostly be in one place. So my idea is the following:

Add a small interface that will ask for the user to press a key, and provide a menu of possible common actions to bind it to, these actions can be “Control, Caps Lock, Shift, Alt, AltGr, Compose Key, Menu” and maybe some Japanese specific keys like the kana key but I don’t know enough about this to have a concrete idea. The implementation of this is surprisingly easy if we use libxkbcommon to load a layout file where we create an alias for that key, or change it’s keycode.

An attempt of mockup for the widget I’m talking about.

For any other use case not handled by this what I suggest we do is allow an arbitrary xkb file to be loaded from the switchboard plug. I really didn’t knew how powerful this was but I think most of the complaints people have can be addressed if they knew how to write their custom keyboard layout file and sometimes can even be better than switching layouts. Even if people is really not willing to learn the file format instead of trying to agree on a set of configuration options it would be easier to design a graphical application that generates layout files in an intuitive way without most of the limitations that the original X input method imposed on the xkb file format, in a similar way as Ukelele does in OSX.

We could also fix some bugs as we move along. Currently every time a layout switch happens libxkcommon is called by Mutter to compile the next layout and load it to Clutter, we could be more clever about this and load all layouts to memory and just switch the reference to the current one. This would fix an annoying bug, where you switch layouts and the first key you type after that doesn’t register because the new keymap file hasn’t been compiled yet.

For the advanced input methods people has always supported ibus but I have found out this is not unanimous and some people prefer others like Fcitx. What ibus did was provide a framework that desktop environments used to display the bubble with options, interpret user’s feedback and send it to the application, it did not provide any language specific capabilities. Instead other people used this framework to create engines for specific languages which are daemons that communicate with the Shell throug D-bus. On Wayland a merge was accepted on Weston that extends the Wayland protocol to allow a preedit section and feedback from the user, this is still a Weston only thing and I haven’t come across information about it being merged into the core protocol. But this new framework will eventually replace what ibus did with a much more standardized version of it. So, on the X side of things we are pretty much left with using ibus but it will be useless once we move to Wayland if im-wayland gets into the core protocol, so we may just support ibus as a temporal solution.

In any case, what I would like to do here (and I haven’t thought about this at length implementation wise, mostly because I don’t know enough about these input methods to have an idea on the requirements) is to provide a set of advanced input methods out of the box which can be selected from the keyboard plug without installing anything, each providing layout-specific options on the right panel. To do this we would need to narrow down the problem to a subset of languages, and then choose one of the input methods available for each of them. Then we would need to be able to configure them from our keyboard plug (I think ibus engines provide a d-bus interface we could use for this). Which, would imply looking at each language and deciding which options are the most useful, this is a very language specific thing and would require someone who is fluent in the language to provide us with feedback.

Undoubtedly, for advanced input methods to happen nicely we need a lot of feedback from people who actually need them.

What needs to be done

If all that was too long to read for you, here is a summary of the key points I think need to be done:

Gala

Use libxkbcommon to add a way of loading arbitrary xkb files (this change should actually happen on Mutter).

Try to load all layouts on memory and just switch the reference to the current one.

Provide a gsettings interface that allows to specify the list of layouts coming either from a file or from an RMLVO description, and also leave room to choose advanced input methods.

Switchboard keyboard plug

Kill the options tab.

Add a way of loading arbitrary xkb files into Gala.

Add an interface that asks for a keypress and shows a menu with options on what action should be binded to it. In a similar way as to how custom shortcuts are added.

See if ibus packages are installed and list them on the keyboard plug. Even better a set of pre installed ibus engines can be provided so that people chooses a language and it just works.

Provide a subset of the options given on the specific engine’s configuration panel directly on the keyboard plug (this should be doable through d-bus).

Team work

Agree on a set of officially supported languages to narrow down the problem.

Get at least one person who uses each of these languages on a daily basis and is willing to spend time giving feedback to developers.

Decide on an input method per language from all the ones available.

Work with each one of the language advisors to agree on a subset of configuration options that will be provided directly from the switchboard plug.

Final Notes

All of this has come to my mind after a lot of time of reading, mostly by myself, about this problem. I’m not entirely sure about everything here but I would really like people to do some mockups for the keyboard plug, I have some in paper that I could try to draw on Inkscape but my skills aren’t great.

Also my native language is Spanish so my assumptions regarding advanced input methods on other languages may be wrong.

Finally, I would really like to get general feedback about this, if you have any comments please feel free to send them to me, my mail is: santileortiz@gmail.com