In this post, I'll outline how pointer emulation on touch events works.
This post assumes basic knowledge of the XI2 Xlib interfaces.

Why pointer emulation?

One of the base requirements of adding multitouch support to the X server
was that traditional, non-multitouch applications can still be used.
Multitouch should be a transparent addition, available where needed, not
required where not supported.

So we do pointer emulation for multitouch events, and it's actually
specified in the protocol how we do it. Mainly so it's reliable and
predictable for clients.

What is pointer emulation in X

Pointer emulation simply means that for specific touch sequences, we
generate pointer events. The conditions for emulation are that the
the touch sequence is eligible for pointer emulation (details
below) and that no other client has a touch selection on that window/grab.

The second condition is important: if your client selects for both touch and
pointer events on a window, you will never see the emulated pointer events.
If you are an XI 2.2 client and you select for pointer but not touch events,
you will see pointer events. These events are marked with the
XIPointerEmulated so that you know they come from an emulated source.

Emulation on direct-touch devices

For direct-touch devices, we emulate pointer events for a touch sequence provided the touch is the first touch on the device, i.e. no other touch sequences were active for this device when the touch started. The touch sequence is emulated until it ends, even if other touches start and end while that sequence is active.

Emulation on dependent-touch devices

Dependent touch devices do not emulate pointer events. Rather, we send the
normal mouse movements from the device as regular pointer events.

Button events and button state

Pointer emulation triggers motion events and, more importantly, button
events. The button number for touches is hardcoded to 1 (any more specific
handling such as long-click for right buttons should be handled by
touch-aware clients instead), so the detail field of an emulated button
event is 1 (unless the button is logically mapped).

The button state field on emulated pointer events adjusts for pointer emulation as it would for regular button events. The button state is thus (usually) 0x0 for the emulated ButtonPress and 0x100 for the MotionNotify and ButtonRelease events.

Likewise, any request that returns the button state will have the appropriate
state set, even if no emulated event actually got sent.

Grab handling works as for regular pointer events, though the interactions
between touch grabs and emulated pointer grabs are somewhat complex. I'll
get to that in a later post.

The confusing bit

There is one behaviour about the pointer emulation that may be confusing,
even though the specs may seem logical and the behaviour is within the
specs.

If you put one finger down, it will emulate pointer events. If you then put
another finger down, the first finger will continue to emulate pointer
events. If you now lift the first finger (keeping the second down) and put
the first finger down again, that finger will not generate events.
This is noticable mainly in bi-manual or multi-user interaction.

The reason this doesn't work is simple: to the X server, putting the first
finger down just looks like another touchpoint appearing when there is
already one present. The server does not know that this is the same finger
again, it doesn't know that your intention was to emulate again with that
finger. Most of the semantics for such interaction is in your head alone and
hard to guess. Guessing it wrong can be quite bad, since that new touchpoint
may have been part of a two-finger gesture with the second finger and whoops
- instead of scrolling you just closed a window, pasted your password
somewhere or killed a kitten. So we err on the side of
caution, because, well, think of the kittens.

I recommend re-reading Thoughts on Linux multitouch from last year for some higher-level comments.
In this post, I'll outline how to identify touch devices and register for touch events.

This post assumes basic knowledge of the XI2 Xlib interfaces. Code examples
should not be scrutinised for language-correctness.

New event types

XI 2.2 defines four new event types: XI_TouchBegin, XI_TouchUpdate, XI_TouchEnd are the standard events that most applications will be using. The fourth event, XI_TouchOwnership is mainly for handling specific situations where reaction speed is at a premium and gesture processing when grabs are active. I won't be covering those in this post.

Identifying touch devices

To use multitouch functionality from a client application, the client must announce support for the X Input Extension version 2.2 through the XIQueryVersion(3) request.

Once announced, an XIQueryDevice(3) call may return a new class type, the XITouchClass. If this class is present on a device, the device supports multitouch.The class struct itself is defined like this:

The num_touches field specifies the number of simultaneous touches supported by the device. If the number is 0, we simply don't know (likely) or the device supports an unlimited number of touches (less likely). Regardless of the value expect that some devices lie, so it's best to treat this value as a guide only.

The mode field specifies the type of touch devices. We currently define two types and the server behaviour differs depending on the type:

XIDirectTouch for direct-input touch devices (e.g. your average touchscreen or tablet). For this type of device, the touch events will be delivered to the windows at the of the touch point. Again, similar to what you would expect from a tablet interface - you press top left and the application top-left responds.

XIDependentTouch for a indirect input devices with multi-touch functionality. Touchpads are the prime example here. Touch events on such devices will be sent to the window underneath the cursor and clients are expected to interpret the touchpoints as (semantically) relative to the cursor position. For example, if your cursor is inside a Firefox window and you touch with two fingers on the top-left corner of the touchpad, Firefox will get those events. It can then decide on how to interpret those touchpoints.

A device that has a TouchClass may send touch events, but these events use the same axes as pointer events. Having said that, a touch device may still send pointer events as well - if the physical device generates both.
Your code to identify touch devices could roughly look like this:

Selecting for touch events

Selecting for touch events on a window is mostly identical to pointer events. A client creates an event mask and submits it with XISelectEvents(3). One exception applies: a client must always select for all three touch events [1], XI_TouchBegin, XI_TouchUpdate, XI_TouchEnd. Selecting for one or two only will result in a BadValue error.

As for button events, only one client may select for touch events on any given window and the event delivery attempts traverse from the bottom-most window in the window tree up to the root window. Where a matching event selection is found, the event is delivered and the traversal stops.

Handling touch events

The three event types [1] are XIDeviceEvents like pointer and keyboard events. So from a client's point of view, in essence all we added was new event types.

The detail field of touch events specifies the touch ID, a unique ID for this particular touch for the lifetime of the touch sequence. Each touch sequence consists of a TouchBegin event, zero or more TouchUpdate events and one TouchEnd event. Since multiple touch sequences may be ongoing at any time, keeping track of the ID is important. The server guarantees that the touch ID is unique per device and that it will not be re-used [2]. Note that while touch IDs increase, they increase by an implementation-defined amount. Don't rely on the next touch ID to be the current ID + 1.

The button state in a touch event is the state of the physical buttons only. A TouchUpdate or TouchEnd event will thus usually have a zero button state. [3]

That's pretty much it, otherwise the handling of touch events is identical to pointer or keyboard events. Touch event handling should be straightforward and the significant deviations from the current protocol are in the grab handling, something I'll handle in a future post.

[1] I know, it's four. Good that you're paying attention.[2] Technically ID collision may occur. For that to happen, you'd need to hold at least one touch down while triggering enough touches to exhaust a 32 bit ID range. And hope that after the wraparound you will get the same ID. There are better ways to spend your weekend.[3] pointer emulation changes this, but I'll get to that some other time.

Thursday, December 15, 2011

After pulling way too many 12+ hour days, I've finally polished the patchset for native multitouch support in the X.Org server into a reasonable state. The full set of patches is now on the list. And I'm still expecting this to get merged for 1.12 (and thus in time for Fedora 17).

The code is available from the multitouch branches of the following repositories:

Below is a short summary of what multitouch in X actually means, but one thing is important: being the windowing system, X provides multitouch support. That does not mean that every X application now supports multitouch, it merely means that they can now use multitouch if they want to. That also includes gestures, they need application support.

A car analogy: X provides a new road, the applications still have to opt to drive on it.

Multitouch events

XI 2.2 adds three main event types: XI_TouchBegin, XI_TouchUpdate and XI_TouchEnd. These three make up a touch sequence. X clients must subscribe to all three events at once and will then receive the events as they come in from the device (more or less, grabs can interfere here). Each touch event has a unique touch ID so clients can track the touches over time.

We support two device types: XIDirectDevice includes tablets and touchscreens where the events are delivered to the position the touch occurs at. XIDependentDevice includes multitouch-capable touchpads. Such devices still control a normal pointer by default, but for multi-finger gestures are possible. For such devices, the touchpoints are delivered to the window underneath the pointer.

That is pretty much the gist of it. I'll post more information over time as the release gets closer, so stay tuned.

Pointer emulation

Multitouch can be a compelling interaction method but as said above, X only provides support for multitouch. It will take a while for applications to pick it up (Carlos Garnacho is working on GTK3) and some never will. Since we still need to interact with those applications, we provide backwards-compatible pointer emulation. Again, the details are in the protocol but the gist of it is that for the first touchpoint we emulate pointer events.

That's the really nasty bit, because you now have to sync up the grab event semantics of the core, XI 1.x and XI2 protocols and wrap it all around the new grab semantics. So that if you have a multitouch app running under a window manager without multitouch support everything still works as expected.
That framework is now in place too though I expect it to still have bugs, especially in the hairier corner cases.

But other than that, it should work just as intended. I can interact with my GNOME3 desktop quite well and I get multitouch events to my test applications.

Tuesday, December 6, 2011

For the last couple of weeks I've been pretty much working full-time on getting multitouch/XI 2.2 ready for the merge (well, I was on holidays for a bit too). So first of all - sorry if I've been ignoring bugs or emails, I'm working to a few deadlines here. Anyway, here's a bit of a status update.

Right now, it looks like touch event delivery is working, including nested grabs.Chase Douglas started on the pointer emulation while I was away and we're now at the point where emulation works, except that pointer grabs on top of multitouch clients aren't handled yet. I'm still rather optimistic to get this into 1.12, though it's getting a bit unwieldly. Carlos Garnacho has already sent me some patches, so he's testing the lot against the GTK branches.

However, since touch support cannot simply be bolted on top and needs to be integrated properly, this has triggered some extra rewrites here and there. I'm currently some 200 commits ahead of master sync-point. I'm planning to get this number down to something sane before merging but meanwhile, sorry, I'll have to keep ignoring you until this is done.

It adds a few lines of code (though I suspect the compiler will mostly reduce the output anyway) but it improves readability. Especially in the cases where the source field name is vastly different to the implied effect. In the second example, it's immediately obvious that pointer emulation should happen for the first entry.