The touch action

Over the past few weeks I have done some fundamental research into the touch action and
its consequences, and it’s time to present my conclusions in the form of the
inevitable compatibility table. I have also written an
advisory paper that details what browser vendors must do in order to
get by in the mobile touchscreen space. Finally, I discuss a few aspects of my research in this article.

Disclosure: This research was ordered and paid for by Vodafone. Nokia, Microsoft, Palm, and RIM
have helped it along by donating devices to me.

When a user touches the screen of a touchscreen phone, sufficient events should fire
so that web developers know what’s going on and can decide what actions to
take. Unfortunately most mobile browsers, especially Opera and Firefox, are severely
deficient here.

The touch action is way overloaded, and most browsers
have trouble distinguishing between a click action
and a scroll action. Properly making this distinction is the only way of creating a truly
captivating mobile touchscreen browsing experience.

The iPhone’s touch event model is excellent and should be copied by all
other browsers. In fact, these events are so important that I feel that any browser that does
not support them by the end of 2010 is out of the mobile browser arms race. There’s only
one problem with the iPhone model, and it’s relatively easy to fix.

I have created a drag-and-drop script that works on iPhone and Android
as well as the desktop browsers,
a multitouch drag-and-drop script that works only on
the iPhone, and a scrolling layer script that
forms the basis of faking position: fixed on iPhone and Android, who do not
support that declaration natively.

I will hold a presentation on my research at the DIBI conference, Newcastle upon Tyne, 28th April. It will likely
include future discoveries and thoughts.

Problems with the touch actions

One of the most serious problems of the current touchscreen interfaces is that the
touch action is way way overloaded. When the user touches the screen he may want to start
a click action, a scroll action, or a resize action. Distinguishing correctly between
these three actions is what sets a good touchscreen interface apart from a bad one.

This is especially important with regard to the click action. Far too often, an intended
click on an element does not work because the user moves his finger a tiny little bit
during the action, and the operating system concludes he wants to perform a scroll action
instead. The page does not really scroll because the user does not really move his finger,
but the click action is canceled, and this results in a seemingly unresponsive interface.

This problem is especially severe in the Samsung WebKit that runs in the Samsung H1 and M1
Widget Manager, but it occurs on other operating systems, too.

Only iPhone and Palm have really solved this problem; the other browser vendors are
in various stages of catching up.

Current state

Events can be divided into three groups:

Touch events that describe the exact actions of the user on the screen without
reference to the result of these actions. Examples: touchstart, touchmove, touchend.

Interface events that describe the result of the touch action. Examples: click, resize,
scroll, contextmenu.

The touch events are supported only by iPhone and Android, in that order. Even multitouch Androids
do not support more than one series of touch events (i.e. more than one finger) at a time.

All browsers support the legacy events, although some don’t support all of them. Still,
this is largely irrelevant.

The real chaos is in the interface event group. Most WebKit-based browsers support most events
reasonably well, but all others, including Opera and Firefox, don’t support them at all or
make a mess of them. That’s annoying, because it’s these events that actually tell
us what the user is trying to achieve.

Touch events

The touch events are touchstart, touchmove, and touchend. As you’d expect the first
fires once when the user initially touches the screen, the second continuously while the user is
moving his finger, and the third when the user releases the screen.

All browsers MUST (in the sense of RFC 2119) support these events at their earliest opportunity.
Any browser that does not support them by the end of 2010 is out of the mobile browser race.

The touch events are currently supported only by iPhone and Android.
There are a few differences between their models:

The iPhone also supports the gesture events, which fire when more than one
finger touches the screen. I’m not yet totally sure how useful they will be;
I’d like to see (or create) a practical test script first.

The iPhone stores information about the touch, notably the coordinates, in
a special touches interface, while the Android stores them directly
on the event object itself.Google is right, and Apple is wrong. I explain why in my
advisory paper.

There is also the touchcancel event that fires when the user’s touch is “canceled.”
I haven’t yet studied it; I feel that it’s useful only in a very few edge cases.

The touchmove and touchend events also fire when the touch action moves out of the
element where the touchstart event took place.

If a touch action moves into an element, touchstart generally fires. I’m wondering
if we should use touchenter instead, which in turn presupposes the existence of a
touchleave event.

I’m also wondering whether we need a touchhold event, and whether the touch events
should return an area instead of just a coordinate of a single pixel.

Interface events

The interface events fire when the user actually takes an action instead of aimlessly touching
the screen. These actions include:

Clicking on a link or otherwise activating an element. The click event should fire.

Scrolling. The scroll event should fire.

Zooming. We need a new zoom event for this situation. Currently a few browsers fire
a resize event instead, but that’s not good enough, and it might be that resize should
be assigned to another situation entirely. I need to do more research on this.

Calling up a context menu. The contextmenu event should fire. More in general we might
need a touchhold event that fires when the user touches the screen and holds his finger
in place, whether or not a context menu pops up.
On the other hand, this event is pretty easy to fake in JavaScript.

Browser compatibility is a bloody mess:

The click event fires in all browsers.

The scroll event does not fire in Opera, Samsung WebKit, Firefox, MicroB,
and NetFront when the user scrolls.

The zoom event does not exist yet; I’ve invented it during my research. We need
it badly, though.

The contextmenu event does not fire when a context menu appears in Opera,
Android, MicroB, or NetFront. It fires correctly on Iris and IE. The other browsers
do not have a touch-based context menu.

Legacy events

Current websites are created exclusively for desktop, and they use the legacy events
extensively. Since mobile browsers want to give their users access to the “fixed”
web, they have to support these legacy events.

Still, the touchscreen environment is not the same as the desktop environment. Most
notably, a mouse action on the desktop is not equivalent to a touch action on
a phone. The user of a touchscreen phone needs to touch the screen for pretty much any action
(zoom, scroll, click), something that is not true for a mouse.

This is how the legacy events work currently:

If a touchstart and a touchend action occur on (roughly) the same coordinates this constitutes
a touchclick action. The legacy events are fired, as is the click event.

The legacy events are mouseover, mousemove, mousedown, and mouseup. The event
order may differ, but that doesn’t matter. In addition
the :hover styles, if any, are applied to the element.

Only one mousemove event fires.

When a touchclick action occurs on another element, the mouseout event is fired on
the original element and the :hover styles are removed.

The dblclick event is not supported. (It’s totally useless, anyway.)

The Vodafone Widget Managers fire the mousedown event when the user initially
touches the screen.

Thus, the S60 Widget Manager in theory allows a mousedown/mousemove-based drag-and-drop.
In practice performance is so painful that I advise you not to bother.

iPhone and S60 WebKit cancel the rest of the events if a DOM change occurs onmouseover
or onmousemove. I’m sure this is a good idea in the short run, but the mobile space
will have to emancipate itself from the “fixed web” and this rule
must eventually disappear.

BlackBerry and NetFront do not support the mouseover and mousemove events.

Drag-and-drop and position: fixed

A drag-and-drop script that works on iPhone and Android
in addition to the desktop browsers. The trick here is removing the mouse events when the
browser turns out to support the touchstart event.

A multitouch drag-and-drop script that works only on
the iPhone. Android does not support more than one finger at a time.The iPhone
handles the touch event properties wrongly.

A scrolling layer script that works on the iPhone and Android.
Remy Sharp pointed out that it also contains the solution to
getting position: fixed to work on the iPhone and Android.
Essentially, if you make this script vertical instead of horizontal it’ll create a scrollable
layer between “fixed” panels. There are various tricky bits involved, though, and
I’ll have to return to this problem later. Still, we now have the basic solution.

iPhone clickable area

I suspect that on the iPhone the actual clickable area is slightly
shifted downwards with regard to the visible HTML element. That is, a click just below the
element may also be counted as a click on the element itself. This does not go for the area
just above the element.

To try it for yourself, use the multitouch drag-and-drop.
Try it in normal vertical orientation first, and try to touch the element as low as possible.
Then turn the device 180 degrees and try again. You’ll find that you have to touch
the HTML element distinctly higher now.

I’m still trying to figure out how to officially prove that Apple does this; maybe
I’ve misinterpreted the test results.

iPhone documentation is lousy

Finally, during my research I noticed time and again how unbelievably lousy Apple’s
Safari iPhone documentation site is. I advise developers to use my pages instead — as usual.

Part of the problem is the content; once you get to the correct page you will find a terse
summary of the situation that is correct as far as it goes but mostly leaves out stuff that
doesn’t fall exactly in the page’s topic, even if it’s closely related.

Worse, the invention of the cross-reference has taken Apple completely by surprise.
I wish I could say it’s scrambling to catch up, but it isn’t. No cross-references
whatsoever. Anywhere.

Try it yourself. Go to the TouchEvent page. It contains absolutely zero reference to
the crucial TouchList interface that exposes useful information about the touch events. There
is in fact a workable page about TouchList. Try to find it from the TouchEvent page. You
won’t.

Alternatively, try finding the same information from the
Safari Reference Library home page. Search works — if you know
what to search for. The official navigation is absolutely useless.

Apple’s pseudo-frames or something also drive me crazy. The keyboard
focus does not snap to the actual documentation page, which means I initially can’t scroll.
(Firefox)

Comments

I noticed the iPhone "target below" behavior everywhere. You can test it in the home screen, where tapping on the top of an icon don't trigger a click on the app, while tapping way below works as expected.
No proof either, but I think it make the UI more forgiving. A pretty good tweak when you hold your phone the way it should be.

* When you click with your finger, do you prefer to see a bit of the button above your fingernail, or to cover the button completely?

* For that matter, if you had long nails and tried to click a button, where will your skin actually touch the screen?

I think this might be Apple's solution to the lack of precision from a finger, instead of generating a "touch area" property. Can you imagine the different implementations of "touch area"? And which percentage of the touch area creates a valid event?

For example, let's say there's a really large button, so that 50% of your finger touches the button, but that half of your finger only covers 20% of the button. Should that be a click?

Great work once again, PPK! I loved to see the results of your extensive testing; your post has now given me a good knowledge of what to expect in the varied implementations of events in mobiles. This is certainly a problem - desktop browsers are all over the place (especially IE), but when compared to mobile browsers' implementations, they look very consistent. Thank you very much for this post, it has helped me a lot.

As others have said, Apple's touch below is common across the iPhone/iPod. While I can see the rational behind it I touch with the finger not the nail, and having got an iPod after having a year of using Android it is quite annoying. An option in the system menu from Apple would be nice.

Assuming the web model is like the Cocoa model, touchcancel is in fact important. When a touch gesture begins, its type (as say, flick, scroll, swipe) is not yet resolved. Suppose the user puts his finger down on a button in a scroll view. Then he begins to move. The original press should be canceled, replaced with a scroll. Etc. Cancellation is important because gesture recognition cannot be completed before events are issued. You need to be able to issue a retraction.