Cocoa Therapy

Why?

I’m very interested in iPad used for content creation, you might call it “casual content creation”. With iWork for iPad, Apple sent a clear message: the iPad is more than a content consumption device.

I believe content creation apps are going to be key to the iPad as a platform.

The competitive strategy angle is that HTML5 is narrowing the gap, so much so that a content consumption web app can come pretty close to a native app. However for key content creation aspects like interactivity, integration with the system, media richness, speed, disconnected use and polish, native apps still have an edge. This means that native, content creation apps, differentiate the iPad from (upcoming) competing tablets, more than a content consumption app can do.

The platform strategy angle is that content creation apps complete the iOS platform and, in the context of the laid back/feet on the table target, make it general purpose.
Content creation on a touch screen is a shocker to people who believe you’re only creating content when you’re typing text—how could you type all the text on a soft keyboard?

Once you overcome the concept of keyboard and mouse as the primary input devices, you realize they might in fact have been a blessing and a curse, stifling creativity by forcing tools and users to adapt, instead of the other way around. What would music sound like today if we had never evolved from harpsichord and lute, would Mark Knopfler play Sultans of Swing on a harp?

Mouse

The basics of computer literacy today involve learning to coordinate eye and hand to internalize the separation between hand movement and on screen activity. Change your mouse acceleration settings and realize just how delicate the abstraction is.

To a novice user, aiming at something on screen with a mouse is like trying to ring a doorbell using a broomstick. The tool that’s between you and the target object is the cause for the lack of directness. You will get used to it out of necessity, but that doesn’t make it better than direct interaction.

To sum it up, a first level of indirection is removed by touching objects on screen: you directly touch and manipulate information you want to act upon.

Selection

This is a tricky one. Twenty six years of Windows Icons Mouse and Pointer leave us with conventions, behaviors and encrusting that build on input device limitations in ways that make user interfaces hard to learn.

Object selection is one of the relics of WIMP interfaces, the selected state is a form of intrinsic UI modality. Objects and actions are pleasingly orthogonal to the mathematically inclined, but performing actions on objects that are in a selected state is another form of indirection, akin to using a marker to highlight the Lego brick that you’ll then pick up with your “hand tool” to perform an assembly action.

Selection is the premise to the Great Inspector Hunt, whereby you click on an object to manipulate it and then go to an entirely different place to hunt down the property you’re looking for.

Despite the trend to simplify user interfaces and remove features, there still are too many features to expose only through gestures, particularly considering the lack of a shared gesture vocabulary among different apps.

Multiple selection is a common UI feature that doesn’t map well to touch screens, and in fact it doesn’t really map well to some kinds of objects even on the desktop, like discontinuous text selection. Multiple selection also prevents contextual UI placement, which I believe to be a problem of multiple selection, not of contextual inspection.

On a touch screen the two “obvious” ways of implementing multiple selection are the iWork way, tap and hold first object then use other hand to tap other objects, or the desktop-inspired drawing of a rectangular “rubber band” that selects all objects it touches. Multitouch with two hands is a pretty demanding technique, let go of the first object and you lose the selection, the first hand partially obscures then screen and is in the way and you really have to put the iPad down. Dan Messing’s otherwise excellent FreeForm drawing app implements rubber band object selection, the fundamental downside is that because there’s no hovering mouse cursor and because the entire canvas is a target that initiates a rubber band, it is exceedingly easy to activate accidentally, deselecting the object you’re working with.

The solution is to think about what multiple selection is really needed for and try and do that differently. Text style can be applied to multiple objects by copy/pasting style information, Keynote on iPad uses a suboptimal multi-hand gesture to match object sizes that is essentially style cloning. Mail on iPad groups “objects” by entering a mailbox “edit” mode, where multiple messages can be selected to be moved or deleted. There likely are other workable solutions that can contribute to kill the need for multiple selection.

So while from a software development point of view you can still think of an object as being selected, from an interaction design point of view it probably helps to think selection doesn’t exist, that by tapping an object the user is asking for the object manipulation UI to be exposed. Near the object of course.

Semantics

When you approach a building you don’t type in the door angle in degrees to open it. Sometimes the door handle shape doesn’t match how you actually use the door, but you definitely manipulate the door directly.

Yet app use generic number or string editing controls all the time, instead of meaningful visual metaphors. User interface kits of standard controls help developers build apps faster, at the cost of seldom representing object properties in a way that directly manipulates the property.

The lack of directness, result of the use of a surrogate representation, might have been a reasonable compromise between immediacy and implementation simplicity in the past. Today it’s just forcing the user to think in terms of the app’s representation of a property, rather than the property itself. So, once again, it’s a form of indirection that should be removed.

Fixing this is very dependent on the kind of property, and there’s a tradeoff between frequency of use of a property and clutter when all the property controls come on screen.

Keyboard

A landscape touch screen makes for a surprisingly functional keyboard, considering it gives no tactile feedback. Clearly it is no match to a real keyboard, if your job is to crank out a few pages of text a day you’ll obviously want a real keyboard and perhaps a full desktop OS.

But actually if you are a gifted writer and can dump your brain and turn thoughts into actual textual content in an unretouched stream of words, the glass keyboard might be reasonably usable. The issues become apparent when you need to edit and move text around, to manipulate text instead of to produce content. Moving the cursor around with arrow keys or using keyboard accelerators to manipulate text, like hand-eye coordination in using a mouse, is second nature to anybody who spends any amount of time typing.

Beginners will instead just backspace through perfectly good text to get to a typo, until they learn the magic of the left arrow key: it’s like a backspace that doesn’t delete!
This is not to reiterate the “iPad is for young/old/dumb people” cliché, rather to point out that keyboard-based text manipulation is not a natural interface, that we should consider the idea that there might be alternate interfaces for text editing, that function and meta keys that have popped up over the decades are barnacles on the “content keys” rock.

A new interface for something so fundamental isn’t something you dream up without testing and I haven’t given it a huge amount of thought, but I do believe that a keyboard-less multitouch text navigation and manipulation UI might be a workable solution, better than trying to replicate keyboard and mouse based manipulation on a multitouch screen.

Approximating

The iPad is sexy and makes you want to use it even for content creation. In a future article I plan on discussing how the above applies to the iWork UI, though it’s clear that iWork is just a first good shot, definitely not the final word on touch content creation UIs.

I believe content creation apps will define the iPad and make it many times more useful than it currently is, and I believe only UIs that remove indirection and bring content closer to the user will succeed in disappearing, as Adam Engst puts it “the iPad becomes the app you’re using”.

You shouldn’t write off the iPad as a content creation tool just because iWork isn’t quite there yet. Remember the state of Mac software on January 24th, 1984? Yeah, MacWrite or MacPaint, one at a time. And the Mac was born for content creation, as most dead-tree-age, printer-front-end solutions were at the time.

I’m at WWDC for a few more days, so if you’re in San Francisco I’d be happy to chat about this, just contact me @duncanwilcox.

Comments

Real apps

I am curious about how “real apps” will work on the iPad, how an application that is more than a utility or casual/light use exposes functionality in a touch-optimized user interface.

The only available information is what Apple has shown at the iPad announcement on January 27th. We will surely discover more soon, but I wanted to get a head start and, well, couldn’t help watching the keynote a few times and grab all available footage from the post-keynote hands-on.

Many of the interactions are pretty obvious on the surface, but on a second (or third) look reveal the non obvious or the unexpected. So here’s a quick rundown of what I have found, mainly in the iPad version of Keynote, Pages and Numbers.

A lot of the UI and interaction that wasn’t shown on stage is visible in the hands-on videos. It is interesting and sometimes hilarious seeing the Apple representative stumble into features and expose the difficulties as he or she learns the UI, sometimes even failing to discover the proper gesture — notably how to exit a Keynote presentation in the endgadget hands-on video.

Controls

A slight indentation of the slides in the outline on the left in Keynote could seem just a visual representation of hierarchy, it actually is a fully functional outline with collapsible sections, note the disclosure triangle.

Outline hierarchy

Collapse sequence after the disclosure triangle has been pressed

In two instances in Keynote popovers and UI persist in-context (next to content) only when in a mode explicitly entered and exited by the user, which seems like a reasonable rule to follow.

Transition build popovers

Image masking in-context controls

Interaction issues

There are inconsistencies in how UI modes are entered in Keynote. Slide reordering mode is entered by tapping and holding a slide in the outline, transition setup is entered tapping the corresponding menu icon, image resize mode is entered double tapping an image.

Pictures, charts and shapes can’t be dragged out of the “insert object” popover, so in Keynote the new items are just lumped on the slide. The inability to drag is inconsistent with press-and-drag to reorder slides on the sidebar and while we don’t know what pressing and holding does, I’ll guess that since a single tap inserts the image, pressing and holding has no effect. The reason for the non functional drag and drop might also be related to the modality of popovers.

Attempt to drag an image to the slide fails

Reordering columns in Numbers doesn’t appear hard to perform, but I feel like it’s nearly impossible to discover that you have to tap exactly on the column header to select it, then tap and hold to drag it.

Dragging columns in Numbers

I haven’t found any footage of pictures or charts being deleted (or slides for that matter), nor any UI that could be thought to perform deletion, and the Apple representative in the Slashgear video solves it by repeatedly pressing the (prominent) undo button. This could either mean it hasn’t been implemented, or the delete gesture isn’t all that obvious.

Reaching for the visible Undo button

Multitouch is one thing, multihand a whole other, it doesn’t look at all comfortable and Phil Schiller actually had to put the iPad down when moving multiple slides and when matching image sizes in Keynote.

Multihand multitouch. Ouch!

Complete?

The inspector icon always shows the proper inspector for the selected object, in this case text. Aside from the questionable, arguably not-keynote-ready design, the text inspectors scream incomplete! There are text styles, but where is the text style editor? Where’s a font picker? How about a color picker? These are UI elements we take for granted on a desktop, and they are necessary for content creation. Clearly they’re coming soon.

Text style popover

Text layout popover

Finds

Here are screenshots of things I haven’t seen elsewhere.

Pressing the bottom left + opens the “Tap to insert a new slide” popover, tapping on one of the theme-preset slides animates it to the center of the screen.

The new slide popover

Even though the popover arrow points to the left, the tools menu appeared after pressing the wrench icon. I didn’t expect to see “Find” or “Help” in here. The last two items are “Slide numbers” and “Check spelling”.

The Tools menu

The shapes pane of the “insert object” popover.

Shapes pane

Here Keynote had just been reopened after a crash, evidently unable to save a preview of the document.

Keynote document browser after a crash

Content creation?

In conclusion, is the iPad going to be a good platform for productivity apps, apps that go beyond content consumption? For the time being I will have to answer with a conditional yes, despite Phil Schiller insisting that you can do “really advanced” things “with just a finger”. I feel like Apple is stil figuring out the UI and iWork will likely be a little different when it ships.

The overarching problem seems to be that finding a place for UI elements and finding a way to interact with them is a new science and needs new conventions. Additionally gestures aren’t always easily discovered even when they are consistent across apps, but the lack of conventions leads to non-uniformity which makes it even harder. The best bet here is following iWork’s behavior as much as possible.

I will say that the iPad is going to be good enough for “casual content creation” at the very least. Given time and research and experimentation with touch UIs, it definitely feels like it has the potential of eventually replacing desktop UIs for many tasks.

Comments

Inside Macintosh

The “Inside Macintosh” book series, started in 1985, is without doubt a milestone in the modern computing experience, and not just on the Mac.

Inside Macintosh was largely a reference to what has today become Carbon. Part of the first volume was what has become the Human Interface Guidelines. The main goal of the HIG was to push applications towards a consistent appearance and behaviour. What is remarkable, and what still defines the Mac user experience today, is that Mac developers have pretty consistently adhered to the HIG.

The current Human Interface Guidelines is a three part document: “Application Design Fundamentals”, “The Macintosh Experience” and “The Aqua Interface”. The first two, a fifth of the content, largely pertain to the design process and common sense application design issues. The third part, over 80% of the content, is a very specific and detailed set of guidelines on the purpose, interaction and visuals of the elements of the Aqua UI.

In the 23 years since the first Inside Macintosh much has changed in the Mac look and feel, but I believe a turning point in the evolution of the HIG was the development of iLife around 2002-2003. Apple engineers and designers experiment outside of the sanctioned guidelines. Apple is continuously pushing the envelope beyond HIG conformance out of necessity, to innovate and refresh the look and feel, and iLife was and is the testbed for this.

Extending the Interface

In “Extending the Interface”, the guidelines leave the door open for extensions to the Aqua UI: “When a need arises that can’t be met by the standard elements, you can extend the set of controls […]”. So you’re violating the HIG only if you’re extending the interface when a standard element would meet the needs of the UI you’re implementing.

In the context of solving user needs, this explains why for example Aperture and Final Cut use grayscale controls and translucent (“HUD”) panels: standard controls and panels create visual noise that interferes with the perception of color of the actual content the application is trying to render. But does it also explain the iPhoto toolbar location? The iTunes 7 dark scrollbar look?

A reasonable argument might be that while Apple is technically violating the current HIG, they aren’t violating future guidelines: system-provided controls and frameworks, and the guidelines themselves, will eventually be updated to reflect the innovation.

What should a third party developer do? John Gruber’s C4[0] talk “Consistency vs. Uniformity in UI Design” spurred some debate and spread the “the HIG is dead!” meme.

A guideline is not a hard rule, and the HIG is open to extensions when “a need arises” anyway. I can’t help thinking that while there are important details regarding what Aqua controls should look like, they are still only details. If you look at any productivity app, you’ll find that the major part of the UI is composed of elements that are impossible to codify in guidelines.

The image editor view of Acorn, Pixelmator, Photoshop? The spreadsheet and charts editor in Numbers and Excel? The waveform view in FuzzMeasure, Fission or Logic? Clearly the visualization and the interaction depend on the nature of the data, and while there are several types of data that have natural or established visual mappings, interaction and editing are a whole other can of worms. This is where developers spend a lot of their coding time, and take the tough decisions that make or break the app.

An application with visually distinctive window style or control look will never stand a chance against an application with distinctive data visualization and interaction. As JLG once put it, “At a risk of being called sexist, ageist and French, if you put multimedia, a leather skirt and lipstick on a grandmother and take her to a nightclub, she’s still not going to get lucky” — he was referring to Windows 98 at the time, but it’s a timeless quote.

Visual user interfaces

Now imagine you’re John Geleynse (oh perhaps you are — hi John!), you desperately want to help Mac developers build better apps, you’ve covered the use of system-provided user interface elements, now what? The closest to describing what goes in the main content view is probably the “Reflect the User’s Mental Model” section in the “Human Interface Design Principles” chapter. Good, solid principles, but kinda fuzzy when you’re trying to build a new UI.

The Apple Human Interface Guidelines never covered the hardest part of building user interfaces, beyond general high level design principles. What’s changing now is what is a good versus merely acceptable user interface:

a table with numbers (covered by the guidelines) becomes a 3D shaded translucent chart (not covered)

a set of parameters controlled by sliders (covered) is now a rich visual representation of the parameter-generated data set (not covered)

a list of cities in a popup menu (covered) is now a map with the cities over it (not covered)

… and so on.

Many apps already expose core content data visually. For many of the apps that don’t, the future is for the standard controls-based UI to be replaced with application specific graphic design and interaction design, packaged in a custom view with contextual editing controls and direct manipulation.

Edward Tufte’s work on information visualization is a great source of inspiration, I believe the main view of many apps could be implemented as interactive Tufteian graphics. Bret Victor dissects this process in his amazing Magic Ink essay, where he describes how he built the user interface for the award-winning BART widget (pictured above).

In my experience Apple UI people are quite helpful in anything from brainstorming to refining UI concepts, for example in the WWDC labs and UI design sessions. However, discussions inevitably are on a case by case basis. It has become apparent that building a high quality app on the Mac now requires having a designer on the team.

Visceral user interfaces

How has GUI interaction evolved? Let’s put a square on screen:

You are prompted for X&Y coordinates, you enter them, click a button, the square appears;

You are shown a square, you tweak the X&Y numbers and it moves;

You are shown a square, you drag a horizontal and vertical slider and it moves;

You are shown a square, you click on the square and drag it where you want it;

You are shown a square, you barely touch the screen and the square follows your finger.

Each step removes a little bit of abstraction and a little bit of indirection, until interaction is natural, because while technically there is an interface, you’re manipulating the content with your hand and you don’t feel like there’s an intermediary.

Is it dead?

The HIG is still good. In fact the first fifth of it is pure gold, still 100% current and relevant. The Aqua guidelines are just less relevant to applications where user interaction and data representation are tightly coupled, where content and UI aren’t separated by indirection, where interaction is more visceral, where the content is the user interface.

Comments

Recent posts

About the Author

Hi, I’m Duncan Wilcox, I’m a software developer and chocolate addict, living in Florence, Italy. I’m passionate about the Mac, photography and user interaction, among other things. Contact me at duncan@wilcox.it or follow me on Twitter