Design

I will be in Brussels next weekend to attend FOSDEM and give an updated version of the “Sketching Interactions” talk that I delivered at GUADEC (summary).

In order to have some more material to talk about, I have revisited my interactive sketch for navigating among open pages in Epiphany.

To recap: the problem that we are looking into is how to manage open pages in your Web browser. This is commonly done with tabs, but these have some problems: they display very little information, are hard to use in touch screens, and scale badly.

To illustrate this last point, here is how 15 open pages would look in Epiphany right now:

Hardly ideal.

This work looks into alternative in-app navigation among open pages that would (hopefully!) improve Web browsing. I started by prototyping the current proposal in the GNOME wiki and have continued from there. From the previous iteration, it seemed that a grid view might be a good solution for choosing among open pages:

Grid with open pages

I have extended that idea with a “New Page” view that would allow the user to review and search among his bookmarks, recently visited pages, reading list, etc. For now, this view just offers a fixed list of sites to illustrate how navigation would work, but it wouldn’t be hard to extend it to try more complex behaviour:

Part of the reason for working on this was to offer some ideas to my Igalia colleagues and other members of the GNOME community who are working on Epiphany and WebKitGTK+. The other part was to encourage people to try out ideas, not just argue about them. Too much time is lost arguing when we could be showing.

This can be done quickly and inexpensively: in this particular example, in 300+ lines of QML. The key is to focus on doing just the bare minimum to portray the experience that we are interested in. Because these sketches are quickly and cheap, they enable us to explore and discard many ideas easily.

Communication of design ideas and decisions is specially complex in a distributed community like GNOME. Interactive sketches like the one here could help improve this situation.

Some time ago, I wrote a small functional prototype to explore some of the design ideas for the evolution of the GNOME Web browser (maintained by my colleagues at Igalia). I thought that it would be a good idea to show these experiments to a wider public.

The basic idea by the GNOME designers is that, instead of tabs, open pages would be placed in an overview: you would click on a thumbnail there to return to a certain web page, and clicking again on “Pages” would take you back to the overview. A possible evolution of this would be to integrate bookmarks and reading lists in that overview.

This first video shows the interaction as described in the current design: in the overview, open pages are shown in a horizontal list, which gets reordered so that the leftmost element in the list corresponds to the last open tab. Note how the thumbnail is updated whenever we go back to “Pages”, and how the list scrolls to the left to show the most recently opened sites.

“Getting the design right” refers to the things that concern us when refining a design: usability, accessibility, visual appearance, performance, etc.

However, what exactly is “the right design”?

Here are photos of three objects to help us start thinking about this.

“Cars in Cuba - 57”, by patrick_nouhailler

The first one is an old car in Cuba, carefully kept and maintained. Terribly attractive at a first look. After this first impression, questions begin to come up. Is it easy to handle? Expensive to maintain? Ecological?

“My Armory: Chef's, Boning, Utility Knives”, by osakajon

The second photo shows a set of Japanese knives. Are they beautiful? Kind of, as far as knives go. They feel great in the hand, though, and are sharp and sturdy. And what do they tell about their owner? What would it say about yourself to own a set like this?

“La petite tour”, by esm723

The third photo is a small figurine of the Eiffel Tower. It is not particularly attractive. It is absolutely, perfectly useless. And yet, it has value for its owner. But that value does not come from its looks of its usage. The value of this object is not in the object itself: it comes from how it makes its owner think (in this case, of a past trip to Paris).

There have been different attempts at defining the characteristics that make things attractive and valuable for us. In his book “Emotional Design”, Donald Norman details three levels of processing:

Visceral: related to the sensory properties of a product

Behavioural: how it feels when used

Reflective: how it makes us think differently about ourselves, and how it changes the perception of others

A more recent approach has been carried out by Karen Holtzblatt (InContext). She and her team identified a set of characteristics that make a product “cool”:

Allows you to accomplish your intent anywhere, on your time

Goes direct into action without hassle

Provides connection with the people that matter to you

Helps you build your own identity

Has nice aesthetics and sensation

Laseau's funnel

Laseau’s funnel describes design as a combination of ideation and reduction. There are many different solutions to explore at the beginning of a project, and we need to be able to go through all those ideas quickly, evaluating and rejecting them. Making questions upon questions in order to start getting the right answers.

Requirements-driven development is not enough to provide the characteristics that people really value. A much better approach is to be able to quickly try out design ideas, using those sketches to help us elicit the real user requirements and refine the designs.

These sketches need to be fast and cheap, so that they can be plentiful. And because they are plentiful and cheap and fast, they can be disposed of, to leave room for more and better ideas.

As it is often said, design is about saying “no”: design is a negative craft.

Besides its benefits for ideation, this approach also helps improving the communication of design ideas among designers, and between them and other stakeholders (i.e. the GNOME community at large, in our case).

The Wizard of Oz

One (surprisingly) influential film for the field of UX design has been “The Wizard of Oz”. You are probably familiar with the story: Dorothy’s house is carried away by a tornado, all of a sudden she is not in Kansas any more, she meets a cast of wacky characters and runs into several adventures until she and her friends finally meet the famous Wizard of Oz.

Large, loud and surrounded by fire, the Wizard is a terrible and dreadful sight. Dorothy and her friends are appropriately afraid and have no choice but to do as he commands. Until the little dog Toto gets so scared that he runs and pulls at a curtain: behind it there is a small old man, pulling levers.

So the terrible Wizard was actually just an old man using smoke and mirrors. But what’s interesting is that, up until the point of this revelation, Dorothy’s and her friend’s reactions were governed by the belief that he was real. The Wizard was fake, but their experiences of him weren’t.

The Wizard of Oz is actually the name of a specific technique for sketching interactions where part of the behaviour of a computer system that would be too expensive to build is replaced by a hidden human operator. It has been successfully used to, for instance, simulate the experience of using a perfectly accurate speech-to-text system way before those systems were as common as they are beginning to get today.

Palm Pilot wooden prototype (1995)

The talk continued with descriptions of a number of examples and different techniques to quickly sketch the behaviour and experience of using a product way before that product is a reality. Several tangible prototypes were discussed, from simple paper-based sketches to full living rooms.

Living room at CWI Amsterdam, used to conduct experiments on different TV and teleconferencing technologies.

Some of the examples relied on storytelling engage the audience’s empathy to help them experience the proposed interaction through the protagonists of the story. The cases discussed ranged from written stories to the acting up of interaction flows and even professional theatre performances.

Theatre used to elicit requirements from elderly users, University of Dundee

Communication of design decisions is a problem in GNOME. We are a project composed of small teams working on different and remote organisations, but with a lot of potential stakeholders (a whole community of them!). It is important to develop and practice techniques that would allow us to refine and communicate ideas more fluently.

To end with a bit of advice: just turn your computer off from time to time, grab some pen & paper, and try things out 🙂

My blog has just been added to Planet GNOME, so let this post be my way to say that I am very happy to join the fine people there (including several colleagues at Igalia).

This week I will be attending the CHI2012 conference in Austin, TX. This is the main conference on human-computer interaction and UX design. I will write a long recap once it is over, with special attention to those bits that may be interesting to GNOME and other work that we are carrying out at Igalia, and also to things that are just too wonderful, eye-catching or plain weird to miss.

As a design exercise, I started thinking some time ago about how we could port Butaca to a tablet form factor. As usual, this was the tool selected to help my thinking and get started in the design:

My basic idea was to keep the current presentation in pages, replacing the Back button with a horizontal navigation that would let you move back and forth along your history. The pages that you have seen are on your left; the “forward” ones are to the right, along with shortcuts to launch each of the main sections of the application.

To illustrate this, I drew a few mockups with Inkscape. This is one of them:

This helped in visualizing the solution, but important questions had not been quite answered yet. Would this idea really work on a tablet? Would it look nice? Does it really make sense to structure navigation history in this way? I might have tried to build a functional prototype, but prototypes are used to confirm decisions rather than raise questions. It is important to avoid committing too soon to a particular design, before the space of possible solutions has been fully explored. What was needed at this stage was something else, something tentative and exploratory. What was needed was a sketch.

Sketches are cheap, disposable, quick. So, I took a bunch of screenshots of the N9 application and joined them together with just ~160 lines of QML, in a sort of interactive collage. This method would work just as well with nice ad-hoc mockups, but those take more time and the point was to keep it quick. Igalia had provided me with a WeTab running GNOME 3, so I just had to load my little application on it and take the following video:

The take-home lesson here is that trying out ideas is important. There are techniques that assist us in creating interactive sketches quickly, which can be very useful when we need to explore ideas and build on them to generate new ones. Furthermore, a sketch can be a great communication tool, as I hope to have shown with the preceeding video.

I have just read Bret Victor’s excellent post on the future of interaction. He accurately describes the current mainstream interaction paradigms (and even many futuristic visions!) as pictures under glass and claims that they don’t really take advantage of most of our natural capabilities, chief among them the huge sensitivity and precision of human hands. He mentions some lines of research that might help pave the way for future interactive technologies that are more aligned with our natural capabilities. I am just going to add pointers to a few of them and a couple of comments.

Another aspect is proprioception, the sense that informs us of the positions of our limbs and body; for instance, our use of a computer mouse depends on out ability to know the position of our hand and arm in space. Of great interest is the concept of extended physiological proprioception (original paper from the 70’s): the information obtained via a tool (e.g. the point of a pencil, prosthetic limbs in the original research, etc.) is actually perceived in a very similar way as if it were part of the body. The current thinking is that this capability might have evolved as our species began to use tools, many thousands of years ago.

For instance, this means that we can grab a pen, and use it to touch and successfully identify the characteristics of physical objects. You can even try it now. Haptic pens, such as those created by SensAble, exploit this idea by providing a device that can be moved tridimensionally, with little engines that simulate the feedback one would get if he was using the pen to touch and interact with a physical object. They have been a very expensive technology for some time, reserved for fields such as advanced engineering and medicine; as the price of this technology goes down over time, we might begin to see more and more applications.

Manipulative and haptic interfaces seem a good fit for dealing with computerised representations of physical objects. However, much of the work that we do on computers has to do with manipulating abstract information and there might not be an immediate way to translate and represent it in a physical interface. Different metaphors will need to be developed and evaluated.

This is linked to multimodal interaction, a branch of HCI research that tries to provide information to different human senses and gather inputs coming from varied sources. The expansion of mobile devices requires us to provide forms of interaction that are more flexible for when keyboard and screen are not readily available. Generally speaking, this is also a more human approach, in that it has the potential to make a better use of our innate abilities, and also more humane, as not all of us have the same senses available or with the same precision.

Following up on my last post, I want to share a few ideas that could improve the use of the Web from GNOME. Many of these come from other people and I am trying to combine them into one coherent package.

The first goal would be to offer better support for common Web browsing patterns, revisitation and exploration. Specifically, this means supporting web applications, a more convenient and agile presentation for favourites, better history and bookmarks management, better tab management within the browser window for pages that are related to the same tasks, and better tab management from the Shell to help the user align the different sets of tabs with his current activities and interests.

The second goal would be to do so in a way that is not cumbersome and complex, but light and consistent.

Revisitation: Home and History

As noted in my previous post, there are different kinds of Web revisitation; one of them comprises sites that we visit often because they lead to new information, which is not exactly the same as storing a linking to a page because of the information that it contains at the moment of reading (e.g. an article). In a manner similar to what Firefox does, I propose to have a Home tab as the starting point for browsing. This tab could include a search field, links to recent pages and groups of pages, favourites and Reading List. Being able to define a page as “favourite” and “pin” it to the Home page would ease mid- and long-term revisitation, which makes up for a large percentage of our activity in the Web.

The Home tab would be a way to get to new content, but what about returning to sites that were visited some time ago? Next to the Home tab, we could place a History&Bookmarks tab that offered a rich search interface to retrieve pages that have already been seen.

Tabs on top, with Home and History on the top left.

Fine-grain tab management

Modern browsers are placing their tabs on top with good reason. The main advantage is that this helps establish a visual hierarchy inside the browser window that reinforces the proper mental model, so that controls that operate on the same scope are grouped together. To decide which controls should be given priority in the interface, we could use usage datafrom Firefox as a guide, always keeping in mind that we cannot assume that everybody will know how to use all the available shortcuts (e.g. a similar study found that over 80% of users never used Ctrl+F to search). Browser-level functionality (New Window, Preferences, Quit…) could be moved to the application menu.

Tabs provide a number of benefits that make them a convenient way to organise your Web browsing. However, one of their problems is that as their number grows, it can become difficult to go back to a certain tab; a way to improve this situation could be to show a thumbnail of the tab’s content on mouse-over, allowing for a quick scan of open tabs without having to open them one by one.

Tab thumbnail on mouse-over.

There is a difference between following a link and opening it in a new tab. In the first case, the original page is still visible and readily accessible; in the second, it has disappeared from the UI and has to be kept in the user’s memory, to be accessed again via the Back button. These two different actions can allow the user to create a curated version of their trail through the Web, one that does not contain all the pages that they have visited but just those that have been deemed important. These tab trees are an important feature but tab-focused interfaces (e.g. tree-style tabs, otherideas) might be far too complex. A compromise could be to include a visual hint at the existence of different tab groups, but without making it the main point of the interface.

Without text, can you tell which of these tabs are related?

Coarse-grain tab management

Tabs are a good way of structuring your browsing when their number is low enough (research shows that an usual number of open tabs is around 6). When their number grows, you can have trouble because there are simply too many unrelated tabs in one window. So we have a problem with the organisation of a lot of content that is related to different activities: well, the GNOME Shell is a solution for that. I propose to allow high-level management of Web tabs directly from the Shell Overview (not too different from Panorama with a bit of this), providing an overview of the open tabs and supporting their movement between different browser windows and workspaces.

Epiphany window in the Shell overview, displaying the open tabs.

Wrap-up

I have tried to describe a situation where Web browsing is more tightly integrated with the desktop. There is still a lot of work to do: detailed functionality needs to be refined, assumptions need to be verified, mockups and prototypes need to be created and evaluated…

A browser is a very complex application to design, but luckily there is a lot of knowledge already available that should help us generate ideas and make informed decisions.

Earlier this week I began to look at some of the many available works on the field of Web browsers for the desktop, with the goal of improving the design of the Epiphany browser and taking advantage of the fact that Igalia is one of the main maintainers of WebKitGTK+. The first task, of course, is to correctly understand the problem: in a field as big and complex as this, this means a lot of reading and synthesising. Today I will explore two particular aspects: revisitation and tabbed browsing. In the future I will expand on this and begin to share some design ideas.

Revisitation

Revisitation means to access web sites that have been already seen previously. Although there are discrepancies on how to measure it, for the sake of design we can say that we have already seen roughly half of the pages that we visit. The article by Obendorf et al. mentions three kinds of revisitation:

short-term revisits (within the hour): these are the most common, often performed by following links, or using the Back button;

mid-term revisits (within the day): the most usual way is to use bookmarks or write the URL (often helped by autocomplete);

long-term revisits: this is related to the rediscovery of information that has already been seen; people re-access these pages mainly through links because they need to re-search (enter the same search terms) and/or re-trace (follow the same steps); history and bookmarks are also employed to some extent, but the current interfaces might not be easy or convenient to use.

Previously-unseen pages are usually visited by directly entering a URL or by following links from search pages (e.g. Google) or other information hubs (e.g. reddit, news sites).

A wider research was carried out by Adar, Teevan and Dumais. Their findings are consistent to those above, as they found that Web page revisitations could be clustered in the same three groups plus another one, which they called hybrid and which contained sites that were popular but infrequently used. They went further in trying to analyse the kind of web sites that typically fell on each group. The fast revisitation pattern often corresponded to “hub&spoke” behaviour, where users move back and forth between a set of promising results and each individual item. The mid-term one tended to refer to pages that act as starting points where the user can carry out a task (e.g. communication, banking) or access new information (e.g. news, forums). The infrequently-accessed group comprised pages that provide specialised search (e.g. travel) or related to weekend activities; as in the previous paper, the researchers also note that external search engines are often used for revisitation. There was a fourth, hybrid group of pages that caused “hub&spoke” movement but that were infrequently accessed, such as craigslist, eBay, shopping, games, etc.

With these results, the researches mention a number of implications for design. The most interesting for me is that “there may be value in providing awareness of, and grouping by, a broader range of revisitation patterns. For example, users may want to quickly sort previously visited pages into groups corresponding to a working stack (recently accessed fast pages), a frequent stack (medium and hybrid pages), and a searchable stack (slow pages).”

Research on tabs

During the last years the usage of tabs has made the Back button less prevalent. For instance, a common behaviour is to perform a search and then open different results in their own tabs, attempting to find the desired information through exploration of the candidates without losing track of the result set for further refinement. This often causes problems because the Back button does not work as expected (local history only applies to the current tab) and it might be complicated to find the originating document in the case of large tab trees. Problems with the Back button also arise when entering information through web forms and when using web applications.

A study of tab usage on Firefox showed that tabs are mostly used for immediate revisitation and task-switching. They serve as reminders or short-term bookmarks, they allow users to open links in the background and are a convenient way to keep frequently-accessed pages open. Visually, they are cleaner, less cluttered and easier to access that separate browser windows.

Many participants used tabs for revisitation more often than the back button, up to the point where, for frequent tabs users, tab switching was the second most frequent thing they did in the browser (after following links). The reasons reported were that tabs were more efficient, more convenient and more predictable (you can see the target right away). Another factor that helps to ease multitasking is that tabs leave the page in the same state, which is not always true with the back button.

The study found marked differences between regular and power tab users. The median number of open tabs was reported at around 6, but the maximum number of open tabs at one point in time could get much higher than that, past 20 and beyond for some users. As the participants were using regular Firefox, it could be that for some a limiting factor to the number of open tabs might simply be lack of space.

A bit of personal experience

As a user of the Tree-style Tabs extension for Firefox, I often find myself creating long trees of tabs where the tree itself marks a trail that is coherent and useful. I do not use the Back button often, and I think that the reason might be that opening a new branch in the tree somehow makes that part of the trail useful, clear and important, whereas pages that can only be accessed by going back soon fade out of memory. For a given task, it might well be that there is value in having a clear structure of related sites: the combination of tree-style tabs and the Back button helps create and navigate this structure.

Opening a link in a new tab actually marks the previous one as interesting and worth keeping around for a while, whereas closing a tab or following a link signals that the previous page was not that interesting after all (and it will fade from memory soon). This way of looking for information is probably related to orienteering, an information seeking strategy in which users take small steps towards their target using partial information and contextual knowledge as a guide. Making said set of steps visible and semi-permanent also acts as a very convenient reminder: my tab structure is kept between sessions, which makes it very easy to resume work or reading (for instance, there is a small subtree hanging from my RSS reader tab with articles that I will read later, and another one hanging from Bugzilla with bugs that I am working on).

Longer-term revisits

Regarding mid- and long-term revisits, I propose to contemplate three kinds of sites: web applications, information hubs and the personal archive. Web applications are self-contained and focused mainly on one task; their goal is in most cases to replace local applications for e.g. email, calendar, project planning, music, etc.

It might make sense to separate frequently visited websites that periodically provide new content from concrete and interesting information items; you can think about as the difference between reading the newspaper everyday and cutting out a news item that mentions your amateur football team. Information hubs are pages that are visited often because they lead to the discovery of new information, either on the same site (e.g. guardian.co.uk) or on others (e.g. reddit.com). On the other hand, the personal archive is a collection of information items that are relevant for the user because of the information that they already contain. There are many motivations behind the construction of personal archives: not just simply storing things for later retrieval, but also creating a legacy, making it easier to share resources, reducing fear of loss, self-expression and self-identity.

Posted in Design, Igalia, Planet Igalia on March 29th, 2011 by femorandeira – Comments Off on A bit of reading on music recommendations

Today I have been reading about the work done by the Distributed Computing group at the Zurich Institute of Technology. These guys are developing complex techniques for combining “audio social information” (tags and listening habits, taken from Audioscrobbler) to discover relationships, thus generating a social audio space where songs/artists are placed and that the user can navigate. An opposite (or maybe complementary?) approach to this one would be the use of audio features to establish relationships between songs/artists, but this might ignore some of the semantic and social information that is so relevant to our understanding of music. They have developed a set of prototypes for Android (with video) and Amarok. The science looks interesting and these kind of approaches could well be the next step in the evolution of music players now that the average collection is often too big to be dealt with using only long text lists.

Today I want to think about a possible mobile application to fulfil the goal of getting “a varied stream of my music that suits a general theme and that requires little interaction from my side”. For the particular case of mobile music:

Music can be in the background, allowing the user’s attention to be directed elsewhere.

Portable music devices are often pocketed away. Direct interaction and user input are therefore sparse.

Mobile music has usually only one listener.

Fine management of the specific songs being played and their sequence might not be required.

These are all assumptions, but I will go with them for now. Bear in mind that this is just a bit of exploratory design and so far hasn’t been validated.

The main idea is that the user should be able to just select the musical “theme” and get music along it. Since both tags and genres have a subjective component, I thought that using music artists as the “seed” for the radio stream might work well. If this sounds similar to Last.fm‘s radios, yes, it is similar.

Last.fm uses streamed content and relies on crowdsourcing to generate the relationships between artists, tags, songs, etc… One of the constraints that I had in mind when I thought about this application was that the music should come from the user’s personal collection and not depend on having an Internet connection. More importantly, Last.fm’s radios offer a wide range of choices and therefore require complex UIs, ; one of the questions that I am wondering here is if one could strike a compromise between providing music that satisfies the user’s needs and doing so with an interface that is simple, quick and pleasant to use.

This is a quick sketch. In this hypothetical app., the user is presented with a list of the artists and bands in their collection.

An image of the band could be used to make the application more visually attractive and easier to recognize. It is worth noting that the list layout shown in the image is not necessarily the best one for this particular case. The proper thing to do if we were serious about developing this would be to prototype, test and evaluate different possible list layouts in a systematic way: one column, two columns, images, no images… This might make a good starting point for a future post about testing.

It is worth repeating that this is supposed to work on the user’s music collection on a mobile device, which on average [citation needed] would be limited to dozens/hundreds of artists, most of which the user can recall or at least recognize [citation needed]. A different problem altogether would be to explore a vast collection that has been compiled by somebody else, as would be the case with e.g. Jamendo.

When the user selects one of the items, the application begins to play a stream of music that is related to the artist selected. The definition and calculation of similarity in music is a complex engineering topic that I will cover in future posts. Determining the “suitability” of the algorithm selected would not be trivial since what we are looking for is to provide the selection of music that better fits subjective desires and expectations, which change from person to person.

The “now playing” screen is intentionally simple: you can pause the current song, skip to the next one or go back to the list in order to select a new band. I am not even fully convinced that the “settings” item makes sense there, but it could act as a place to keep functionality that does not need to be always present (e.g. play only music from the selected band and not from related artists).

That’s all for today. Stay tuned for more stuff coming in the next days!