Saturday, September 26, 2009

Augmented Reality, Tangible UI, and Activity Theory

Yesterday in the Computers and Cognition psychology class I'm taking this semester, I lead the discussion on a paper called Physical and Virtual Tools: Activity Theory Applied to the Design of Groupware. It focuses on the design of a tangible, collaborative augmented reality project that allowed users to plan layouts of, say, building interiors. The main connection of the paper to Activity Theory was that virtual tools were typically viewed as an internalization of activity, but perhaps could be considered an externalization if people become more used to them.

In the context of this work, activity may be thought of as a subject's interaction with his or her surroundings. Human thought and behaviour in this interaction is mediated by artefacts. When the interaction is internalized, it turns into a mental activity. But when it is externalized, thoughts and memory are represented in or by the physical environment.

When laying out the design requirements for the project, it was observed that the way physical and virtual tools were used seemed to follow the three-step development of tool usage laid out by Victor Kaptelinin, as follows:

Inexperience makes using the system as efficient with tools as it is without.

When physical tools are introduced, the ability to complete tasks improves as the process is externalized.

The introduction of virtual tools can occasionally replace the use of physical tools as the process is internalized again.

This motivated the authors to justify experimenting with physical and virtual tools.

The end result was a system called BUILD-IT. (Video demo available.) It uses tangible bricks to manipulate an overhead plan view of the layout being worked on, while a 3D view is projected on the wall beside it. The thinking was that by using physical objects to interact with the system, there would be a closer connection between the action and the mental reflection. This kind of system could support a wider range of human expression than a standard mouse-based program could.

In terms of activity theory, the authors hoped that their work would stimulate its theoretical development. In particular, they thought the theory could expand on the idea of objectification, arguing that the degree of externalization really depends on the user's familiarity with virtual tools. Can virtual tools ever truly be an externalization, or are they destined to remain a part of a disconnected outer world, making it more difficult for users to understand the interface at hand?

During and after the class discussion, I had a couple of questions pop into my mind.

First, I find it interesting that virtual tools are considered an internalization in the first place. While I am by no means an expert on Activity Theory, we had a chance to talk more about what externalizations and internalizations were in relation to an earlier theoretical paper. My understanding is that it is an operational kind of thing: an externalization is simply the ability to take a physical tool and use it to help with the activity in question. No longer using the tool essentially brings interactions within yourself again as a mental activity. This is the internalization. You can go back and forth between the two, as was suggested in the three steps mentioned earlier.

In the case of virtual tools, you are no longer using something physical, though it is often some kind of representation of a real thing. However, whether it exists in the real world or not, I fail to see how it was ever not an externalization. It's still something that exists outside your head (even if it's just pixels on a screen) and that you use to help you with a task. Is this simply a sign that I am in fact used to virtual tools enough that they have become an externalization? Would most people feel the same way in this digital age?

A second question came up that was rather interesting: what exactly makes a tangible user interface (or augmented reality) different from traditional mouse-based systems?

The answer that I was trying to give, but couldn't quite say right at the time, was that it reduces the number of indirections in the way of getting to what you want to do. When using a mouse, you have to think about how the way you move your hand is going to affect what appears on the screen. But when you get to gesture on top of a display that changes as you move, you remove that level of indirection. Tangible UI's don't always do this (think of the separate 3D view in BUILD-IT, for example), nor do augmented reality systems. But when they do, I figure the smaller amount of congnitive processing required makes them that much easier to use.

The other answer that caught my attention was that humans apparently have very good musculoskeletal memory. That means that when they move, say, their entire neck, they will remember the motion required for completing a task better than they would if they only had to move their hand/finger. I always assumed that these kinds of user interfaces just made more sense since they had natural mappings from motions people make to the results in the software, but this memory thing really makes a lot of sense to me as well. Perhaps it's a combination of the two.

I found this paper really useful in thinking about what makes augmented reality and tangible UI's so useful. I keep talking about wanting to make an educational AR game for my PhD research, but I almost always imagine having a tangible component as well. I think an interesting component of the research could be determining how well children can work with virtual tools, and whether it's an automatic externalization for them, since they are growing up with this sort of technology.