CSE 171: User Interface Design: Social and
Technical Issues6. Some Further Examples and Theory of Semiotic Morphisms

This chapter of the class notes first explores direct manipulation, and in
particular, its relationship to semiotic morphisms; it then gives some notes
on chapter 5 of the text, explaining how this material could have been
enriched using notions of preservation for semiotic morphisms, and it
concludes with some additional remarks, mainly of a mathematical character,
on semiotic morphisms, supplementing what is in the assigned readings.

6.1 Direct Manipulation

Ben Shneiderman is known for his sustained and enthusiastic advocacy of
direct manipulation, although he was not the originator of the idea, which he
attributes to Ted Nelson. Shneiderman says that direct
manipulation is characterized by the following features: (1) analogical
representation; (2) incremental operation; (3) reversibility; (4) physical
action instead of syntax; (5) immediate visibility of results; and (6)
graphic form.

I would especially emphasize points (1) and (4). About the limitations to
visibility in (5) and to graphics in (6), it seems to me that representations
can involve other senses than just sight. That point (1) for direct
manipulation is an analogy or metaphor is very relevant for us, because it
says that direct manipulation involves a semiotic morphism. The physical
nature of this metaphor (in (4)) makes it seem more direct and concrete, and
thus easier for users to grasp and to apply. Leibniz, who was no doubt
thinking of mathematical notation, makes a similar point when he says:

In signs, one sees an advantage for discovery that is greatest
when they express the exact nature of a thing briefly and, as it were, picture
it; then, indeed, the labor of thought is wonderfully diminished.

A good example of this phenomenon is the difference between doing proofs in
plane geometry with diagrams and doing them with axioms; in fact, the
constructions of traditional Euclidean plane geometry rely on a kind of
direct manipulation interface. Insight and creativity are enchanced by using
a more direct and physical notation, due to the greater sense of involvement
and connection that it produces. This in turn is due to the closer
association with one's already existing sensory-motor schemata, which is
closely related to important themes in contemporary cognitive linguistics on
the nature of metaphor, where it is said that the most basic metaphors are
image schemas that are grounded in human embodiment.

Two important principles can help deepen our understanding here: The
Principle of Transparency is an important criterion for success: an
interface is good if the user does not notice it but instead only notices the
task at hand; so designers are most successful when users never think about
them or their work! The Principle of Virtuality is Ted Nelson's a
brief original formulation of direct manipulation, as a representation of
reality that can be manipulated.

Shneiderman's campaign on behalf of direct manipulation has been so
successful that today, one is perhaps more likely to see it misapplied than
to see it not applied when it should have been. Here are my paraphrasings of
Shneiderman's useful list of potential limitations of direct manipulation
(page 204 of his text):

Spatial or visual representations are not necessarily better than text,
because they may be too spread out, requiring tedious user scrolling over
large displays. For example, flow charts may be useful for algorithms or
small programs, but rapidly become less useful as program size increases.

Users must learn the meaning of components of a visual representation,
and a graphic icon may may require much more learning time than a word or
phrase. Many examples can be found in Microsoft products, such as Word.

Users can easily over- or under- estimate the functions associated with
some graphical analogy, even if the image itself seems clear.

For users who are experienced at typing, moving their hand off the
keyboard to the mouse and back can consome a great deal more time than simply
typing the relevant command. For example, the keyboard of a real calculator
is much more efficient than any graphical representation that requires use of
a mouse.

The tatami project in my lab ran into
some of these problems with a direct manipulation interface that we built for
proofs; it turned out that displaying the proof tree was useless for large
proofs, because of the size and homogeneity of the display. Instead, we
broke proofs into "mind sized" pieces, each having its own webpage.

Classical semiotics also provides insight into the success of direct
manipulation, by seeing it as an indexicality of motion, often reinforced by
a specific kind of iconicity, called diagramatic iconicity by Peirce,
where the geometric structure of the sign corresponds to the structure of its
object (geographic information systems are particularly clear examples, since
the structure of a geographic map corresponds to the structure of some part
of the surface of the earth). Slider controls are a simple semiotic morphism
having linear traversal as their source domain, and they could probably be
applied even more widely than they have been. Scrollbars on windows are
perhaps the most familiar special case; they both display and control what
portion of a (possibly very long) "scroll" is actually displayed.

Unfortunately, Shneiderman often confuses the essentially semiotic nature
of direct manipulation with the technologies (or in more semiotic terms, the
"media") that are used to implement it. Our semiotic conception of direct
manipulation allows us to avoid this error, by clearly distinguishing between
what functionality is preserved, and how it is represented. For example, it
is perfectly possible to have a virtual reality interface to plain old 1978
DOS, complete with a haptic clicking keyboard and a virtual ancient VT100
screen with bright glowing green characters floating in space before you!
Despite the fancy technology, this is still just command line DOS. This
confusion is really just one aspect of a larger confusion, between the device
that supports an interface, and the design and software that make the device
actually function as an interface. Journalists often focus on the physical
device (the "box") without giving much thought to the design of the
interfaces of the applications that it supports. This is no doubt due in
part to their receiving press releases from manufacturers and pressure from
the advertising department, but it also reflects a bias in our culture.

Design errors often appear as violations of the underlying metaphor of a
direct manipulation interface, or more generally, for any interface,
violations of consistency of its semiotic morphism. One infamous example is
the Apple Macintosh use of the trashcan for ejecting a floppy disk; it has
confused generations of users, and it violates the trashcan metaphor in that
the floppy is not trash. A more complex example is the use of lemmas in
proofs, which leads to violations of a tree metaphor, but can be patched by
using hypertext links (as in Kumo).
Another example is the little arrows at the top and bottom (or left and
right) of many scrollbars, because the physical motion metaphor does not
suggest that these should be "hot." In fact, a scrollbar with this
capability is actually a blend of two metaphors, and hence is a bit
more difficult to learn; the second metaphor is similar to the "up" and
"down" buttons on elevators.

It is interesting to note that both the term "virtual reality" and the
data glove were invented by Jeron Lanier. Augmented reality has important
industrial applications (e.g., at Boeing) and no doubt will have more.
Situatedly aware shopping carts do not appeal to me, and indeed,
raise significant ethical issues. Again, the main point here is that direct
manipulation is a form of semiotic morphism, and use of algebraic semiotic
ideas can clarify some of the issues surrounding direct manipulation.

6.2 Notes on Chapter 5 of McCracken & Wolfe

The slogan "Content organization drives visual organization" on page 82
can be seen as a corollary of our principle that good designs are semiotic
morphisms, from a content space to a display space, that optimize some
preservation properties. The four principles summarized on page 83 are good,
but it seems to me that something very essential has been left out, namely
that ordering relations in the "content organization" should be reflected (or
"preserved", in our language of morphisms) by the display. It is easy to see
that the (good) examples do exactly this; e.g., the six groups of links in
Figure 5-2 are arranged by the size of the card-sorted groups, and there is
even an (implicit) ordering by importance within groups by the importance of
items; this is not the case for Figure 5-1. The course lecture on this topic
contained much more information than this paragraph, showing how levels and
priority in the source semiotic space can explain and improve the four
principles in the text, making them both more general and more precise;
you should have been there!

Another comment is that the Alignment principle is less important than the
others, it is just one way to achieve consistency, which can also be
achieved, for example, with color or with size; although alignment is very
basic to the way that most browsers display many HTML commands, it is not
necessarily basic for more creative graphical layouts, or even for all the
natural ways to present HTML, for example, by speech generation (for blind
users).

First, recall that the composition (f;g) of semiotic morphisms f:
T -> T' and g: T' ->T'' is defined, for x an
element of T, by (f;g)(x) = g(f(x)). This means first apply f to x,
and then apply g to the result; the semicolon notation is borrowed from
programming languages, where it again indicates first do one statement, then
the next.

Definition: A binary relation > on a set P is a partial
ordering if it is transitive (i.e., a > b and b > c imply a > c,
for all a,b,c in P), and is anti-reflextive (i.e., a > a does not hold
for any element a in P). A partial ordering > is a total ordering if
for all a,b in P, either a = b or a > b or b > a.

Notice that the so-called "unordered list" of HTML actually produces
graphic elements that display a total order in a natural way (since for each
pair of distinct list elements, one is necessarily above the other); HTML
"unordered lists" differ from "ordered lists" in being unenumerated, not in
being unordered.

Definition: Given two partially ordered sets, P with > and P' with
>', their lexicographic product consists of the set of pairs (a,a')
with a in P and a' in P', ordered by (a,a') > (b,b') if a > a' or (a=a' and b
> b').

Theorem: If P and P' are both totally ordered, then so is their
lexicographic product.

The reason that the TOD ("time of day") semiotic space has some odd
looking representations that appear to be good mathematically is that this
particular theory of time is very basic, and does not include certain social
conventions which we expect to see preserved in our representations of time.
The most important of these is that the 1440 minutes of a day are enumerated
using two counters, one that goes up to 24 and the other up to 60; these are
combined by the constructor (_,_), to create pairs of counters. Here are the
axioms for this more detailed source space, where h, m are variables for the
hour and minute counters, respectively, and s denotes the unary "next" (or
"tick") function on time (i.e., on the pairs of counter values), and also
denotes the successor function on integers:

s(h, m) = (h, s(m)) if s(m) < 60 .
s(h, m) = (s(h), m) if s(m) = 60 and s(h) < 24 .
s(23, 59) = (0, 0) .
It is interesting to notice that the usual ordering on time is exactly the
lexicographic product of the two counters, that is, (h, m) > (h', m') if h >
h' or (h=h' and m> m'). With this additional structure on the source space,
the allowable semiotic morphisms are what we would expect, and in particular,
both the strange unary representation, and the decimal number of elapsed
minutes, fail to preserve the structure created by the constructor (_,_).

Definition: A projection M on a semiotic space S is an
semiotic morphism with source and target S that is idempotent , i.e.,
that satisfies the equation

M ; M = M .
A simple example maps a list of numbers to their sum, given as a list. For
example, this morphism maps (1,2,3) to (6), and (4,5,6) to (15), and (7,8,9)
to (24), and then maps each of them to itself. A similar example computes
the average of a list of numbers. A bit more complex example is a mapping of
lists of lists to numbers lists of sums of the component lists.

In general, a projection can be undefined on many elements of its semiotic
space. A simple example is mapping numbers to their remainder modulo (say)
60; it is defined on all numbers, but not on the non-numerical character
strings in W. A more complex example is the morphism on W to
itself that takes total elapsed minutes to military time; more precisely, if
N is a string of decimal digits, then

M(N) = Q : R
M(Q : R) = Q : R
where "Q : R" is the quotient Q of N by 60, as a string of digits, followed
by the colon, followed by the remainder R of N by 60, again as a string of
digits.

Definition: A semiotic theory T' is a refinement of a
semiotic theory T if there is a semiotic morphism f: T ->
T' which preserves all relevant properties of T and which
induces an isomorphism of the algebras of terms of T and T'.
[[More technically, if G, G' are the signatures of T, T', and
if T(G) denotes the algebra of G-terms, then there must be a view f: G ->
Der(G') from T to T' (where Der(G') is the derived term
signature of G') that induces a G-isomorphism T(G) -> T(G')|G, the
reduction of T(G') to a G-algebra via f.]]

For example, the two counter theory for time in minutes is a refinement of
the one counter (with cycle 1440) theory. Similarly, the three counter
theory of time in seconds is a refinement of the one counter theory with
cycle 86,400. In these two examples, the simple theory is refined by
encoding some additional social conventions as constructors and axioms, in a
way that is consistent with the original theory.

Exercise: Consider the same points that are discussed above for
time of day in minutes, but now for time of day measured in seconds,
including the three corresponding clocks. Hint: The more refined
version of the theory should have three counters instead of just two.