There are many resources out there that provide accessibility guidelines - hints on making content easy for a screen-reader to parse, or reminders about content that blind users cannot glean. That's pretty common.

However, something I rarely see are guidelines for audio usability - guidelines that detail the different ways non-sighted users have to process information, and the ways to use sound, pitch, speed and breaks to signify meaning.

Some specifics I'd like to learn:

In visual interfaces, I can exploit the innate meaning of certain colours - red for errors, for example. Can I do anything similar with audio?

Likewise, in a sighted UI, I can use color, shape and belonging to signify kinship and relationship between items. Can this be done with audio alone?

We've all seen interfaces that use movement and positioning to hint at physical analogies, or invoke a real-world event. Think a deleted item fading into nothingness, or an inactive element appearing to recess. How can I use sound to invoke similar physical concepts?

In real-world scenarios, how quickly can users without the benefit of sight navigate through an interface or sitemap?

Do non-sighted users pick up different browsing habits or research strategies to their sighted counterparts?

Does anyone have any resources or advice on the specific challenges of designing interfaces that rely either completely or mostly on audio output?

I am going to mark this favorite and try and get back on this later,one of my professors worked on an application for making smartphones accessible to users by just usually using audio cues
–
Mervin JohnsinghMar 23 '12 at 20:52

1

I would imagine this would be applicable to certain out-of-home interfaces, too... for instance, audio control of in-car computers. Might even apply to future Siri-powered devices.
–
Daniel NewmanMar 23 '12 at 20:53

1

The RNIB (Royal National Institute for Blind people) recently sued a uk airline for not providing an accessible website, so this issue can be very important! econsultancy.com/uk/blog/…
–
JonW♦Mar 23 '12 at 21:24

The Eyes-Free project for Android provides several applications that are based on audio and gestures. Their YouTube channel showcases some interesting examples: youtube.com/user/EyesFreeAndroid
–
Pau GinerMar 24 '12 at 0:30

1 Answer
1

One of the questions raised was how do audio interfaces differ from visual interfaces. I believe this passage explains it very well:

Audio interfaces present content linearly to users, one item at a
time. This contrasts with the way in which most people use visual
interfaces. Sighted users can scan an entire screen almost
instantaneously, comprehending the overall layout, the artistic style,
and other macro-level aspects of the content. Screen reader users
cannot comprehend these macro-level aspects as quickly. The linear
progression through the content from beginning to end is somewhat like
automated telephone menu systems which do not reveal all of the
options at once. Users must progress through such systems in a
step-wise manner

Reading speed: Most people relying on audio cues rely on screen readers to get acess to content.The content read using screen readers can be read at speeds ranging as much as 300 words/minute or more. The reading speed is never fixed and can be varied and it depends on the experience of the reader as highlighted below:

Screen readers do not read web content quite like human beings do. The
voice sounds somewhat robotic and monotone. In addition, experienced
users often like to speed up the reading rate to 300 words per minute
or more, which is more than the inexperienced listener can easily
understand. In fact, when many people hear a screen reader for the
first time, at the normal rate of about 180 words per minute, they
complain that it reads too quickly. It takes time to get used to a
screen reader, but the interesting thing is that once users get used
to it, they can race through content at speeds that can amaze sighted
individuals.

This discussion mainly talks about how many screen reading software’s allow users to set reading speeds on a scale to acclimatize themselves and what are the challenges involved in determining those scales

How do screen readers read content and how that should influence are design/content decisions - Since most of the content which is read using screen readers is on the web,most of the guidelines are applicable to them but I guess they could be extended to software based interfaces too. Here are some of the guidelines with regards to how content is read and rendered into audio cues

Screen readers try to pronounce acronyms and nonsensical words if they have sufficient vowels/consonants to be pronounceable; otherwise,
they spell out the letters. For example,NASA is pronounced as a word,
whereas NSF is pronounced as "N. S. F." The acronym URL is pronounced
"earl," even though most humans say "U. R. L." The acronym SQL is not
pronounced "sequel" by screen readers even though some humans
pronounce it that way; screen readers say "S. Q. L."

Screen reader users can pause if they didn't understand a word, and go back to listen to it; they can even have the screen reader read
words letter by letter. When reading words letter by letter, JAWS
distinguishes between upper case and lower case letters by
shouting/emphasizing the upper case letters.

Screen readers read letters out loud as you type them, but say "star" or "asterisk" for password fields.

Screen readers announce the page title (the attribute in the HTML markup).

Screen readers will read the alt text of images, if alt text is present. JAWS precedes thealt text with the word "graphic." If the
image is a link, JAWS precedes the alt text with "graphic link."

With regards to the challenges of designing accessible solutions with audio specific output for devices such as smart phones and devices which no longer provide a perception of haptic touch,I would strongly recommend reading this excellent CHI paper Usable Gestures for Blind People: Understanding Preference and Performance by my professor and one of my TA's :). To quote what the paper says about the challenges involved in designing for accessibility in smart phones

Though screen readers are now included in android and apple phones,
accessible touch screens still present challenges to both users and
designers.

Users must be able to learn new touch screen applications quickly
and effectively, while designers must be able to implement
accessible touch screen interaction techniques for a diverse range of
devices and applications. Because most user interface designers are
sighted, they may have a limited understanding of how blind
people experience technology. A designer who wishes to create a new
accessible touch screen-based application currently faces several
challenges.

First, while touch screen interfaces for sighted users are largely consistent due to now-familiar gestures such as
tapping, swiping, and pinching, touch screen interfaces for blind
users vary widely across platforms.

There exist very few examples of how to extend accessible touch screen interfaces to devices other than smartphones.

a designer who wishes to provide gestures in their application must consider whether the gestures will be
appropriate for a blind user. Although blind people may use the same
hardware as their sighted peers, it is possible that they will
prefer to use different gestures, or that they will perform
the same gestures differently than a sighted person. Sighted people
perform gestures differently when they lack visual feedback , and it
is reasonable to assume that a blind person may also perform
gestures differently than a sighted person.

To summarize his findings with regards to designing smartphone interfaces for blind users(note all the findings are not applicable from a audio cue point of view but do highlight factors which would help in better audio generation for ease of use)

Avoid symbols used in print writing. Blind users may have limited knowledge of symbols used in print writing, such as letters, numbers,
or punctuation. Even when these symbols are known, users may not
be used to them or may not be comfortable performing them.

Favor edges, corners, and other landmarks. Locating precise spots on the touch screen surface can be very
difficult for a user who cannot see the screen. The physical edges
and corners of a touch screen are useful landmarks for a blind
person. Placing critical functions in these areas will improve
accessibility and reduce the likelihood that the user will trigger
these functions accidentally. Reduce demand for location accuracy.
Blind users may be less precise in targeting specific areas of
the screen, including edges and corners. This problem can be reduced
by increasing target size or by allowing approximate
targeting methods, such as allowing a user to touch near a target and
then explore with their finger to locate it more precisely.

Limit time-based gesture processing. Blind people may perform gestures at a different pace than sighted people. Thus, using
the gesture’s speed as a recognition feature or as a parameter (as
in kinetic scrolling) may result in increased errors for blind
users.

Reproduce traditional spatial layouts when possible. Objects with familiar spatial and tactile layouts, such as a
QWERTY keyboard or telephone keypad, are instantly familiar to
many blind people. Reproducing these layouts may make it easier
for a blind person to learn and use a new interface.