FROM FOLEY TO FUNCTION: A PEDAGOGICAL APPROACH TO SOUND DESIGN FOR NOVEL INTERACTIONS

Abstract

Increasingly, devices of everyday use employ computing technology. Due to small or absent screens and their ubiquity in the environment, these interactive commodities might benefit from the consideration of sound in their design and use. However, the criteria for appropriate, enjoyable and useful sonic interactions and suitable pedagogical methods for educating future designers in this area still remain to be explored.

We want to encourage design students to create “sounds for tomorrow” in an explorative way, inspired by the sound pioneers of “New Hollywood”, who employed techniques of the avant-garde to establish new sonic identities. We implemented a design process in a workshop setting, continuously scrutinizing sound design and interpretational strategies. The process evolved over three connected stages: Foley-based and electroacoustic Wizard-of-Oz mockups and functional prototypes. Real time sound making, performance of interactions, and critical reflection on the aforementioned are central to our approach. In this article, we elaborate on the method, discuss design cases, and present pedagogical insights.

Introduction

Mark Weiser's (Weiser 1995) vision of the ubiquitous computer has become a reality. From smartphones, which disappear in our pockets, to the computers hidden in our cars, appliances, and houses, to recent gadgets such as Google’s Glass or Samsung’s smart watch, we are surrounded by interactive commodities. Many of these technologies have in common limited possibilities for visual display, be it due to small or nonexistent screens, because we need our eyes for other tasks during a typical use situation, or because the need for consulting a display may be disruptive. Furthermore, a display may be undesirable for aesthetic reasons. Thus, this literally “disappearing” technology offers several opportunities for using sound in its design. However, due to the relative novelty of such applications, there is a lack of experience and best practice in terms of how and when to use sound.

In our view, it is important to understand that sound designers dealing with interactive experiences, or - perhaps more common - (interaction) designers who use sound, are contributing to a process of exploration. At the same time, they are creating standard-setting definitions of what possible future experiences of sound in interactive artifacts can be. Today’s design decisions will define, to a significant extent, how sound in interactive applications will be judged in the near future. From an optimistic standpoint, the situation can be compared to the time of the "New Hollywood" movement in cinema. The (freshly graduated) "movie brats" (Pye and Myles 1979) managed to create and establish a new sonic mainstream aesthetics, which relied on subversion and avant-garde thinking and practice, bringing film sound to a whole new level (Flückiger 2001). This demonstrates that such times of transition offer a real chance for innovative design to set new standards. We believe that we are in a similar situation today in relation to sonic interaction design, sonification, and auditory display. Therefore, the pressing question is: what are our guiding principles, ideals, and aesthetic values, and how will today's design decisions (or the absence of "design" in many of these decisions) impact the soundscape of tomorrow? And how can young designers be educated to embrace the mindset of the New Hollywood “movie brats”?

Background

Despite the fact that sound design of interactive applications is a relatively young field, several relevant aspects have been investigated. First, there is the large body of research from the area of auditory display and sonification. Concepts and guidelines in relation to notification and warning sounds, design concepts such as earcons and auditory icons, and sonification strategies for the representation of data through sound have been widely discussed and investigated (Kramer 1994; Hermann, Hunt and Neuhoff 2011).

In relation to the use of sound for physical interactive products, how the relationship between object, action, and sound is subject to re-configurations has been studied. Hug investigated the emergence of meaning from a complex interplay between an artifact’s characteristics, the social practices around it, and the sounds it produces (Hug 2008).

Other relevant domains of research deal with the emergence of meaning in sonic interactions. Here, actions and events are not only interpreted as meaningful signs and part of a functional (sonic) narrative (Back 1996), but the quality of their execution, the performance itself, comments, modifies, strengthens, or weakens the function and meaning of their referential aspects (Fischer-Lichte 2001). Goffman describes how we perform our self in everyday social interactions as well as how everyday artifacts - and their sounds, we might add - become important means in this process (Goffman 1959). And last but not least, (acoustic) sound is an inherently performative medium, dependent on movement and agency: Chion designates this sonic manifestation of self with the term ergo audition (Chion 1998).

Sound design in this context can be seen as an expressive channel for interactive artifacts and micro- narratives of interactions. Sound both tells us something about the temporal and dynamic development of processes as well as about the inherent semantic structure of the interaction with this ”genie in a bottle” (Hug 2008). This forms the fundament for an understanding of both narrativity and performativity as two central pillars on which sonic interaction design can be built conceptually as well as in the design process (Hug 2010a; Hug 2010b). Based on this, a conceptual framework has been formulated and evaluated with a focus group of sound design experts (Hug and Misdariis 2011).

A bit outside of academic research, audio branding and product sound communities show increasing interest in the new interactive commodities as opportunities for applying sound design (Bronner, Hirt and Ringe 2012; Spehr 2009). It seems that two relatively distinct groups are approaching the field of interaction design with sound, and it is very important to find ways to better integrate these two worlds (Hug and Misdariis 2011), not only in professional design, but also in educational practice.

The Emergence of Sonic Interaction Design Pedagogics

The areas of knowledge described above provide the basis of a systematic approach to designing sounds for interactive applications. But design education also relies on experience and practical applications, for examples and learning material, and these are scarce to non-existing outside academic experimentations. There is a need to involve the future designers in the process of exploring and defining the opportunities, challenges, and best practice in terms of sound design for interactive commodities, because they will be the ones who contribute significantly to the standard-setting postulations we discussed above. This leads to the question as to how sound can be integrated into interaction design education.

Luckily, we do not have to start from scratch. There is an abundance of textbooks available, in particular in the area of film sound and music or general sound design for media (Sonnenschein 2001; Raffaseder 2002; Viers 2008; Katz 2007; Rose 2008). In relation to interactive media, there is a smaller, but increasing, amount of material available. Many educational books focus on the technical aspects of interactive sound production (Cook 2002; Cancellaro 2006; Farnell 2010), and some focus specifically on audio for computer games, which in some way are the “killer apps” of interactive sound (Marks 2001; Collins 2008). In media and design education, we can observe that sound is becoming increasingly established, with several educational institutions offering bachelor or master degrees in sound design (for media, or in general), or at least devoting a reasonable amount of their curriculum to sound. However, educational programs specifically directed toward interactive sound design are rather rare.

Yet, over the last five to ten years, and in particular in the context of the COST action on Sonic Interaction Design, several proposals regarding pedagogical approaches to sound design for interactive applications have been made and also implemented in workshops and courses. Rocchesso, Serafin and Rinott (Rocchesso, Serafin and Rinott 2013) provide an overview over more recent pedagogical efforts in the area of SID. These efforts are directed on the one hand towards sensitizing participants to sonic interactions, for instance through sound walks or through the production of audio dramas. On the other hand, they are directed at sketching and prototyping with particular toolsets, for instance using vocal sketching (Ekman and Rinott 2010) and sonic overlay of video, but also more complex tools such as the Sound Design Toolkit (Delle Monache, Polotti and Rocchesso 2010).

Franinovic, Hug and Visell (Franinovic, Hug and Visell 2007) have presented conceptual considerations and their implementation in a workshop setting that emphasizes the exploration of basic action-sound relationships. Through methods such as morphological analysis and bodystorming, they encouraged the development of concepts, which were then explored in short video scenarios or through building functional prototypes. An exploration of basic sound design methods in pedagogics, which are inspired by methods of the Bauhaus, can be found for instance in Franinovic and Visell (Franinovic and Visell 2008).

Other efforts focus more on hermeneutics, interpretational phenomena, and dialogical communication processes. Hug has conducted a moderated group analysis of a systematic collection of filmic representations of interaction scenarios and derived “narrative metatopics” as conceptual and formal building elements for sonic interaction narratives (Hug 2010a).

Hug (Hug 2010b) also discussed the need to enable real-time improvisation of sounds in order to tackle the genuinely performative nature of sonic interaction, which we have described above. This conceptual approach has certain implications in design pedagogics, which are also of central importance in the work presented here.

In some cases the pedagogical efforts are connected to practice-based research. Hug has described a way to establish sound in a game design bachelor education, not only as topic of teaching, but also as classroom-driven research (Hug 2007). Core elements of this approach consist in the training of listening, an understanding of functions of sound in computer games, and a structured design process of prototyping and implementation of sound in interactive games. An important aspect is the continuous encouragement of experimental and unconventional approaches, which can lead to new insights about possibilities for interactive (game) sound, an approach which is also adopted in the work described here.

Last, but not least, software tools and technical frameworks are important elements in Sonic Interaction Design education as they offer the necessary functionality for realizing interactively controllable sounds. For instance, Delle Monache et al. (Delle Monache et al. 2010) have presented a software toolkit that provides a relatively straightforward way to use procedural, physics-inspired sound synthesis methods for real time applications. Other examples are the “Musical Modules” and the Gesture Follower toolkit (Schnell, Bevilacqua, Rasamimana, Blois, Guedy and Flety 2011).

Challenges in Sonic Interaction Design Education

The pedagogical approaches outlined above cover a wide range of relevant topics and competencies, but there are some specific challenges that are particularly tied to the treatment of sound in the context of an interaction design education. In the following we will outline these challenges.

The Stigma of “Functional” Sound

Sound in interaction design is often understood as design element that has the right to exist if, and only if, it offers a specific functional benefit – otherwise it is superfluous. Usually, sound is used to notify, inform, or warn us, or, as is the case in sonification, to represent some kind of data flow. “Used” is to be taken literally, as the sounds are normally based on existing sounds, be it identifiable everyday sounds (in the case of auditory icons) or tones and musical sounds respectively (as in the case of earcons and most warning sounds).[1]

The (Undesired) Dominance of Technology

A second challenge is that in interaction design education, despite all opposing efforts, technology often is a key concern. At least in later stages of the curriculum, artifacts will have to be functionally implemented. In our observation, this leads to a design (and learning) strategy that tries to make do with what seems technically feasible. This fixation on “making things work” technically is a general problem for the aesthetic development of the students. Each day and hour they spend on technological implementation will be lost for efforts that deal with aesthetic definition, inspiration, experimentation, and ultimately self-discovery as “artist-designer”.

The Dialectic of Tools

We have mentioned above that tools play an important role in sonic interaction design. Provided they are simple enough to use, these tools can indeed offer a starting point for interactive sound generation. At the same time, every tool affords a certain functional and aesthetic direction, thus biasing the design and potentially misleading students to accept the underlying technologies as offering the only approach. This prevents the invention of new methods and tools for novel contexts and applications. The methods we are proposing here are no exception, but we claim that they are much more open to change and modification, very much as a sketch in a visual design process (Buxton 2008).

The Aesthetic and Technical Complexity of Interactive Sound

Starting from the already technology-heavy field of digital audio production, interactive sound demands a broad skill set, from knowledge about synthesis, acoustics, interface design, sensorics, and electronics, up to skills in analytic listening and music. These topics fill the curricula of full-time educational programs, for instance, in the area of computer music or composition for new media and, as such, are out of the scope of a curriculum in interaction design. In addition, we are dealing with the fact that sound will not play a central role in the professional life of most of our students.

The Mythical Sound Designer

Partially related to this complexity, sound design is often considered to be some obscure art form, performed by highly specialized eccentrics in esoteric and expensive high-end studios. This myth about sound design is particularly hard to battle. Interestingly, the vast majority of “making-ofs” and other sources of information on sound design (for film, mainly) usually serve to reinforce the stereotype. One reason for this lies in the nature of film sound, which – as with other cinematographic elements – ultimately aims to hide the artifice in favor of “suspense of disbelief”.

The result of all these factors is that only a small minority of students will even attempt to tackle interactive sound: Why bother with a “superfluous” modality which demands extensive technical skills usually taught at specialized institutions when dealing with such a complex interdisciplinary field as interaction design?

Pedagogical Framework and Implementation in Workshop Setting

One strategy to mitigate these issues is to approach sound not from a technical, but from a "sound studies" perspective, understanding its role and relevance both for the sensory being of the individual as well as on a socio-cultural scale. Second, we need to put the “design” back into “sound design": We need to motivate our students to treat sound as equivalent material of design.

Building on this, and the understanding of the educational challenges described above, we have devised methods that allow students of interaction design to explore and implement interaction design scenarios, using sound from the very beginning, in a dialogical setting. In the following we describe our pedagogical principles and the resulting educational framework.

Pedagogical Principles

The Classroom as Dialogical Research Lab

This point may sound surprising, as it is not directly connected to a teaching outcome, but to a research question. We are "teaching" about artifacts, which may become future products, but are not (yet) part of our everyday live experience. Thus, in these workshops, we co-create the reality about which we teach. The consequence is that these workshops are also part of a general research setting where we investigate the emerging phenomena (see also [Hug 2007] and [Hug 2010a]). More precisely, we investigate the discourse and phenomena occurring in design and interpretation of interactive everyday artifacts that do not exist yet as "real commodities", but are projections of possible futures in the form of prototypes and mockups. Thus, we are trying to contribute to a discourse of sonic studies that deals specifically with an emerging field of (potential) everyday sound experience.

As a consequence, we aim for an exploration of sonic interaction design that is not “expertise driven”, building on what (little) could be known already, but rather, on a setup that enables a dialogical, discursive approach. For this, we need a responsive, flexible design approach, which is able to deal with insights emerging from the interaction process. This process, and by consequence also all tools, have to be open to improvisation and variation, also when used by non-experts. This flexibility is also needed, because we understand the experience of technology as an inherently dynamic, dialogical process (McCarthy and Wright 2004).

Appropriate Scope and Tools for “Non-Sound Designers”

In order to reach our goal, a radical reduction of conceptual and theoretical aspects and the related empirical research is necessary. Auditory display and sonification, and their related design principles, are not discussed in depth, because the goals we aim to reach do not depend on this knowledge, and our approach to education as experimental laboratory works better if a design bias is avoided.[2]

On the other hand, in response to the related pedagogical issue mentioned earlier, there is the need to demystify sound design, as a prerequisite for reducing the inhibitions and increasing self-confidence in students, independent from their background and experience with sound. This is partially achieved by using “quick’n’dirty”[3] low-tech methods that are readily available and do not require extensive practice or experience with technical tools, but offer an access to the richness of a vast, nearly unlimited, range of sonic possibilities as well as the possibility to increase the impact of the design demonstration by practice and experience. Such methods allow for a process that begins with drafting interactive sound experience and gradually works towards a final, technically elaborate, implementation. In so doing, we try to make sure that no compromises on sound design are made.

Maintaining a “Sound” Focus

In order to fulfill the promise of investigating sound in its full potential, it is critical that sound remains at the center of the creative and reflective process. As outlined above, interaction designers usually deal with many different aspects in design; therefore, the methodical framework has to encourage, and sometimes even enforce, a maintained focus on sound.

The Resulting pedagogical Framework

Required Preliminary Competencies

For the overall research strategy to yield the desired results it has to be ascertained that all participants share a comparable and sufficient level of sound-related competence. While not required for the initial stage, later development stages require some basic knowledge of sound editing and multitrack composition with simple effects. Here we focus on a range of methods that originate in tape-based sound editing and basic signal processing methods, which can be understood and reproduced with relative ease. And, as we are also dealing with functional prototypes, another precondition is a basic knowledge of electronics and programming.

Stages of the Process

For developing the sound concepts, we used a range of prototyping methods, which we arranged in a three-stage process. They fulfill the relevant pedagogical requirements described above:

They allow us to deal with performativity and are open for improvisation and dialogical exploration, by employing live, ad-hoc sound making techniques

They rely on tools which are accessible, easy to learn, but offer a potential for complexity and iterative refinement

The sound making techniques and tools are integrated in such a way that continuous development from open exploration to functional implementation is possible

For the exploration of the “possible futures” through design, and to make sure that the resulting artifacts also can serve as cases for further study, we need to produce cases of “plausible experiences“ that can be analyzed in the group. This is a common practice in design called “experience prototyping”, which is described as “any kind of representation, in any medium, that is designed to understand, explore or communicate what it might be like to engage with the product, space or system we are designing” (Buchenau and Suri 2000: 425). In our case, we used a variation of the Wizard-of-Oz prototyping method,[4] where a computational system is simulated by an invisible human, triggering events in real time, while another person acts out the scenario or uses the prototype. This satisfies the requirement of providing performative, improvisational real time prototypes and provides the basis for the final stage, the implementation of a functional prototype.

In order to encourage the dialogical process, during all demonstrations and at the end of each stage, the experience is discussed with the whole class. The demonstrations and the discussions are also recorded on video, which is not only important for the ongoing reflection, but also for the underlying research effort described above.

In the following we will describe the resulting stages in more detail.

Stage 1: Foley Mockup

In this stage, we focus on exploring the richness of sound as material and make it possible to develop an initial concept that can be implemented in the form of a live demonstration of a Wizard-of-Oz mockup.

As mentioned above, our methods need to be accessible to someone with very little previous experience and open for heterogeneous groups with different backgrounds in terms of sound and music production. Foley, the technique of creating sound effects synchronous to film, is ideal for this and can be easily learned through a set of simple exercises (e.g. “Semantic Foley”, see [Hug 2010b]). Combined with the possibility of making sounds with the voice and the body, it offers a vast amount of sonic possibilities with a very simple technical setup (Ament 2009). One technical element, however, is essential: Just as in actual Foleying, the sounds made must be captured with a microphone. The microphone is the first element of an electroacoustic sonic transformation and provides an aesthetic link to the following step in the design process, the electroacoustic mockup. The microphone, and the projection of sound through a loudspeaker, provide an acousmatic listening condition, which further supports the understanding of sound transformation in the electroacoustic condition.

As the temporal and technical investment in creating the sounds for the Foley stage is relatively low, another requirement is fulfilled, which is the openness to change and the creation of design alternatives. Also, it makes it easier to abandon design directions, or to “kill your darlings”[5] if necessary.

Finally, this method is also particularly open to improvisation that supports a dialogical form of design, where a sonic idea can be immediately adapted, based on the reactions of users and the audience, which, being unexpected, may often challenge interaction assumptions.

Figure 1: Performing the Foley mockup

In the workshop, after some introductory exercises, the actual development of the Foley mockup commences right from the beginning. The students go through the usual steps of research and concept development and are asked to develop an application scenario, which needs to be plausible within possible everyday settings. Also, the narrative and performative quality of the interaction element and how it could be manifested in sound should be considered. The Foley mockup is then performed, using the Wizard-of-Oz method. Ideally, somebody from the audience performs the interaction. Afterwards, the demonstration is discussed with all participants, and ad-hoc suggestions might be tried out immediately, using the Foley technique. Thus, the Foley mockup stage serves as both proof of concept and playful inspirational session.

Stage 2: Electroacoustic Mockup

Building on the results of the Foley mockup, the next step uses recorded sounds that are played back via a multisampler setup. In didactical terms, and also in terms of design iteration, this stage serves to connect the Foley stage with the final stage, where the interactive systems are implemented as functional prototypes.

The use of multitrack editing allows students to create complex sounds in a more controlled way than is possible with Foley. Still, the experience and sound ideas from the Foley stage can be used as a starting point. The resulting sounds can then be triggered with the multisampler software, and real time mappings can be explored and performed by the “wizard” with a MIDI keyboard. These tools require a more structured approach already, in terms of defining mappings and producing assets, but are still easy to understand and handle and relatively flexible to modify.

The overall process of this stage is the same as in the Foley stage: The participants are asked to refine the initial ideas into a specific application scenario, considering the design requirements of the situational context, and to work towards a systematic approach to defining sound assets, modification parameters, and mappings to specific interaction elements. They are asked to perform their interaction scenario, using MIDI controllers to explore the dynamic relationships of sound properties, action and artifact.

Finally, again, the interactive scenarios are performed in a Wizard-of-Oz setting, and participants can try the interactions themselves. Also at this stage, the participants are asked to consider designing for performative and expressive variation and adaptation and to stay open to improvisation.

Figure 2: Two wizards using a MIDI keyboard to perform the electroacoustic mockup

Stage 3: Functional Prototype

The exploration of the context and the possibilities of sounding interactive artifacts, which have been developed in the previous stages, lead us to the third stage, where the students are asked to develop a functional prototype of the envisioned system. This stage is crucial because it requires, on one hand, a clear vision of the concept and offers, on the other hand, the possibility for a broader audience to interact with the designed artifact or system. The electroacoustic mockup forms an ideal basis for the functional prototype, as the students are able to use their digital sounds implemented in the multiplayer environment and simply replace the “wizard” through a combination of sensor technology and programming.

The key challenge in this stage is to be able to abstract from real world situations to single parameters and eventually come to simple yet effective solutions that also take into account the chosen context. In our case of interactive sound design, this means using appropriate interfaces between input data and the sound production environment. Therefore, students are using established prototyping tools (e.g. Processing, Arduino) together with improvised combinations of different sensor platforms (Microsoft Kinect, distance sensors, microphones).

Figure 3: Working on the electronics for the functional prototype

The students benefit from the previous stages in the process: Both the situation and the direction of the sound design have been defined. Likewise, the students already have developed the basic mapping intuitively by exploring parameter controls with the MIDI keyboards. From this basis, an initial decision about the technology to be used can be made. Guidance in the early stages is useful in order to train focus on the most relevant key events in the concept, which could be used within the setting of a functional prototype, as it is hard to fully implement a whole system in the short time of the workshop. This step also means scaling down the initial expectations from the participants regarding the possibilities of technology.

Case Studies

In the following, we will present and discuss three cases that emerged during a workshop held at Zurich University of the Arts in November 2012. The workshop lasted twelve days in three consecutive weeks. Fourteen bachelor students participated, eleven of whom were male. None of them had a previous background in sound design, except for the courses providing the fundamental knowledge described above.

In order to provide a suitable starting point and a fast progression from the course start to production, four initial scenarios were given to the students, and groups were formed around them. These scenarios also functioned as design constraints, helping to keep the focus on a specific application context, motivated by core characteristics of use, and the overall situation. To assure consistency with previous research, they were based on the conceptual framework for interactive commodities described in (Hug 2010a) and (Hug 2013).

The following assignments/design scenarios were given:

Doc-o-Mat: A wearable or implant with a body-related purpose (“quantified self”): e.g. health monitor for specific genetic dispositions, nutritional values, fitness. It is designed to work in private and public settings.

Matchmaker: A wearable or implant that manages social relationships in public environments or works in relation to the experience of social situations. It may offer means for wearers to express themselves and communicate in a social setting. It may work at various levels of intimacy.

Toyz: This scenario is less defined by a specific situational context, but focuses instead on interactions that afford playful behavior, or that specifically address children.

Project 1: “Dipalu”

Maria Antonieta Diaz, Patrick Müller, Marc Schneider

This project was based on the assignment of creating a toy. The initial idea of the group was to create something that could be used in the kindergarten. There was a strong relation to a real-world application, because one participant’s sister worked in childcare. Two directions were followed in the beginning: One was a device that would help to collect and share sounds; the other proposed a pet-like artifact that could mediate interactions between children and teacher.

Figure 4: Using simple materials to present the Foley mockup

Foley Mockup

In the Foley mockup, the group acted out both of their scenarios. The implementation of the Foley mockup was very simple. The participants used only their voice, both for mimicking the sound collector and for the character’s voice. In order to represent the pet-like artifact, a simple toilet roll was used. This approach allowed them to prepare the prototype with little effort. On the other hand, the sonic elaboration and variability in interaction was rather limited, and there were some insecurities during the demonstration due to a lack of practice, as they spent quite a lot of time sketching storyboards and discussing ideas.

During the demonstration, it became noticeable that the group enjoyed their idea, and producing sounds in real time in front of an audience was no trouble. Several sonic ideas were based on ad-hoc improvisation. But the actor needed to be in eye contact with the wizard almost constantly, which disrupted the presentation somewhat. Also, he behaved like an actor on a stage rather than a casual “user”. Another issue was the audibility of the sounds, which lead to follow-up explanations about design aspects.

Electroacoustic Mockup

For the development of the electroacoustic mockup, a refined scenario was developed, focusing on the idea of a pet-like character that could be placed inside a kindergarten. It was meant to be able to react to sonic situations as well as influence the children’s behavior by various vocal utterances, depending on the noise level. In this way nursery teachers could use it as a mediator for getting children to be quiet and listen or to stimulate their interest in a specific situation.

The group discussed situations that could occur when a large number of children are in one room and created visualizations of these situations. This helped the group to define key interaction states between the object and the children (e.g. children being noisy, quiet, talking, singing, etc.). By doing so, they were then able to systematize the expressive reactions of their object and the accompanying sounding qualities. Starting with simple sounds, the group expedited the sound through several iterations. Building on the Foley mockup, they worked with their voices, developing a systematic catalogue of vocal utterances. They edited the voices to make them more abstract and explored various expressive real time controls.

The demonstration performance still contained some improvisation, but when the audience challenged the interaction, e.g. by reacting differently than expected, the prototype turned out not to be so flexible anymore.

Functional Prototype

In the final step, the group found not only a way to integrate the functions of their toy into a single object, but they even designed an entire narrative around the object. They created a furry creature, gave it the name “Dipalu”, and created an illustrated background story for it.

The tracking of the loudness was realized with a simple microphone, attached to a notebook computer. They developed a program that scanned for sonic patterns based on noise level (e.g. children talking or screaming). These situations were then mapped onto MIDI notes that were sent to the multisampler software where the predefined sonic feedback was triggered.

Figure 5: The final object and hand-drawn storyboard from the Dipalu group

With this setup, the group managed to create a tangible experience of the product and situation. The final result was still very close to the initial approach and demonstrates how the method proposed here can work ideally: exploring interactive sound through simple tools at the beginning of the process and finalizing the concept by building a working prototype at the end.

Concluding Remarks

As mentioned, this group managed to closely follow the proposed method. At the beginning, their Foley mockup helped them in improvising and exploring several ideas. The live performance of the electroacoustic mockup lead to a more systematic yet still flexible approach. The group reports in their project diary:

“Now we were no longer experimenting, and we had to record certain emotions for this conceptual creature to come alive. We made a basic interaction board.”

This statement shows on the one hand the successful transition to a more systematic approach, but also that experimentation - in particular with interactive processes - was abandoned a bit early.

Despite the simple drafting methods, the actual development of sounds was a core challenge to many students. The “Dipalu” group reported:

“It was still a difficult process because we didn’t have a clear picture in our heads of what this creature was; we only had a sketch, around which we have constructed our story.”

It turned out that this storytelling was an important catalyst for making sound design decisions.

Furthermore, this group demonstrated that the methods we are describing are suitable and even enjoyable for non-sound designers for developing sounds in a heterogeneous team. They produced their sounds, “everyone contributing with unique sounds; it was a long but fun process.” The quality of the sounds produced was surprisingly high, considering their limited experience. During the different stages they discovered the value of creating rich, varied sonic material and also took the opportunity to explore sounds outside of the narrow confines of the interaction script or the requirement specifications.

Project 2: Wash and Play

Basil Schmid, Nils Solanki, Ramun Rinklin

For this group, the starting assignment was the “Doc-O-Mat”. They soon focused on the process of hand washing, which has become more widely discussed again in the context of swine flu. The group’s aim was to make the process of washing the hands according to the WHO guidelines more enjoyable through “gamification”.[6]

Foley Mockup

The concept of gamification lead the group to the conclusion that “game-like sounds” in the form of synthetic 8-bit sounds as we know from old video game consoles should be used, with the aim of transferring an aesthetic of their gaming experiences into an everyday context (washing hands). Another argumentation was that synthetic beeps were already part of the soundscape in a hospital, and thus their design approach would work because the basic sonic aesthetics were familiar and still offered a novel, playful quality to the chore of washing one’s hands. While these are valid positions, the result was that the “Foley stage” in this project was actually inexistent.

Electroacoustic Mockup

Because the first prototype turned out to be quite linear, with game sounds being triggered at each step of the process of washing hands, we encouraged the team to further explore the motivational and gestural aspects. In particular they were asked to investigate potentials for interactive sound-action relationships, to help increase the efficiency of washing hands, which, according to the WHO guidelines, depends on hand positions and the correct execution of specific motions.

The group thus investigated the habits of people when washing their hands and created a step-by-step overview of the process. They decided to work with spatial position, the overall temporal development, and timing and strength of hand movement. Still, the design remained rather linear. Further qualitative research would have been necessary in order to find interesting action-sound couplings.

Sonically, the decision to recreate the 8-bit video game sound aesthetics was paramount. Being asked to explore variations and interactive relationships, they resorted to a software which generates 8-bit game style sounds based on the setting of some parameters. With this software it was possible for them to create a set of sound variations without departing from the aesthetic framework. On the other hand, the opportunity for a more open sonic exploration was missing. This was also a consequence of their interaction concept, which was focused on signal sounds for each consecutive stage of hand washing, and did not consider more expressive and interactive potentials, or how sound could be used to motivate people to wash their hands more thoroughly.

Functional Prototype

The team of “Wash and Play” was very motivated to implement their prototype in the final phase, and they stated in a discussion that for them the whole project depended on successful functional implementation. During this stage, the group was still limited in terms of sound design by their chosen style. Also, their goal of implementing the prototype into an actual restroom lead to long experimentation and development with the sensors and electronics, which had to be fitted into the faucet and soap dispenser.

Figure 6: Sensor being used by the Wash and Play group to track running water

On the one hand, the team profited from this effort, because it was very rewarding to experience the interaction in a realistic setting. Also, the challenge led to some innovative technical solutions from which the group learned a lot.

VideoObject 1: Wash and Play final presentation

In terms of the workshop goals, the consequence of this technical absorption was that even the few possible approaches to investigate and develop truly interactive aspects and the corresponding sounds, for instance during the rubbing of the hands, as well as a more thorough investigation of the topic of hygiene, could not be further explored.

Concluding Remarks

Although the 8-bit game aesthetics laid on its usual charm, the style-based approach chosen by this group did not correspond well with the proposed methodology; maintaining the style restricts the creation of sounds and the sonic flexibility required for modulations in line with interactions. Also, there is the danger of producing sonic stereotypes that can soon become tiresome. The discussions with the group revealed that there was a certain degree of distrust that any sound could possibly be interesting enough, which resulted in an orientation towards sounds that were considered “cool” – in this case a certain type of game sound. The argument that synthetic beeps are already common in hospital and health center contexts seemed valid, but also leads to the reinforcement of a (problematic) sonic condition rather than offering alternative approaches.

Last but not least, the absence of an actual Foley-mockup precluded innovative insights from an improvisational setting and the mental readiness to abandon some design ideas if necessary. One related issue was the focus on using sounds to notify and represent certain stages rather than to explore expressive, performative, and narrative qualities.

Project 3: “Sonotag”

This project emerged from the assignment “Matchmaker”. The group’s initial intention was to focus on the slightly awkward social situation that comes into being when two or more people are in an elevator and the resulting issues of interpersonal communication.

In their initial research the group investigated typical elevator scenes from movies in order to better understand the narrative dimensions associated with the setting. The existing use of sound in the form of “elevator music” was judged to be insufficient or even counterproductive. Thus, the approach of the group finally was directed at stimulating movement through sound, in order to relieve the awkwardness of standing in a certain position.

Figure 7: Acting out the Foley mockup of the Sonotag group

Foley Mockup

The Foley mockup employed two pitched tones, representing two people standing in an imaginary elevator. The pitch of the tones was mapped to the distance between them. The goal was to make the two pitches match by moving around in the elevator.

The intention of the group was to confuse both participants by using strange and alien tones, requiring them to cooperate in resolving the confusion and thus reducing the inhibition to interact with each other. The sounds were produced by voice. This had the advantage that the resulting sounds had a complex texture and were not too static, despite being pitched. The disadvantage was that there were interruptions of the sustained tone caused by breathing, which could be misunderstood as the beginning of a new sound event. Also, the voice was not strange or alien enough for the desired effect of confusion, as it was recognizable as being the voice of a human being. Moreover, this recognizability made the sound rather funny.

The mockup managed to clearly demonstrate the principle of interaction. Nevertheless, the group could not explore the full potential of sound as a socializing element, their focus on representation of distance alone being too technical and obvious.

Electroacoustic Mockup

For the electroacoustic mockup, the group used a heavily pitched version of the recorded voice that could be looped continuously. This provided two advantages: First, the voice was defamiliarized and indeed turned into a strange, alien sound that had an identity of its own, without revealing the human origin. Second, it could be played continuously and more precisely, as the setup offered better control of duration and pitch.

During the demonstration they experienced difficulties finding the right match between action and the triggering of the tones. Also, while this stage was very similar to the Foley mockup, it actually reduced the complexity of sonic expressions and interactions.

Functional Prototype

As the previous stage resulted in even more limitations for interaction, the group decided to turn the basic principle of two people interacting through distance into a narrative game, which was named “Sonotag”. The two players took the roles of a rabbit and a wolf who meet each other in the middle of an imaginary playfield. They are blindfolded, and the only information from the game is provided through sounds delivered through a pair of headphones, which are wirelessly connected to a computer. This computer calculates the distances between the two protagonists as well as their distance to the border of the playfield. Based on these distances, different sonic feedbacks are played back to the participants: Both protagonists hear the distance of the opponent as a pitched voice sample, which is complemented with breathing and growling sounds when they are very near or being eaten as crunchy bite, and touching the border is audible as electric discharge. This latter sound event turned out to be very prominent and almost dominated the sonic experience. The reason for this was that the tracking system would break down if a participant went outside a certain area. Thus, a technical limitation forced the group to resort to a rather stereotypical warning sound, which did not fit into the overall narrative.

VideoObject 2: Sonotag final presentation

The functional mockup was quite consistent with the initial approach but abandoned the original situational context in favor of a dramatic game. Also, the group still used vocal sounds but added some illustrative or metaphorical sounds. The advanced tracking technology, based on a Microsoft Kinect camera, together with the tight restrictions imposed by the gameplay as well as issues associated with the height of the room, lead to a somewhat more challenging programming task. Furthermore, they needed to implement a wireless sound transmission solution, and the narrative nature of the game required a sound design that was rather cinematic. By splitting the group and distributing the work into the three domains (tracking of the players, the gameplay, and the sound design), they were able to meet these challenges. But an additional effort was needed to ensure aesthetic coherence.

Figure 8: Camera-based system for tracking the movement of the players for the functional prototype of Sonotag

Concluding Remarks

The initial idea for creating disturbing and alien sounds to break social barriers turned into a simple representation of the distance between people. This example demonstrates that if the mockup is too reductionist and simplified, an otherwise good concept is endangered, leading to aesthetic dead ends. Also, in this case, resorting to a narrative approach helped overcome this issue. The narrative, together with the gameplay requirements, gave clear shaping to the possible interactions and bridged the gap to the sound making process by defining a common aesthetics. In the end, however, the technology left its mark by forcing the group to implement a warning sound that they found inappropriate to the chosen sound aesthetic.

Discussion and Recommendations

In this section we will discuss the observations that we have made during the workshop and present the resulting recommendations in relation to sound pedagogics for interaction design.

General Recommendations

Enabling a Performance-Driven, Dialogical Process

We have described, above, the benefits of using a pedagogical approach that emphasizes real time sound making and live demonstration of interaction scenarios, while also offering the possibility to challenge an interaction process with unexpected behaviors. In principle, both Foley and electroacoustic mockups can encourage exploration and ad-hoc ideation, especially at the beginning of the design process. These stages are akin to sketching and moodboards in visual design. But it is important to acknowledge that this method requires the participants to overcome inhibitions related to being on “stage” and, in addition, inhibitions related to making sounds in a playful way.[7]

From our observations, the demonstrations were often a bit too controlled, and improvisation and ad-hoc ideation were not as common as we desired, as the participants followed their scripts, even in the Foley stage. This also has to do with how the educational system in general encourages the presentation of final concepts rather than the collective exploration of open ideas.

There are two strategies that can help to resolve these issues. First, it is important that the urge to “script” interactions in a restricting manner is resisted. The sonic experiences should always have a possible impact on the concept, and vice versa. Second, audience interaction, and the resulting moments of serendipity and surprise should be encouraged or even enforced. The presenters need to understand themselves as part of a rule-based, generative, and dialogical system rather than as actors acting out some previously prepared script.

Another concern is the (self-inflicted) urge of many students to “build” a functional prototype, as they consider this the only valid form of prototyping. The appreciation of non-functional prototypes and quick’n’dirty, low-tech methods is essential.

Enabling Sound-Driven Interaction Design

Providing methods that make use of the performance-driven sound design approach is the basis for enabling truly sound-driven interaction design. The “ease of use” of Foley and the seamless transition to the electroacoustic mockup allow the participants to use sound as material to design with from the beginning. The process facilitates the participation of all members in heterogeneous groups and enables non-sound designers to develop and finally implement convincing sonic ideas as well.

We see these sound-driven methods as an equivalent to conventional drafting and prototyping methods in interaction design. But these sound-driven methods are not established and are not always taken seriously. Therefore, “designing through sound making” needs to be advocated, and other more established methods for developing design ideas, such as sketching and storyboarding, need to be constricted, at least in the initial stage of development.

Avoiding a Purely Representational Use of Sound

If we are demanding an open and explorational approach towards using sound in interaction design, we are also avoiding the reduction of sound to a purely representational medium (i.e. indicating “error” or linear change of some kind of value). This reduction in our view is a central issue that prevents sound from being used successfully in many cases, as it leads to simplistic beeps and potentially annoying stereotypes that fail to stimulate interaction. The issue relates to the “stigma” of functional sound mentioned in section 4. In the case of the elevator scenario from Group 3, strangeness and surprising sonic qualities would have been promising paths to explore. We need to understand that expression can also be considered as function, and the character of a sound is part of its meaning.

Avoiding Stylistic Bias, Stereotypes and Trivial Mappings

From the case analysis we have seen that a design approach that is style-driven restricts possibilities for exploration of alternatives that could be more effective for the functioning of the design method. It usually leads to static and unresponsive solutions, because possibilities for modulations and variations are limited by the style’s formal requirements. In general, the flexibility and openness of the sound decline as aesthetic pre-determinations increase. Similarly, the simplification of sonic interactions leads to stereotypes and trivial mappings, which may clarify the interactions, but lead to dead ends in the pursuit of novel, fitting “sounds for tomorrow”.

A special case of stereotype is the human voice. While on the one hand it is a powerful means of sonic drafting, on the other, it requires extensive work to defamiliarize it, to detach it from its obvious human origins. In the case of Dipalu, this was not an issue, because the voice was meant to be the voice of the creature and could, as such, be sufficiently defamiliarized.

In some cases, the regress to predefined styles and stereotypes originated in a lack of sonic ideas, arising from an inhibition of exploration and a lack of experience in sound design, but possibly also a result of the representational approaches mentioned above. As it turned out, narrativity could be of great help in this case. Creating stories around artifacts as characters helped to define the sonic aesthetics and was also a useful basis to ensure an aesthetic coherence without being too restricted by a style.

Recommendations Related to Specific Stages

Establishing Foley as Demonstration Method

In order for the students to be able to use Foley for the design process required practice and minimal skills with the microphone. Otherwise, this stage could lead to an oversimplification of sounds, which leads to an unsatisfying quality of the demonstration, which in turn leads to abandoning an otherwise good approach. Live sound making as means of sonic interaction design needs to be taken seriously, and it is important to push participants to really explore the vast universe of sonic possibilities provided by objects.

Multisampling as Mediating Technology

With multisampling and MIDI mapping, this stage provides the basis where students are able to prototype their ideas in a more structured and controlled way than with the Foley method. Therefore, this stage requires a clear concept and, ideally, some existing sounds from the Foley stage to be effective. In the worse case, this leads to a rigid scripting of the interaction process and an emphasis on sequential triggering, in the positive case, the improved control over the sound production increases the expressive and narrative potential.

Most importantly, the electroacoustic live-mockup bridges the gap between Foley exploration and functional prototype, as it delineates in advance the relevant parts for the implementation as functional prototype, in particular concerning systematization of the concept and mapping of sound parameters to interactions and processes. The restriction on working with samples is useful for our target group, interaction design students, as it is consistent with the Foley stage. But in principle, the multisampler stage can be extended with other methods of digital sound generation.

Sensor Technology as Boundary

A prominent insight from the cases is the impact of the various technologies on the respective development of the projects. It is remarkable that in particular the difference between continuous and discrete acquisition of sensor data leads the students to either explore possibilities more openly, or to make restrictive decisions in a “yes” or “no” manner. The Dipalu case, for example, is based on a microphone as relatively imprecise input device, and thus motivated the students to think about situations as ambivalent and continuously changing. This ongoing exploration, coping with the technology and using it as malleable material, also contributed to the development of their concept. The opposite example can be found in that of the camera used for the realization of Sonotag. The existing functionality of the Microsoft Kinect camera was used to trigger events directly, and therefore limited the range of possible interactions. The group became more oriented towards functional implementation, which resulted in a much more defined and inflexible prototype. Also, the danger of the tracking breaking down motivated the use of a stereotypical and prominent warning sound which broke the narrative coherence.

Another point observed was that as soon as sensor technology is considered, be it conceptually or in actual implementation, it will take time and energy away from a sound-driven exploration. We strongly recommend starting with open methods for both sound generation and sensor technology and to maintain this openness as long as possible. This makes the process more undefined but leads to an extensive engagement until the end.

Conclusion: A Sound Approach to Sonic Interaction Pedagogics.

In this article, we have described our approach to sound design education in the context of interaction design. We are motivated by the recognition that we are currently in a phase where important decisions are made in terms of how our world may sound in the future and that our interaction design students will be part of this process of standard-setting postulations.

We have described our understanding of the challenges associated with designing sounds for interactions in addition to the educational challenges related to interaction design specifically. Based on this analysis, we have proposed a pedagogical framework that builds on theoretical background from literature as well as previous experiences with teaching sound design to interaction design students. The framework emphasizes the role of sound as equivalent material for designing interactions rather than using it in a purely functional way. This stands in opposition to the tendency to use sound rather than designing sound.

The method fundamentally builds on an improvisational and dialogical process, enabled by performative, easy-to-use, low-tech tools and a design process that allows for ad-hoc ideation and exploration. Up to the final stage, where a functional prototype is implemented, the technological system is replaced by a “Wizard-of-Oz” mockup, which affords ongoing dialogue and flexibility.

We reflect and discuss the method through the analysis of three design cases that illustrate the strengths and weaknesses of the proposed framework. Based on this analysis of the cases, we could make several recommendations both for sound design in general and in direct connection to the stages of the proposed design process. In particular, we emphasize the need to counter the tendency toward scripting interaction sequences early on by emphasizing an open dialogical exploration through ad-hoc improvisation, which also stays open to changes in aesthetic directions. Also, we point out the need to appreciate quick’n’dirty low-tech solutions to avoid a technical bias and the resulting impact on aesthetic and conceptual decisions. But we also demonstrate that these methods need to be practiced and implemented correctly in order to prevent that students resort to reductionist approaches.

To exhaust the full potential of the method it is necessary that the authors of the design, acting as sound making “wizards”, understand themselves as a “generative, rule based system” confronting the spontaneous behaviors of users of the mockups. This way, we have a good chance of realizing a truly sound approach to sonic interaction pedagogics and to systematically explore “possible futures” in an area where “best practice” is virtually inexistent.

Notes

1. Game sound can be excluded from these limitations, as in this application area it is evident that sound should and will also contribute to a narrative and an aesthetic experience.

2. It is our job, as educators, to provide the knowledge in a problem-oriented manner, when it can be useful.

3. The “dirty” in this case is not meant to be understood as “bad quality” or “ugly”, but rather as “unpolished”.

4. The Wizard-of-Oz method was developed in the eighties by John F. Kelley to simulate natural language computing, and he describes it as an ”experimental simulation which I call the OZ paradigm, in which experimental participants are given the impression that they are interacting with a program that understands English as well as another human would” (Kelley 1984: 26).

5. Kill (or murder) your darlings is a popular recommendation in design education and practice. The phrase is originally attributed to Sir Arthur Quiller-Couch, who addressed it to authors: “Whenever you feel an impulse to perpetrate a piece of exceptionally fine writing, obey it - whole-heartedly - and delete it before sending your manuscript to press. Murder your darlings” (Quiller-Couch 1914).

6. “Gamification” is a term denoting the introduction of game-like elements such as rules, target goals, and win-lose conditions to non-game applications.

7. In our society this skill unfortunately seems to be reserved for children.

Daniel Hug has a background in music, sound design, interaction design and project management in applied research. Since 1999 he is investigating sound and interaction design related questions through artistic installations, design works, research and theoretical publications. Since 2005, he is teaching sound studies and sound design for interactive media and games at the Interaction and Game Design departments of the Zurich University of the Arts, Switzerland. In addition, Hug is lecturer and researcher at the chair for music pedagogics at the School for Teacher Training of the University of Applied Sciences of Northwestern Switzerland, and visiting lecturer for interaction and game sound at the University of the Arts Bern, the University of the Arts and Industrial Design Linz and the Aalto University in Helsinki. Hug pursues a PhD on sound design for interactive commodities at the University of the Arts and Industrial Design Linz. As member of the steering committee of the Audio Mostly conference and founder of the sound design and consulting company Hear Me Interact! he aims to promote design and research in the area of sonic interaction design.

Moritz Kemper is interaction designer working in the fields of physical and tangible interaction and investigating how our everyday life is affected by technology. He is holding a Bachelor's degree in Industrial Design as well as a Master in Interaction Design. Since 2011 he is assistant at the Interaction Design Department of the Zurich University of the Arts, where he is teaching courses in physical computing, computer vision and related topics. Furthermore he is the head of the Physical Computing Lab at ZHdK. His research interests include embodied interaction, ubiquitous computing and DIY technologies.

Rose, Jay (2008). Producing Great Sound for Film and Video. Third edition. Waltham: Focal Press.

Schnell, Norbert, Frederic Bevilacqua, Nicolas Rasamimana, Julien Blois, Fabrice Guedy, Emmanuel Flety (2011). “Playing the Mo - Gestural Control and Re-Embodiment of Recorded Sound and Music.” In Proceedings of the International Conference on New Interfaces for Musical Expression (NIME-2011) (pp. 535-536). Oslo: University of Oslo.