Is This Thing On: Creating Musical Conversation with SoundSelf

It’s the sci-fi writers’ fault, you know. After tantalizing us with glimpses of a glorious future replete with personal datapads (check), instantaneous communication (check) and a worldwide repository of the sum total of all human knowledge (check), we’ve still yet to create computer systems controlled entirely by voice. Sure, Siri is a step in the right direction – but Apple’s task-oriented AI assistant is still a ways away from being a distinct, recognizable character in her own right.

So if the technology is not quite there yet, should we even bother with voice-controlled games?

The concept has certainly been dabbled with before. Vivarium’s Seaman tasked players with raising and caring for a cantankerous fish-person primarily through voice commands, while Sony’s Lifeline attempted the same, except instead of breeding unholy hybrids, you guided a young woman to safety through a space station teeming with monsters. In both cases, gameplay consisted largely of repeating the same phrase into the mic until the damn computer decided to parse your voice correctly.

While neither games were commercial hits, the Karaoke Revolution games (and later, the microphone-enabled Rock Band titles) integrated music as vocals well, and Ubisoft’s ambitious Endwar finally saw voice commands more or less succeed in a fast-paced gameplay context.

Still, aside from music games that simply gauge the player’s pitch and war games that take input in the form of curt military commands, games that attempt to use the voice as controller seem doomed to be regarded as oddities, with Lionhead’s cancelled Project Milo as the high water mark that never was – and might never be.

Yet the problem here isn’t with the concepts – it’s with the technology. Conversations aren’t digital constructs that function on clear action/reaction cycles; they’re messy, analog back and forth exchanges of data. Barking “deploy gunships” is fine when you’re in the middle of a firefight and you need that action completed. But asking “how was your day?” is a different beast. Subtext, vocabulary choices, body language, inflection, emotions, even the silences – these all affect real conversations.

Ever talk to your dog? While the odds are that it won’t pick up on most of your grammatical nuances, that fact doesn’t stop you from talking and looking for a sign, any sign, of a reaction on its part. This is especially true if there are certain words you know that will elicit responses – like its name, “treat,” “walk,” or what have you. On some level there’s an exchange that happens, even if it’s just a soulful look from your pup directly into your eyes, as if he’s begging you to just tell him what you want in barks already.

This is just one example of the anthropomorphization that happens when we willfully engage with other entities in this way, whether it’s a pet, a game character or a chatbot.

Which brings us to SoundSelf.

SoundSelf is a voice-controlled meditation curiosity from Robin Arnott, creator of the claustrophobic terror-soaked experience Deep Sea as well as sound designer on Antichamber. While the game is still in development, at present it’s being billed as a meditation-focused experience played by chanting in front of your computer. As your voice maintains a steady drone and changes pitch, the visuals adjust in kind in the hopes of inducing a zen-like state of euphoria.

“Our aim is to make it feel like meditation or a psychedelic trip, something that you don’t remember that well but after you feel different, you feel more grounded.”
–Robin Arnott

In a bold move, Arnott has put up a playable prototype for the game on the project’s Kickstarter for anyone to try. While the core conceit is certainly there – singing or even humming consistent tones will prompt a wild array of intricate shapes across the complete color spectrum – the experience is pretty barebones. Which, for a prototype, is understandable.

Although Arnott’s tools grant the game access to a tremendous amount of data in the human voice, the development team is still searching for the best way to utilize it alongside the simple pitch recognition seen in the prototype. Once the Kickstarter concludes, Arnott said in a recent conversation, finding ways to add more structure to the experience will become a priority – as the prototype’s sole gameplay instruction is “sustain long tones.”

Yet even in its rough early form, there is the kernel of something interesting happening. As you hit certain pitches, the game’s aural accompaniment swells to accommodate you; as you hit certain keys, a disembodied choir fades in to envelop your lonely note in a musical embrace. Once you gradually shed your inhibitions around roommates, pets, spouses and neighbors and start chanting with abandon, the effect of the soundscape combines with the steady vibration in your chest to hint at something that may not have been seen in a game before: the embryonic makings of a musical conversation between human and machine.

But is this truly the start of a conversation? Or merely an echo chamber? Is your dog really furrowing his brow to communicate, or are you just projecting onto an oblivious animal? The debate is a canard – in the end, as with meditation, if you come away from the experience with some new insight, it doesn’t matter. Even if it’s all thanks to the sound of your own voice.