The age of voice assistance and a new input paradigm

NEW YORK – On the second day of SMX East, the opening keynote from Google’s Naomi Makofsky entitled, “The Age of Assistance” reminded us that people are connected to more devices than ever before, and that more and more of these interactions are conversationally powered.

We’ve gotten here as a result of a long evolution. In the 1970s, we had the revolution of personal computers, where input was driven by typing. By the 1980s, more and more people had home computers, and we had the mouse, so a new input format was clicking.

Then came the next major shift with the advent of smartphones and another new input paradigm came forth—tapping.

Now we’re moving into a new environment where voice becomes a major input paradigm. With that comes the concept of proactive assistance provided by the machine, and the idea of helping people get things done. This involves anticipating the consumer need or intent, to get assistance to the right person at the right time.

Looking at the evolution of voice search itself, we can see that voice search from Google launched in 2009:

The Knowledge Graph debuted around 2012, and today has over one billion entities. Google Now, the first assistant, followed later that year, and it tried to anticipate user needs.

Today, personal assistants need to work across a widening landscape of different devices, and help users with a wide variety of needs, from helping them get from point A to point B to booking a restaurant.

This involves more than running on and seamlessly transitioning between mobile devices—users may be connecting via their refrigerators, vacuums or thermostats.

So how fast will all this happen? According to eMarketer, 87 percent of B2C marketers believe that personal assistants will play a big role before 2021. Can it really move that fast?

Consider the rate at which the smartphone became a core part of our world. In 2008, no one had a smartphone. Naomi takes a quick poll of the audience, and demonstrates that by the time of her keynote (at 9 a.m.), nearly everyone in the audience had already either used an app ride service such as Uber of Lyft, checked their social media, or sent one or more texts.

The bottom line is that this can happen fast, so get ready for it!

Today, Google Assistant is installed on more than 500 million devices. Its focus is to help you get things done. For business owners, our opportunity is to create “Actions,” which is how we can offer our own customers opportunities for conversational experiences with our brand.

One of the driving factors for this is that the accuracy of voice technologies has improved dramatically, as shown in this chart:

In fact, computerized voice recognition software has recently begun to beat humans in voice recognition tests. Speech generation is also getting much better. These are key drivers for voice adoption.

Next, Naomi shares five key insights with us:

1. Voice Is About Action: Interactions in voice are 40 times more likely to be about actions than traditional search queries:

list of queries

2. People Expect Conversations: Voice interactions between humans and devices are much more conversational:

As a result, we’re evolving from the keyword-based query to something more dynamic. For example, there are over 5,000 ways to ask to set an alarm.

3. Screens Will Change Everything: Nearly half of all voice sessions use a combination of voice and touch input:

The world of voice will be multi-modal, and will involve a mix of tapping, typing, and talking.

4. Daily Routines Matter: People will use their personal assistants to support their daily routines:

They will naturally take action based on the context they’re in, not the device they’re on.

5. Voice Is Universal: No manual is needed. We all (well, nearly all of us) learn how to speak, and it’s intrinsic to human interaction. We learn a common language of communication, and our personal assistants will be tasked with understanding us as we are.

Naomi’s final words of advice are for us to “show up,” “speed up” and “wise up.” We show up by creating great content, using schema markup for our content, and by posting videos on YouTube. We speed up by creating voice experiences that make working with our voice apps faster than the alternatives available to users.

Finally, we wise up by starting to get our hands dirty now. Start learning to create conversational interfaces now. It will take work and experience to get good (or great) at it. We’re still in the early days, but this is likely going to come upon us quite fast.

Last, but not least, think about what your brand is going to sound like in the world of voice. You need to learn to project your brand persona through this environment, as voice is inherently social. How your brand sounds online will become a core marketing area of focus. Make sure you sound good!

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

About The Author

Eric Enge is General Manager of Perficient Digital, a full-service, award-winning digital agency. Previously Eric was the founder and CEO of Stone Temple, also an award-winning digital marketing agency, which was acquired by Perficient in July 2018. He is the lead co-author of The Art of SEO, a 900+ page book that’s known in the industry as “the bible of SEO.” In 2016, Enge was awarded Search Engine Land’s Landy Award for Search Marketer of the Year, and US Search Awards Search Personality of the Year. He is a prolific writer, researcher, teacher and a sought-after keynote speaker and panelist at major industry conferences.