Amazon's Alexa Presentation Language and the future of the smart home

Amazon just bought its first house. Its first house company, to be more specific. Plant Prefab, a California-based custom modular home company, received a significant investment from the tech behemoth last month. The idea behind this investment is that Plant Prefab will now allow buyers to customize their homes with Alexa-enabled devices.

This is merely the most obvious example of Amazon’s ongoing plan for smart home domination, but it’s hardly the most significant. At their hardware announcement event in September, they announced 15 new Alexa-enabled smart devices, ranging from smart plugs to microwaves.

In addition to this onslaught of new hardware, Amazon introduced a more subtle but no less exciting piece of their voice strategy. The Alexa Presentation Language (APL) will allow third-party developers to integrate visual content to voice apps for use on screen-based Alexa products like Echo Show, Fire TV, Fire Tablet, Alexa alarm clock, and Echo Spot, as well as third-party Alexa devices soon to come with the release of the Alexa Smart Screen and TV SDK.

So why is this so significant? It opens the door to voice experiences that are not only richer visually, but also contextually; brands will now be able to bring a wider utility and specificity to their interactions with their customers, based on where that person is, what they’re doing, and what device they’re using at that moment.

By not only diversifying their offering of Alexa-first product offerings but also expanding Alexa’s domain to a growing array of third-party products, Amazon is doing everything they can to extend their device leadership over competitors. The Alexa Presentation Language is the glue that will connect all of these devices and experiences.

Whereas the web let people view content across different sized screens, APL (and likely future competitors) aims to support interactions across an even wider range of contexts, including those where there is no screen at all. On a laptop? Great - we’ll show you movie times. In the car? We’ll read them to you - but just the few we think will be most relevant.

Getting your brand ready for multi-modal voice proficiency

2018 so far has shown that a multi-device ecosystem utilizing screens and voice—with mobile as the hub is the future of personal and enterprise technology. The primary challenges this presents to brands is scalability. We see two key areas to prepare in:

Content Strategy. Does your content strategy take both voice and screen into consideration? Do you have a plan for how, and where, to respond to voice queries in a multi-modal environment? Is your brand voice clearly defined internally to ensure continuity when new devices and contexts come to market?
API Services that let you move quickly. As connected devices continue to grow in adoption, the variety of front-end contexts calling on your existing services can become equally complex. You’ll need to talk to Siri on smartphones, and Alexa on microwaves. The key is having robust, flexible APIs that let you interact with your customers in different contexts. A comprehensive, dedicated API layer in your product infrastructure can take this burden from your backend, allowing for increased flexibility and efficiency, meaning you’ll be able to get to market faster and more reliably when introducing a new device into your brand’s ecosystem.

Companies who have chosen to jump into voice in 2018 with both feet have found great success so far. Perhaps the most publicized success story in recent months is Erica, Bank of America’s virtual assistant, which launched in June and amassed 3 million users in its first three months. (It’s worth noting that actual user reviews of Erica are mixed, though that seems due largely to how it was rolled out.)

The key piece Amazon is missing, of course, is an Alexa-native mobile device to control all of these peripherals. (The failed Amazon Fire was pulled over three years ago, with no replacement in sight.) It’s important to remember that the smartphone is still the most widely-owned, widely-used voice-enabled device there is, and Google and Apple are the world leaders in smart assistant penetration (because of all the Android and iOS devices out there which have Google Assistant or Siri installed).

Considering the continued primacy of mobile devices as the hubs of our digital lives, it’s difficult to imagine a strategy comprised exclusively of standalone voice assistants winning the coming “voice wars” over ecosystems with a phone at the center. Can Amazon overcome the major disadvantage they face in comparison to Google and Apple—not having a foothold in the smartphone OS market?

Time will tell. If 2018 was the year voice became a mainstream topic of conversations, 2019 will be the year when it becomes a foregone conclusion—when it will become clear which brands have taken the lead, and which got caught flat-footed.

If you want to talk further on how to bring your brand into the age of voice, don’t hesitate to reach out!