Going Big with a Consumer Voice Skill Startup?

In the past few weeks there have been numerous product announcements in the voice computing space from big players like Amazonand Google. It appears as though a key part of the strategy for these platform companies is to foster a rich third-party developer ecosystem developing consumer voice Skill “applications,” with an eye towards making the end-user experience increasingly vibrant. But can you actually successfully build a consumer-facing voice Skills based business? Not a hobby, or a side-project. A consumer startup on a #VoiceFirst platform with venture-scale potential. Today?

This new computing paradigm with third-party developers situation has created a logical analogy which many have said “maps pretty neatly” between the Amazon Skill “Store” and the Apple App & Google Play Stores. So are Skill Stores the App Stores of the Voice Computing era? The latter managed marketplaces have provided a safe vetted area for consumers on the mobile smartphone platforms after the wild west of PC generation. This deliberate fostering of a rich mobile app developer ecosystem has created entire app industries, millions of consumer apps, and an incredibly profitable line of business for their creators. I’d argue, though, that integral to both of those mobile stores successes were three components: native discovery, clear monetization paths, and transparency/consistency.

Unfortunately, these components are currently lacking in today’s voice computing environment. At the beginning of this year, I penned a blog post which argued the two biggest inhibitors to serious (Alexa) Skills development are distribution and monetization. I think it’s becoming increasingly clear that the distribution methods for discovering a new Skill are most often outside the voice paradigm and involve a separate screen experience. This isn’t necessarily intentional, but rather a reflection of one of the weaknesses of voice as a user interface not being able to communicate a suite of options as efficiently.

And while mobile app stores developers can lament the fact that both Google and Apple charge a hefty 30% of revenue tax for the privilege of distribution in their stores, the good news is that for the past decade it’s extremely clear what the rules are and they’ve been consistent over time. There is some risk with Amazon that the rules will change (as has happened with partnership terms in other areas like its Associates program, Fulfillment by Amazon (FBA), or conditions withits third-party sellers). All of that being said, it’s hard to argue against receiving a check in the mail from Amazon as a reward for delighting consumers by developing an engaging Skill.

And as much as the above critiques are about Amazon, it’s clear that they’re the progressing forward on these issues and surely the furthest ahead. While the alternative voice platforms exist, to date they’ve come up short. Google isn’t doing enough to engender developers, and it shows in that most of the Skills for the Home are merely mirror copies of those in the Alexa store without any innovation. Even more paradoxical, Apple seems almost hostile – Siri Kit requires a companion iOS app first.

So what’s a Skill developer or a #VoiceFirst startup to do given these constraints and that Amazon has the clear lead in the space for the time-being? Build a business leveraging Amazon’s voice lead but not relying solely on it:

First, view those checks Amazon is sending as non-dilutive financing capital to fund starting a business, not as revenue from a business. Monthly checks are nice, but I believe there’s risk they won’t last (or be as large) forever. The business models for leveraging voice have yet to play out, but I’m not convinced Amazon Skills checks are as solid and consistent in the same way they were from mobile app stores. Of course, take them while they last, but don’t increase the cost-structure of a startup based on that income.

Second, the good news is to recognize that even if a Skill is #VoiceFirst, it’s not voice only. Instead, consider the voice computing development to be closer to #VoicePlus. We’re soon going to be in a multimodal world where data and interactions (plus monetization) is shared across (many) platforms. Practically-speaking, there is a short list of truly voice-only applications. Music, one of the most adopted category of Skills, plays over a speaker which is an inherently non-visual experience. But it’s likely that in most cases, users will seamlessly blend a voice interaction with one on their phone to one on a large screen in the room. Even Amazon recognizes this dynamic and is already making product pushes in this direction. I believe that a multi-modal approach simultaneously serves the end-user, as well as startups’ business model innovation outside the Skill store. To that end, I have noticed some developers already utilizing SMS or other mobile messaging in taking the interaction “off-platform” for monetization purposes. While there are certainly some applications that can exist exclusively as an Alexa Skill or Google Action, the majority will require a cross-platform, multi-modal experience.

Pursue aligned incentives with unique value. What does Amazon care about? Incredible customer experience and selling (Amazon) product. I believe it’s less likely that Amazon will create a way for consumers to “purchase” a skill in the same way people purchase apps, so the mobile app store paradigm analogy isn’t directly going to emerge. Instead, if a third-party Skill furthered either (or better yet both) of Amazon’s goals of experience and product, there is a higher likelihood that Amazon will look more favorably on it for rewards. And especially if a Skill employs a unique structural advantage (data, brand, network effect) that Amazon cannot easily replicate, the more likely it will not be in the target scope of Amazon’s sometimes fickle partnership interests.

So for the time-being, the best option for a creating a consumer voice computing business is to play in the Amazon Alexa sandbox. And that’s OK for now… as long as you play nicely and contribute to it.

David Beisel

David Beisel is a co-founder and Partner at NextView Ventures. He has been focused on early stage Internet startups his entire career, both as an entrepreneur and venture capitalist.

As an investor in the digital media space, David was most recently a Vice President at Venrock and previously a Principal at Masthead Venture Partners. Prior to becoming a venture capitalist, David co-founded Sombasa Media, an e-mail marketing company best known for its flagship product BargainDog. Sombasa was successfully acquired by About.com where David served as Vice President of Marketing.

David holds an MBA from the Stanford Graduate School of Business and an AB in Economics, magna cum laude and Phi Beta Kappa, from Duke University. He also founded and leads the Boston Innovators Group, an organization which holds quarterly entrepreneur events drawing a thousand attendees.

+1. Right now the Alexa Skill ecosystem, similarly to the Facebook Messenger bot ecosystem have more in common with mobile apps on a carrier deck pre-iPhone then they are to the App Store / Google Play ecosystem we know today. Ineffective Discovery, opaque business models, no effective payment / billing, little UX guidance…
Furthermore it’s not clear that Amazon has made a bet on 3rd party skills being the future for Alexa – this looks more like throwing some money to developers to see what can be learned from their ideas. Early mover is probably not an advantage in this environment.

Stuart Crane

Totally agree about the multi-modal experience becoming more prevalent.. and that there really are not many “voice-only” applications, at least not that have legs. Unfortunately it is not ideal for a startup to have to “further” Amazon’s goals (either/both of them) just to get recognition. The great thing is that all of this is so early and there is lots of runway when it comes to all of the other platforms as well — Google, Siri, Cortana, Bixby, etc. For now though, Amazon is at the top.