A more detailed exploration of how we built travel voice search for Amazon Alexa

We're very proud to share some thoughts behind our new Amazon Alexa service, and the thought process that goes into building some very different things than usual. Firstly, have a look at what we've been up to:

You might recall that a while back, we published a report called "The Future of Travel" - a view into the crystal ball, forecasting how travellers would search and book their journeys in ten years' time.

The primary focus was around the notion of "travel buddies" - assistants who know your preferences, your travel history and your upcoming schedule - and pro-actively making suggestions or even making reservations for you, to save you the pain of searching.

That goal of booking travel before you knew the need for it, is some way off at the moment - but we work hard to spot ways to inch ourselves forward and bring 2024 to life a lot earlier.

With this in mind we have recently taken up the challenge of developing 'conversational search', to let our users look for travel in an environment known to them. Innovative ways to expose our product inside familiar interfaces.

This type of search is nothing particularly new - free text search interfaces have existed for years. Think of how Google revolutionised search - and almost expects people to ask 'human' questions rather than game the system with keywords. Indeed Skyscanner had its first stab at voice and free-text search with its Windows apps in 2012, so we have some previous for this.

However with the advent of our Skyscanner Partners team, we are keen to use our well-established API to push these types of service and really advance the whole 'assistant' model - by showing the art of the possible for our development partners - and giving users an early taste of how it can benefit them.

This started in the summer with a 'chat bot' installed into the fast-growing Telegram app.Telegram - a rival to WhatsApp trading on the security of its conversations - makes it possible to have 'bots' which although clearly not real people, can conduct conversations to get quick answers to problems in a platform they already use routinely.

Leaning on the API, my colleague Richard Keen put together a bot very quickly to prove it's both simple but quick and powerful for both sides of the conversation - and providing simple "low, medium and high" prices for hotels and flights - in seconds. The important thing though, was jumping on a platform rapidly gaining adoption, and use it to lower the friction of travel search.

Then in the summer, a friendly architect at Amazon showed us how simple it could be to enable something similar - using voice search - on the Amazon Echo device. We jumped on this because it was clearly another way to get people thinking about travel early, and another step closer to that 'assistant' goal; searching when it comes into your head, without even picking up a device.

And to be honest, building for "Alexa", the conversation engine deployed by Amazon, is not particularly difficult. What you're effectively doing is building up a miniature state machine, keeping careful note of what the user has already told us, and how much more information we need to give them a meaningful answer.

Whilst it's entirely possible to deliver an answer from a "one-hit" question - such as "find me flights between London and New York on January 21st" - the longer a phrase becomes, the less accurate the recognition and parsing becomes - and the more frustration the user will get in trying to get their information.

Building up conversations piece-by-piece, asking short but friendly questions which drive a particular response, gives a far better experience and a trust in Skyscanner that we wouldn't get if the product was more difficult.

The real challenge comes later - in making the conversation a little bit more human and natural. People don't like talking to robots any more than 'pressing 1, 2, 3, hash' when phoning their bank.

You soon come up against challenges, such as the basic principle that bizarrely enough, people can be polite. Every phrase you train Alexa to hear, such as "ask Skyscanner for a flight" - has to be tuned with the word "please" at either end. Alexa then has to reciprocate by being polite in response, thanking the user for their input at each stage - but without sounding repetitive (and yes, robotic).

As a developer you're not looking up the Javascript manuals, you're suddenly reaching for a thesaurus. As a UX designer you're not drawing anything, you're mapping users' emotional reactions and observing how people never quite ask the questions the way you expected.

Technical challenges then balloon as a result, but ultimately the user reaction when it all falls into place, is something to behold. Because users aren't staring at a screen or concentrating on what to click, tap or type, you can see every step of their journey etched in their faces.

The belief is that having bought into the service, they will be that bit closer to Skyscanner as a company and use our other services. In time, we will surely take the voice engine through to booking when technology allows, and remove the need for those other platforms.

Our Alexa search will soon be deployed in the US, where Amazon currently operates the service and sells compatible devices, but we hope some day soon it will be made available in the UK so we can let more people use and improve it.

But more than that, we've proved a concept and opened up a new paradigm of search for Skysanner which makes that 2024 goal a lot nearer all of a sudden, thanks to Moore's Law and the flexible platform we've worked so hard to put together. And many more of these services we can spin up quickly as a result.

To disagree with Elvis Presley, it really could end up as a world where 'a little more conversation' results in 'a little less action' - for our users, if not for Skyscanner!

Interested in finding out more about Skyscanner's journey with bots and voice travel search?