Bixby's creator explains his vision

Shares

Siri, Alexa, Google Assistant, Cortana… every huge brand is diving into the robotic world of digital assistants. Facebook and Amazon are standing shoulder to shoulder with Google and Apple in making artificial intelligence a key part of future plans...Samsung has been notable in its absence.

But while it made the startline with it’s own digital assistant Bixby in March (alongside the launch of the Galaxy S8 and S8 Plus), it wasn’t to high fanfare.

Coded onto a hardware button on the side of the phones, Bixby merely opened up a single screen that had information about your upcoming appointments, health info or a smattering of news.

“We’re not late to the game, we’re new to the game... because we’re playing a different game”

Injong Rhee

In our review, we noted that Bixby was “average at best, and pretty much useless at worst”, largely because it didn’t really do… anything. The Bixby Home was just another portal, and Bixby Vision, embedded in the camera, was inaccurate and too basic to warrant a place on a high-end smartphone like this.

As Bixby Voice is finally rolling out worldwide, and as we await the launch of the Galaxy Note 8 we looked deeper to try and work out what Samsung is actually trying to do.

A work in progress

It would be easy just to dismiss Bixby as a half-baked idea, one launched too early and would likely be shelved silently in a few months’ time.

But - as I said earlier in the year - there’s something about Samsung’s attitude behind the scenes with Bixby that makes me feel like there’s more to this than meets the eye. Everyone I speak to is far too confident that Bixby will be a success, despite launching into an incredibly feature-rich smartphone market where Siri and others have been working for years.

Injong Rhee is Executive VP and Head of R&D, Software and Services for Samsung Mobile, and in charge of Bixby’s creation - in a wide-ranging interview he’s so relaxed about the Bixby program he created that it’s almost alarming when you consider the importance a project like this has to have for a multi-billion dollar corporation.

Injong Rhee, Executive VP and Head of R&D, Software and Services for Samsung Mobile

“Bixby’s more like an interface, that allows you to do more things with your phone,” Rhee tells me. “It’s, you know, we say intelligent because it provides a lot of natural way to interact with people.”

That’s the key difference that Samsung is banking on, that Bixby is going to offer more than Apple, Google and Amazon are bringing to the table with their voice assistants. Bixby Voice is meant to be just a part of the story, working seamlessly with the camera and the touchscreen so the user can interact with their phone more naturally.

"If you watch the movie Iron Man, Tony Stark is sitting there interacting with Jarvis, the agent or the assistant, and he uses all the means available to him," Rhee explained to us. "He uses the keyboard and he talks to him, he uses his hands, spinning, and whatever he feels comfortable using at that point and in that context and the object he’s actually manipulating."

An impressive demo

The reason behind this move is partly the complexity of smartphones. The days of a single menu to help you call, text or play Snake are gone - user interface designers now have to try and find a way to make it easy to perform tens of thousands of tasks.

“This problem isn’t just for smartphones,” says Rhee. “It’s also all the appliances and devices we make: refrigerators, TVs, washing machines, air conditioners, and even robot vacuum cleaners – there’s so many things this vacuum cleaner can do if you look at the remote control and you’re like, “What to do?”

That's the idea for Bixby Voice - mimicking the idea of handing your phone over to a person next to you and asking them to do the things you don’t want to stab at the phone trying to do.

Bixby can be woken with a button or a voice command

If you're cooking, Bixby can scroll up and down for you, play music, stop the screen from timing out and search the web all without needing a single touch.

Rhee is enthusiastic about this, demonstrating multiple scenarios to show how well Bixby can understand things logically and sort them for you. It ranges from taking a picture and posting it on Facebook with a single command to asking the phone to change its language.

(There's a strong irony with the latter command, given Bixby can only understand Korean and US English).

The impressive thing is not that a phone can recognise voice - (other handset can do what Bixby can despite it being impressive) - but that it can do so more contextually than others.

In the demonstrations, there were misheard words, although the result was still correct as the software applied context to the situation. Not only that, it'll learn your intentions over time - if you're in an app that you're changing the language of a lot, it'll know not to ask if you want to change the language of the phone itself.

Head in the clouds

Rhee’s vision for Bixby might be predictably bold (“It will be de facto interface for everything. Every device we produce, every devices others produce. That’s the vision that we see”) however, the next few months will be the proving ground.

The project leader claims the company even has a lot of interest pouring in for absorbing Bixby into other brands' technology.

"We’re being approached by many different OEMs," Rhee confirms to us. "Not only smartphones, but we’re talking about car manufacturers, and others like audio makers are coming to us [too]."

He's not really worried about that right now, preferring more to focus on Samsung's range of devices, from smartphones to tablets to TVs to white goods - all it takes is a microphone and an internet connection and anything can be Bixby-enabled, and the range of Samsung products on the market is rather large already.

Therein lies the issue: Bixby is cloud-based. If you want to do anything on your smartphone through the assistant, you'll need an internet connection, and if that fails when using voice you'll be left looking foolish.

"EVERYONE, I AM SENDING A TWEET"

One of the main issues people have with voice control of their devices at the moment is that when it fails, they feel idiotic for trying something new in the first place - so it really needs to be flawless first time to create the right impression.

Rhee agrees. “[This] friction point is why today people are not using [voice assistants]. If they have to second-guess which function works and which function doesn’t work then it’s not going to work.”

He points to the fact Bixby will learn over time as a reason to feel Samsung’s voice assistant is offering something new, this negating the foolishness. But when Rhee explains the way Samsung is thinking about improving Bixby, it sounds like a dangerous plan:

“I look at Bixby as like ten years old, and its ability to understand us. I consider our users are like Bixby parents, they’re trying to grow their baby, it’s not as if [Bixby] make a few mistakes they’re not going to kick them out on the street.”

Except that’s exactly how users will see it - Bixby Home and Vision were powerful tools in theory, although in reality their use case was limited and sometimes irritating, so users quickly switched off.

Even Siri, the most popular voice assistant on the market, needs a current marketing campaign with The Rock to show what it can do, rather than just being seen as a place to set timers easily, so it shows that there’s a real need to highlight to users how good a personal voice assistant can actually be.

So why launch Bixby so early on without the voice recognition, when it could have waiting until the announcement of the Note 8 and had the service fully ready to go, coming into the market on a blaze of impressive voice recognition that other services can’t manage?

“I like drama, so…” Rhee says, laughing. “When we announced Bixby we were [using] 10 applications initially and we got a lot of feedback saying that that’s not enough for people to use.”

Bixby Home - it needs more relevant information

But there are still only a few apps supported by the service, so you can't ask it to search for a movie on Netflix and start playing it - that'll need more finger-based fun to achieve.

Rhee says that such things are coming, and that's part of the reason for the delay with the service, as the consumer feedback was clear that they wanted more apps - so the drawing board was revisited.

What happened was support for things like YouTube, dialling into the features usually designed for accessibility for the disabled, using them to remove the need to touch the screen. In fact, Samsung has found that it can control most elements of the Android ecosystem in this way, giving it a strong ability to control the operating system through voice.

Samsung is going to have to do an awful lot of work to convince the market that its system is the best out there - it bought Viv last year to help create a software developer kit that can be used to integrate Bixby and encourage developers to use it throughout their own framework, and encouraging developers to help out is going to be crucial.

A long way to go

Here’s the bigger question: how good is Bixby compared to its rivals? Samsung is focused on proving it’s better than Apple at the moment in a bid to claw market share away from its great competitor, and that was the focus of another demo.

Rhee says the company has only been working on Bixby for the last couple of years, but is at pains to show demonstrations where it can outperform Siri.

“[Bixby] is a completely different concept. S-Voice tried to copy what Siri had been trying to do,” he says. “There’s some components that we used [in Bixby] like the ability to understand speech and all those things we can make use of, but in terms of all the natural language understanding, the action planning, integrations with all the different applications, that’s all completely new.”

In the side by side demonstration, the differences are subtle - asking the phone to take a selfie on both will see the front-facing camera opened, although only the Samsung phone will actually take the snap.

However, you could argue that some people just want the camera live at this point, allowing them to compose a shot. When asking both phones to open the last photo taken, Bixby does just that where the iPhone conducts a random keyword-based search through the handset.

Bixby Vision still needs a lot of work

While Rhee then makes a big deal about being able to say ‘post this photo to Facebook’ or other social networks - I’m not convinced people want that level of autonomy given to a phone for their social networking.

And let’s not get ahead of ourselves here. While there’s clearly a huge amount of work gone into the creation of Bixby (with Rhee telling me that 3000 people were working on the project) and the demonstrations are impressive, they’re still just pre-rehearsed demos.

Admittedly, the voice demonstration I saw was pretty much perfect, with the only failures not having the right apps installed in the demo phones on show - posting things to Facebook isn't possible if you're not logged in, understandably.

For all Samsung’s relaxed cockiness about Bixby, it’s a service that is still far from flawless. Bixby Vision still gets in the way of using the camera and has very limited functionality - its image recognition could rarely work out what it was being shown at launch.

This desire to put an unfinished product on the market is testament to the fact Samsung needed a player in the digital assistant game, and despite being impressed by the Bixby demonstration I can’t shake a huge wave of skepticism swirling because Samsung took so long to get involved.

Amazon's Alexa has a huge headstart - not just in technology, where it's being used by brands like Ford, HTC and Huawei - but also in terms of user perception. The idea of asking Alexa to do things is no longer alien for the average user, and that's a massive boon.

One of the things that impressed me most about Bixby was the sheer amount of commands you could say to it, and all of them seeming very natural, non-robotic sentences.

An iconic logo in waiting?

At the same time, we’ve not seen what the list looks like for Google’s Assistant or Apple’s Siri, and with the Pixel 2 and iPhone 8 coming out, that could be even more impressive. Neither company will stand still, and both are working hard on offering more natural language context.

Whether Samsung can succeed with its own assistant depends on how well it can roll out the service.

"[Bixby] is a good starting point. And as I said, we’re new to the…," Rhee says, catching himself before restarting his sentence. “We’re the first to the game, and I think the others, you know definitely Siri, will do a lot of catch-up.”

He claims that Google can’t manage the same thing Samsung is because “they don’t have access to the lower level of things that a device manufacturer currently has” - but in reality that’s precisely what the brand is doing with the Pixel range, and possibly explains why the new set of phones was created by Google in the first place.

If the Galaxy Note 8 and improved Galaxy S8 duo show a real retention in usage, and users start to really get on board with what Bixby can do, Samsung will have pulled a masterstroke in making itself relevant in an assistant game that it looked to have missed the boat on.

Perhaps Samsung doesn’t care about that. Maybe it’s seeing Bixby as a long-term play and it’s willing to keep investing in it until it works, as part of a wider vision powering all its devices.

The thing is, time is ticking - Alexa is running away as the de facto voice control on devices, Apple investing heavily in both the power and reputation of Siri and Google Assistant drawing on huge levels of data to keep improving.

Samsung needs to improve Bixby - fast - to make sure its customers keep using it regularly, to give the data needed to make the voice recognition more accurate, the image searches more useful and the whole experience an embedded part of daily lives, supporting all the major apps you’d need - otherwise this could end up being a costly project the joins S Voice quietly on the shelf.