voice recognition – Gigaomhttp://gigaom.com
The industry leader in emerging technology researchThu, 24 May 2018 17:25:15 +0000en-UShourly1Watson-powered toy blows past Kickstarter goal in a dayhttp://gigaom.com/2015/02/17/watson-powered-toy-blows-past-kickstarter-goal-in-a-day/
http://gigaom.com/2015/02/17/watson-powered-toy-blows-past-kickstarter-goal-in-a-day/#commentsTue, 17 Feb 2015 19:46:39 +0000http://gigaom.com/?p=915294First it was Jeopardy!, then it was cancer, e-commerce and cooking. Now, IBM’s Watson artificial intelligence system is powering a line of connected toys.

And it looks as if people are impressed with the idea: A company called Elemental Path launched a Kickstarter campaign on Monday for a line of toy dinosaurs, called CogniToys, and had surpassed its initial goal as of Tuesday morning. The company was aiming for $50,000 and had raised more than $70,000 as of 11:40 a.m. Tuesday.

Essentially, the dinosaurs are connected toys that speak to IBM’s Watson cloud APIs, which the company began rolling out last year. According to the Kickstarter page, the CogniToys will allow children to engage with them by talking — asking question, telling jokes, sharing stories and the like. In addition, the page states, “The technology allows toys to listen, speak and simultaneously evolve, learn and grow with your child; bringing a new element of personalized, educational play to children.”

Elemental Path is not the first company focused on building natural language and artificial intelligence into toys. Possibly the best-known example so far is a startup called ToyTalk, which is building natural language iPad apps and was founded by former Pixar CTO Oren Jacob.

The evolution of artificial intelligence, and the ability to easily train toys, robots, apps or anything, really, is going to be a major focus of Gigaom’s Structure Intelligence conference September 22–23 in San Francisco. We’ll also talk a lot about machine learning and AI at our Structure Data conference March 18–19 in New York, where speakers from Facebook, Yahoo, Spotify and elsewhere will discuss how data in the form of images, text, and even sounds are allowing them to build new products and discover new insights about their users.

]]>http://gigaom.com/2015/02/17/watson-powered-toy-blows-past-kickstarter-goal-in-a-day/feed/4Facebook acquires speech-recognition IoT startup Wit.AIhttp://gigaom.com/2015/01/05/facebook-acquires-speech-recognition-iot-startup-wit-ai/
Mon, 05 Jan 2015 21:52:54 +0000http://gigaom.com/?p=904267Facebook has acquired Wit.AI, a San Francisco-based startup building a speech-recognition platform for the internet of things. The company launched early in 2014 raised a $3 million seed round in October from a group of investors including Andreessen Horowitz, Ignition Partners and NEA.

Wit.AI has about 6,000 developers on its platform, which allows users to program speech-recognition controls into their devices and deliver the capabilities via API. When I spoke with co-founder Alex Lebrun in May, he explained that his ultimate goal is to power artificially intelligent personalities like those in the move Her, but the company’s present focus is on helping power devices that can respond to simple voice commands. At the time, he said, Wit.AI was working with SmartThings on its line of connected devices, and was in talks with Nest before the Google acquisition.

[blockquote person=”” attribution=””]Facebook has the resources and talent to help us take the next step. Facebook’s mission is to connect everyone and build amazing experiences for the over 1.3 billion people on the platform – technology that understands natural language is a big part of that, and we think we can help.

The platform will remain open and become entirely free for everyone.[/blockquote]

For Facebook, acquiring Wit.AI gives it another opportunity to expand its platform into the world of connected devices and even smart homes without relying on speech-recognition technology developed by often-competitive companies. Much like Amazon has its Echo device, and Google has both the Android ecosystem and the Nest division, Facebook, too, likely wants a way to let users touch it when neither a keyboard nor a conventional computing device is around.

]]>Amazon’s Echo is a good listener but a wretched assistanthttp://gigaom.com/2014/12/12/amazons-echo-is-a-good-listener-but-a-wretched-assistant/
http://gigaom.com/2014/12/12/amazons-echo-is-a-good-listener-but-a-wretched-assistant/#commentsFri, 12 Dec 2014 14:45:15 +0000http://gigaom.com/?p=899548Never has the gap between a flawless technology experience and a closed ecosystem loomed as large as the gap between the Amazon Echo and the Ubi personal computer. While Amazon’s Echo works beautifully and is a gorgeous cylinder that is ready to hear and (attempt to) obey my every command from pretty much anywhere in the room, it fails because its abilities to connect with a variety of web services are very limited.

Meanwhile, the Ubi, a voice-activated computer that is older and, yes, much more painful to use, wants to do the same thing. Like a teenager, though, it isn’t adept at listening to my commands, sometimes awkwardly interrupting my conversations, and its music playback is not nearly as graceful as the Echo’s.

But what the Ubi lacks in technical ability it makes up in a democratic willingness to try to control a variety of web services via If This Then That, SmartThings and others. If you combined Ubi’s openness with Amazon’s grace and technical acumen — provided by the powerful speakers inside the Echo and the seven-mic array that pics up your voice from across the room — you’d have the perfect voice-activated digital assistant.

Instead, I paid the [company]Amazon[/company] Prime member price of $99 (it’s $199 for non-Prime members) for what is basically a voice-activated timer, task list and way to access my Amazon Prime music library. The Echo also answers questions via a Bing search about 70 percent of the time it’s asked, although some basics — such as my requests to convert a temperature from Fahrenheit to Celsius — proved unintelligible to the Echo (you can see that in the screenshot below).

Echo recognized requests from half a dozen people, including two children, although my daughter is having a hard time with Echo because she can’t always say “Alexa,” the wake word we use for the device. (Sadly, you only get two options for your wake word: Alexa or Amazon, but a spokesperson from Amazon says it will add more wake words over time.) You can’t change the search engine, so you’d better love Bing.

How it works

Before I get too deeply into my review of the Echo, let me pause to explain how it works and what it can do. The device is a little under nine inches tall and is about the diameter of a wine bottle. It has a ring of lights at the top that acts as an indicator, showing it has heard your command, and can be turned to raise or lower the volume. As electronics go, it’s elegant enough to sit in a visible place in your home. Mine’s on my kitchen counter and I can talk to it from just about anywhere in my downstairs living room, dining or kitchen area.

The Echo also comes with a remote control that you can stick to a surface via a magnet or double-sided tape. The remote reportedly comes in handy for communicating with the Echo in noisy environments, but I’ve found it’s most helpful for fast-forwarding songs that I dislike since my home is rarely noisy enough that the Echo can’t hear us. Using the remote for this is faster than saying “Alexa, skip this song.”

When you open the package, setup takes less than 10 minutes and requires you to go to Amazon’s site to download the Amazon Echo app to your Android phone. The app lets you customize things such as your zip code (so Echo can get weather and news for your area), plus specify important elements such as purchasing preferences.

The app uses the card format you may recognize from Google Now.

For example, the automatic purchasing was turned on when I got the device, which meant that I could just tell the Echo to buy Taylor Swift’s album and could have it in my music library immediately. Thankfully, I don’t have one-click ordering turned on, and as added protection against my eight-year-old’s love of Top 40 songs and instant gratification, I also added a PIN number I need to enter before making a purchase.

You can use the app to see what the Echo heard when it misses your verbal request, and to check your Shopping or To Do list. It’s also where you can go to create multiple profiles so you or your partner can share a single Echo but have multiple To Do lists. I’ve found the app to be a good place to troubleshoot when the Echo gets things wrong. It is also popular in our house since we occasionally ask the device for the Spanish translation of an English word, and the translation goes into the app because Echo doesn’t know how to speak it.

What works, what doesn’t

Like any new tool, the Echo takes a bit of getting used to. I imagine in a few weeks it will have completely supplanted a few items in my home, such as the egg timer. I like the timer setting, although the alarm is a bit quiet. I use it for cooking, but also for keeping track of time — “Alexa, set a timer for 4 pm.” I also like asking it to tell me the time since we can’t see the clock from our living room couch.

We’ve been adding things to a shopping list though Echo, although we don’t use that as our master list yet for the weekly run to the grocery store. We also haven’t tried out the feature to let our daughter bring up webpages on her Kindle Fire tablet for schoolwork, but I’m looking forward to that. The idea is that she can say “Alexa, search tornadoes on Wikipedia,” and the page will come up on her Fire Tablet.

We dislike the Echo’s Amazon-centric worldview. My family spends plenty of cash on Amazon each month, but our lives are managed via Google calendars and our entertainment is on Netflix, Spotify (mostly consumed over a Sonos) and Hulu. Our home automation runs the gamut from Philips Hue bulbs to a Nest and Chamberlain garage door opener integrated with SmartThings and IFTTT. I’d like to bring those devices and services into the Echo. Amazon does have integrations with TuneIn and iHeartRadio, and I would expect to see more integrations coming since the brains of the Echo are hosted in the cloud and can be updated over time. Amazon’s spokesman does note that other Amazon products have an SDK and that Amazon does want to hear from developers about what they want to do with the Echo, so there’s some hope out there, although Amazon is a company that is known to build proprietary versions of existing open source platforms.

Is the Echo a Jambox rival?

The Echo also acts as a Bluetooth speaker playing music via Spotify, Pandora and other services via your phone, so people considering a Jawbone Jambox might consider Echo instead. It sounds as good as the mini Jambox, although it’s not as adept with the bass. You have to control playback via the phone, not via your voice.

Why not add…

Since Amazon is clearly thinking about ways to build devices that push them deeper into the home and gather more data, there are ways to make the Echo more robust and yet unique enough that Amazon can truly make it an essential gadget in many homes. Here are some things I’d like to see, beyond the ability to control more web services:

Expansion packs for languages, so I can query for vocabulary or questions in other languages.

Home automation control so I can integrate it with my Nest, Hue lights or SmartThings.

A safe mode for my kid, so she can’t purchase anything, but also so she can’t inadvertently play music or send websites to her Kindle Fire that she shouldn’t. Ideally, this would be based on voice recognition.

A way to link two Echoes together and sync them, the way you can with a Sonos music player in party mode. As speakers go, these sound pretty good, so some people might use them all over their house instead of investing in multiple Bluetooth or Sonos speakers.

The end game

As a consumer and Amazon Prime member already, I’d pay $100 for the Echo because I’m old and having voice-controlled access to Amazon’s Prime Music streaming service fits my musical tastes (long live the 90s!). Plus, I like the list features and have a chunk of my music on Amazon’s cloud already. If I weren’t a Prime member this would be a much tougher sell.

If I hated the existing Prime Music offerings, I wouldn’t buy the Echo unless I already bought my music on Amazon and kept it in Amazon’s cloud, or if I were a big user of TuneIn or iHeartRadio. But I do see the device as a unique, well-done offering from a technological perspective that could put Amazon deeper into people’s homes if it manages to open it up a bit more.

I like the Echo and find myself looking for more ways to interact with it, and hope those come along in the near future. If Amazon is really a company hell-bent on getting more data to sell me more physical and digital goods, then the Echo can gather a lot more data about a lot more devices if Amazon wants to let it control more things. So both from a selling-more-Echoes perspective and a business-strategy perspective, my hope is that we see Amazon open Alexa up.

And yes, I did say that again.

]]>http://gigaom.com/2014/12/12/amazons-echo-is-a-good-listener-but-a-wretched-assistant/feed/15Why Her‘s production designer thinks about function before designhttp://gigaom.com/2014/11/19/why-hers-production-designer-thinks-about-function-before-design/
Wed, 19 Nov 2014 19:02:31 +0000http://gigaom.com/?p=890320K.K. Barrett has worked with Spike Jonze on many films, and helped design Her, a film that famously centers on the relationship between its geeky protagonist and what seems to be a post-Siri talking operating system. At Gigaom’s Roadmap conference on Wednesday, he naturally talked about how the design process for this semi-futuristic technology worked – and also about his own attitudes towards gadget design.

“I think of technology as a tool, something that’s a means to an end,” Barrett said. “I don’t care about the design of the tool unless it functions really well. Design as function is different in my mind to aesthetic design – I do want it to be pleasing if it works very well as a tool.”

On the husky voice of Scarlett Johansson, who played the “Samantha” OS, Barrett said it had been a good choice to go with someone with an unusual voice rather than the “flat voice or common voice” that is often used for virtual assistants today. He said such systems would “eventually need to recognize our hesitancies” and other human quirks.

Barrett explained that, when he and Jonze began designing the film, [company]Apple[/company] hadn’t brought out Siri yet – that only happened during the process, leading the team to keep bringing forward how far in the future the film’s aesthetic was supposed to represent. “At the end we decided we were one release away from now,” he said. “The script had nothing to do with technology. It was a human story about human connection, or lack thereof.”

That said, it turns out the most challenging thing Barrett had to design for the film was the protagonist Theodore’s smartphone-like device. “The first thing we started designing on day one was the device and it was the last thing ready on the day we started shooting,” he said, explaining that this was over a three-and-a-half month period.

]]>Amazon’s Echo device already exists. It’s called Ubihttp://gigaom.com/2014/11/07/amazons-echo-device-already-exists-its-called-ubi/
http://gigaom.com/2014/11/07/amazons-echo-device-already-exists-its-called-ubi/#commentsFri, 07 Nov 2014 19:03:10 +0000http://gigaom.com/?p=887109Amazon surprised the tech world Thursday with the launch of Echo, a cylindrical device the size of a whiskey bottle that lets you ask questions, order it to play music, spell words or add items to your grocery list. Unfortunately for those of us who believe voice commands are an essential way to communicate with a smart home, amazon is using an invite-only process to distribute the Echo I signed up, and as an Amazon Prime member will only pay $100 for the $199.99 device, but who knows if I’ll get picked to try it out.

But in the meantime I have a device that does much the same thing. It’s called Ubi, and it’s a $299 connected speaker that offers many of the same functions as Amazon shows off in its advertisement as well as hooks into If This Then that and an open development environment. I last played with the Ubi in the spring, where it was a bit of an iffy experience. Sometimes it worked and other times it offered completely random and often hilarious responses.

In my home Ubi became a joke as we tried to get it to respond to our questions only to be met with something completely different. I’m happy to say that several updates later Ubi is much better at deciphering what I’m asking and delivering more value. But it’s still an experience that requires a user to interact with Ubi like it’s a computer rather than speaking normally, unlike how the Amazon Echo is protrayed.

While Amazon’s Echo is only shown right now in an advertisement (which you would expect to be show optimal conditions) I’ve taken many of the same questions that the family asks Amazon’s Echo and asked them of Ubi so you can get a sense of how it sounds. Play the audio file below to hear how Ubi and I interact, or you can continue reading the review for an overview.

In general Ubi takes about two or three seconds to respond to her “wake up” command and then another few seconds to take actions. If I’m directing an action over IFTTT it takes longer. For example, when I asked Ubi “Turn living room lights on” via an IFTTT recipe, it takes 7 seconds for the lights to flip on. That feels like an eternity. I would expect that over time Amazon might build in options that let you control your connected devices, although for now it appears to have its eyes on something a bit more tied to Amazon’s retail operation — letting you add things to your shopping list.

That’s a pretty handy feature as is the ability to talk to the Amazon echo from anywhere in the room. Ubi definitely suffers if you get too far away from it and the speaker quality is pretty low. So when you ask Ubi to play Katy Perry’s Firework as my daughter does, it doesn’t sound great. I’m sure Amazon’s product will sound better.

Ubi does have a fun intercom feature that lets you call any Ubi from an app running on Android phones, which means I can open the app, hit a button and the Ubi speaker in my kitchen will repeat what I’ve said. I used it to scare my daughter while I was testing this. Ultimately though you have no way to know if your message was heard or if Ubi is talking to an empty room. Although with the aid of an IP camera that could change.

All in all, for $299 Ubi is an expensive toy for the tech set who want to play with voice as a UI. When the Echo hits homes I’ll be on the lookout to see how it fares with the speech recognition and natural language processing as well as how reactive the experience is. Those long pauses with Ubi are horribly awkward.

A better sounding speaker and details on how the privacy policy works round out my list of features that I’ll be eyeing in the Echo. Hopefully I’ll get a chance to experience it soon. In the meantime, I’ve got Ubi.

]]>http://gigaom.com/2014/11/07/amazons-echo-device-already-exists-its-called-ubi/feed/14Speech-recognition platform Wit.AI raises a $3M seed roundhttp://gigaom.com/2014/10/15/speech-recognition-platform-wit-ai-raises-a-3m-seed-round/
Wed, 15 Oct 2014 17:19:32 +0000http://gigaom.com/?p=881134Wit.AI, a startup building an API platform for speech recognition, has raised a $3 million seed round, led by Andreessen Horowitz. Ignition Partners, NEA, A-Grade, SVAngel, Eric Hahn, Alven Capital, and TenOneTen also contributed. We covered Wit.AI in May, detailing its plans to build a machine-learning-powered API service that developers can use to bring voice commands to their applications or connected devices. We will, of course, be talking all about the devices that could benefit from such a platform at our Structure Connect conference next week in San Francisco.
]]>An AI anthology: Tracking the rise of self-learning computershttp://gigaom.com/2014/06/02/an-ai-anthology-tracking-the-rise-of-self-learning-computers/
http://gigaom.com/2014/06/02/an-ai-anthology-tracking-the-rise-of-self-learning-computers/#commentsMon, 02 Jun 2014 12:00:29 +0000http://gigaom.com/?p=845189

Over the years, Gigaom has covered many attempts to improve the way that computers respond to our voices, movements or other visual cues, and identify the words we type and the pictures we take. These technologies have and certainly will continue to change the way we interact with computers and consume the incredible amount of digital data we’re producing. The work being done in universities and corporate research labs right now to build self-learning vision, voice and language models will only make our experiences better.

Here are some timelines tracking Gigaom’s AI coverage over the years, specifically around [technology]deep learning[/technology] research and applications, other types of learning systems and applications, and cognitive computing (really, just [company]IBM[/company] Watson). The second timeline gathers discussions of advanced AI at our various conferences. Links to stories are below the images.

We will update it regularly as new product launches, research advances and industry news occur.

Computers that learn what they’re seeing, hearing and reading

For some more information on deep learning, check out these useful primers:

Watson: IBM’s big bet on cognitive computing

Talking AI at Gigaom events

]]>http://gigaom.com/2014/06/02/an-ai-anthology-tracking-the-rise-of-self-learning-computers/feed/9Someday, Her will be real. But first, an internet of things that obey our commandshttp://gigaom.com/2014/05/20/someday-her-will-be-real-but-first-an-internet-of-things-that-obey-our-commands/
http://gigaom.com/2014/05/20/someday-her-will-be-real-but-first-an-internet-of-things-that-obey-our-commands/#commentsTue, 20 May 2014 19:49:45 +0000http://gigaom.com/?p=842425Wit.AI co-founder Alex Lebrun has just one small dream: He wants to make artificially intelligent personalities available to every device we own. But even artificial intelligence systems have to crawl before they can walk.

Any plan to create a system like Joaquin Phoenix’s love interest in the movie Her “has to be grounded in reality,” Lebrun says. Right now, reality is not smooth dialog with a smartphone-based avatar that understands us and the world around us. A more realistic scenario for the next couple years will be empowering your TV to turn down the volume when you ask it to, by turning speech into machine-readable JSON lines.

“If you want to teach something to an AI,” Lebrun explained, “it first has to understand simple voice commands.” (In a recent Ask Me Anything on Reddit, Facebook AI director Yann LeCun also weighed in on Her, writing that “Something like the intelligent agent in ‘Her’ is totally out of reach of current technology.”)

Scaling speech recognition means embracing machine learning

Lebrun is no neophyte when it comes to interactions between humans and computers. He (along with fellow Wit employee Laurent Landowski) previously founded an online customer service company called VirtuOz that Nuance bought in 2013. So he’s not trying not to get ahead of himself — that goes for the technology as well as the business model. Because the vocabulary at Symantec, for example, doesn’t mean a whole lot at Nestle, VirtuOz, which provided a virtual customer service agent for websites, cost its users about $100,000 to deploy and it took months to hard-code the systems to know everything they needed to know for each individual company.

In trying to take this type of technology mainstream, neither the cost nor the static language model would work. That’s why Lebrun, Landowski and co-founder and CTO Willy Blandin decided to do things differently with Wit. It’s delivered as a free API that developers can use to build voice-command capabilities into their connected devices. Because it’s a cloud-based service, Wit can use machine learning to expand its knowledge base with each developer who adds commands to the system, rather than forcing everyone to hardwire their own sets of words and actions.

Currently, Wit has signed up about 3,500 developers, mostly in the world of connected devices and the internet of things. It was working with Nest before the Google acquisition (it’s not anymore) and is working with SmartThings and various devices with which its connected-home hub interacts. Ideally, someone sitting in his lounge chair will be able to say “Turn the temperature to 75 degrees” or “Set an alarm for 8:15 a.m.,” and the appropriate device will recognize the command, send it to the Wit servers for processing, and then perform the command when it receives its JSON instructions.

Because it’s dealing with a relatively small vocabulary (there are only so many connected devices right now and, therefore, only so many commands), Wit is able to hit accuracy rates up to 95 percent in some situations. “By connecting all these dots, we have a very good coverage of language,” Lebrun explained.

Provided it sticks to a discernible set of rules, he added, “If you invent your own language … it will work.”

The Wit.AI experience.

To Siri and beyond

If voice commands are the first step toward a fully realized AI experience, the next step is a version of Apple’s Siri that works better (Microsoft would argue its Cortana virtual assistant fits this bill) and that’s omnipresent. Talking to a phone isn’t a particularly intuitive experience (using the keyboard is probably easier, Lebrun suggested) but engaging in some sort of dialog with other devices or appliances might feel a lot more natural. Lebrun thinks it will be at least three years before Wit’s API can enable full dialog between people and their devices.

There are technical hurdles — adequately connecting various APIs or other systems for speech, language and vision, for example — as well as the difficulty of teaching systems to perceive things beyond pattern recognition. They’ll need to be able to figure out that tables or lamps, for example, don’t always look like tables or lamps. They’ll need to go beyond mere recognition and begin to understand what it means for people or objects move from Point A to Point B, to predict the future based on all the sensory experiences they’ve already ingested.

Humans might need to make a few adjustments, too — including recognizing that no matter how good an AI system is or what they’ve been promised, it’s still a machine. Lebrun said about 25 percent of people who interacted with a VirtuOz agent thought it was a person — and that technology was relatively rudimentary. Many people felt obliged to type “Thank you” at the end of a chat session; about 15 percent tried to go off-topic and pick up “female” agents.

Amid all this talk about artificial intelligence, that last bit of info might actually be a comforting thought to some people. The more things change, the more they do, indeed, stay the same.

]]>http://gigaom.com/2014/05/20/someday-her-will-be-real-but-first-an-internet-of-things-that-obey-our-commands/feed/4How this startup knows who’s in your meeting, what they’re saying and whether it mattershttp://gigaom.com/2014/04/11/how-this-startup-knows-whos-in-your-meeting-what-theyre-saying-and-whether-it-matters/
http://gigaom.com/2014/04/11/how-this-startup-knows-whos-in-your-meeting-what-theyre-saying-and-whether-it-matters/#commentsFri, 11 Apr 2014 19:45:00 +0000http://gigaom.com/?p=833799In the business world, the voice is a powerful thing. In meeting rooms, offices and conference calls, it’s how ideas are generated, mandates given and gauntlets thrown down. Yet, somehow, the record of all these discussions doesn’t quite do them justice: messy handwritten (and probably incomplete) notes, typed meeting minutes that don’t distinguish idle chatter from meaningful business or, worse, no record at all. Thanks to advances taking place in computing and machine learning, that’s all about change.

Take, for example, a startup called Gridspace that wants to make meetings more productive by outsourcing note-taking to a machine. It’s a challenging problem to solve — any solution must provide a seamless experience, as well as be accurate — but the company is trying to do it right. It has built product that bundles smart hardware and applications with several flavors of speech recognition, voice recognition and natural language processing.

The most noticeable piece of the puzzle is the hardware — a simple, small recording device called the Memo M1 that sits on a desk or table. It’s always on, although its ambient light and motion sensors let it kick in only when someone is actually in the room. It has radio sensors to help determine who’s in the room based on their mobile phone fingerprints, although voice recognition helps makes this more accurate as does pre-planning the meeting using the Memo app and listing the participants.

The Memo service works with conference lines, as well (it can be set up to automatically call participants) and there’s a mobile app available for recording conversations on the road.

After a meeting is done, Memo will email everyone the highlights of the meeting and provide them an opportunity to go through and comment on or flag certain parts. The next day they’ll receive a fuller digest, complete with that post-facto information. At any time, participants can listen to the highlights of the meeting, which presumably are important points or action items, or they can hear the whole thing. They can search for specific parts by word or person.

The Memo mobile app. Source: Gridspace

Gridspace CTO Anthony Scodary described the user experience design as being focused on minimizing changes to how we go about our days in the office. Set up to its fullest potential, Memo users don’t have to press a button, set up something in an app, or even speak a command at something to take advantage of the service. “It’s really just [about] designing interfaces … that make something that you don’t have to change your natural behaviors much,” he said.

Getting it right means getting NLP right

As seamless as the experience might be, though, it’s Gridspace’s work on natural language processing and speech recognition that could make or break the company. All the automation and search capabilities in the world don’t mean much if a system designed to capture meetings can’t understand what’s happening or what’s being said. And after all, as Scodary acknowledged, “The end goal [of Memo] is to generate what is essentially the highlight reel of a meeting.”

Memo has several methods for deeming what might be important, ranging from certain keywords being spoken (e.g., “This is important.”) to someone manually pressing a button on the M1 device to flag it as important. Even changes in volume or lots of people talking over each other might indicate a key part of the conversation.

However, as with many machine learning systems today, it’s the input of humans that will help train Memo to be as accurate as it can be, Scodary explained. The more that people go through afterward and verify the system was correct, or flag important parts it missed, the smarter it gets. When someone “inputs unambiguously that something is important,” he said, Memo analyzes the context around those sections and readjust the weights in its algorithms accordingly.

Pressing to flag content or mute the recorder. Source: Gridspace

Out of the boardroom and into the hallway

If Gridspace, which is still in the process of closed pilot projects and taking reservations for its M1 devices and mobile app, can pull this off, it could have promise even beyond the conference room. Scodary envisions a future where people have Memo devices sitting on their desks, ready to capture an impromptu brainstorming session or maybe just a short chat about the all-hands meeting earlier in the day.

“We’re very interested in those three-minute meeting between your other meetings,” Scodary said. (And don’t worry: there’s a mute button if you’re going to complain about the boss, and Scodary said the company is working on features for voice commands to strike previous comments and to delete parts of a meeting that has already happened.)

Frankly, this vision is the kind of thing one can see a company like Microsoft or Google chasing, too, as they strive to own productivity by owning the crossroads of collaboration, communication and devices. This type of technology could find its way into an already sensor-packed smartphone, tablet, desktop or even wearable — Intel recently showed off a new mobile processor designed with voice recognition in mind — and integrate with existing office suites and meeting applications.

At home or in the office, our voices could soon be just as important inputs to our computers as our keystrokes. Once we figure out how to avoid putting our collective foot in our mouth, we’ll probably be thankful for it.

]]>http://gigaom.com/2014/04/11/how-this-startup-knows-whos-in-your-meeting-what-theyre-saying-and-whether-it-matters/feed/6How to get the most out of Apple’s Siri on your iPhone or iPadhttp://gigaom.com/2014/01/11/how-to-get-the-most-out-of-apples-siri-on-your-iphone-or-ipad/
http://gigaom.com/2014/01/11/how-to-get-the-most-out-of-apples-siri-on-your-iphone-or-ipad/#commentsSat, 11 Jan 2014 17:00:31 +0000http://gigaom.com/?p=790770For a feature that has so many uses across all of iOS, it is amazing how many people have never used Siri before. A survey taken in the second half of last year found that as many as 84 percent of users polled were not using Siri following the launch of iOS 7.

That means that there are still quite a few individuals that for some strange reason find it awkward speaking into their cell phones. What may make the remaining device owners more comfortable trying out Siri is knowing that you can use Siri with a set of headphones that include a microphone and even Bluetooth headsets.

The following will help those that have not used Siri before get started, and show off some of the many situations where Siri can be used on iOS for those among us that are using it.

Teaching Siri

What to call you – Siri can be configured under the General settings on your iOS device. One of the first things you set up on your device is which record in your contacts list belongs to your identity information. Siri uses this information to know who you are. Using the nickname feature of your contact information, you can tell Siri what you prefer to be called. It is of course much easier to just tell Siri “Siri, call me ‘your majesty.'”

Correct pronunciations – Siri does not always get things right when it comes to the pronunciation of certain names. But that’s ok, you can always teach Siri how best to enunciate each name. All you need to do is simply tell Siri to “Pronounce Geoffrey Goetz” and you will be guided through a series of tuning settings that will help get the pronunciation just right. This is much better than trying to manually enter a phonetic spelling of a name in the nickname field of your contacts.

Your family tree – Your contacts list is something that Siri can master. The more information you have in your contact list the more Siri will know about you and your family. In contacts you can add several different relationships by editing your contact information and adding a new “Related Name.” The label specifies the relationship between the two contacts; mother, father, brother, sister, spouse, child, friend. You can even enter a custom label of your choosing. When used in conjunction with Find My Friends, it makes searching for your contacts by your association to them much easier. “Siri, where is my son (or daughter)?”

Working with Text

Take dictation – Using Siri to perform speech-to-text translations is not limited to the questions you can ask Siri. You can also tap the microphone button on the keyboard to awaken Siri and speak the text you would like Siri to type for you. Sometimes saying what you are thinking can help you refine your thoughts more clearly, and allows you to avoid committing to paper many of the things in life you probably shouldn’t.

Read selected text – There are many hidden gems inside of the iOS Accessibility settings that almost everyone can take advantage of. One such setting is turning on the text-to-speech abilities of Siri. Hidden under General, Accessibility, Speak Selection, you can change the voice, speaking rate, and even have the words highlighted as they are spoken. This will add an additional selectable item above the text when you select it named “Speak.” Simply select a section of text as if you wanted to copy it and tap on the “Speak” option to have Siri read back to you the text that you have selected. Unfortunately this does not work on books in your Kindle library. To do that you will have to turn on Siri’s Voice Over feature.

Reading ebooks – Siri’s Voice Over setting can turn virtually any ebook into an audiobook. To make it easier to switch Voice Over on and off, you can configure the Accessibility Shortcut feature located at the bottom of the settings for Accessibility. With it you can quickly use a triple-click of the Home button to turn Voice Over on and off. Once set, open one of the books in your Library, even using the Kindle app, and triple-click to enable Voice Over. Once activated use a two-finger swipe up to signify that you want Siri to begin reading the book to you. Now Siri is not nearly as nice as some of the winners of this year’s Audie awards for audible books, but it will do in a pinch.

Speak notifications – Another hidden gem in the Accessibility settings is the ability to have Siri speak notifications as they pop up. This particular feature is great to use in conjunction with your car audio system so you don’t feel like you have to take your eyes off the road when a new notification is sent to you. This is accomplished by turning on the Speak Notifications feature that is part of the Voice Over capabilities. Triple-click the home button once your iPhone is paired with your car and Siri will read your notifications as they pop up while you are driving.

Controlling your device

Change device settings – This is a great feature when you are on a plane, listening to your favorite music, and you want to switch the airplane mode on. You can speak a command to Siri by pressing and holding down on the pause button on the remote attached to your headphones and instruct Siri to modify the settings without missing a beat. Something as simple as just saying “Turn on Bluetooth” as you get into your car and “Turn on Bluetooth” when you get out. And if there is a setting that you frequently change but don’t like navigating too, just inform Siri that you want to change the setting and the proper configuration screen will instantly appear.

Launch apps – Similar to the spotlight feature in iOS, Siri has the ability to find and quickly launch apps you have installed on your device. By saying something like “Launch Spotify,” Siri will search your list of installed apps and launch the app matching the name you have spoken.

Play iTunes Radio – Of course iTunes is an app, but Siri can take you specifically to iTunes Radio, which is a tab within an app. This feature — launching a specific tab within an app — does not work on all apps like Spotify or Pandora, but is a handy way to instantly play your favorite iTunes radio station with a simple command. You can even get more specific with your music library and ask Siri to play a particular album, song or artist.

Sending and receiving messages

Review missed calls – Placing calls to specific individuals in your contacts list by saying something as simple as “Call Home” is certainly easy enough to do using Siri. But did you know that Siri can also be used to list out your most recently missed calls by saying “Do I have any missed calls” or check your voicemail list by saying “Do I have any new voicemail.” Again, this is a great hands-free feature that you can use in many situations where you are waiting for that important call to come in.

Check email – In a similar manner, Siri also has access to all of your mail: “Do I have any new email?” You can even get more specific by asking “Any new email from Tom today” and Siri will look at any emails that you have received from Tom today. It is even possible to have Siri check the context of the message by asking something like “Show new mail about the contract.”

Tweet someone – Since Siri is integrated more with external services like Twitter, you can use Siri in a similar fashion to messages. Rather than saying “Tell Susie message” you say “Tweet message instead”. You can even specify a hashtag that you want to use in your message or that you want to use your location information in your tweet. Siri can search beyond the bounds of search engines as well: it can search for trending topics on Twitter by asking “What is trending on Twitter” or something more specific like “Find tweets with the hashtag Siri”.