Voice Recognition enables users to control their Smart TVs with voice commands. Two modes of recognition are supported: embedded mode and server mode. In the first mode, the TV tries to match voice samples to one of predefined commands. This operation is performed locally by the TV. In the second mode, voice samples are sent to an external server which converts them into text and sends the results back to the TV. In this tutorial you will learn how to develop an application which lets the user control it with voice.

Figure 1. Initialized tutorial application on the virtual machine based version of the Emulator

Before performing any action related to Voice Recognition, you should check if this feature is supported by the device your application is running on. Use the IsRecognitionSupported function for this purpose. You should also check if voice recognition is enabled on this device, using IsVoiceRecognitionEnabled function.

You can find the code responsible for those actions in the Setup scene’s handleShow method, in /app/scenes/Setup.js.

When the application is closed, you should unsubscribe from voice events with the UnsubscribeExEvent function. If your application uses the Application Framework, unsubscribe in the onDestroy function which can be found in /app/init.js file. Otherwise you can unsubscribe when a window.onunload event occurs.

When a recognition event occurs, the callback function that was set up during event subscription is invoked. The function receives an event object as a parameter. There can be only one voice event callback registered. Because of that the VoiceDispatcher object was introduced in the application. It routes received voice events to the currently focused scene.

Voice event object passed by the VoiceDispatcher to the scene’s handler contains properties specifying the type of the event and the recognition result. Basing on this information, appropriate action can be performed.

File: /app/scenes/PhotoView.js

(...)this.handleVoice=function(e){log("ScenePhotoView.handleVoice, type: "+e.eventtype);switch(e.result.toLowerCase()){case"left":Grid.left();break;case"right":Grid.right();break;case"exit":sf.core.exit();break;case"close":sf.scene.hide("PhotoView");sf.scene.show("Gallery");sf.scene.focus("Gallery");sf.key.preventDefault();break;case"describe":(...)isDescriptionOn=true;break;case"return":if(isDescriptionOn){(...)isDescriptionOn=false;}break;default:if(isDescriptionOn){// Recognize any text description of a photo in gallery.Grid.setDescription(e.result);(...)isDescriptionOn=false;}break;}};(...)

In property event.result there is an information about the recognized text. In the preceding code snippet you can see a command handler for both embedded mode set by user (left, right, exit, close, describe) and server mode (return).

In this application, there is a possibility to set a text description for each photo in a gallery. If event.result is not recognized as keyword (left, right, exit, close, describe or return), we treat it as a description, if the description mode is on (its state is held in isDescriptionOn variable).

The voice helpbar shows available voice commands and/or a guide text at the top of the screen. It acts as a guide for the user prompting which voice commands are currently supported by the application. Helpbar contents can be changed by the application at any time.

SetVoiceHelpbarInfo function is used to set up the voice helpbar type and its items. Voice helpbar configuration must be passed as a stringified JSON object. Following code snippet illustrates the format and possible properties of the configuration object.

Important

Recognition mode is determined by the type of the voice helpbar.

Tip

You can use the JSON.stringify method to convert a JavaScript object into a string.

In server mode, the TV sends recorded voice samples to an external server which performs recognition. When the operation is completed, the server returns the recognized string. In addition to the guide text, the server mode helpbar can also display one of the predefined items such as OK, Cancel or Return:

HELPBAR_TYPE_VOICE_SERVER_GUIDE

default helpbar for server recognition mode

HELPBAR_TYPE_VOICE_SERVER_GUIDE_RETURN

helpbar for server recognition mode with special Return command emulation

HELPBAR_TYPE_VOICE_SERVER_GUIDE_OK

helpbar for server recognition mode with special OK command emulation

HELPBAR_TYPE_VOICE_SERVER_GUIDE_CANCEL

helpbar for server recognition mode with special CANCEL command emulation

File: /app/scenes/PhotoView.js

helpBarInfoDescribe={helpbarType:"HELPBAR_TYPE_VOICE_SERVER_GUIDE_RETURN",guideText:"Say the word or phrase you wish to add as the image description"}webapis.recognition.SetVoiceHelpbarInfo(JSON.stringify(helpBarInfoDescribe));

Tip

In order to test voice recognition in the server mode using the SDK, use the input field shown on Figure 2 to enter the phrase to be recognized.