README

==============================================
Speech recognition script for Asterisk
==============================================
This script makes use of Google's speech recognition engine
in order to render speech to text and return it back to the dialplan
as an asterisk channel variable.
------------
Requirements
------------
Perl The Perl Programming Language
perl-libwww The World-Wide Web library for Perl
flac Free Lossless Audio Codec
Internet access in order to contact google and get the speech data.
If you plan to use SSL you will need to install the 'IO-Socket-SSL' Perl module
that implements an interface to SSL sockets.
------------
Installation
------------
To install copy speech-recog.agi to your agi-bin directory.
Usually this is /var/lib/asterisk/agi-bin/
To make sure check your /etc/asterisk/asterisk.conf file
-----
Usage
-----
agi(speech-recog.agi,[lang],[timeout],[intkey],[NOBEEP])
Records from the current channel untill the timeout (set to 10 seconds by default,
-1 for no timeout) is reached or the interrupt key (# by default) is pressed.
If NOBEEP is set, no beep sound is played back to the user to indicate the
start of the recording.
The recorded sound is send over to googles speech recognition service and the
returned text string is assigned as the value of the channel variable 'utterance'.
The scripts sets the following channel variables:
status : Return status. 0 means success, non zero values indicating different errors.
id : Some id string that googles engine returns, not very useful(?).
utterance : The generated text string.
confidence : A value between 0 and 1 indicating the probability of a correct recognition.
Values bigger than 0.95 usually mean that the resulted text is correct.
--------
Examples
--------
sample dialplan code for your extensions.conf
;Simple speech recognition
exten => 1234,1,Answer()
exten => 1234,n,agi(speech-recog.agi,en-US)
exten => 1234,n,Verbose(1,The text you just said is: ${utterance})
exten => 1234,n,Verbose(1,The probability to be right is: ${confidence})
exten => 1234,n,Hangup()
;Speech recognition demo also using googletts.agi for text to speech synthesis:
exten => 1235,1,Answer()
exten => 1235,n,agi(googletts.agi,"Say something in English, when done press the pound key.",en)
exten => 1235,n(record),agi(speech-recog.agi,en-US)
exten => 1235,n,Verbose(1,Script returned: ${status} , ${id} , ${confidence} , ${utterance})
;Check return status:
exten => 1235,n,GotoIf($["${status}" = "0"]?success:fail)
;Check the probability of a successful recognition:
exten => 1235,n(success),GotoIf($["${confidence}" > "0.8"]?playback:retry)
;Playback the text
exten => 1235,n(playback),agi(googletts.agi,"The text you just said was...",en)
exten => 1235,n,agi(googletts.agi,"${utterance}",en)
exten => 1235,n,goto(end)
;Retry in case speech recognition wasn't successful:
exten => 1235,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",en)
exten => 1235,n,goto(record)
exten => 1235,n(fail),agi(googletts.agi,"Failed to get speech data.",en)
exten => 1235,n(end),Hangup()
;Voice dialing example
exten => 1236,1,Answer()
exten => 1236,n,agi(googletts.agi,"PLease say the number you want to dial.",en)
exten => 1236,n(record),agi(speech-recog.agi,en-US)
exten => 1236,n,GotoIf($[$["${status}" = "0"] & $["${confidence}" > "0.8"]]?success:retry)
exten => 1236,n(success),goto(${utterance},1)
exten => 1236,n(retry),agi(googletts.agi,"Can you please repeat?",en)
exten => 1236,n,goto(record)
Under the folder wolfram you can find a sample agi script that in combination with speech-recog.agi
sends queries to WolframAlpha and returs the answers as a dialplan variable. See wolfram/README for
details and dialplan examples.
-------------------
Supported Languages
-------------------
English
Afrikaans
Arabic
Chinese
Czech
Dutch
French
German
Hebrew
Italian
Indonesian
Japanese
Korean
Latin
Malaysian
Polish
Portuguese
Russian
Spanish
Turkish
Yue Chinese (Traditional Hong Kong)
Zulu
-----------------------
Security Considerations
-----------------------
This script contacts google's servers in order send the recorded voice data and get back
the resulted text. If you are using voice recognition for sensitive data or you want to
make sure that no 3rd party can eavesdrop your communication with the google service it is
advised to enable the use of SSL in the script. This will encrypt all the traffic between
your pbx and google servers making it much safer.
To enable the use of SSL set the variable '$use_ssl' to 1 in the script. Keep in mind that
security comes with a price, the initialisation of the SSL session adds a small delay in
communication (usually under 1 sec).
-------
License
-------
The speech-recog script for asterisk is distributed under the GNU General Public
License v2. See COPYING for details.