Monday, 7 September 2015

The development of smartphone web applications hosts many challenges. For me personally the largest challenge is efficient and comfortable data input. Entering data through a small keyboard is a time consuming and unpleasant action for me. For this reason I have been designing and developing alternative ways to enter data. My latest investigations concern speech input. It started with the release of iOS8 that supported speech-to-text for a lot more languages than before. Among them was Dutch, my native language, and you know what? It does a great job!

Speech recognition on iPhone

Recognition is fast and accurate. I am able to ‘speak’ a text message or email without or just with one correction. It has a hard time recognizing names and jargon, so entering this blog post with speech would need a lot of correcting. You do need an active connection to the internet as the spoken words are sent to a server where they are detailed interpretation gets done. As you speak you will see the result of recognition by the iPhone. A fraction of a second later this is replaced by the server interpretation.
The dictation function is available in the keyboard.

Any place where you can use the keyboard you can press the microphone and use dictation.
This will take care that the words you speak will be converted to characters that are input into the current input item.
You can have more than one language keyboard installed. Use the Globe button to switch keyboards (and language).
In iOS8 you can turn on the speech recognition in the Settings > General > Keyboards section:

I have done some tests on the speech recognition of Android devices, and this is pretty good, although it cannot match the iPhone at the moment.
Speech recognition within Apex applications
You can use the dictation with any Apex application out of the box. Once you have activated the dictation the microphone will be available in the keyboard and keyboard entry can be replaced by speech to text conversion.
There are a few limitations however.
Some data types need conversion:

Numbers below 10 are converted to words, not digits. Amount may contain a currency sign

Spoken times might not be in the right format

Dates will most likely not be in the format expected by Apex. If the spoken text is recognized as a date the month name will be used.

The items based on these item types will need to implement a conversion for spoken input, so the form needs to be changed.
Furthermore it is not possible to enter controls that do not expose the keyboard. Amongst others these check boxes, radio groups, select lists and HTML5 date and time controls.

Apart from the above it is not practical to enter by speech each item separately. You need 3 taps per item. In most cases keyboard entry will be faster. You will only benefit when entering large texts like remarks or descriptions.

Input of multiple items at once

As mentioned above the entry of separate items is not efficient.
Another way to use speech recognition is by creating a new input item especially for speech entry. The spoken content of this item is split into item content using stop words.
This way multiple items can be input by speaking one sentence. Another advantage of this approach is that the necessary processing can be performed while analysing the sentence before placing the values into the items. So the form items them selves need not be changed.
The processing can also be extended to special cases. It enables for example the use of relative dates, like yesterday or Monday last week.

Example form

I have built a sample Apex application for speech input. This is based on the mobile Apex application that I use on a daily base to register my expenses. To enter an expenses the date, the amount spent, the name of the shop and a description of the purchase should be input. A separate item is available for the speech input.
The values can be entered in one sentence of the following structure:
first item always is the description

the shop name follows after the stop word at

the amount is preceded by the stop word for. The amount is preferably entered with currency, for example two euro fifty. This results in the most accurate recognition and formatting

the date can be entered using the stop word on or as a relative date ( yesterday, Monday last week)

Except for the description there is no prescribed input order.
An example input would be:

Bread and milk at Tesca for $4.65 yesterday

When leaving the speech input item the sentence is analysed and the content is written to the various input items. The example would result in:

date

date one day before today

shop

Tesca

amount

4.65

description

Bread and milk

The user can check whether all the input is correctly filled. If it’s all right he can submit the data and a new record is created.
Now try it yourself by opening this link on your mobile phone.

You can login with the username password combination guest/welcome.
Behind the link you will find an application to enter expenses with speech input. The instructions can be found under the menu item Manual.

When is speech input useful?

This way of entering is typically not for occasional use. You have to be used to the way the sentence is formed and which possibilities are available. After getting used to entering the form by speech you will not want to enter it through the keyboard any more.
Mobile users that have to record their actions regularly could use this kind of entry. This could be for example a salesman, a service engineer or a nurse. For all these workers entering the data directly after the process or as part of the process frees them from filling in paper forms that need to be input on a desktop computer at a later time.

Future developments

I plan to create a plugin for speech input supporting several languages. With this plugin it will be easy to enable existing Apex forms for speech input.
The current software does not support check boxes or list of values, I will be working on that too.

Please let me know whether you would think such a plugin useful and what use you see for it.