Can Speech Adapt to My Mistakes?

For Newcomers

Surprisingly, yes. Indeed, to a certain extent, Speech will try to understand
your command even if you do not get it right immediately. For example,
"Get my mails" and "Get my mail" will work the same way.

However, you should not expect Speech to understand sentences that are
too different from what the developer intended. If you think that a
command is so unnatural that you won't be able to learn it, you may
want to create a custom command that will be more natural to you.

For the Cutting-Edge Addicts

If you're ready to explore the latest developments of the Speech technology,
you can turn on Panther's Semantic Inference feature. Under this
strange-sounding name hides a technology that allows Speech to understand
what you say, even if you do not speak the predefined command.

When this is turned on, you can replace "What time is it?" with "What
is the time?", "Tell me the time," or even "How late is it?"

Since this technology is still at its early stages of development, Apple
chose to turn it off by default. Its accuracy may not be perfect
(yet) and it may slow the speech-recognition engine down a bit. In my
experience however, all worked perfectly well, so I would encourage
you to give it a try.

To do so, follow these steps:

Open the Speech Preferences pane by saying "Open the Speech Preferences."

Go to the Speech Recognition tab.

Select the Commands sub-tab.

Highlight "Global speakable items" and click on Configure.

In the sheet that appears, uncheck the box to turn the feature on
(I know, I know...).

To test it, read the sentences suggested by the activation sheet and
be amazed.

Going One Step Further

Now that you have discovered the joy of Speech, it's time to go one
step further and learn how to almost completely get rid of your keyboard
and mouse.

Front Window and Menu Bar Control

For now, you may have noticed that many commands are still out of your
reach, including menu items, toolbar buttons, etc. The good news is
that you can control them with Speech too, making your keyboard and
mouse almost obsolete.

In order to turn this option on, follow these steps:

In the Universal Access preferences pane, click on "Enable access
for assistive devices."

In the Speech preferences pane, click on Commands.

Select Front Window and Menu Bar.

Now a whole new world is open to you. Try to say the following commands
to show or hide the volume in the menu bar:

Switch to System Preferences.

Show all.

Sound.

Show volume in menu bar.

This gives you a lot of power over your applications and dialog boxes.
Unfortunately, some nonstandard controls will not work with this method.
Also, you probably will not be able to pick items in complex lists by
using Speech. However, most of the functionality of most applications
will be available via voice commands.

Even more powerful and more universal is the menu bar. Indeed, you can
control it by voice. Since almost all menus are standard, you can without
any issue access most of the menu commands from your applications.

To shut your Mac down, you would say:

Switch to Finder.

Apple menu.

Shut down.

Shut down.

Define Keyboard Shortcuts

This is all very nice but, sometimes, giving a menu and a menu-item
name to perform a simple action can be a bit bothersome. That's why
the Speech development team introduced a very nifty command that allows
you to enter any keyboard shortcut simply by saying "Define new keyboard
shortcut."

A palette will then pop up, allowing you to enter the keyboard shortcut
and the voice command you wish to associate to it. You can use such
a command to, for example, create a "Close tab" command in Safari or
a "New chat with" feature in iChat. Users with disabilities could create
a custom command for "Zoom in" and "Zoom out."

Of course, since Panther allows you to define custom shortcuts through
the Keyboard preferences pane, this feature is even more powerful
than one could think at first sight.

Better Interactions with Your Mac

Spending your day in front of your screen isn't always fun, as enjoyable
as using a Mac can be. Therefore, you may from time to time, wish to
be able to step away from your computer -- when a long task is running,
for example -- but without losing contact with your Mac in case something
important happens.

That's pretty simple. Indeed, Mac OS X now features "talking alerts"
-- this feature will cause your Mac to read the alert messages that
may pop-up on your screen if you do not reply to them after a predefined
delay.

This feature can also be very useful in an environment where multiple
computers run at the same time -- a print shop or a computer lab in
a school. Wouldn't it be nice to hear in a clear, distinctive voice
"The PowerMac G5 next to the window needs your attention. The printer
is out of paper," instead of a "Bong!" that you would need to track down?

In order to benefit from this feature, use the "Spoken User Interface"
tab of the "Speech" preferences.

You can then define what the computer will do and after how long it
will talk. I wouldn't recommend that you set a short delay since having
the Mac read the alert while you are already reading and reacting to
it may be annoying. Setting it to 10 seconds gives you the time to react
if you already in front of the screen.

Your Mac can also read alert windows that, for any reason, would pop
up behind your current application or working document.

The "Announce when an application requires your attention" option can
also be a time saver. Indeed, while you are working, you may not notice
the icons furiously bouncing in your Dock but will certainly hear "Safari
needs your attention."

Adding Commands, Folders, or Files

Like many users, your workflow may require you to access documents that
are buried in your folder hierarchy. Luckily, you can easily create
a "command" that tells Speech to open them in the blink of an eye.

In order to do that, simply create an alias of the folders that you
commonly use in the following folder:

[Home] -> Library -> Speech -> Speakable Items

Now, wherever you are, you simply need to say the name of the folder
to open it. To make the alias creation process easier, remember than
holding the option and Apple keys while dragging an icon creates an
alias.

Making your own items able to be invoked by speech can itself be achieved
by speech. Merely click on the item in the Finder and say, "Make this
speakable." Speech will take care of making the alias, putting it in
the Speakable Items folder, and removing the word "alias" from the alias.

Of course, you have to be careful not to drop any alias with a name
that would match the one of an existing command too closely. Otherwise,
you may end up opening this folder unwillingly. To avoid this, simply
change the name of the alias and all will be well again.

Even cooler, you can put in there aliases to documents that you open
often or the HTTP files that Mac OS X creates when you drag an URL from
a browser's address bar onto the desktop. Just make sure that you give
to these files a name that will be relatively easy to pronounce -- for
example, remove the extensions if possible or you will have to pronounce
"filename dot extension."

When All This Is not Enough

When adding aliases and interacting with buttons or menu items simply
is not enough, keep in mind that both AppleScript and the Terminal can
work closely with the Speech technology.

For example, here is how to write a script that will read a string of
text ...

... in AppleScript:

Say "This is something very cool very cool very cool this is something
very cool that every Mac can do!" using "Cellos"

... in the Panther Terminal:

say -v Cellos "This is something very cool very cool very cool this
is something very cool that every Mac can do"

Note that the voice you pick will be ignored by AppleScript if Voice
Recognition is turned on. This is a feature that allows users to enjoy
consistency in the dialog they have with their computer.

When using the "Saving to file" option, however, the voice you pick
is used, since the consistency of the interaction with the user is no
longer a concern.

The ability to interact with the Speech Synthesizer even if you are
not a developer will allow you to add speech capabilities to the Terminal
scripts or AppleScripts that you already use in your daily workflow
without having to learn a whole new set of commands or language.

Getting your Mac to Listen

Now that your existing scripts have gained the ability to speak to interact
with you, wouldn't it be even better if they could listen? Well, Apple
already thought of it and all the information that you need to create
complex listen-and-tell scripts can be found on
this page.

That way, you can create even more complex speakable items that will
start a true dialog with you and react depending on your needs and answers.

Have Some Suggestions to Make it all more Exciting?

Indeed, I do! The first thing to do is to over-use the "Show me what
to say" command and to try to do as much as you can with Speech. At
first, it may look like you are actually losing time since you need
to learn the commands and sometimes learn to speak into the microphone.

However, very quickly, you will see that you can do almost everything
with Speech and get completely rid of meaningless alert sounds, creating
a true dialog with your computer.

Many applications are speech-ready -- iChat, for example, can read aloud
the name of the persons who invite you to a chat but this option is
turned off by default. It is worth taking the time to learn what each
one can -- and cannot do.

After a few days of practice, I am glad to say that I now can use my
Mac without a keyboard or mouse for most of the day, except when typing,
of course.

I Want to Create Sounds from Speech Synthesis

In some occasions, you may want to create a sound file from the text
generated by the speech engine. The easiest way to do so is to use an
AppleScript command like this:

say "This is something very cool very cool very cool this is something
very cool that every Mac can do!" using "Cellos" saving to "Cool.aiff"

When you run this script, it creates a file at the root level of your
hard drive, containing the sound that you would hear if the synthesis
had happened on-the-fly.

Other Technologies

To achieve the same effect, you can also use the demo pages of the AT&T
"Natural voices" technologies. Indeed, to demonstrate their system,
AT&T allows you to type text into a web form and to download the resulting
file. The main advantage of it is that it allows you to read text in
many languages.

Here is the demo
page. Of course, since there are certain limitations and copyrights
that apply, I encourage you to read the Terms and conditions first.
You should also keep in mind that this system is targeted at professional
frameworks and that it runs on powerful servers.

Author's Note

During the preparation of this article, I had the opportunity to talk
with Kim Silverman, principal research scientist, manager, spoken language
technologies at Apple. May he find here the expression of my gratitude
for the information he so kindly provided.

Needless to say, any errors or inaccuracies in the preceding pages remain
entirely my responsibility.

FJ de Kermadec
is an author, stylist and entrepreneur in Paris, France.