Microsoft Speech Server 2004 Standard Edition

Pros:Simple administration; standout tools; simulator for running speech applications on a desktop PCCons:Does not currently support VoiceXMLBottom Line:With the debut of Microsoft Speech Server 2004 Standard Edition, speech-based computing is now within reach of a much wider range of businesses.

ReviewSetting up interactive voice-response systems (like those used by large financial companies and movie directory hotlines) used to require a significant investment in proprietary software. But with the debut of... click here for

Speak to Your Server

Setting up interactive voice-response systems (like those used by large financial companies and movie directory hotlines) used to require a significant investment in proprietary software. But with the debut of Microsoft Speech Server 2004 Standard Edition ($7,999 per CPU), speech-based computing is now within reach of a much wider range of businesses.

With Speech Server, you can develop several kinds of speech applications: touch tone interfaces, voice-driven menus, and multimodal interfaces (where voice supplements a standard visual Web interface). Microsoft's multimodal application style is new and lets callers interact with Web pages via voice as never before. Speech Server piggybacks on ASP.NET, adding speech to standard Web applications via its Speech Application Language Tags (SALT). It offers voice input recognition as well as text-to-speech conversion, with technology licensed from ScanSoft. However, it does not currently support VoiceXML.

We installed Speech Server under Windows Server 2003 Enterprise along with an Intel Dialogic D/41JCT-LS telephony card with four phone ports. (Your hardware choices for telephony include Intel cards with up to 96 ports.) We used the third-party software from Intel to configure the telephony card and with the default options we were up-and-running in under an hour. If you want more than the default options, you'll likely need to dig in here. Configuration can involve setting jumpers on multiple boards. Microsoft is leaving this configuration layer up to the hardware vendors and their software, though any admin who faces this chore will probably wish for more integration into with Speech Server itself.

On the other hand, administration of Speech Server is simple. If you can handle standard ASP.NET Web applications, you'll be able to administer speech-enabled ones, too. Basic admin is handled with a bare-bones MMC snap-in for configuring speech-enabled applications. Additional counters are available in Performance Monitor for pinpointing processing bottlenecks. For debugging on the server, Speech Server offers a trace utility and the ability to view server events within Windows Event Viewer.

The real power behind speech processing is the freely downloadable Microsoft Speech Server SDK 1.0. We installed this add-on to Visual Studio .NET and were pleasantly surprised by its powerful suite of speech tools. By contrast, in the VoiceXML space, development often means using a text editor to write and tweak XML. Microsoft offers a component-based model for building speech apps, with some two dozen Visual Basic-style components for handling speech dialogs and managing phone callsall without delving into the details of XML. In testing, we used these components to model several voice dialogs in C# and SALT from a legacy travel alert application built in VoiceXML.

Other standout tools make creating your first speech application relatively painless. A visual editor let us define and tweak speech recognition grammars (sets of phrases that are valid at particular points within a speech-enabled application). Professional speech developers record all valid prompts ahead of time in sound files; a handy prompt editor in the SDK let us manage a list of text prompts along with recorded WAV files. Speech Server SDK includes support for recording, playing back, and editing sampled speech.

A final noteworthy feature is a simulator for running speech applications on a desktop PC. After typing a starting URL, we were able to simulate a complete telephone session using a standard PC microphone, while viewing a detailed trace log of events. In testing, this feature proved effective, though we missed the color-coding available in VoiceXML solutions like Voxeo, which can make slogging through a trace of phone and speech activity a little easier.

Microsoft's opening gambit in IVR systems is promising, though it's not likely that businesses that have already invested in traditional voice software will jump onboard with this first release. If you are new to voice development, however, and haven't already invested in VoiceXML, the component-based style of programming in Speech Server lets you tackle speech applications with ease. It also expands the kinds of voice applications you can create on the Windows platform.

PCMag may earn affiliate commissions from the shopping links included on this page. These commissions do not affect how we test, rate or review products. To find out more, read our complete terms of use.

Richard V. Dragan, a contributing editor of PC Magazine, has written over 250 articles and reviews for the magazine and other Ziff Davis publications since 1992. From 1994 to 1998 he authored a programming column for Computer Shopper. He has taught C++ and Windows programming at Columbia University since 1990, and Java since 1997.
More »