The telephone is following television into high-definition territory, with HD voice (also called wideband audio) promising to make callers sound like they're practically in the same room.

Surprisingly, though, upstart Voice over Internet Protocol (VoIP) players—and not the long-standing telephone companies and wireless service providers—are the ones turning consumers onto HD voice. VoIP claims more than 900 million users, according to a report by ABI Research released last year.

Skype (which was formed in 2003 and acquired by Microsoft in May 2011 for $8.5 billion) was one of the first to tout the value of HD voice. Google also recognized its value, and in May 2010, it paid $68.2 million to acquire Global IP Solutions (GIPS), a Norwegian firm specializing in VoIP and video processing platforms.

"The Web is evolving quickly as a development platform, and real-time video and audio communication over the Internet are becoming important new tools for users," said Rian Liebenberg, engineering director at Google, at the time of the acquisition.

And as consumer interest in VoIP and the HD voice capabilities that it enables expands, the same technology is forming the basis of the rapidly approaching future for voice calls into and out of the contact center. IP telephony delivers lower costs, greater flexibility, and higher voice quality than traditional telephone services, which is why several analyst firms predict that the vast majority of contact centers could have the technology in place within two years.

Paul Stockford, an analyst at Saddletree Research, estimates that 61 percent of contact centers currently employ VoIP technology, while 13 percent more are evaluating it. Two percent have already earmarked it for purchase this year.

That means that the majority of the contact center industry is ready for HD voice. But is HD voice ready for the industry? Opinions are mixed in that regard, but no one is discounting the benefits that HD voice could bring to the contact center.

The Business Case

With the growing number of call centers using voice recognition technology to manage and route inbound calls, the increased clarity of HD voice will improve the interaction with interactive voice response (IVR) systems and their supporting speech recognition engines. "HD audio definitely improves speech recognition," notes R.J. Auburn, chief technology officer at Voxeo, a provider of IVR and VoIP platforms.

The same can be said for calls that go to contact center agents. "HD voice will eliminate a lot of the misunderstandings and make the conversation go faster. The agent wouldn't have to ask the caller to repeat himself as often," says Jim Machi, senior vice president of marketing at Dialogic, an advanced communications systems provider.

"With [standard-definition] audio, the customer often has to repeat himself, and that costs money in the contact center," adds Chris Thorson, director of product marketing at Polycom, a provider of enterprise-class communications systems.

In the same way, voice biometrics, dictation, transcription, call recording, and speech analytics engines can also yield more accurate results with HD voice.

That's because the improved sound quality makes it easier to distinguish between similar sounding words, numbers, or, if the customer has to spell something out, letters.

Beyond that, other business benefits abound. The clarity and reduced interference with HD voice will allow mobile workers to make calls in noisy environments, such as airports or subway stations, where normal calls would simply not be feasible. The technology also enables crystal-clear voicemail playback, eliminating the need for the user to listen multiple times to understand a message. More effective conference calls are possible as well.

During conference calls over traditional telephone lines, participants often struggle to figure out who is talking or to understand speakers with accents. Misunderstandings are common, faint talkers are lost, and double-talk (when more than one person is speaking at the same time), static, and background noise can all render parts of the conversation unintelligible. Those problems are almost nonexistent with HD voice.

"Conferencing is a great use of HD audio," Auburn says. "The voice quality on conferencing can be painful, but wideband makes it much better."

Some estimates put the voice quality with HD voice at between six and eight times better than standard-definition audio.

"The savings can be huge," says Auburn. "A one percent efficiency gain in a 2,000-agent call center…could be a couple million dollars a year."

How It Works

Voice and video conferencing is still one of the most common uses of HD audio today. And while services like Skype and Google Voice have become widely popular in consumer circles, they've yet to take off as business applications.

But many technology experts maintain that HD voice will go viral—that once people have tried it and seen how clear and natural-sounding it is, they won't want to go back to standard-definition audio, in their business or personal lives. It's a trend that has already begun.

"One of the most unexpected practical bonuses has been the audio quality," said Laurie Douglass-Wilson, senior vice president of the American Chiropractic Association, which uses Audio Presence HD from Siemens Enterprise Communications across its operations. "Even staff in other states sound like they are in the same room with you. It's just that clear and that different from anything we've experienced before."

Douglass-Wilson is not alone in her favorable evaluation. In an interactive multimedia study, Siemens found that 97 percent of people who were exposed to both HD and standard audio found a noticeable difference in sound quality with HD, and 91 percent expressed a preference for HD voice capabilities on their desktop phones. A full 94 percent felt that improving voice quality with HD voice would have a positive impact on their businesses.

HD voice accomplishes this clarity by using digital signal processing technology to capture and transmit higher-quality sound through Session Initiation Protocol (SIP) trunks over a broadband Internet connection or on traditional data T1 circuits, which are significantly less expensive than voice lines.

Normally, calls are transmitted on a frequency of 300 Hz to 3.4 kHz. HD voice, on the other hand, expands the frequency range to between 50 Hz on the low end and 7 kHz and up on the high end.

That's closer to the true human speech range, which includes sounds well above the 3.4 kHz range. As such, human tones at the very low and high ends of the audio spectrum can be lost with traditional, standard-definition audio.

Because HD voice works with a larger frequency, more sound waves and speech data can be squeezed into a single channel, with excess information stripped out. The wider frequency range in turn enables the speech to be clearer and crisper, capturing the natural inflections in voices that often peak above or below the traditional audio standards.

Such voice capabilities work with a dozen or so codecs (pieces of computer code for compressing and decompressing analog sound into digital bits for use by computers and networks). These codecs vary in many ways, including sampling rate, bit rate, computational needs, required memory, latency, resilience, and delay. Some are open-source and free, but in most cases, the codecs used to send voice communications from one endpoint to another have long been—and continue to be—a proprietary battleground. Some of the leading codecs right now are G.722, G.722.2 (also called Advanced Multirate-Wideband, or AMR-WB), RT Audio, Speex, EVRC, MPEG-AAC, and SILK (which is used by Skype).

The existence of these different codecs has created problems for the voice communications industry. To have a successful HD voice call, both parties need to operate on the same codec. If both sides are using different HD codecs, either one side has to be transcoded—or translated—into the other codec, or both have to shift to a mutually agreeable codec. Coming to some kind of agreement on that issue hasn't been easy.

In fact, most deployments up to this time have been limited exclusively between callers on the same wireless networks, creating silos of HD voice capability.

"There are not a lot of connections between networks and carriers right now," Auburn says. "That's kind of how SMS started. You couldn't exchange messages with other networks when that first started either."

The telecommunications industry overcame that obstacle, and today Verizon subscribers can send text messages to AT&T subscribers and vice versa.

Optimism is high that the telecom industry will come together in a similar fashion regarding HD voice. Efforts are already under way, with the XConnect High-Definition Voice Peering Federation pulling together operators to work on the necessary interoperability to start breaking down the silos. And in Europe, where HD voice is more available than in the United States, the telecommunications industry is moving toward advancing the G.722.2 codec as the de facto standard.

A Standard Upgrade

As business telephone systems have adopted VoIP technology, support for HD voice has followed. Telephone equipment from all the major manufacturers now incorporates varying degrees of HD voice components. Likewise, suppliers of integrated circuits for telephony equipment now include wideband audio in their feature portfolios.

Any call center system bought within the past five years is likely to have HD voice capabilities built in, according to Dialogic's Machi. "And as you buy a new phone system, it will be included," he states.

That means that many contact centers are already HD voice-enabled. For those that aren't, making the move to HD voice shouldn't be too much of a problem.

"Superior audio quality does not have to be an expensive solution that is only available in high-end or executive phones," said Rick Puskar, senior vice president of portfolio management at Siemens, in a statement.

"For most companies, we're not talking about a wholesale rip-and-replace of all equipment," says Mark Mortensen, principal analyst at Analysys Mason, a firm that specializes in telecommunications technology. "It's probably more like some minor equipment upgrades."

Where upgrading to HD voice can become a more difficult and time- and budget-consuming project is in the peripheral equipment. Each device connected to the network—not just the phone lines and PBXs but also the headsets, handsets, microphones, and speakers—must support HD voice and the relevant codecs.

"You have to pay for each device that is capable of wideband," Auburn adds. "It does add some incremental costs per agent, but it's not a huge cost. You can do a Web-based client with a relatively small investment."

That's largely because of how system upgrades are budgeted and sold. "Very rarely do companies do an upgrade just for [HD voice]," Polycom's Thorson points out. "Most often, as you do a routine system upgrade, you get it free as part of the upgrade."

At Dialogic, for example, "we're not charging more for HD voice," Machi says. "It's being built right into the product that we're shipping right now."

The same can be said of Siemens, which includes Audio Presence HD with every OpenStage IP phone model as well as its mobile, softphone, and voice conferencing applications, at no additional cost.

Customer Challenge

The challenge with HD audio, though, is that all ends of the phone conversation have to support it for any real benefit to be achieved. And right now, especially in the United States, chances are good that the customer calling into a contact center is not connected to an HD voice line. Currently, no U.S. consumer landline or cellular network is HD voice-capable. In April, Sprint announced plans for a nationwide HD voice rollout beginning later this year. Verizon, AT&T, and T-Mobile have all laid out plans to add HD voice to their wireless networks by 2013.

HD voice-over landlines aren't even being discussed just yet, and the U.S. cable industry has yet to announce a residential HD voice deployment, though Comcast has come close. HD voice is part of Comcast's managed business cloud service, and in June, Comcast began offering Skype to its Xfinity TV customers in Houston. Comcast's efforts aside, "if the consumer is calling from home, it's probably not going to be in HD," Thorson says. "That's the biggest challenge [for the contact center]."

That's in stark contrast to Europe, which is largely ahead of the rest of the world in its adoption of HD voice telephony. Globally at the end of 2011, there were 39 commercial mobile HD voice networks in 31 countries, mostly in Europe, according to data released in February by the Global Mobile Suppliers Association. France Telecom's Orange was the first to support HD voice, rolling it out to some subscribers as early as 2009, but at least 10 carriers now offer HD services in Europe.

In Africa and the Middle East, six carriers reportedly offer the service.

The Global Mobile Suppliers Association also reported that the device ecosystem is growing; there were 60 HD voice-enabled mobile devices on the market in 2011.

But until the big U.S. carriers, like Verizon and AT&T, support HD voice across their entire networks, adoption in America is likely to be fragmented at best.

"Some carriers are looking at it, but there aren't a lot of deployments going on," Mortensen laments. "HD sounds better, but the networks people are using to access it have to be connected. We need the Verizons to upgrade their networks first."

That carrier adoption is so slow is surprising, especially considering that research conducted by Skype revealed that average person-to-person call duration can grow from 21 minutes for standard-definition calls to 32 minutes for HD voice calls. That equates to a 45 percent increase in revenue for the carriers. Because the sound quality is so much better, people don't mind talking longer; they're not in as much of a rush to get off the phone.

Also slowing the deployment of HD voice is what many see as a lack of urgency. "There's no Y2K urgency or government mandate like there was with HD TV," Thorson says, referring to the U.S. government's 2009 mandate that all TV broadcasts transmit in digital, high-definition only.

HD voice is "a low priority right now," even among contact center operators, Mortensen continues. "There are a lot of other more important things to do to improve the customer experience and create a better customer interface before putting in HD voice."

Denise Culver, a research analyst at Heavy Reading, an independent research firm specializing in emerging telecom trends, agrees. "It generally seems that HD voice is a feature that some vendors offer, but I'm not hearing a lot about it being a huge driver one way or the other," she says.

But that will only continue for so long. BT's research also found that customers would welcome HD voice calling and would likely be willing to pay a premium for it. Additionally, 72 percent of users want their next mobile device to be compatible with HD voice, and 50 percent of VoIP customers would be willing to change operators to get better-quality voice services.

Among the businesses that are leading the transition to HD voice are telephone companies and cable TV providers, according to Auburn. "They have a distinct advantage in that they can control the networks and devices," he says.

Government, law enforcement, military, and security are other verticals that have already embraced HD voice, Mortensen says, particularly because making proper voice identifications is essential to those fields.

Auburn sees great potential, though, for HD voice to have a real impact in other businesses. "Any vertical with live communications, conferencing, and Web collaboration should get in on it," he says. "Also, high-touch verticals. We're also going to see it among verticals where they have a lot of mobile apps, like banking and travel."

And though there is some disagreement about the timeframe, technology experts agree that the switch from standard-definition audio to HD voice is inevitable.