Applied Technology: HD Voice

It seems we live in an HD world - the prefix having been already applied to radio and TV. In most circumstances it's used to promote a better user experience, including a significant bump in quality. Why is it we're not demanding the same for on-air phone calls? Phones have sounded the same way for over 100 years. Our expectations of quality in audio and video have risen, yet phone calls still sound thin and fatiguing, especially when put on the radio. What other technologies have taken so long to evolve?

Telephones remain low-fidelity because of choices made by telco network engineers decades ago. Figure 1 shows the approximate audio bandwidth from an 8kHz sampled voice call. When listening to a call on a tiny handset, an increase in audio bandwidth doesn't sound very dramatic. This is why efforts at HD Voice - the term used for the process of delivering wideband phone calls - have lagged. But when these calls are broadcast on the radio, the lack of bandwidth becomes obvious and grating. And we finally have a tool to fix that.

Figure 1

The tool is the now-ubiquitous smartphone. Since telephony has made the progression to voice-over-IP, it's finally possible to leave the old-fashioned techniques of transmitting digital voice behind and move into the 21st century. As shown in Figure 2, once you move to an architecture where your entire phone call exists in the IP domain, it's no longer necessary to stick with traditional narrowband voice codecs (the algorithms used to digitize and compress the voice audio). As long as the hardware on each end supports it, an HD Voice codec can be negotiated and used. And with more studios moving to VoIP for their call-in lines (either by choice or necessity) it's much more likely to get these end-end VoIP calls happening.

Figure 2

To qualify as HD Voice, the codec used must sample the voice at a minimum of 16kHz, at least doubling the audio bandwidth available. In addition to adding in high frequencies, HD Voice restores the low end (below 300Hz) that gets filtered by most phone networks. The result is a crisper, more intelligible conversation with a lot more "punch" and presence, and one that is much more listenable to a radio audience.

In the VoIP world, a range of HD Voice codecs has emerged that delivers good quality at low network bit rates. Algorithms like SILK (used by Skype), G.722.1, and iSAC compete in this space. But the industry has adopted a lowest common denominator of G.722, the old familiar friend of radio from ISDN days. Its strengths are ease of implementation, low delay, and lack of current patent claims.

The mobile phone industry also supports HD Voice today, but the tech has moved in a different direction. Many modern smartphones have the capability of negotiating HD Voice within the phone's main voice channel, replacing the thin sounding mobile algorithms of the past. This is very good because the codec can be engineered to degrade gracefully under adverse signal conditions, much like narrowband codecs do (rather than the dropouts you might experience on data channels). But there are significant limitations.

The first is codecs. The mobile phone industry has chosen AMR-WB (also known as G.722.2) as their primary HD codec, an algorithm completely unsupported in most VoIP systems. This incompatibility hardly matters, though, because the mobile phone networks don't provide any way to bridge these calls over to wired data networks. This means HD Voice is only available to calls entirely within the provider's voice network, and all calls that leave the network (either to the PSTN or to competing wireless providers) are reduced to narrowband. So carrier-grade HD Voice has very limited application to radio broadcasting for now.

But with the advent of 3G and 4G service to smartphones, a more compatible version of HD Voice can be provided. The mobile industry refers to this at "Over the Top" (OTT) voice, where a VoIP app will utilize the data channel to deliver calls. These apps can be built to be completely compatible with wired VoIP systems, and thereby deliver HD Voice to studio-based call-in systems. Most of these apps are based on the common SIP protocol, so they are universally compatible. The list also includes the popular Skype app. When used for voice calls, Skype provides an HD experience when connected to another Skype user.

Comrex STAC VIP

As mentioned, so far HD Voice has not generated excitement from general telephone users. But radio is uniquely suited not only to leverage the benefits of HD Voice (with higher quality programming), but to promote the adoption of the tech on-air. Shows with call-ins can start by equipping scheduled guests (pundits, politicians, athletes) with HD Voice apps to do their calls to the studio. Eventually, stations can enlist listeners to use the apps by offering preferred access to callers who use them. And it all moves toward a worthy goal, which is the eventual banishment of telephone audio from the radio dial.

Of course, proper equipment is required on the studio side to make this happen. An easy way to integrate HD Voice into a studio is via Comrex STAC-VIP, a full-featured talkshow system that works exclusively on VoIP phone lines. STAC-VIP can handle multiple calls from a variety of users, including those on "old-fashioned" phones, HD Voice apps, and Skype. All the different types of callers can be processed and conferenced together like normal phone calls. The Web-based call management system is delivered to a browser from the system's internal Web server, and gives a visual indication of which incoming calls are HD before they are answered.

While STAC-VIP can accept HD Voice calls from most VoIP softphone apps (and VoIP hardphones for that matter), there's a companion app designed especially for simplicity called VIP-QC (Quick Connect). Available on both iTunes and Google Play, the key to the app is simplicity. Setup and station choice is streamlined so the app can be given to anybody. Once configured, one button push connects to the STAC-VIP in HD voice mode.

Radio continues to use the call-in show as the easiest way to connect with listeners and create compelling, locally oriented programming. It's time to change that programming into something that's actually pleasant to listen to, and avoid listeners changing stations due to annoyance with telephone audio quality. Good stations already create great call-in content. HD Voice now gives radio the tools to make those shows sound good from a technical standpoint as well.