I've written before about why the Opus code is so incredibly important if we want to truly deliver a richer and better communications experience than we've had with the traditional PSTN and so it is great to see this support coming in to Linphone. Linphone is certainly not the first SIP softphone to support Opus - there are a number of others out there, including Jitsi and Counterpath's Bria (and X-Lite) - but it's definitely great to see another softphone added to the mix. Hopefully we'll also see this Opus support move to the desktop versions of Linphone (for Windows, OS X and Linux) as well.

Want to learn more about the Opus codec and why it is so important? As I mentioned at the end of my last post about why Opus matters, there will be a special presentation about Opus as part of the IETF 87 Technical Plenary happening in about 2 hours starting at around 17:45-18:00 in Berlin, Germany (Central European Summer Time, UTC+2, 6 hours off of US Eastern time).

What makes the Opus codec so interesting? Why is there such a buzz about Opus right now? If you are not in telecom or doing anything with audio, why should you even remotely care about Opus?

In a word...

Innovation!

And because Opus has the potential to let us communicate with each other across the Internet with a richer and more natural sound. You will be able to hear people or music or presenters with much more clarity and more like you are right there with them.

Opus can help build a better user experience across the Internet.

You see, the reality is that today "real-time communication" using voice and video is increasingly being based on top of the Internet Protocol (IP), whether that communication is happening across the actual Internet or whether it is happening within private networks. If you've used Skype, Google+ Hangouts, any voice-over-IP (VoIP) softphones, any of the new WebRTC apps or any of the mobile smartphone apps that do voice or video, you've already been using IP-based real-time communication.

Dropping The Shackles Of The Legacy PSTN

Part of the beauty of the move to IP is that we no longer have to worry about the constraints imposed upon telecom by the legacy Public Switched Telephone Network (PSTN). Chief among those constraints is the requirement to use only part of the sound frequencies we can hear. You all know the "sound" of the telephone - and you hear it in any movie or TV show when someone is using the phone. It's that certain "sound" that we are all used to... that's what the "phone" sounds like.

In technical terms, we call this "narrowband" audio and it has a frequency range of only 300-3400 Hz.

There are historical reasons for this limitation in telecom, but moving to IP-based communications removes those limits. With VoIP we can use what is called "wideband" audio to have a full rich sound to our voice or video call.

Have you had a really good Skype connection with someone where it sounded like they were almost right there in the room with you?

That is wideband audio.

The Codec Problem

Now, for voice or video over IP to work, you need to use something called a "codec" to translate the sound of your voice to digital bits and carry them across the network (and to do the opposite for whomever you are speaking with). There are MANY audio codecs out there and they come in all sorts of flavors and with all different kinds of capabilities. The problem has been that there hasn't been a codec that:

is optimized for interactive Internet applications;

is published by a recognized standards organization; and

can be widely implemented and easily distributed at little or no cost.

In particular that last point about the cost of licensing, especially for wideband codecs, often caused developers to shy away from giving us the rich voice quality that we can now have with IP. Or, in the case of companies like Skype or Google, they went out and bought companies who created wideband codecs so that they could use those codecs in their products. (See my story from 2010 about Google buying GIPS.)

Now there are free codecs out there that developers can use. For narrowband, there has been the ubiquitous G.711 which provides an IP version of "PSTN audio". There have been many others, including notably Speex.

But the struggle has been that there hasn't been a widely accepted "G.711 for wideband" equivalent that developers can just bake into their products and start using. Instead there have been a number of different, incompatible codecs used in different products.

Enter Opus...

So to address these points, back in 2010, engineers within the IETF got together and formed the CODEC Working Group to come up with a codec that could meet these requirements and become the ubiquitous wideband codec used across the Internet. Skype was involved early on through contributing their SILK codec. The folks at Xiph.org contributed their CELT codec. People from many other companies got involved and there were huge technical discussions on the mailing lists and at IETF meetings.

Opus is a totally open, royalty-free, highly versatile audio codec. Opus is unmatched for interactive speech and music transmission over the Internet, but is also intended for storage and streaming applications.

So Why Does Opus Matter?

Opus matters because it lets developers focus on creating a high quality user experience and not having to worry about codec incompatibilities and licensing issues.

Opus matters because it lets developers easily create applications with high quality audio. They can just start using available libraries and communicating with other applications and devices using a common wideband codec.

Opus matters because it can work in very low-bandwidth environments enabling real-time communications across Internet connections that might not previously have supported such communications. As we start to get more Internet connectivity out to the 5 billion people not yet on the Internet, the ability to work over different kinds of connections is critical.

Opus matters because it can help foster innovation in applications and the user experience. Opus is the default audio codec for WebRTC, and so all the zillion new WebRTC-based apps and startups are already beginning with a far superior audio experience than we've had before.

Opus matters because it will enable even more ways that we can connect with family members or friends and have the experience of being "right there". It can help musicians collaborate better across the Internet. It can help podcasters and journalists deliver higher quality interviews across the Internet. It can, in the best conditions, give us that rich audio experience we get when we are right with someone - even though we may be thousands of miles away.

Opus can help us deliver on the potential of the Internet to create more powerful user experiences and to help us better communicate.

THAT is why Opus matters.

Learn More At Monday's IETF 87 Technical Plenary

To understand more about the current status of Opus, who is using it and where it is going, the IETF 87 Technical Plenary on this coming Monday evening in Berlin, Germany, will have a special segment focused on Opus that will include a number of people involved with the Opus work. The agenda for the session can be found at:

It is happening from 17:40-19:40 Berlin time, which is Central European Summer Time, which is currently UTC+2 and 6 hours ahead of where I live in US Eastern time. If you can't be there in person, there are several remote options:

If you are unable to watch the meeting in real time it will be archived for later viewing.

The first option above to listen to the session using the Opus codec (and WebRTC!) is a very cool one. The panel also includes people who have actually implemented Opus including people from Google and also Emil Ivov from the Jitsi softphone. Their insight into what they did will be great to hear.

If you are a developer of communications apps or services (or a product manager), you can look at how to incorporate Opus into your application or service. There is documentation and software available to help with the process, and many people are out there who can help.

If you are a user of IP-based communications apps or services, ask the company or vendor behind those services when they will support Opus. See if you can get it on their radar as something to implement.

And regardless of what you do with audio, let people know that this new way of communicating exists - help spread the word about Opus - let people know that audio across the Internet can be even better than it has been to date.

As you can tell, I'm excited about the potential - and very much looking forward to seeing what happens as Opus gets more widely deployed.

What do you think? If you are a telecom developer, or a vendor of such services, have you implemented Opus already? Are you thinking about it? (and if not, why not?)

This is an excellent step forward, even with the caveat that it only works on T-Mobile's 4G network and only with specific smartphones. As more and more people get used to the richer and better quality of wideband audio, expectations will rise and continue to push the ongoing migration of all telecom over to IP-based solutions.

I'll be talking about why you should care about wideband audio, what you can do with it, and how you can get started. Here's the abstract:

What is “wideband” or “HD” audio? What are the benefits of wideband audio? What are the advantages and disadvantages of using wideband? With all the buzz out there, what does wideband or “HD” audio really do for you in a business setting?

In this Developer Jam Session, Dan York, Director of Conversations at Voxeo will explain the basics of wideband audio, discuss the various versions of wideband audio deployed in the industry, explain why it is important in terms of business value. Additionally, he will talk about how wideband audio is implemented in Voxeo’s Prophecy and PRISM products.

To me, wideband audio is one of the truly compelling advantages of voice-over-IP and I'm looking forward to sharing that passion with the attendees in a few hours... why not join us and listen in live?

WHY might you want to do this? Well, primarily if you want better audio quality when using VoIP on your iPad... and if you are like me and always find Bluetooth headsets sucking up too much battery power, it's nice to have a wired option.

Next up, figure out what else can be plugged into that USB connector... ;-)

To put this in more normal language, if you know how good a Skype conversation can sound... how rich the audio can be... how it can sound like the person on the other end is right there in the room with you? The quality of that audio connection is because Skype uses a "wideband codec" to send the audio from one end to the other. Up until 2007, GIPS provided the primary wideband codec that Skype used.

At some point in there, Skype realized that, particularly giving away a free product, it needed to control more of its technology stack and stop paying licensing fees to GIPS and so it bought a company, Camino Networks, that had its own wideband audio codec. Skype then moved away from using GIPS and used its own codec technology.

GOOGLE OWNS ITS STACK

This would seem to be the exact same move that Google is making. Through their purchase of Gizmo last year, Google acquired client-side technology and SIP technology for the "control channel" side of communications path. With their 2007 purchase of GrandCentral, Google acquired a SIP-based backend infrastructure (which evolved into Google Voice). They have also had their GoogleTalk product out for some time as well.

What they haven't had until now is control over the "media channel".

IP COMMUNICATIONS 101

To understand why this matters, let's back up and review "IP Communications 101". When you have two "endpoints" (softphones, "hard" phones, applications, whatever), they communicate over IP using two different channels.

The first channel is the "control channel" where commands are passed such as "I want to invite you to a call with me". These days that control channel is increasingly using the Session Initiation Protocol (SIP) although many other protocols exist (both proprietary and standards-based). The control channel typically passes through one or more "proxy servers" (in SIP lingo) that may be IP-PBXs, call servers, hosted servers, "clouds", etc.

The second channel is the "media channel" where the actual audio or video is sent between the endpoints. Depending upon the exact configuration, this media channel may go directly between the two endpoints, as pictured. Or it may go through media proxy servers, or through Session Border Controllers (SBCs). It is typically transmitted using the Real-time Transport Protocol (RTP), but inside the RTP stream the actual audio or video is encoded using a "codec".

The point is that it is separate and distinct from the control channel.

The issue is that while the control channel is increasingly around the open standard of SIP, which anyone can implement, the "codecs" used in sending media from one endpoint to another have long been a proprietary battleground, particularly with regard to wideband (or "HD audio" as some call it). Yes, there have been and are standards, but usually there have been intellectual property or licensing issues. Now, there is work within the IETF to create a standard wideband codec (and as I wrote earlier, Skype is involved with this effort) but that may take some time and the outcome is not known right now.

The easiest way to solve all these issues is to own your own codec. This is what Skype did back in 2007... and what Google seems to be doing now.

CONFERENCING

It's also worth noting that GIPS has conferencing engines for both audio and video... and recent events highlight increasing interest in video conferencing:

Put some of the pieces together like this and we could indeed see renewed interest in video conferencing, particularly from mobile devices. (Or perhaps Google might add audio conference calling into Google Voice.)

TO WHAT END?

The question of course is what will Google do now. Naturally neither the Google news release nor the GIPS "letter to customers" says anything. Typical Google style is for new acquisitions to go silent for some extended period of time and then to pop out in some new offering. In this case, of course, GIPS is providing underlying technology that Google could use in many of its other offerings.

Some of the speculation (and it is only that) I've seen so far is that Google could be taking on:

Skype - As mentioned earlier, with Gizmo and other acquisitions, Google does have the tools to try to create a competitor to Skype.

Apple - Naturally with the Android/iPhone war going on, Google could use this technology to offer new services on the Android platform.

Microsoft - With their now-branded "Communication Server", Microsoft is challenging the incumbents in the enterprise communication space... perhaps Google will put some of the pieces together to start doing something there.

Or perhaps Google will open source some of the technology to further try to disrupt the industry... for instance, will Google offer one of the GIPS codecs to the IETF CODEC working group as an open standard?

Time will tell.... in the meantime congrats to Google and GIPS on this acquisition.

One interesting development in the world of Skype last week which I've seen little mention of is the fact that the folks at Highspeedconferencing.com have rolled out a Skype Extra that lets Skype users have large-scale conference calls. Like most such large conference bridges, they have moderation/"hand-raising", call recording, email invites, etc.
However, the key point to me is that their conferencing bridge uses the wideband audio supported by Skype!That is the key. You now have conference calling with audio quality that is far better than the PSTN! This is where we start to get into the space where VoIP can offer a truly different - and better - user experience than traditional telephony. The Skype blog touches on this:

HighSpeed Conferencing is the only audio conferencing service available to Skype users that offers high-definition (HD) voice quality. There’s no degradation of audio quality, no matter how many Skype users participate in a conference call. And with unlimited usage during a conference call, you can talk as much as you want. Some people stay on the conference bridge all day.

I've not yet used the service as right now I'm not involved with large conference calls, but at the point that I do I will very definitely check it out. (On a tangent, I wonder if Polycom has a trademark on "HD Voice"?)
Have any of you tried it out?