EVS — Enabling the A in AQM

AQM stands for Audio Quality Measurement. Audio means voice and music, but up till now there hasn't been much music in the plain old telephony systems. That might change with the new audio codec called EVS. Enhanced Voice Service.

Not only does it promise this new codec enables you to transmit music with good quality from for example a concert.

One use case is to hold up your phone at the concert and let your friend listen in to the music.

This is probably not a killer application and of course not the driving force behind EVS...

The driving force is to provide better quality than OTT services. As one example, in order to compete for the ear of the consumer EVS needs to spank the OPUS codec used in Skype.

Simulation tests are being performed and here are two examples from Ericsson and Qualcomm.

But simulations is never enough. We see a large demand for comparisions between the service provided by operators and the OTT services.

We have test activities ongoing with Ericsson and VFE for comparing Skype to AMR WB. I guess this will be even more important when EVS is released.

Audio Quality — a time journey

Here are two strong points for the EVS codec:

It is designed for Music, enabling a richer experience with better audio quality.

It is designed for Packet based voice especially for VoLTE networks and provides higher quality using no more capacity.

How about we listen to some music simulated on the common types of codecs? It is a small time journey from the early nineties to the future; HR â†’ FR â†’ EFR â†’ AMR WB 12.65 â†’ EVS 13.2 â†’ EVS 24.4

For VoLTE the Jitter buffer and time scaling, called PWSOLA in this sketch was not standardized. This means that device vendors could implement it differently and thus the same radio environment would give different speech quality. With EVS all of the audio transmission is standardized.

Time scaling is used for increasing the tolerance to jitter. When delay increases in the network the audio can be played out at a slower pace avoiding the jitter buffer to go empty. When the delay decreases the playout speed is increased. You can listen to some jitter simulations with EVS that show how it can sound when the jitter is really bad.

Improved voice quality taking no more capacity

For each codec generation the frequency bandwidth has basically doubled while keeping transmission bit rate the same. The commonly used AMR NB 12.2 gives 3400 Hz which sounds a bit like talking in a can. AMR WB gives up to 7000Hz and is a big improvement while EVS handles 14000Hz stereo is planned for. Even full band audio reaching up to 20000 Hz can be used at 16.4 kbit/s

AMR NB 12.2 kbit/s

AMR WB 12.65 kbit/s

EVS SWB 13.2 kbit/s

Availability

EVS was standardized as part of Rel 12 with work items in Rel 13, which means it's available now and if you scan the news you'll see operators announcing EVS support in their networks, e.g. Vodafone Germany, T-Mobile USwith more to come as operators feel comfortable moving this out from the labs into a commercial network.

For mobile operators and network vendors looking to deploy and optimize the user experience with this new codec it's important to secure test & measurement solutions which are capable to handling up to audio 48k sample rates and generating KPIs such as MOS and SIT (Speech Interruption Time). This will allow a comparison between OTT, CS and emerging VoLTE services, ensuring a competitive and superior VoLTE service deployment.