Processing IP Media with MSML

By Martyn Davis

As we move into Next Generation Networks and the IMS, a lot of the fundamental building blocks of the network are changing, and in fact having an IP core to the network brings more power and flexibility. Media processing is an essential part of telco services: for example a conferencing service needs to mix audio streams together, and feed-back the right audio mix to each conference participant. Often Interactive Voice Response (IVR) is an important part of a service, for example mobile prepaid customers often use self-service systems to top-up account balances. Media Servers can also be used to provide automated announcements, or even services such as Color Ringback Tone, (popular in Asia today, where the caller hears music of the called party’s choice, instead of the normal “ring ring” tones.

Media Gateways are long established devices that can both terminate traditional trunk lines (such as T1/E1) and also process the audio (media) streams. In NGN networks, or even in today’s transitional networks the media is already in RTP (IP) format, and so the legacy telephony interfaces are no longer needed. A new generation of IP Media Servers has emerged address this space, and these media serversbring with them new capabilities. RadiSys (Nasdaq:RSYS), through its recent acquisition of Convedia, is one of the companies with long experience in this area of media processing, and Garland Sharratt is a Senior Director of Product Marketing there: “Media Servers are pure servers, they’re only IP attached, they can sit anywhere in the network, you can add them, scale them at will. Media Gateways are not pure servers, they’re tied to physical interfaces, at the edge of the IP network, so they’re tied to lines or access gateways. The Media Server really is this pure entity in the network that can serve any master at all; it can be seated anywhere in the network, and be controlled by any entity, and this is a really powerful concept.”

In the world of media gateways, typically the Media Gateway (News - Alert) Controller (MGC) and Media Gateway (MG) itself would communicate using MGCP (Media Gateway Control Protocol), or using the more recent H.248 standard (also known as Megaco). The relationship between the two components is a strict master-slave relationship, both being in the same network domain, and always with the MG being controlled by the same MGC. Media Servers (MS), by contrast, have a relationship with application servers (AS) is more peer-to-peer (which fits SIP well), and an AS have a relationship with more than one MS. Likewise, the MS can be controlled by a number of different application servers at the same time, even reaching across different network domains. This many-to-many relationship is power of the MS, and helps to achieve one of the basic IMS goals, which is to allow network core services to be exposed to applications both inside and outside of the operator domain.

The Evolution of the Media Server

A key driving force behind the architecture of media processing has been the nature of the application servers themselves. Because the AS often uses SIP in order to achieve other functions like call control, there has been customer demand to use SIP also at the interface for talking to the MS, rather than have SIP for one set of functions, and H.248 for another.

The earliest SIP-based control protocol for the MS was Netann (now RFC4240), which is still used today. Later came a protocol called MSCML (Media Server Control Markup Language) from SnowShore (now Cantata). Building on some of those ideas, Convedia created the MSML protocol, Media Server Markup Language, which over the last three years has gained remarkable traction as an open de facto standard. Garland Sharratt, who came from Convedia sums up: “Netann did some things well, but it couldn’t support the kind of powerful features we had developed in our media servers. We looked at MSCML, but it didn’t meet our needs in terms of the power and it needed to be licensed. We decided to create our own protocol [MSML] that met our needs and would be easily extensible.”

Of MSCML, Sharratt says: “SnowShore set the way, they set the direction, and they should be praised for creating the protocol. When we were moving from MGCP to SIP several years ago, we would have liked to have used MSCML just not to fragment the industry, but unfortunately because there’s IPR, needing to be licensed, it would have been difficult for us to extend the protocol, which was necessary because our media servers have generally a richer capability than MSCML can support.”

These considerations led Convedia to design their own, extensible, SIP-based control protocol, MSML. They also decided to promote it as open multi-vendor protocol, issuing the specification free of IPR, thus freeing other competitors and partners from the need to license the technology from them: “It really wasn’t a hard decision at all, that’s the Internet way; that’s what is expected in the market. We really believed that we would compete very nicely based on the quality of our product, not on the basis of who ran to the patent office first. We expected that other media processing vendors would use it as well, and it would be extremely popular with service providers. Service providers very strongly push vendors to develop standards, and MSML is the de-facto standard in the industry:
Intel (News - Alert)
[now Dialogic Corp] and NMS [Communications] are two big media processing vendors that are using it. So we created this as an industry thing, and we’re really pleased to see that that is
happening”

Evolution Towards NGN Architecture

Networks today are in transition, but no-one is ready to rip out all of the components and start again with a completely new architecture. Stepwise evolution is the key, and with media processing this is also true. If you look at services today such as prepaid, call center, and voicemail, you will find IVR systems based on VoiceXML technology, so it is important for media servers to acknowledge and support that use. Sharratt says that MSML can do “any and all of the media processing that VoiceXML can”, and much more to boot including conferencing and lightweight IVR, i.e. simple IVR, perhaps collecting two DTMF digits from the caller, without the need to fire up a full VoiceXML interpreter. However, MSML is designed to embrace VoiceXML, so that customers can choose whether to implement media servers that contain embedded VoiceXML functionality, or whether to use a media server in conjunction with a separate VoiceXML server. Operators might choose to minimize disruption to services,
by simply switching out a legacy media gateway for a media server, leaving existing VoiceXML applications and servers in place.

MSML Rollout

So where are the MSML-based media servers today? Sharratt “We [RadiSys] are deployed in about half of the top 50 service providers across the world. Today it’s probably more wireline than wireless; wireless has been a bit slower getting into IP communications. Wireline carriers have been faster to deploy IP cores, but wireless is now taking off, for instance 3G licences are about to be awarded in China, so it’s a pretty exciting time now.”

Standards Acceptance

Sharratt describes media servers as being “the first converged components”Â supporting multiple types of network architectures and accesses, since they combine different control protocols (MGCP, H.248 and SIP) and support the ubiquitous RTP protocol. However, standards for media servers have lagged other components of NGN/IMS.

The 3GPP group have been standardising protocols for use in wireless IMS, but generally the takeup of IMS amongst wireless operators has been slow. The enthusiasm for IMS among the fixed line operators (for example British Telecom’s 21CN project) has in many ways overtaken the wireless operators, and so the ETSI TISPAN group have been adapting and augmenting the 3GPP standards to bring IMS to fixed line services.

However, these groups are only just coming to the point of talking about standards for the SIP-based control language for media servers. The 3GPP group have so far specified H.248 as the control language for the MRFP (Multimedia Resource Function Processor, the IMS name for the media server), and has a Study Item effort underway to define the architectural framework for the multimedia Resource Function Controller (MRFC), the MRFP’s master. The IETF are only just creating an official working group (unoffically called MediaCtrl to date) to define a SIP-based control language. The IETF and 3GPP often work together to evolve standards, and this will likely happen for a media server SIP language too. Sharratt believes that MSML is a leading contender for the eventual SIP-based media server language standard, if only on the grounds of being the most-deployed de-facto standard for SIP.

So how will SIP generally, and MSML specifically, fare against H.248, in the coming months and years? Although current IMS documents talk about H.248 being the standard, many vendors and service providers in the market believe that SIP-based protocols can do this better. Sharratt again: “The most important thing is that MSML is a tool, it’s not a religion, there are many other ways of doing media processing too. Of course we think it’s a great tool and our decision to create it three years ago was bang on, but ultimately it’s one of the many tools in the toolbox. H.248 is what IMS specifies for media server control, so I would expect the two [SIP with syntaxes such as MSML, and H.248] to co-exist for a long while to come, at least until the market eventually decides which it prefers.”

Martin Davis is Principal Consultant at Dialogic (News - Alert) Corporation. For more information, visit the company online at www.dialogic.com.