Microsoft proposes alternate spec for Web audio/video chat standard

Redmond is clearly eyeing the potential for a Web-based Skype implementation.

Microsoft published a blog post today outlining its position on WebRTC, an emerging Web standard that aims to enable real-time audio and video conferencing on the Web without requiring any plugins. The company raised a number of concerns about the current specification and published a draft of its own alternative, called CU-RTC-WEB.

The current WebRTC specification is incomplete—it's still in development and undergoing revision through the W3C WebRTC working group. It is based largely on technology that Google obtained in its 2010 acquisition of Global IP solutions. Google released the underlying software under an open source license and drafted the original proposal. The standard has since attracted the support of Mozilla, Opera, Ericsson, Cisco, and a number of other parties.

Microsoft claims that the standard is too prescriptive and requires too much of the network transport logic to be implemented by the browser. As a result, the company says that it doesn’t offer enough flexibility for Web developers who want to customize how their real-time communication services respond to changes in network quality.

Bandwidth constraints often require intensive real-time communication software to make trade-offs, choosing how to throttle or degrade the stream so that the end-user experience remains acceptable. The point raised by Microsoft is that different usage scenarios could require different approaches.

A lower-level API would make it possible for Web developers who consume the standard to make implementation choices that are best suited for their application. More flexibility would also, says Microsoft, expand the range of applications that developers could build on top of the standard and make it easier to build Web-based real-time communication services that will interoperate well with existing teleconferencing and VoIP solutions.

After acquiring Skype for $8 billion last year, Microsoft is now a major stakeholder in the VoIP space. It’s not entirely clear yet how Microsoft is going to put its Skype assets to work, so it is difficult to comment at this time on how the rise of ubiquitous standards-based Web teleconferencing promised by WebRTC would impact the company.

It could offer a big opportunity for making Skype more ubiquitous, but it also gives a big boost to rivals. Google’s interest in WebRTC is obviously focused on making services like Google Talk voice and video chat and Google+ Hangouts more seamless in the user’s Web browser. Skype would also benefit from more seamless Web experiences, particularly in areas like Facebook integration.

Microsoft’s CU-RTC-Web proposal, which has been submitted to the WebRTC working group, aims to address the issues that the company sees in the current specification. Microsoft’s draft is authored by Jonathan Rosenberg (the inventor of SIP) and Bernard Aboba (a principal architect for Microsoft Lync). Both are signatories of Microsoft’s statement about the limitations of WebRTC along with a number of Microsoft personnel, including several prominent names from Skype.

It’s worth noting that Microsoft’s specification doesn’t completely reinvent the wheel. It’s designed to integrate neatly with several facets of the existing WebRTC stack, including the getUserMedia JavaScript APIs, which enable Web applications to access audio and video streams from the user’s microphone and webcam.

Much of the WebRTC implementation work that the browser vendors have already delivered to users has been focused on getUserMedia rather than the networking parts. Other aspects of WebRTC have seen major changes this year.

One wrinkle in Microsoft’s proposal is the company’s desire to avoid typing particular media codecs in the specification. Google is pushing for the adoption of its own VP8 codec for video and Xiph.org’s Opus for audio. Opus is an open format for high-quality streaming audio that incorporates technology from Xiph.org’s previous CELT effort and Skype’s SILK codec.

VP8 and Opus are both good choices because they are available under royalty-free terms and aren’t encumbered by known patents. Opening the door for browser vendors to choose which codecs they support could pose compatibility challenges, much like the issues that have arisen around the HTML5 video element. Several browser vendors, including Mozilla and Opera, oppose using royalty-bearing codecs because doing so would create a patent paywall, raising the barrier to entry for innovation around the technology.

The patent issue is bound to be controversial, but the other aspects of Microsoft’s position seem relatively well-considered. With some recent standards, particularly WebGL, we’ve seen a strong tilt toward low-level APIs on the Web that offer developers more flexibility to create whatever they can imagine. As the Web continues to grow into its role as an application platform, that approach seems sensible. Independent developers can always create libraries that provide convenient high-level constructs on top of the underlying APIs, ensuring that the features are suitably easy for Web application developers to consume. Microsoft’s vision for WebRTC seems consistent with that pattern.