MPEG Column: 106th MPEG Meeting

Christian Timmerer is a researcher, entrepreneur, and teacher on immersive multimedia communication, streaming, adaptation, and Quality of Experience. He is an Assistant Professor at Alpen-Adria-Universität Klagenfurt, Austria and CIO at bitmovin.net, Austria. Follow him on Twitter at http://twitter.com/timse7 and subscribe to his blog at http://blog.timmerer.com.

November, 2013, Geneva, Switzerland. Here comes a news report from the 106th MPEG in Geneva, Switzerland which was actually during the Austrian national day but Austrian Airlines had a nice present (see picture) for their guests.

In this meeting, ISO/IEC 23008-1 (i.e., MPEG-H Part 1) MPEG Media Transport (MMT) reached Final Draft International Standard (FDIS). Looking back when this project was started with the aim to supersede the widely adopted MPEG-2 Transport Stream (M2TS) — which receives the Technology &amp; Engineering Emmy®Award in Jan’14 — and what we have now, the following features are supported within MMT:

Self-contained multiplexing structure

Strict timing model

Reference buffer model

Flexible splicing of content

Name based access of data

AL-FEC (application layer forward error correction)

Multiple Qualities of Service within one packet flow

ITU-T Tower Building, Geneva.

Interestingly, MMT supports the carriage of MPEG-DASH segments and MPD for uni-directional environments such as broadcasting.

MPEG-H now comprises three major technologies, part 1 is about transport (MMT; at FDIS stage), part 2 deals with video coding (HEVC; at FDIS stage), and part 3 will be about audio coding, specifically 3D audio coding (but it’s still in its infancy for which technical responses have been evaluated only recently). Other parts of MPEG-H are currently related to these three parts.

In terms of research, it is important to determine the efficiency, overhead, and — in general — the use cases enabled by MMT. From a business point of view, it will be interesting to see whether MMT will actually supersede M2TS and how it will evolve compared or in relation to DASH.

On another topic, MPEG-7 visual reached an important milestone at this meeting. The Committee Draft (CD) for Part 13 (ISO/IEC 15938-13) has been approved and is entitled Compact Descriptors for Visual Search (CDVS). This image description enables comparing and finding pictures that include similar content, e.g., when showing the same object from different viewpoints. CDVS mainly deals with images but MPEG also started work for compact descriptors for video search.

The CDVS standard truly helps to reduce the semantic gap. However, research in this domain is already well developed and it is unclear whether the research community will adopt CDVS, specifically because the interest in MPEG-7 descriptors has decreased lately. On the other hand, such a standard will enable interoperability among vendors and services (e.g., Google Goggles) reducing the number of proprietary formats and, hopefully, APIs. However, the most important question is whether CDVS will be adopted by the industry (and research).

Finally, what about MPEG-DASH?

The 2nd edition of part 1 (MPD and segment formats) and the 1st edition of part 2 (conformance and reference software) have been finalized at the 105th MPEG meeting (FDIS). Additionally, we had a public/open workshop at that meeting which was about session management and control for DASH. This and other new topics are further developed within so-called core experiments for which I’d like to give a brief overview:

Server and Network assisted DASH Operation (SAND) which is the immediate result of the workshop at the 105th MPEG meeting and introduces a DASH-Aware Media Element (DANE) as depicted in the Figure below. Parameters from this element — as well as others — may support the DASH client within its operations, i.e., downloading the “best” segments for its context. SAND parameters are typically coming from the network itself whereas Parameters for enhancing delivery by DANE (PED) are coming from the content author.

Baseline Architecture for Server and Network assisted DASH.

Spatial Relationship Description is about delivering (tiled) ultra-high-resolution content towards heterogeneous clients while at the same time providing interactivity (e.g., zooming). Thus, not only the temporal but also spatial relationship of representations needs to be described.

The outcome of these CEs is potentially interesting for future amendments. One CE closed at this meeting which was about including quality information within DASH, e.g., as part of an additional track within ISOBMFF and an additional representation within the MPD. Clients may access this quality information in advance to assist the adaptation logic in order to make informed decisions about which segment to download next.