Scalable Compression and Transmission of Internet Multicast Video

Steven Ray McCanne

In just a few years the "Internet Multicast Backbone", or MBone, has risen from a small research curiosity to a large scale and widely used communications infrastructure. A driving force behind this growth was our development of multipoint audio, video, and shared whiteboard conferencing applications that are now used daily by the large and growing MBone community. Because these real-time media are transmitted at a uniform rate to all the receivers in the network, the source must either run below the bottleneck rate or overload portions of the multicast distribution tree. In this dissertation, we propose a solution to this problem by moving the burden of rate-adaptation from the source to the receivers with a scheme we call Receiver-driven Layered Multicast, or RLM. In RLM, a source distributes a hierarchical signal by striping the constituent layers across multiple multicast groups. Receivers then adjust their reception rate by simply joining and leaving multicast groups.

But RLM solves only half of the problem. To distribute a multi-rate flow to heterogeneous receivers using RLM, the underlying signal must be encoded in a hierarchial or layered format. To this end, we developed and present herein a layered video compression algorithm which, when combined with RLM, provides a comprehensive solution for scalable multicast video transmission in heterogeneous networks. In addition to a layered representation, our coder has low-complexity (admitting an efficient software implementation) and high error resilience (admitting robust operation in loosely controlled environments like the Internet). Even with these constraints, our hybrid DCT/wavelet-based coder, which we call "Progressive Video with Hybrid Transform" or PVH, exhibits good compression performance -- comparable to wavelet zerotree coding (i.e., EZW) at low rates and near the performance of traditional DCT-based schemes at high rates. As well, it outperforms all (publicly available) Internet video codecs while achieving comparable run-time performance.

Our RLM/PVH framework leverages two design methodologies from two related yet often segregated fields: joint source/channel coding (JSCC) from traditional communications theory and application level framing (ALF) from computer network design. In accordance with JSCC, we combine the design of the source-coding algorithm (i.e., PVH) with the channel-coding algorithm (i.e., RLM), while in accordance with ALF, we reflect application semantics (i.e., PVH) in the design of the network protocol (i.e. RLM). In this thesis, we posit that JSCC and ALF are two manifestations of the same underlying design principle. We explore the ALF/JSCC design space with a discussion of our "Intra-H.261" video coder, which we developed specifically for MBone video transmission, and compare its performance to that of traditional designs based on independent source- and channel-coding.

Finally, we bring all of the pieces of our design together into a comprehensive system architecture realized in a flexible software toolkit that underlies our widely used video application -- the UCB/LBL video conferencing tool vic. Our system architecture not only integrates RLM and PVH into an autonomous video application but also provides the functionality requisite to a complete multimedia communication system, including user-interface elements and companion applications like audio and shared whiteboard. In this framework, we craft "media agents" from a common multimedia toolkit and control and configure them over a software interprocess communication bus that we call the Coordination Bus. By composing an arbitrary arrangement of media agents over the Coordination Bus and complementing the arrangement with an appropriate user-interface, we can induce an arbitrary multimedia collaboration style. Unlike previous work on layered video compression and transmission, we have implemented RLM, PVH, and our coordination framework in a "real" application and aredeploying a fully operational system on a very large scale over the MBone.

Advisor: Martin Vetterli

BibTeX citation:

@phdthesis{McCanne:CSD-96-928,
Author = {McCanne, Steven Ray},
Title = {Scalable Compression and Transmission of Internet Multicast Video},
School = {EECS Department, University of California, Berkeley},
Year = {1996},
Month = {Dec},
URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/1996/6211.html},
Number = {UCB/CSD-96-928},
Abstract = {In just a few years the "Internet Multicast Backbone", or MBone, has risen from a small research curiosity to a large scale and widely used communications infrastructure. A driving force behind this growth was our development of multipoint audio, video, and shared whiteboard conferencing applications that are now used daily by the large and growing MBone community. Because these real-time media are transmitted at a uniform rate to all the receivers in the network, the source must either run below the bottleneck rate or overload portions of the multicast distribution tree. In this dissertation, we propose a solution to this problem by moving the burden of rate-adaptation from the source to the receivers with a scheme we call Receiver-driven Layered Multicast, or RLM. In RLM, a source distributes a hierarchical signal by striping the constituent layers across multiple multicast groups. Receivers then adjust their reception rate by simply joining and leaving multicast groups. <p>But RLM solves only half of the problem. To distribute a multi-rate flow to heterogeneous receivers using RLM, the underlying signal must be encoded in a hierarchial or layered format. To this end, we developed and present herein a layered video compression algorithm which, when combined with RLM, provides a comprehensive solution for scalable multicast video transmission in heterogeneous networks. In addition to a layered representation, our coder has low-complexity (admitting an efficient software implementation) and high error resilience (admitting robust operation in loosely controlled environments like the Internet). Even with these constraints, our hybrid DCT/wavelet-based coder, which we call "Progressive Video with Hybrid Transform" or PVH, exhibits good compression performance -- comparable to wavelet zerotree coding (i.e., EZW) at low rates and near the performance of traditional DCT-based schemes at high rates. As well, it outperforms all (publicly available) Internet video codecs while achieving comparable run-time performance. <p>Our RLM/PVH framework leverages two design methodologies from two related yet often segregated fields: joint source/channel coding (JSCC) from traditional communications theory and application level framing (ALF) from computer network design. In accordance with JSCC, we combine the design of the source-coding algorithm (i.e., PVH) with the channel-coding algorithm (i.e., RLM), while in accordance with ALF, we reflect application semantics (i.e., PVH) in the design of the network protocol (i.e. RLM). In this thesis, we posit that JSCC and ALF are two manifestations of the same underlying design principle. We explore the ALF/JSCC design space with a discussion of our "Intra-H.261" video coder, which we developed specifically for MBone video transmission, and compare its performance to that of traditional designs based on independent source- and channel-coding. <p>Finally, we bring all of the pieces of our design together into a comprehensive system architecture realized in a flexible software toolkit that underlies our widely used video application -- the UCB/LBL video conferencing tool vic. Our system architecture not only integrates RLM and PVH into an autonomous video application but also provides the functionality requisite to a complete multimedia communication system, including user-interface elements and companion applications like audio and shared whiteboard. In this framework, we craft "media agents" from a common multimedia toolkit and control and configure them over a software interprocess communication bus that we call the Coordination Bus. By composing an arbitrary arrangement of media agents over the Coordination Bus and complementing the arrangement with an appropriate user-interface, we can induce an arbitrary multimedia collaboration style. Unlike previous work on layered video compression and transmission, we have implemented RLM, PVH, and our coordination framework in a "real" application and aredeploying a fully operational system on a very large scale over the MBone.}
}