Minutes of the RTP Media Congestion Avoidance Techniques (RMCAT) BoF
IETF#84
Reported by Colin Perkins
The RMCAT BoF was held from 13:00-15:00 on 2 August 2012 at the IETF 84
meeting in Vancouver, Canada. The chairs were Michael Welzl (University
of Oslo) and Colin Perkins (University of Glasgow). The BoF grew out of
discussions around the RTCWEB working group, and was coordinated on the
mailing list.
Introduction
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-1.pdf
The chairs introduced the BoF. The IETF and W3C are currently developing
standards for video conferencing in web browsers, using an RTP-based
media layer. The resulting systems are expected to see wide deployment.
This is potentially problematic, since standards for RTP congestion
control are not well developed. This risks causing congestion collapse in
the Internet in the worst case, and in the short-term may disrupt the
quality of experience due to interactions between flows. The main goal of
this BoF is to understand the problem, and agree on a process for finding
a solution.
Problem Statement
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-5.pdf
Harald Alvestrand gave a more detailed outline of the problem statement.
He noted that the Internet is dominated by TCP traffic, and that real-
time interactive traffic is a niche, comprising mostly low-bandwidth VoIP
and gaming flows, generally running on managed or high capacity networks.
RTCWEB is aiming to change this, deploying interactive and high-bandwidth
video on unmanaged and variable quality networks. Harald re-iterated the
potential for congestion collapse due to these flows, and noted also
that a consequence of the interactive nature of these applications is
that low delay is as important as a sufficient bandwidth allocation. He
also noted that these applications, while being relatively high bandwidth,
have both upper- and lower-bounds on the bandwidth they can consume.
Several mechanisms exist that might be considered for congestion control
of interactive real-time applications. TCP can be said to deliver obsolete
data reliably, encouraging full queues at bottlenecks, and so cannot meet
the delay bounds. TFRC has smoother behaviour than TCP and is unreliable,
but does not focus on minimizing delay, and has not seen wide deployment
for a range of reasons. There are also proprietary mechanisms, that are
by definition not interoperable. None of these are suitable. What we need
from a future working group is a well-defined problem statement, one (or
more) fully specified mechanisms for solving the congestion problem for
interactive real-time applications, and metrics against which the success
of that mechanism can be evaluated.
Context: IAB/IRTF Congestion Control Workshop
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-6.pdf
Michael Welzl summarised the outcomes of the workshop on congestion
control for interactive real-time applications that was held on 28 July
2012 in Vancouver. The workshop asked 1) if, absent changes to the
network, we can develop a useful congestion control algorithm for
interactive media; 2) if there is work in the area of measurements that
we can use to create incentives to make updates to the network happen;
and 3) if it is useful to develop a congestion control mechanism that
assumes the problem is in the end host only, where there is idle capacity
in the network. The consensus of the workshop participants was to answer
yes to all these questions, but to note a wide range of normal delay
variation in non-congested networks, which will significantly complicate
the problem due to the delay bounds of interactive applications.
The workshop participants noted that there are both short- and long-term
paths to the solution. In the long-term, we should consider ECN, active
queue management, and ways to segregate non-delay sensitive TCP traffic
from interactive traffic. In the short term, there is potential to avoid
self-inflicted queuing and ensure that e.g. browsers sending interactive
flows don't cause excessive congestion and delay. Properties and limitations
of the media codecs and feedback channel were noted, as was the trade-off
between loss- and delay-based algorithms.
Context: Buffer Bloat and AQM
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-4.pdf
Jim Gettys summarised issues relating to buffer bloat and active queue
management (AQM) in the network. Numerous middleboxes, especially home
gateway devices, and hosts have excess buffering. This can lead to large
standing queues in the presence of TCP traffic, since TCP tries to fill
the pipe, but doesn't try to minimise delay. This is problematic for
interactive media flows that do care about delay. There are numerous
pieces to the solution including reducing buffering in the network,
deploying active queue management (where the new CoDel algorithm has
promise), and developing new congestion control algorithms. None are
likely to solve the problem in the short term, so RMCAT will need to
cope with over-buffered networks when designing a solution.
Context: Competing Traffic
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-3.pdf
Matt Mathis outlined the problems with interactive media flows competing
with TCP traffic ("TCP Friendly is an oxymoron"). The fundamental problem
is that TCP requires queues for correct operation, but interactive media
flows cannot tolerate queueing. Modern TCP stacks are sufficiently well
tuned that they will cause queuing on any network path, if they don't run
out of data to send first. Matt believes the solution is a combination
of traffic segregation, coupled with metrics to expose the problem. For
interactive media congestion control, he recommended the group assume a
(mostly) empty queue, and design the congestion control to avoid causing
self-inflicted queuing delay and to provide "fair" sharing between media
flows. Queuing delay is unavoidable in the presence of TCP cross-traffic,
and a circuit breaker is needed to avoid serious congestion, but he
suggested to avoid over-think interactions in failure cases, and ended with
a quote from Andrew McGregor: "protecting TCP is unnecessary, it will not
return the favour".
Outline Proposals for Potential Solutions
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-2.pdf
and http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-0.pdf
Harald Alvestrand and Piers O'Hanlon presented candidate congestion
control algorithms for interactive media applications. The proposal from
Harald is a delay-based algorithm, based on analysis of filtered packet
inter-arrival times; Piers suggested a variant of the TFRC algorithm that
also includes a delay-based component. Neither algorithm was discussed in
detail.
Proposed Charter and Discussion
Slides: http://www.ietf.org/proceedings/84/slides/slides-84-rmcat-1.pdf
Draft charter: http://rtp-congestion.alvestrand.com/bof-planning-page/wg-charter---input-to-vancouver
The chairs outlined the proposed charter and invited discussion.
The following points were raised:
- Matt Mathis suggested that a missing item on the charter is defining a
clear metric that can be used to report congestion-related problems to
the users of interactive media applications.
- Magnus Westerlund asked if congestion control for calls using a mixer
or other application-layer middlebox should be in scope for the group.
It is possible to run a separate control loop on each side of the
middlebox, but this can require media transcoding for rate adaptation
if the two sides of the call achieve different bandwidth allocations.
Passing congestion signals through the middlebox can allow end-to-end
congestion control, avoiding transcoding, so the group should possibly
consider coupled congestion control for this scenario. Cullen Jennings
disagreed, wanting to limit the scope to point-to-point flows only and
to remove anything to do with middleboxes.
- Stephan Wenger suggested that the discussions have been behind the
state of the art in joint source-channel coding. The group should
consider that not all packets are equal in importance when designing
congestion control algorithms.
- Toerless Eckert suggested that the group may be too constrained by not
being able to change TCP.
- Eric Rescorla noted that the pace of browser development is fast, and
would allow us to rapidly roll out new versions of congestion control
provided it is scoped to running in a single application-level process.
This suggests that trying to change operating system or network layer
components should be outside the scope.
- Harald Alvestrand and Spencer Dawkins agreed to keep the scope small,
and suggested that the Transport Area could consider the wider issues,
possibly creating other working groups if necessary. Limiting the scope
of a possible RMCAT working group should not limit the wider discussion
of the issues.
- Randell Jesup wanted clarification that AQM is out of scope. The chairs
agreed that defining AQM would be out of scope for RMCAT, but the group
might decide to develop proposals that could use AQM if available.
- Van Jacobson commented that the problem to be tackled is difficult. He
recalled some of the history of developing TCP congestion control, and
suggested that solving the self-interference problem for interactive
media flows would be a sufficient challenge. Van recommended that the
group consider impact on TCP as a second-order problem.
- Jonathan Lennox, Spencer Dawkins, and others wondered why the charter
proposed initial publication of the algorithms as experimental RFCs.
The process is unusual, and it is not clear how deploying the
algorithms in browsers with hundreds of millions of users can be
classed as an experiment. Ted Hardie suggested that an important
point is how easily and quickly the algorithm can be updated.
- On a related note, Lars Eggert noted that defining evaluation criteria
is important: what denotes success for the group? Lars also suggested
that we need to be clearer on how we evaluate candidate algorithms, noting
that we should have experiments that can be replicated, and competing
proposals evaluated against the same criteria to ensure fairness and
clarity in the evaluation. Toerless Eckert echoed these statements.
- Harald Alvestrand suggested that we need to identify how multiple
flows work, and how flows can be grouped together. To what extent
should congestion control of aggregates of independent flows sharing
a 5-tuple be in scope? Michael Welzl suggested that defined APIs may
be a part of this.
- Bob Briscoe cautioned the group that congestion control is a difficult
problem, and suggested that we may want to develop a framework, rather
than a single algorithm.
- ??? suggested that a requirements document is a missing deliverable in
the charter. He also noted that media congestion control is broader
than RTCWEB, and suggested that the group should try to avoid limiting
the applicability of the algorithms developed to just RTCWEB.
The chairs wrapped-up the discussion by asking three questions:
- Do you think that the problem is clear, well‐scoped, solvable, and
worth solving? There was a hum in favour.
- Do you support forming a WG with the charter outlined? There was a
strong hum in favour.
- Would you be willing to work on one or more of the drafts outlined?
There was a significant constituency of people willing to work on the
drafts.
The charter will be updated based on the discussion and circulated on the
mailing list for review. The chairs and the area directors will then work
to form a working group.