Payload Working Group J. Lennox
Internet-Draft D. Hong
Intended status: Standards Track Vidyo
Expires: December 31, 2017 J. Uberti
S. Holmer
M. Flodman
Google
June 29, 2017
The Layer Refresh Request (LRR) RTCP Feedback Message
draft-ietf-avtext-lrr-07
Abstract
This memo describes the RTCP Payload-Specific Feedback Message "Layer
Refresh Request" (LRR), which can be used to request a state refresh
of one or more substreams of a layered media stream. It also defines
its use with several RTP payloads for scalable media formats.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 31, 2017.
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Lennox, et al. Expires December 31, 2017 [Page 1]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions, Definitions and Acronyms . . . . . . . . . . . . 2
2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3
3. Layer Refresh Request . . . . . . . . . . . . . . . . . . . . 5
3.1. Message Format . . . . . . . . . . . . . . . . . . . . . 6
3.2. Semantics . . . . . . . . . . . . . . . . . . . . . . . . 7
4. Usage with specific codecs . . . . . . . . . . . . . . . . . 8
4.1. H264 SVC . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2. VP8 . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3. H265 . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5. Usage with different scalability transmission mechanisms . . 11
6. SDP Definitions . . . . . . . . . . . . . . . . . . . . . . . 11
7. Security Considerations . . . . . . . . . . . . . . . . . . . 12
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
9.1. Normative References . . . . . . . . . . . . . . . . . . 12
9.2. Informative References . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14
1. Introduction
This memo describes an RTCP [RFC3550] Payload-Specific Feedback
Message [RFC4585] "Layer Refresh Request" (LRR). It is designed to
allow a receiver of a layered media stream to request that one or
more of its substreams be refreshed, such that it can then be decoded
by an endpoint which previously was not receiving those layers,
without requiring that the entire stream be refreshed (as it would be
if the receiver sent a Full Intra Request (FIR); [RFC5104] see also
[RFC8082]).
The feedback message is applicable both to temporally and spatially
scaled streams, and to both single-stream and multi-stream
scalability modes.
2. Conventions, Definitions and Acronyms
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Lennox, et al. Expires December 31, 2017 [Page 2]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
2.1. Terminology
A "Layer Refresh Point" is a point in a scalable stream after which a
decoder, which previously had been able to decode only some (possibly
none) of the available layers of stream, is able to decode a greater
number of the layers.
For spatial (or quality) layers, in normal encoding, a subpicture can
depend both on earlier pictures of that spatial layer and also on
lower-layer pictures of the current picture. A layer refresh,
however, typically requires that a spatial layer picture be encoded
in a way that references only the lower-layer subpictures of the
current picture, not any earlier pictures of that spatial layer.
Additionally, the encoder must promise that no earlier pictures of
that spatial layer will be used as reference in the future.
However, even in a layer refresh, layers other than the ones being
refreshed may still maintain dependency on earlier content of the
stream. This is the difference between a layer refresh and a Full
Intra Request [RFC5104]. This minimizes the coding overhead of
refresh to only those parts of the stream that actually need to be
refreshed at any given time.
An illustration of spatial layer refresh of an enhancement layer is
shown below.
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
An illustration of spatial layer refresh of a base layer is shown
below.
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
An illustration of an inherently temporally nested stream is shown
below.
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
3.1. Message Format
The Feedback Control Information (FCI) for the Layer Refresh Request
consists of one or more FCI entries, the content of which is depicted
in Figure 5. The length of the LRR feedback message MUST be set to
2+3*N 32-bit words, where N is the number of FCI entries.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seq nr. |C| Payload Type| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | TTID| TLID | RES | CTID| CLID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5
SSRC (32 bits) The SSRC value of the media sender that is requested
to send a layer refresh point.
Seq nr. (8 bits) Command sequence number. The sequence number space
is unique for each pairing of the SSRC of command source and the
SSRC of the command target. The sequence number SHALL be
increased by 1 for each new command (modulo 256, so the value
after 255 is 0). A repetition SHALL NOT increase the sequence
number. The initial value is arbitrary.
C (1 bit) A flag bit indicating whether the "Current Temporal Layer
ID (CTID)" and "Current Layer ID (CLID)" fields are present in the
FCI. If this bit is 0, the sender of the LRR message is
requesting refresh of all layers up to and including the target
layer.
Payload Type (7 bits) The RTP payload type for which the LRR is
being requested. This gives the context in which the target layer
index is to be interpreted.
Reserved (RES) (three separate fields, 16 bits / 5 bits / 5 bits)
All bits SHALL be set to 0 by the sender and SHALL be ignored on
reception.
Target Temporal Layer ID (TTID) (3 bits) The temporal ID of the
target layer for which the receiver wishes a refresh point.
Lennox, et al. Expires December 31, 2017 [Page 6]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
Target Layer ID (TLID) (8 bits) The layer ID of the target spatial
or quality layer for which the receiver wishes a refresh point.
Its format is dependent on the payload type field.
Current Temporal Layer ID (CTID) (3 bits) If C is 1, the ID of the
current temporal layer being decoded by the receiver. This
message is not requesting refresh of layers at or below this
layer. If C is 0, this field SHALL be set to 0 by the sender and
SHALL be ignored on reception.
Current Layer ID (CLID) (8 bits) If C is 1, the layer ID of the
current spatial or quality layer being decoded by the receiver.
This message is not requesting refresh of layers at or below this
layer. If C is 0, this field SHALL be set to 0 by the sender and
SHALL be ignored on reception.
When C is 1, TTID MUST NOT be less than CTID, and TLID MUST NOT be
less than CLID; at least one of TTID or TLID MUST be greater than
CTID or CLID respectively. That is to say, the target layer index
MUST be a layer upgrade from the current layer index
. A sender MAY request an upgrade in both temporal and
spatial/quality layers simultaneously.
A receiver receiving an LRR feedback packet which does not satisfy
the requirements of the previous paragraph, i.e. one where the C bit
is present but TTID is less than CTID or TLID is less than CLID, MUST
discard the request.
Note: the syntax of the TTID, TLID, CTID, and CLID fields match, by
design, the TID and LID fields in [I-D.ietf-avtext-framemarking].
3.2. Semantics
Within the common packet header for feedback messages (as defined in
section 6.1 of [RFC4585]), the "SSRC of packet sender" field
indicates the source of the request, and the "SSRC of media source"
is not used and SHALL be set to 0. The SSRCs of the media senders to
which the LRR command applies are in the corresponding FCI entries.
A LRR message MAY contain requests to multiple media senders, using
one FCI entry per target media sender.
Upon reception of LRR, the encoder MUST send a decoder refresh point
(see Section 2.1) as soon as possible.
The sender MUST respect bandwidth limits provided by the application
of congestion control, as described in Section 5 of [RFC5104]. As
layer refresh points will often be larger than non-refreshing frames,
Lennox, et al. Expires December 31, 2017 [Page 7]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
this may restrict a sender's ability to send a layer refresh point
quickly.
LRR MUST NOT be sent as a reaction to picture losses due to packet
loss or corruption -- it is RECOMMENDED to use PLI [RFC4585] instead.
LRR SHOULD be used only in situations where there is an explicit
change in decoders' behavior, for example when a receiver will start
decoding a layer which it previously had been discarding.
4. Usage with specific codecs
In order for LRR to be used with a scalable codec, the format of the
temporal and layer ID fields (for both the target and current layer
indices) needs to be specified for that codec's RTP packetization.
New RTP packetization specifications for scalable codecs SHOULD
define how this is done. (The VP9 payload [I-D.ietf-payload-vp9],
for instance, has done so.) If the payload also specifies how it is
used with the Frame Marking RTP Header Extension
[I-D.ietf-avtext-framemarking], the syntax MUST be defined in the
same manner as the TID and LID fields in that header.
4.1. H264 SVC
H.264 SVC [RFC6190] defines temporal, dependency (spatial), and
quality scalability modes.
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | TID |R| DID | QID |
+---------------+---------------+
Figure 6
Figure 6 shows the format of the layer index fields for H.264 SVC
streams. The "R" and "RES" fields MUST be set to 0 on transmission
and ignored on reception. See [RFC6190] Section 1.1.3 for details on
the DID, QID, and TID fields.
A dependency or quality layer refresh of a given layer in H.264 SVC
can be identified by the "I" bit (idr_flag) in the extended NAL unit
header, present in NAL unit types 14 (prefix NAL unit) and 20 (coded
scalable slice). Layer refresh of the base layer can also be
identified by its NAL unit type of its coded slices, which is "5"
rather than "1". A dependency or quality layer refresh is complete
once this bit has been seen on all the appropriate layers (in
decoding order) above the current layer index (if any, or beginning
from the base layer if not) through the target layer index.
Lennox, et al. Expires December 31, 2017 [Page 8]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
Note that as the "I" bit in a PACSI header is set if the
corresponding bit is set in any of the aggregated NAL units it
describes; thus, it is not sufficient to identify layer refresh when
NAL units of multiple dependency or quality layers are aggregated.
In H.264 SVC, temporal layer refresh information can be determined
from various Supplemental Encoding Information (SEI) messages in the
bitstream.
Whether an H.264 SVC stream is scalably nested can be determined from
the Scalability Information SEI message's temporal_id_nesting flag.
If this flag is set in a stream's currently applicable Scalability
Information SEI, receivers SHOULD NOT send temporal LRR messages for
that stream, as every frame is implicitly a temporal layer refresh
point. (The Scalability Information SEI message may also be
available in the signaling negotiation of H.264 SVC, as the sprop-
scalability-info parameter.)
If a stream's temporal_id_nesting flag is not set, the Temporal Level
Switching Point SEI message identifies temporal layer switching
points. A temporal layer refresh is satisfied when this SEI message
is present in a frame with the target layer index, if the message's
delta_frame_num refers to a frame with the requested current layer
index. (Alternately, temporal layer refresh can also be satisfied by
a complete state refresh, such as an IDR.) Senders which support
receiving LRR for non-temporally-nested streams MUST insert Temporal
Level Switching Point SEI messages as appropriate.
4.2. VP8
The VP8 RTP payload format [RFC7741] defines temporal scalability
modes. It does not support spatial scalability.
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | TID | RES |
+---------------+---------------+
Figure 7
Figure 7 shows the format of the layer index field for VP8 streams.
The "RES" fields MUST be set to 0 on transmission and be ignored on
reception. See [RFC7741] Section 4.2 for details on the TID field.
A VP8 layer refresh point can be identified by the presence of the
"Y" bit in the VP8 payload header. When this bit is set, this and
all subsequent frames depend only on the current base temporal layer.
Lennox, et al. Expires December 31, 2017 [Page 9]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
On receipt of an LRR for a VP8 stream, A sender which supports LRR
MUST encode the stream so it can set the Y bit in a packet whose
temporal layer is at or below the target layer index.
Note that in VP8, not every layer switch point can be identified by
the Y bit, since the Y bit implies layer switch of all layers, not
just the layer in which it is sent. Thus the use of LRR with VP8 can
result in some inefficiency in transmision. However, this is not
expected to be a major issue for temporal structures in normal use.
4.3. H265
The initial version of the H.265 payload format [RFC7798] defines
temporal scalability, with protocol elements reserved for spatial or
other scalability modes (which are expected to be defined in a future
version of the specification).
+---------------+---------------+
|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RES | TID |RES| LayerId |
+---------------+---------------+
Figure 8
Figure 8 shows the format of the layer index field for H.265 streams.
The "RES" fields MUST be set to 0 on transmission and ignored on
reception. See [RFC7798] Section 1.1.4 for details on the LayerId
and TID fields.
H.265 streams signal whether they are temporally nested, using the
vps_temporal_id_nesting_flag in the Video Parameter Set (VPS), and
the sps_temporal_id_nesting_flag in the Sequence Parameter Set (SPS).
If this flag is set in a stream's currently applicable VPS or SPS,
receivers SHOULD NOT send temporal LRR messages for that stream, as
every frame is implicitly a temporal layer refresh point.
If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit
types 2 to 5 inclusively identify temporal layer switching points. A
layer refresh to any higher target temporal layer is satisfied when a
NAL unit type of 4 or 5 with TID equal to 1 more than current TID is
seen. Alternatively, layer refresh to a target temporal layer can be
incrementally satisfied with NAL unit type of 2 or 3. In this case,
given current TID = TO and target TID = TN, layer refresh to TN is
satisfied when NAL unit type of 2 or 3 is seen for TID = T1, then TID
= T2, all the way up to TID = TN. During this incremental process,
layer refresh to TN can be completely satisfied as soon as a NAL unit
type of 2 or 3 is seen.
Lennox, et al. Expires December 31, 2017 [Page 10]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
Of course, temporal layer refresh can also be satisfied whenever any
Intra Random Access Point (IRAP) NAL unit type (with values 16-23,
inclusively) is seen. An IRAP picture is similar to an IDR picture
in H.264 (NAL unit type of 5 in H.264) where decoding of the picture
can start without any older pictures.
In the (future) H.265 payloads that support spatial scalability, a
spatial layer refresh of a specific layer can be identified by NAL
units with the requested layer ID and NAL unit types between 16 and
21 inclusive. A dependency or quality layer refresh is complete once
NAL units of this type have been seen on all the appropriate layers
(in decoding order) above the current layer index (if any, or
beginning from the base layer if not) through the target layer index.
5. Usage with different scalability transmission mechanisms
Several different mechanisms are defined for how scalable streams can
be transmitted in RTP. The RTP Taxonomy [RFC7656] Section 3.7
defines three mechanisms: Single RTP Stream on a Single Media
Transport (SRST), Multiple RTP Streams on a Single Media Transport
(MRST), and Multiple RTP Streams on Multiple Media Transports (MRMT).
The LRR message is applicable to all these mechanisms. For MRST and
MRMT mechanisms, the "media source" field of the LRR FCI is set to
the SSRC of the RTP stream containing the layer indicated by the
Current Layer Index (if "C" is 1), or the stream containing the base
encoded stream (if "C" is 0). For MRMT, it is sent on the RTP
session on which this stream is sent. On receipt, the sender MUST
refresh all the layers requested in the stream, simultaneously in
decode order.
6. SDP Definitions
Section 7 of [RFC5104] defines SDP procedures for indicating and
negotiating support for codec control messages (CCM) in SDP. This
document extends this with a new codec control command, "lrr", which
indicates support of the Layer Refresh Request (LRR).
Figure 9 gives a formal Augmented Backus-Naur Form (ABNF) [RFC5234]
showing this grammar extension, extending the grammar defined in
[RFC5104].
rtcp-fb-ccm-param =/ SP "lrr" ; Layer Refresh Request
Figure 9: Syntax of the "lrr" ccm
The Offer-Answer considerations defined in [RFC5104] Section 7.2
apply.
Lennox, et al. Expires December 31, 2017 [Page 11]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
7. Security Considerations
All the security considerations of FIR feedback packets [RFC5104]
apply to LRR feedback packets as well. Additionally, media senders
receiving LRR feedback packets MUST validate that the payload types
and layer indices they are receiving are valid for the stream they
are currently sending, and discard the requests if not.
8. IANA Considerations
This document defines a new entry to the "Codec Control Messages"
subregistry of the "Session Description Protocol (SDP) Parameters"
registry, according to the following data:
Value name: lrr
Long name: Layer Refresh Request Command
Usable with: ccm
Mux: IDENTICAL-PER-PT
Reference: RFC XXXX
This document also defines a new entry to the "FMT Values for PSFB
Payload Types" subregistry of the "Real-Time Transport Protocol (RTP)
Parameters" registry, according to the following data:
Name: LRR
Long Name: Layer Refresh Request Command
Value: TBD
Reference: RFC XXXX
9. References
9.1. Normative References
[I-D.ietf-avtext-framemarking]
Berger, E., Nandakumar, S., and M. Zanaty, "Frame Marking
RTP Header Extension", draft-ietf-avtext-framemarking-04
(work in progress), March 2017.
Lennox, et al. Expires December 31, 2017 [Page 12]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/
RFC2119, March 1997,
.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
July 2003, .
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, DOI
10.17487/RFC4585, July 2006,
.
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
"Codec Control Messages in the RTP Audio-Visual Profile
with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104,
February 2008, .
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/
RFC5234, January 2008,
.
[RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
"RTP Payload Format for Scalable Video Coding", RFC 6190,
DOI 10.17487/RFC6190, May 2011,
.
[RFC7741] Westin, P., Lundin, H., Glover, M., Uberti, J., and F.
Galligan, "RTP Payload Format for VP8 Video", RFC 7741,
DOI 10.17487/RFC7741, March 2016,
.
[RFC7798] Wang, Y., Sanchez, Y., Schierl, T., Wenger, S., and M.
Hannuksela, "RTP Payload Format for High Efficiency Video
Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, March
2016, .
9.2. Informative References
[I-D.ietf-payload-vp9]
Uberti, J., Holmer, S., Flodman, M., Lennox, J., and D.
Hong, "RTP Payload Format for VP9 Video", draft-ietf-
payload-vp9-03 (work in progress), March 2017.
Lennox, et al. Expires December 31, 2017 [Page 13]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
[RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
DOI 10.17487/RFC7656, November 2015,
.
[RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund,
"Using Codec Control Messages in the RTP Audio-Visual
Profile with Feedback with Layered Codecs", RFC 8082, DOI
10.17487/RFC8082, March 2017,
.
Authors' Addresses
Jonathan Lennox
Vidyo, Inc.
433 Hackensack Avenue
Seventh Floor
Hackensack, NJ 07601
US
Email: jonathan@vidyo.com
Danny Hong
Vidyo, Inc.
433 Hackensack Avenue
Seventh Floor
Hackensack, NJ 07601
US
Email: danny@vidyo.com
Justin Uberti
Google, Inc.
747 6th Street South
Kirkland, WA 98033
USA
Email: justin@uberti.name
Lennox, et al. Expires December 31, 2017 [Page 14]
Internet-Draft Layer Refresh Request RTCP Feedback June 2017
Stefan Holmer
Google, Inc.
Kungsbron 2
Stockholm 111 22
Sweden
Email: holmer@google.com
Magnus Flodman
Google, Inc.
Kungsbron 2
Stockholm 111 22
Sweden
Email: mflodman@google.com
Lennox, et al. Expires December 31, 2017 [Page 15]