Reviewreview-ietf-manet-dlep-13-rtgdir-early-berger-2015-06-11

[Note this is a WG LC related review, not IETF LC.]
Hello,
I have been selected as the Routing Directorate reviewer for this
draft. The Routing Directorate seeks to review all routing or
routing-related drafts as they pass through IETF last call and IESG
review, and sometimes on special request -- or WG Last call as was the
case here . The purpose of the review is to provide assistance to the
Routing ADs. For more information about the Routing Directorate, please
see ​
http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir
Although these comments are primarily for the use of the (chairs and)
Routing ADs, it would be helpful if you could consider them along with
any other Last Call comments that you receive, and strive to resolve
them through discussion or by updating the draft.
Document: draft-ietf-manet-dlep-14
Reviewer: Lou Berger
Review Date: June 8 (later than requested due to scope of comments -- sorry)
WG LC End Date: unknown
Intended Status: Standards track
Summary:
While I think the document is pretty decent for the scope of the
work, I do have concerns about this document and recommend that the
WG Chairs/Routing ADs discuss these issues further with the authors.
I'm also available as/if needed to discuss.
Comments:
I think the document shows significant good work and looks to be a
useful protocol, although I'm not overly familiar in this space.
That said, I have a number of serious concerns about the document,
and its contents from a few of perspectives. These include basic
protocol issues, underspecified details (which could lead to
interoperability issues), and specification/editorial issues. I
think the document / protocol can be modified to address the issues
I raise below. Of course, it is up to the WG, chairs, and ADs to
decide which comments to address and which to ignore.
I don't expect that all comments will result in changes.
Major Issues:
- The length field of the generic data item (i.e., TLV) is only 8
bits. While 255 bytes (assuming that this is the unit of measure,
which BTW isn't specified) is big enough today, allowing for
larger will greatly simplify things when 255 isn't enough. --
We've run into this in RSVP and it's a real pain.
- Version number is currently defined as a data item. This means a
signal (i.e., message) needs to be potentially fully parsed to
discover what version is being used. This precludes basic format
changes to the protocol. Perhaps the Discovery and Init Signals
should be special cased to include version in their formats. (And
shorten version to 8 bits from 32, as mentioned below).
- The document references, but does not define, 'in-session' and
'discovery' states. These either need to be formally defined or
removed. BTW we had exactly the same issue with LMP (RFC4204) and
ended up adding section 11 (FSMs) at a pretty late stage of the
process.
- TCP session management is not defined, nor is the relationship
with TCP and DLEP sessions fully defined. For example:
o Closing the TCP session is only mentioned in one place and in a
way that is inconsistent with the expected protocol behavior
(close TCP before ACK is received).
o What happens when a DLEP session is terminated, can the TCP
session be reused or must it be closed too?
- There is no transaction model defined. For example, it's
completely unclear if only one unacknowledged Signal allowed at a
time, or perhaps just one per signal type is allowed, or perhaps
there are no restrictions. This needs to be explicit.
- What is the purpose of retries and timeouts over TCP? Retries
aren't needed over TCPs and it's unclear whey they are being used.
- The higher level implications of ACKs, over TCP, isn't really
clear. It seems ACKs are defined for multiple purposes: reliable
transport, transaction acknowledgment and transaction results. Of
course the first isn't needed, and implications of the others
should be clear. For example, in section 7.10, why would there be
a retry when receiving a Destination Up ACK signal indicating an
error?
- There is no discussion on scaling considerations. Are there really
none? For example, how often might be appropriate to issue/limit
Peer Updates based to changes in link quality, or how to handle
the case where a large number (all or most) of destinations go
down.
- There are 13 places where the protocol allows implementation to
define their own 'heuristics'. Some of these seem unnecessary due
to the TCP point raised above, but any that remain in the protocol
should be fully specified to ensure predictable/consistent
behavior from implementations.
- Data Items are defined for "Extensions" and "Experimental
Definition" (Sections 8.7 and 8.8). Both seem to support for
optional mechanisms, but the former uses assigned numeric values,
why the latter uses UTF-8 strings.
o What, if any, is the intended distinction/relationship between
these?
o How does an "Experimental Definition" become standardized?
- Sections 8.19 and 8.20 define "Resources" related Data Items. The
definition related to these basically says a resources is "An
8-bit integer percentage, 0-100, representing the amount of
resources allocated to receiving|transmitting data.". If I were
implementing this protocol, I'd have no idea how to produce,
update or use this information. I think there is some missing
informative and normative (RFC 2119) text related to these
formats.
- Sections 8.21 and 8.22 (Relative Link Quality) have a similar
problem of being under described, in particular it's unclear if
there's a meaningful, non-proprietary definition for link quality
that an implementation is to act on or if the passed value is just
passed for as monitoring information. Either way, this needs to
be clarified.
- Section 9 defines a "credit-windowing scheme analogous to the one
documented in [RFC5578]". It describes how credits are exchanged,
but it provides zero definition on the implications or use of
credits relative to the data plane.
- Multiple ways to implement the same function are allowed, e.g.,
optional presence of Status, Interval and TCP port. Generally
allowing such complicates testing and leads to interoperability
issues. The document should pick one way and require it.
- The document doesn't state if there are any ordering requirements
on data items. It should be explicit on this, e.g., there are no
ordering requirements on the placement of Data Items within
Signals.
- The required and optional data items that are permitted on a
signal isn't always clear. For example are 0/1/N copies of a
particular Data Item required/allowed. Using something like ABNF
would really help formalize and clarify this.
- The document doesn't clearly delineate from informative/narrative
text, normative / required processing procedures, and message
formats. This by itself is not necessarily a major issue, it just
makes it harder to (write,) review and implement the protocol.
What is a major issue is that this approach allows for duplicate
(and sometimes contradictory) normative procedures and for
omissions in procedures (particularly related to exception/error
processing). Specific examples are included above and below. It
would be best to ensure that each required processing behavior is
defined just once and in a consistent way.
- The security consideration section is inadequate. This section
should address the security of the DLEP protocol, not user
traffic. It should include an analysis of risks/threats/possible
exploits and how these are mitigated by the protocol. rfc6952,
and the protocols it references can serve as examples.
Minor Issues:
- The data and signal type fields are both 8 bits. This seems
pretty small, particularly the data type field. Given this is a
control protocol, I think a larger (at least data type) field
would provide better "future proofing".
- 2^32 versions are currently allowed (section 8.1). This seems a
bit excessive. I'd opt for max of 8 bits here myself.
- It's probably too late, but it probably would be cleaner to have a
generic ack signal rather than a per signal type ack. I mention
this here as this may come up again when clarifying the
transaction model (as mentioned above.)
- Section 2: Assumptions
This section includes informative and normative text so is more
than just Assumptions. Personally, I'd remove all normative text
from the section.
- There are no specific rules related to UDP header formation.
- Sections 8.10->8.17. Isn't add/drop indicator needed for subnets
in destination update signals?
- The IANA Considerations sections must follow​ RFC2360.
- New registries must include initial values, which are defined in
the document. (The document currently has many TBDs that should
be replaced.)
- New registries need an allocation policy, e.g.:
The registry should be established with registration policies of
"Standards Action" (for Standards Track documents) and
“Specification Required" (for other documents). The designated
expert is any current <fill-in> WG chair.
Nits:
- The document introduces the terms "signals" and "data items" for
what is commonly called "messages" and "TLVs" (or objects) in
other protocols. It's probably too late to change this, but I
think the introduction of unique terminology is counter
productive.
- Use of RFC 2119 conformance language is a bit rough, and there are
words in all caps that are not defined in RFC2119. Take a look at
http://trac.tools.ietf.org/wg/teas/trac/wiki/PSGuideline
for some
suggestions.
- Internal socket operation is mentioned a couple of times. It
really shouldn't be, the spec should define behavior on the wire.
- The Length fields are missing unit of measure (presumably octets)
- The Mnemonics are used basically once and don't really add value,
suggest dropping them.
- How/when is the "Unknown Signal" Status Code sent?
- Section 8.7: Extension List should be shown as a variable length
field.
- Section 8.8: Experiment List should be shown as a variable length
field.
That's it -- for now -- hopefully I didn't miss anything. Look forward
to hearing response to the above (and how I got things hopelessly wrong ;-)
Lou