Hello,
Yesterday, we discussed options for a Linux Plumber's Conference
networking track talk.
Here is a first draft of an abstract. Feel free to comment and react:
===
Multipath TCP (MPTCP) is more and more popular these days but it is not
in the upstream Linux kernel yet. A fork is still being maintained on
the side and has been since March 2009. But it cannot be upstreamed as
it is because this implementation is designed for MPTCP and the TCP
stack is too heavily impacted in term of maintainability but also a bit
regarding the performances.
In this presentation, we would like to present the challenges we are
facing. Some are introduced by this MPTCP protocol, others by objectives
we defined: limit at the maximum the impact on the existing TCP stack.
We would like to have no performance regression, a maintainable and
configurable solution and an MPTCP implementation that can be used in a
variety of deployments.
The MPTCP upstreaming community is working on a RFC patch set for
net-next. We should be able to send it before the next LPC in September.
In the current situation, a socket can be created with IPPROTO_MPTCP to
initiate and accept an MPTCP connection. This socket remains compatible
with regular TCP and IPPROTO_TCP socket behavior is unchanged. This
implementation makes use of ULP between the userspace-facing MPTCP
socket and the set of in-kernel TCP sockets it controls to limit the
minimum impact on the current TCP stack. ULP has been extended for use
with listening sockets. skb_ext is used to carry MPTCP metadata.
Both the communication and the code are public and opened. You can find
us at mptcp(a)lists.01.org and https://is.gd/mptcp_upstream
===
Do not hesitate to improve it, fix typo or restart from scratch if
needed, I don't mind!
Cheers,
Matt
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

Hello,
As we previously discussed that it could be nice to go to LPC to present
the current status and have face to face meetings, don't forget that the
registration will re-open in a few days:
http://www.cvent.com/d/s6q3j2/4W
Cheers,
Matt
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

While debugging socket state transitions in MPTCP sockets, I found that
MPTCP sockets were listing IPPROTO_TCP as their protocol in
/sys/kernel/debug/tracing/trace. sock->sk_protocol was only 8 bits wide,
truncating the new value of IPPROTO_MPTCP (0x0106) to IPPROTO_TCP
(0x06).
The networking code has varying integer widths for 'protocol' at
different layers:
* POSIX socket API: 32 bits
* sk_buff->protocol: 16 bits
* IP header (on the wire): 8 bits
MPTCP shows a use for protocol values outside those that fit in an IP
header. The change to struct sock fills an 8-bit hole, so there is no
change in the size of the structure.
Given that we are currently discussing the appropriate value for
IPPROTO_MPTCP, I'm sending this as an RFC to inform those discussions. I
had previously thought that the 16-bit value for IPPROTO_MPTCP was
compatible with the existing code base.
Mat Martineau (3):
net: Make sock protocol value checks more specific
sock: Make sk_protocol a 16-bit value
net: Add IPPROTO_MPTCP to inet_sock_set_state tracepoint output
include/net/sock.h | 5 ++---
include/trace/events/sock.h | 5 +++--
net/ax25/af_ax25.c | 2 +-
net/decnet/af_decnet.c | 2 +-
4 files changed, 7 insertions(+), 7 deletions(-)
--
2.22.0

Hello,
We just had our 56th meeting with Mat, Peter and Ossama (Intel OTC),
Christoph (Apple), Davide and Florian (Red Hat) and myself (Tessares).
Thanks again for this new good meeting!
Here are the minutes of the meeting:
Accepted patches:
- mptcp: fix remaining checkpatch issue:
- by Matth
- reviewed by Mat
- "squashed" in "mptcp: Write MPTCP DSS headers to outgoing data
packets"
- no signed-off added for this fix
- mptcp: move MPTCP option bits to internal header:
- by Matth
- reviewed by Mat
- "squashed" in 3 different commits, no signed-off
- mptcp: Re-factor mptcp_create_subflow():
- by Peter
- reviewed by Matth
- "squashed" in "mptcp: Associate MPTCP context with TCP socket"
Pending patches:
- mptcp: simplify crypto.c:
- by Davide
- reviewed by Mat, Florian and Christoph
- we can have something just random
- maybe later we can switch to a hash as an optimisation
- Change sock->sk_protocol to a 16-bit value:
- by Mat
- for the discussions with IPPROTO_MPTCP, see below
- decision: we apply this
- mptcp: Make MPTCP socket block/wakeup ignore sk_receive_queue:
- by Mat
- linked to Mat's work on the Data FIN.
- it was blocked while it should not be.
- we cannot simply check the end of the received queue with
MPTCP. That's why the behaviour needs to be different with MPTCP
- feel free to review
- can be squashed or added at the end:
- if it is a fix for a bug introduced in a previous commit,
better to squash (except if it is to explicitly show something
particular to MPTCP of course)
- we can squash
IPPROTO_MPTCP:
- Mat: sock->sk_protocol to a 16-bit value (increases a handful of
array sizes)
- Hoang: #define SOL_X25 262
- Mat: we might want to set SOL_MPTCP at some points:
- could be good to avoid collisions.
- Maybe good to merge the patch and wait for feedback later.
Feedback from netconf:
- slides at http://vger.kernel.org/netconf2019.html
- no show stopper foreseen by anyone
- got one question wrt. using kTLS with MPTCP (both use ULP
infrastructure), we should have a look at this but not a major issue for
now (stacked ulp...?)
- Eric asked about diag support, Davide already working on this
- diag ulp infra should be upstreamed independently (for ktls)
- one question was about path management, no objections to us adding
something very simple plus the genetlink based one to place decision
making in userspace
- one concern is wrt. local security holes, we can ask syzkaller
people to start also running on the mptcp tree once we get ready to
upstream, or initially restrict IPPROTO_MPTCP to init_user root to limit
impact (or both).
- no need to implement mptcp-level (coupled) congestion control on
top of subflows (i.e., its fine to use more bandwidth than one standard
tcp flow)
Send LPC proposal:
- maybe going more in initial feature and roadmap, comparison with
mptcp.org, client/server view, etc.
- could be good to send it early next week.
- Mat will look at the draft
mptcp.org:
- support MPTCP v1 seems problematic (when using v0, default
behaviour) but almost there
Next meeting:
- We propose to *skip* the next one (4th of July). Next one would
then be the 11th of July.
- Usual time: 16:00 UTC (9am PDT, 6pm CEST)
- Still open to everyone!
- https://annuel2.framapad.org/p/mptcp_upstreaming_20190711
Feel free to comment on these points and propose new ones for the next
meeting!
Talk to you next week,
Matthieu
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

Instead of using 99 for IPPROTO_MPTCP (which IANA defines as "any
private encryption scheme"), use 262 (0x100 | IPPROTO_TCP). The MPTCP
self tests continue to run successfully with these changes.
Earlier in development we used 262 for IPPROTO_SUBFLOW as it would get
truncated to 0x06 (IPPROTO_TCP) in the IP header. Now that
IPPROTO_SUBFLOW has been removed in favor of using ULP, 262 is freed up
for MPTCP.
Note that this does change the value of IPPROTO_MAX, and in reviewing
the occurrences of that around the kernel most uses look fine.
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c has one instance of IPPROTO_MAX
where its meaning is not immediately obvious.
Mat Martineau (2):
squash-to: Define IPPROTO_MPTCP
squash-to: add basic kselftest program
include/uapi/linux/in.h | 4 ++--
tools/include/uapi/linux/in.h | 2 ++
tools/testing/selftests/net/mptcp/mptcp_connect.c | 2 +-
3 files changed, 5 insertions(+), 3 deletions(-)
--
2.22.0