I notice that issue with Steam. Steam uses lots of ECN, which can be
nice for saving bandwidth with large games. The issue I notice is that
Steam is the one application that can cause me to have ping spikes of
over 100ms, even though I have thoroughly tested my network using both
flent and dslreports.
I also notice that I get large sparse delays in the cake stats during
steam downloads. The highest I can remember right now is like 22ms.
On 6/4/2016 9:55 AM, Jonathan Morton wrote:
>> On 4 Jun, 2016, at 04:01, Andrew McGregor <andrewmcgr at gmail.com> wrote:
>>>> ...servers with ECN response turned off even though they negotiate ECN.
> It appears that I’m looking at precisely that scenario.
>> A random selection of connections from a packet dump show very high marking rates, which are apparently acknowledged using CWR, but a subsequent dropped packet (probably due to queue overflow) takes many seconds to be retransmitted (I’m using a rather high memory limit for observation purposes).
>> Overall the TCP behaviour is approximately normal for NewReno on a dumb FIFO, and the ECN signalling is completely ignored. This doesn’t rule out the possibility that it’s a different Reno relative, such as Westwood+ or Compound.
>> There’s often more than one CWR per RTT. This isn’t a consistent characteristic; some connections have normal-looking CWRs while others issue them every three packets, as if they’re fishing for “more accurate” ECN feedback. It might vary by host; I didn’t keep track of that. But this can’t be DCTCP; even that should back off in the face of a 100% marking rate, which is often achieved at my low bandwidth and with very persistent queues.
>> Other servers respond normally to ECN signals, ruling out interference by my ISP. It’s possible the ECE flag is wiped and the CWRs are faked, but there’s no legitimate reason to do that. The CWRs ultimately make no difference, since at 100% CE marks, every ack has ECE set anyway.
>> Turning off ECN negotiation at the client results in a much better managed queue with similar throughput. It’s not immediately obvious whether that’s due to a functioning congestion response or simply the AQM clearing out the queue the hard way. It’ll be interesting to see what effect COBALT has here, when I get it to actually work.
>> As for who these servers are: Valve Software’s Steam platform. I did say they were large and popular.
>> - Jonathan Morton
>