By pure coincidence I read this posting about windows network performance from 2009, referring to the AFD ("ancillary function driver") registry setting "DefaultSendWindow" (along its partner, "DefaultReceiveWindow"). It's what controls the system-wide default for socket buffers. And the msdn docs (<invalid link removed by admin>) even say, "Applications can modify this value on a per-socket basis with the SO_RCVBUF socket option".

So yeah, I was looking for that damn thing for 2+ years. This is an alternative way to apply this modification (and it also affects other apps).

martin

2010-10-01

Thanks a lot for sharing your findings!
This issue has been added to tracker.
I'll include your proposed change into the next beta release.

ultramage

2010-09-30 22:18

Hello again :)
It's been two years since I noticed that all PuTTY-based software seems to exhibit a very poor tcp network pattern when doing bulk uploads: not using the sliding window algorithm and not aggregating data into full-sized tcp packets.

Last year I took a peek at the PuTTY source code, to try and see if I could spot any obvious problem. I did learn that PuTTY does its crypto in 512-byte blocks, and then places these blocks into a linked list, scheduled for sending. The sends are done using the send() function, individually block by block.

I made [this screenshot (<invalid hyperlink removed by admin>)], and just for the heck of it, wrote a small test case that imitated the networking approach. It was just a tcp socket doing 100 sends in a loop, sending 512 bytes each time. Funnily enough, this was enough to reproduce the issue on Windows (also, the same code performed perfectly on FreeBSD). I mailed the test case to Martin and to the PuTTY contact e-mail; he didn't know what to do about it, and they never replied.

Today I looked at the test case once more, trying to understand what could be going wrong. The test case was so generic that it applied to any windows program that didn't use its own send queue buffers, which was kinda disturbing. So anyways, as an experiment I gave the socket a larger internal send buffer (setsockopt(SO_SNDBUF)) and the issue disappeared.

So I had the culprit, now it was all a matter of fine-tuning the buffer size. I wrote a simple [testing tool] that tried various buffer sizes and measured the overall transfer time. It showed that against a machine 5ms away, 256kB was the ideal buffer size; anything less took longer, and anything more actually took longer too (going too low actually turned off the sliding window algorithm). Using 256kB instead of the default 8kB halved the transfer time. Even more drastic, when testing a machine 180ms away, the difference was like 30 seconds vs. 3 seconds.

I then repeated these tests with 'pscp' and a ssh server, and the results matched what I saw earlier - against the 180ms machine, I got 40kB/s with unpatched 8kB buffer pscp, and 330kB/s with a patched version. In this case, a buffer size of 64kB was enough to reach the max. transfer speed. A huge, mind-blowing improvement. So people have been using PuTTY for a decade, and noone ever questioned why ssh clients on *nix systems transfer data several times faster?

Alright, my e-mail address is now in my profile :)
Good luck with the testing.

martin

2008-12-29

Thanks. I'll try to reproduce it.

In case I'm not, can you provide me your email address, so I can send you a debug version of WinSCP to track the problem? If you do not want to post the address here, you can send me an email. You will find it (if you log in) in my forum profile. Please include link to this topic. Thanks.

ultramage

2008-12-29 10:34

I tried both SFTP (WinSCP shows "SFTP-3") and SCP (WinSCP shows "SCP"). Both gave the same result. I also went over the 'transfer settings' menu but none of the options I tried changed anything.

A Wireshark recording I made previously can be checked out [url=]<invalid hyperlink removed by admin>]here[/url]. Can you at least reproduce this 1470+76 fragment pattern?

martin

Re: packet fragmentation

2008-12-29

What protocol are you using with WinSCP?

ultramage

packet fragmentation

2008-12-24 22:23

Hello. During a network performance test I noticed that when uploading data, I can observe a repeating sequence of send:1460, send:76, recv:ACK. When I do the same transfer from a freebsd machine using the 'scp' tool, I get a clean stream of send:1460.

So it seems WinSCP is sending data using some internal block size instead of streaming it, and this size is above the MTU/MSS of the network. This leads to communication inefficiency, where 2 packets are sent instead of one each time.

I couldn't analyze this further due to lack of a build environment, and I couldn't find any direct explanation in the source code (beside one occurence of '1536' inside RC4 crypto code). Any feedback is appreciated.