Speeding up Your FTP Transfers: Part II

Last time we looked at a simple way of increasing ftp throughput. We saw that if Unix system default settings have small buffer sizes for transmit and receive buffers, this translates to small transmit and receive windows, which results in data sent across a wide area network in unfortunately small chunks. This produces dead time between chunks as the sender waits for acknowledgments back from the receiver.

We used this simple formula,

<window size in bytes>/window * windows/sec = bytes/sec

to determine what throughput rate we could get over a wide-area network where the windows/sec was calculated with,

1 window / <time it takes, in seconds, to send one window of data>

While this is useful for determining how much information can be transmitted based on how many windows can be sent per second, it does not, as one reader pointed out, take into account the bandwidth limitation of the connection. Thus, with our formula above, we could increase the window size to infinity and get basically:

which would be excellent--if we lived in a perfect world. However, we are usually limited by a certain bandwidth like the 100Mbits/sec of a 10/100Base-T connection, or 1Gbits/sec for Gigabit Ethernet, or 155 Mbits/sec for an OC3 ATM link.

How does one include this limitation? As always, our goal is to send as much data as we can as quickly as we can. If we have a 100 Mbits/sec connection, we want to be able to send at that rate, right?. We could try to just shove data across the network at the full 100 Mbits/sec but the link may not be reliable and we might lose some of it. Using TCP to reasonably assure delivery of the data, which many applications like ftp, www, email do, the sender of the data will wait, after it has transferred the data, for the acknowledgment of that first part of the data, before sliding its 'send window' forward to send more data. As we noted last time, this delay in the time it takes for data to be sent and the acknowledgment to come back is the round-trip-time.

We would like to keep the pipe full all the time, but if our TCP window is too small and this round-trip-time is large, then there will be a gap in transmission while the sender waits for the acknowledgment to come back. If we are able to send at the highest data rate the whole time it takes for the initial data to get to the receiver and an acknowledgment to come back, we would get,

bandwidth * round-trip-time =
amount of data that can be sent in that round-trip-time

which would keep the pipe full because the sender gets the acknowledgment back from the receiver just at the time it has reached the end of its send window and moves the window forward to send more data.

For example, if it took five seconds for the acknowledgment come back from a receiver to the sender, and the connection is a 100 Mbits/sec connection, we would be able to send,

100 Mbites/sec * 5 seconds = 500 Mbits or 62.5 Mbytes

of data during that time. Note that this is the 'window' of information that can be sent during the five seconds of delay between first data sent and the first acknowledgment received and is what we would try to set for our window or buffer size on the system. This result is called the bandwidth*delay product (pronounced 'bandwidth-delay product' rather than 'bandwidth-times-delay product' :) )

In kerberos ftp, this would mean we would set the buffer sizes to 62.5 Mbytes:

ftp> lbufsize 65536000
ftp> rbufsize 65536000

Five seconds is quite a large delay and it is more common to see a delay between 50-200 milliseconds on the Internet. If we have a 100 Mbit/sec connection and the round trip time is 100 milliseconds, or 0.1 seconds, our bandwidth*delay product would be:

100 Mbits/sec * 0.1 seconds = 10 Mbits or 1.25 Mbytes

and we would set our window size in ftp accordingly:

ftp> lbufsize 1310720
ftp> rbufsize 1310720

to try to send data at the full 100 Mbits/sec, or 12.5 Mbytes/sec. If we could keep that rate going we could transfer a 100-Gbyte file in,

100 Gbytes / (12.5 Mbytes/sec) = 8192 seconds or 2.28 hours.

Not bad, eh? Note that our selection of a 1-Mbyte window the last time was close to this size.

This covers the simple aspects of sending data at or almost the full data rate. There are still a lot of other things that could affect your transfer rates such as the communications links between you and the remote end, the effect of data transmission errors on your data rates, system resource issues at either end, the effect of other people's transfers on yours, etc. But this is a good start at speeding up those ftp transfers.

For further information about increasing performance of data transfers, check out these urls:

ARSC Advanced Display Environments Workshop

As staff and researchers gain experience with ARSC's new four-walled immersive environment, the Discovery Lab, we continue looking to the future of visualization as an aid for analysis and expression of computational results.

To this end, ARSC is sponsoring an "Advanced Display Environments Workshop," here at UAF. The schedule (still subject to minor changes) is posted below. Sessions are open to UAF, ARSC, and HPCMP researchers. Please contact Jon Genetti (ffjdg@uaf.edu) in advance if you are interested in attending.

The University of Alaska Fairbanks is an affirmative action/equal
opportunity employer and educational institution and is a part of the University
of Alaska system.
Arctic Region Supercomputing Center (ARSC) |PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8602 | fax: 907-450-8601 | Supporting high performance computational research in science and engineering with emphasis on high latitudes and the arctic.
For questions or comments regarding this website, contact info@arsc.edu