Tag Archives: file transfer

Large file transfer over VPN is a problem for many companies for a few reasons, transfer is unreliable, VPN traffic kills the Internet connection, and it is unproductive. This article will show you how to copy large files over VPN or other unreliable network connections, and some of the best software to deal with this, and best practices to deal with large file transfer, and how to ensure file integrity. These are, in my experience, the best way to do it. You have to evaluate yourself, if they work in your environment, and test a lot.

When copying files over VPN, there are a few problems that need to be addressed:

file transfer can be easily interrupted,

over saturating the VPN connection,

redirecting all the available traffic to the VPN connection

ensuring the transferred file is undamaged

Let’s talk a bit about all of these, why they are important, and how they affect the success of your file copy. This will give you, hopefully, a clearer image of the process.

Network Interruption – File Transfer Failed

When we talk about large file transfers, an interruption after a few hours of transfer is not a good thing, and if you take in consideration the time spent, and to only realize you have to do it again because the transfer failed. There are many reasons for the network to interrupt, and even a second is enough to corrupt your file. VPN is prone to network interruptions with large files, because it saturates the Internet bandwidth, and when other VPN clients try to use the VPN tunnel, the file transfer will be interrupted.
The fix for this is some kind of transfer resuming at both the server level and the client level. A few client-server protocols that support file resuming are: SMB, FTP, HTTP, rsync.

The easiest way to implement file transfer resuming in a corporate environment is through SMB and robocopy. Robocopy has a feature to copy in restartable network mode, so if the network goes down, it will automatically resume the transfer from where it left. The robocopy option to copy in restartable mode is /z as with the regular copy command.

robocopy /mir /z X:\source-folder\ \\RemoteServer\RemoteFolder

Rsync has also a resume option, and it works great, but you have to make sure you use the network filesystem and not a locally mounted network directory. The command to transfer will be like this:

rsync -aP juser@server:/RemoteServer/Directory /Home/Local-Directory

Filezilla has an option to resume files after interruption, but there is a timeout setup by default. Make sure you set the timeout to 0, so that you can recover even after a few hours with no connection between server and client. Note, that this is not a good option if you have many clients, since it will keep open connections indefinitely.

Apache has also an option to allow file resuming, which is enabled by default. The disadvantage with Apache is that, by default it doesn’t support file upload. If you are determined to use Apache for this though, there are some ways to do it, you can start your research here: File Upload plugin for Apache

Over Saturation of the VPN Connection

The VPN connection is there for many users, don’t think that your file is the most important think in the world. If you take all of the bandwidth for your file transfer, other users might not perform daily important tasks.

The over saturation of the VPN connection can be avoided by implementing bandwidth limiting at the software level.

You can do that with SMB by using robocopy, at the client level. The command will look like this:

robocopy /mir /IPG:250 X:\source-folder\ \\RemoteServer\RemoteFolder

The IPG parameter is the one that controls the bandwidth; it’s the acronym for Inter Packet Gap, and with an IPG of 250, the transfer rate on a 100 MBps network is around 12.7 MBps. The lower the IPG, the higher the bandwidth saturation. The transfer rate will be different for different network speeds.

With FTP is going to be at the server level, it’s very simple if you use Filezilla FTP server. Just restrict the bandwidth to a safe limit, (note that there is no restriction in the picture).

Control VPN Traffic

It looks like a simple decision to allow all the traffic possible through the VPN, most companies will determine that VPN traffic has the highest priority. However, in real life there are many non VPN applications that are run from the Internet, and are critical for a business. Booking a flight ticket, using a hosted web application, getting your emails from a hosted email server, etc… So it might make sense to control the maximum bandwidth used by the VPN, and this is especially needed in environments where large file transfers over VPN are very common. The best way to control this is through firewall policies to limit the bandwidth for the VPN destination. On some devices, like the Fortigate firewalls I am using, this is called traffic shaping. On other devices might be named differently.

File Integrity Verification

There is a mechanism for checking the file integrity with robocopy and rsync.

The perfect tool to make sure your file is identical with the remote one is to check with md5. On Linux this is a package that comes by default in many distributions, on Windows you can use WinMD5, that you can download it here: WinMD5

Slow Server

How do we know the server is slow and the problem is not elsewhere?
Make a file transfer between any two other computers on the network. Compare the measurements with the server’s transfer rates.What are the reasons for a slow server?
There are many reasons for a slow server. The server is many times the bottle-neck of a network. Here are a few reasons for a slow server:
An average, or below average network card, (you need good quality network cards for a server).Server Network Card Underutilized. Connect your server on the backbone or on 1GB switch ports to make use of the high speed network card. You probably want to limit all your clients to transfer at 100Mb so that there is no traffic discrimination. If your server and switches support higher transfer rates, (10GB ports), make sure you make use of it.Slow disks. Poor hardware is many times the main reason. Improper configuration, such as choosing the wrong RAID type, or not using write caching can be another reason.Too many clients on a server. If too many clients make requests to the same single server this could overload the server and it will perceived as a slow network by the users. Measure your server’s performance on load using the performance logs and alerts and the system monitor in Windows. Usually the performance is changing over the course of a day based on the number of users who access the server at the same time. Sometimes adding another network card would be sufficient. Enabling cache writing on the SCSI card can help a lot, (make sure you install a cache battery), adding a new SCSI card and additional disks to offload the existing ones could be of help. Sometimes adding another CPU can make a difference, (if you have free CPU slots). Memory is very often the most used method of upgrading, but most of the times it is not the needed solution. Use the performance logs and alerts and the system monitor and compare with the recommended thresholds to determine what your bottleneck is.Slow server response, (packet sniffer to determine the handshake time), Adjust the server’s configuration to optimize the handshaking time; (this is a fairly advanced optimization task).