What is packet loss?

Packet loss describes an error condition in which data packets appear to be transmitted correctly at one end of a connection, but never arrive at the other. This might be because:

network conditions are poor and the packet became damaged in transit;

the packet was deliberately dropped at a router because of internet congestion.

What is the effect of packet loss?

The main observable effect of packet loss is poor data throughput performance. But packet loss is not the only cause of poor performance, so some care is needed in diagnosing whether packet loss is being experienced.

Diagnosing packet loss

Packet loss exhibits itself in different ways according to the protocol that is being observed or tested. Certain error conditions might affect one protocol but not others.

With UDP (Unreliable Datagram Protocol), used by many online games and streaming media applications, it is impossible to use any system tools to assess whether UDP packets are being lost: only the applications themselves have any knowledge of their network usage. The games or applications might tell you whether they consider they are losing packets.

The ICMP (Internet Control Message Protocol) is used by the ping and traceroute commands. The output of a ping command normally finishes with an analysis of packets sent, returned, and lost. Ideally there should be no losses, and a loss rate exceeding 5% is really pretty bad. The output of a traceroute command will show an asterisk in place of a dropped packet. The ping command is a fair way of establishing whether there is reasonable connectivity with a specific remote host, but it does not indicate at which point in the journey things might be going wrong. The traceroute command can sometimes indicate the point in the route at which packet loss occurs, or begins to occur.

The TCP (Transmission Control Protocol) is used to carry most important data on the internet, including web pages, e-mail, news posts, and file transfers. TCP guarantees end-to-end delivery, and is self-healing. It does this by detecting errors and recovering, which might involve re-transmitting packets after packet loss. This happens automatically, and the applications using TCP are entirely unaware that errors have occurred and been fixed. The applications cannot therefore inform you of any such error conditions. The only thing you might observe is poor upload or download performance: poorer than your cable modem rate cap would suggest.

Although we can't actually count lost packets, we can infer estimates of packet loss by observing statistics of TCP error recovery kept by the operating system.

This shows that, of 1617 packets sent, only 2 had to be retransmitted, which is very good. A retransmission rate higher than a few percent would be poor. Retransmission might occur because:

a packet that this PC sent was lost and never received at the far end; or

the packet was received at the far end, but the ACK acknowledgement packet from the far end back to this PC was lost.

These figures are cumulative totals since the last restart, and might therefore include the effects of some problem long since cured. What matters is how the counters are changing now. It would be better when investigating an upload performance problem to do this:

use netstat to record the values of the Sent and Retransmitted counts;

perform a large upload, e.g. FTP to some remote FTP server;

use netstat to record the new values of the Sent and Retransmitted counts;

subtract the earlier counts from the later counts to derive the counts for the upload itself;

if the Retransmitted count increase is more than a few percent of the Sent count increase, then there must be some packet loss problem associated with that upload.

Systems other than Windows (e.g. Linux, Mac OS X) have better output to netstat -s -p tcp, and include a count of Duplicates Received as well as Segments Received. A duplicate packet might be received because:

An ACK acknowledgement packet that this PC sent (in response to receiving a data packet from the far end) was lost and not received at the far end, so the far end retransmitted the data packet.

Again, the strategy for investigating a download performance problem should be:

use netstat to record the values of the Received and Duplicate counts;

perform a large download, preferably by FTP to avoid web proxy caches;

use netstat to record the new values of the Received and Duplicate counts;

subtract the earlier counts from the later counts to derive the counts for the download itself;

if the Duplicate count increase is more than a few percent of the Received count increase, then there must be some packet loss problem associated with that download.

Troubleshooting packet loss

There are only a few things under a user's control that can cause packet loss, and be fixable.

Ensure that P2P applications have an upload throttle set below that of the cable modem upstream rate cap;

Ensure that games have a client rate set below that of the cable modem upstream rate cap;

For FTP uploads, use an FTP application (such as LeechFTP) that is capable of throttling the upload rate.

For some obscure routing problems on uploads to distant hosts, packet loss can sometimes be lessened by reducing the PC's MTU setting from the normal 1500 to a smaller value, but this cannot be considered to be a permanent solution as small MTUs lead to general poor performance.

If the PC has a ridiculously large TCP Receive Window (RWIN) value, then trivial packet loss on downloads can cause large retransmissions from the remote server (the complete RWIN length might have to be retransmitted), with consequent reduction in performance. The RWIN setting should be only just a bit more than it needs to be from the formula for the minimum RWIN for optimal performance.

The things that you might be able to diagnose but the ISP needs to remedy are:

Most other packet loss occurs within the internet and is beyond the control of the user.

Packet loss in Pace digital TV set top box models 2000 or 4000

A user has reported experiencing packet loss with the Pace models 2000 or 4000 (but not the model 1000) when performing uploads. In his case, he found that a cure was:

either: throttling the upload rate to be just less than the upstream rate cap;

or: setting an MTU of 242 or less (instead of the default 1500) on the network interface connected to the STB. This is a very drastic cure, and not usually to be recommended, because it would normally cripple upload performance on a properly functioning broadband connection. However, if it works for others in these circumstances, it might be worth trying.