Optimize TCP to Speed Through the Digital Freeway

TCP (Transmission Control Protocol) is a common and reliable transmission protocol on the Internet. TCP was introduced in the 70s by Stanford University for US Defense to establish connectivity between distributed systems to maintain a backup of defense information. At the time, TCP was introduced to communicate amongst a selected set of devices for a smaller dataset over shorter distances.

As the Internet evolved, however, the number of applications and users, and the types of data accessed and stored also evolved. The original intent of TCP was to communicate data in the form of text across computers; today’s data transfer is more complex including high pixel images, audio files, and video delivery.

The aim of the modern web is to provide consumers with an excellent user experience by loading web content quickly and seamlessly. This can be achieved by faster transmission of content over the Internet. TCP has evolved over the years and the protocol enhancements have made it possible to transmit several types of data content with optimal performance for all users.

In this article, we talk about three protocol enhancements made to TCP to have content delivered at a better rate:

Multipath TCP

TCP Connection and Session reuse

TCP Slow Start

To appreciate the benefits of these optimizations, it’s necessary to understand the design and some of the drawbacks of the older version.

TCP Connection

For data to be communicated reliably and in the right sequence between two systems, TCP must initially establish a connection between them. Once the connection is established, TCP determines how to break down the data that needs to be sent over the network.

TCP was initially designed to establish communication after a successful three-way handshake between the client and server (See Diagram below). Along with the three-way handshake, the client and server also exchange security information in the form of certificates. These certificates authorize the client to set up a secure communication, consuming additional network time during reconnection tries in cases of connection breakdowns.

Drawback #1: Network Switching

When a mobile device streaming live content over a Wi-Fi network goes out of range, the device switches to its mobile carrier. In this scenario, TCP needs to repeat the three-way handshake and the security information exchange once again; in the process, the device drops the existing Wi-Fi connection. The multipath feature was enabled to overcome the drawback of network switching.

Multipath TCP

Multipath TCP has capabilities to enable multiple sub-flows in a single TCP session. For example, consider two separate networks – Network 1 and Network 2. The server can send data over wireless Network 1 and re-establish the connection with wireless Network 2 if Network 1 is unavailable. The network switch is done without dropping the existing connection, which eliminates the additional overhead of TCP connection time.

To understand multipath TCP better, let’s look at connection and session reuse, which is used to speed up TCP time.

Connection and Session Reuse

Once a connection is established between the client and server using the three-way handshake, the connection can be used to make multiple HTTP requests to retrieve contents from the same server. This is known as connection reuse.

During the handshake, the server shares a set of session information with the client. This includes the time the connection will be alive and SSL information containing the encryption keys. If the session ends, both the three-way handshake and the SSL negotiation need to be reestablished. Session reuse ensures the data streaming for the client is not disturbed so the web content or stream data does not have to be reloaded.

How Does Multipath TCP Work?

Let’s use an example of a smartphone with access to both a 4G connection and Wi-Fi interface. Say, the smartphone uses the 4G network to establish the connection to the server. To establish the connection:

The smartphone sends a SYN along with the MP_CAPABLE TCP option enabled (indicates the smartphone supports Multipath TCP)

The server responds back with SYN+ACK with MP_CAPABLE, after which the connection is established

This connection is established between the server and the Smartphone 4G carrier

Now, if the smartphone needs to send data over the Wi-Fi, then it sends a SYN with MP_JOIN TCP option. This option contains all the information required to authenticate and authorize the device and the Wi-Fi. For this, the server responds with SYN+ACK MP_JOIN and the new communication path is established.

By using the multipath TCP, the original connection established will not be closed, which does not affect the data stream and helps smartphone users have data stream consistent while moving from one wireless network to another.

Drawback #2: Flow Control

TCP controls the rate at which data is transmitted between the client and server. The internet infrastructure design is not the same across all regions, which means the amount of traffic handled at one location will not be the same as other.

Say, there are three Internet infrastructures over which the data can be transmitted from server to the client. We see A and C are completely capable of handling the rate at which the data is sent from the server, whereas B has a capacity issue. B can handle only 20Mbps of data, whereas the server sends at a rate of 30Mbps. This would lead to packet loss which in turn would result in retransmission of the lost packets. Retransmission forces the server to wait further for the acknowledgment, resulting in higher load time of the content on the wire.

TCP slow start was introduced to overcome this drawback.

TCP slow start works based on the windowing technique. This helps TCP control the rate at which data is sent over the network, and to understand the maximum capacity of data that can be sent over the wire.

TCP slow start exponentially increases the rate at which data is transmitted. Below we see the flow of data from the server to the client. Slow start initiates with one packet, on an acknowledgment from the client, it increases transmission rates by 2^n (1, 2, 4, 8 …). If any packet loss is observed over the network Slow Start retransmits only the lost packet, rather than sending all the packets in the window.

Slow start increases the window size exponentially until the maximum window size of the receiver is reached or when there’s packet loss due to congestion. In such scenarios, TCP slow start adjusts to the previous window size for which all the packets were acknowledged.

Using TCP slow start, the congestion over the network can be identified early hence reducing the packet loss and the overall retransmission of the lost packets. It also identifies the network capacity enabling the server to send data at a consistent rate over the network. This improves the TCP time which helps users experience a faster and more reliable data transfer.

Concluding with a real-world scenario on how network latency can affect the overall page load time and end user experience of a website.

In the scatterplot graph shown below, we see a lot of outlier data points for the response time or the time taken to load the base HTML page for a website in China. The base HTML page loaded the fastest with a response time of 1.17 seconds while the slowest response time was at 35 seconds as highlighted in the graph. Latency in loading the base request affects the end-user experience as it delays loading the required content on the page.

Scatterplot graph showing the distribution of data for Response Time

So, what could be causing the issue? Is it a slow server or a high network latency which has resulted in the inconsistent response to incoming requests? Catchpoint TCP and TraceRoute monitor types can help with the root cause analysis.

Catchpoint’s TCP and traceroute monitors can identify the network path chosen between the client and the server. Information about each hop in the network (Latency, IP, ISP, and etc.) can provide valuable insights into the impact of network performance on your overall response times. The TCP tests help monitor the time taken for requested content to be routed to the desired destination.

In the current digital landscape, speed and reliable data delivery are critical to a positive user experience. It’s equally pertinent to monitor the routes taken through the Internet to deliver content. This helps us isolate performance degradations caused by network failures or route inefficiencies.