Networking 101: Understanding TCP, the Protocol

TCP In Plain English

September 17, 2008

TCP is used everywhere, and understanding how TCP operates enables network and systems administrators to properly troubleshoot network communication issues.

TCP is wonderfully complex, but don't worry: We aren't going to tell you to go read RFC 793. This is a gentle introduction, or demystification, if you will. In this edition of Networking 101, we'll cover the TCP protocol, in only as much detail necessary to understand the second part. You'll be familiar with enough terminology, you'll understand the components of the TCP header, and then next week we'll discuss "TCP in the wild," which will focus on examining some common issues with TCP, including window scaling problems, congestion, and of course the mechanics of a TCP connection.

We sometimes hear people call it "the TCP/IP protocol suite," which means that they're talking about layers 1-4 plus 7, similar to how we presented layers. TCP lives at layer 4, along with its unreliable friend UDP. TCP stands for Transmission Control Protocol, by the way. Remember the header picture from the IP article? When a packet is encapsulated, we'll of course have the IP header at layer 3, and immediately following is the TCP header, which becomes the "data" for the IP header. TCP includes its own jargon, just like everything else. There was Ethernet frames, IP datagrams, and now TCP segments. You can think of them all as packets, but be sure to use the correct terms when communicating with others.

While trying to think of other things people say about TCP, it seemed apropos to spend some time explaining the things people are trying to tell you. There's nothing worse than asking a guru a question, and getting a response like "well, it's end-to- end." If you knew TCP you'd know what this meant, but then you wouldn't have asked the question in the first place. Let's see what we can do about that.

Yes, TCP is end-to-end. There is no concept of broadcast, or anything like it. To speak TCP with another computer, you must be connected, like a telephone call, so each end is prepared to talk. "Stream delivery" is also another phrase you'll hear. This simply means that TCP works with data streams, and out of order packets are o.k. In fact, TCP is even o.k. with lost or corrupted packets; it will eventually get them again. More likely you'll be hearing a programmer talking about streams, referring to the fact that it's hard to tell when data is actually going to be sent, and you can send unstructured data down a TCP stream. TCP can buffer things, in weird ways that sometimes don't make sense, but neither programmers nor users need to worry about that.

Whenever a TCP packet is sent, an acknowledgment, or ACK, is returned. This is really the only way to provide a reliable protocol: You must let the other side know if you have received things. Of course, people will want to improve on an inefficient system like this. Enter "piggybacking ACKs" into the picture. People call TCP "full duplex" because of piggybacking, because it lets both sides send data at the same time. This is accomplished by carrying the ACK for previous packet received within the current packet, piggybacked. In terms of preserving network utilization, this is much better than sending an entirely separate packet just to say "got it." Finally, there's the concept of a cumulative ACK: ACKing more than one packet at a time, to say "I got all the others, including this one."

In IP we dealt with individual packets being part of a larger IP datagram. Remember, a TCP segment is an individual TCP packet. TCP is a stream, so there isn't really any other concept to worry about aside from a "connection." Maximum Segment Size, or MSS, is negotiated at connection time, but almost always changes. The default MSS is 536, which is 576 (the IP guaranteed minimum packet size) minus 20 bytes for the IP header and 20 bytes for the TCP header. TCP tries to avoid causing IP-level fragmentation, so it will almost always start with 536.

The sexiest feature of TCP still remains; this is the Sliding Window Protocol. The window is essentially the amount of un-ACKed data that has been sent, and it can grow and shrink at will. This gets really interesting, and will be covered next time.

The header of a TCP packet is 20 bytes, just like an IP's. Both IP and TCP headers can get larger, if options are used. TCP does not include an IP address; it only needs to know about the port on which to connect. Don't let this confuse you though, TCP keeps track of end-to-end connections in a state table that includes IP addresses and ports. It's just that the header for TCP doesn't need the IP information, since it comes from IP.

It is easier to think of a packet as a stream, one byte after the next. Everyone always wants to show a table for the header, but this can confuse matters more. The TCP header, starting with the first bit is:

Source port, 16 bits: my local TCP port that's used for this connection