Hi guys, some of you know that I'm implementing a tcp/ip stack. I've completed ARP, ICMP, UDP, IP and have now turned my intention to TCP. I can connect to a host and send and receive packets and handle sequence/acknowledge numbers. However I'm lost as to how you can detect a lost packet? Say I send 3 packets the same size and only 1 is acknowledge how do I know which one was acknowledge????

A TCP receiver will acknowledge a sequence number only when it has received all bytes up to (but not including) that sequence number.

Say a host receives two packets with sequence numbers 1-1000 and 2001-3000. The receiver will acknowledge sequence number 1001 for both packets because sequence numbers 1001-2000 are missing. As soon as the receiver receives those missing sequence numbers (possibly because the sender re-sends the missing sequence numbers after it notices that the receiver hasn't acknowledged them yet), the receiver can acknowledge sequence number 3001.

To detect lost packets, you have to use a retransmission timer. If you don't receive an acknowledgment for a sequence number before the retransmission timer expires, send the missing packets again and update the retransmission timer.

You can also detect lost packets if you receive three duplicate acknowledgments (same sequence number) because that usually means the receiver received some packets after a lost packet, so you can start sending packets starting at the last acknowledged sequence number. This is called "fast retransmit". This is not necessary but it can speed up retransmission in some cases.

Hi christop, could you be a bit more clearer about how the sequence numbers work? Am I right in thinking that if I send three packets at the same time before any ackownledges, say starting a sequence 1 I'll have the following sequence numbers sent if sending 10 byte each.

I'm lost as to how you can detect a lost packet? ... how do I know which one was acknowledge????

Ah. This used to be pretty simple. The transmitter would send seqno 1000, 2000, 3000, 4000 with a timeout attached (in sender side data structures) to each packet. 3000 gets lost. Sender would get acks relatively promptly for 2000 and 3000, but nothing after that. The timeout would expire for 3000 and sender would retransmit 3000. Then the timeout for 4000 would expired and it would retransmit 4000. Then it would get two acks for 4000 because it was already there (so when receiver received 3000 it would ack 4000, and then ack 4000 again when it received its 2nd copy (because that's what you're supposed to do when you receive dups, IIRC.) But at least the receiver had all the data. Note that sender has to keep a copy of data until after it's acked.

Most networks didn't "lose" packets except to router congestion, so a lot of effort got put into avoiding congestion rather than recovering from lost packets. That meant picking the timeouts more accurately, sending packets more slowly, and re-computing the timeout of 4000 when you discovered you had to re-transmit 3000. (otherwise you might mistakenly resend nearly a whole windows worth of packets when only one was lost.)

This got refined over the years, and a modern TCP will implement things like "selective ack" and "fast retransmit." This means that if you transmit packets 1000 - 20000, and start receiving multiple ACKs for 9000, you should probably retransmit 10000 even before its timeout occurs. Or something like that.

Van Jacobson's papers/talks from the early 90s are unusually approachable and pretty brilliant, and you can learn a lot by reading them. SOME of their conclusions have been distilled into RFC1122, which you SHOULD be looking at as you do your implementation. It has a lot of stuff distilled into hints (well, "requirements) and lots of references.

IMNSHO, not a lot of research has been done on non-bulk data transfer or lossy tcp networks (or perhaps, not a lot of the research that has been done has reached mainstay internet users.) Even though I (as "Mr Terminal Server") felt Van was neglecting some important (?) cases, I wasn't intellectually credentialed, capable, or inclined, to come up with alternative suggestions...

Sorry. It's probably best to start out with the simple algorithms from the original RFC793.

TCP doesn't acknowledge individual packets; it acks points in a bytestream, and it only acks the points at which it has received ALL previous data.

If you send 10000 bytes in 10 separate packets with sequence 1000, 2000, etc, and the 3rd packet is lost, but all other packets are received, the receiver will only send ACKs that cover the first two packets.

Each packet sent has a timeout assigned when it's transmitted, and if you get no ACK for the 3rd packet when the timeout expires, you retransmit data starting at the sequence number of the un-acked packets.

The "obvious implementation" is that each transmitted packet is assigned a timestamp based on it's original transmission time and placed on a "retransmission queue." When a packet is acked (you receive some AKC for a sequence number larger than the last bit of data in the packet) it is removed from the retransmission queue. Some sort of "TCP Timer process" comes along and scans the retransmission queues, retransmitting each packet that has timed out (and resetting the timeout, to a larger value, just in case the packet was delayed, rather than lost.)

RFC793 pretty much has a suggested implementation - what parts of it are causing you problems? (it IS very bytestream-centric, so it talks about keeping SND.UNA (unacknowleded data) as a sequence number, rather than a packet.)

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here.

No guarantees, but if we don't report problems they won't get much of a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

Whew. As a network guy, I'll echo Johan's comments - TCP is far from simple. It's a tall order to implement a minimal version on a microcontroller that's still compatible with the rest of the world.

I recommend a look at the book TCP/IP Lean by Jeremy Bentham. It relies heavily on his own libraries to accomplish what you're attempting, but will give you a great insight to the challenges of the protocol.

And if you aren't already, you need to become very familiar with wireshark (http://wireshark.org). Use it to study behavior of TCP on working systems, combined with the TCI/IP Illustrated book Johan recommended - which is one of my favorite texts, but be aware that it is quite old and there have been new TCP options and behaviors added over time. This is probably not significant to your implementation, but you may see some things in wireshark that the book doesn't cover; Google can fill in the gaps.

Quick question. If I send 7 packets and the first 3 are acknowledge and the 4 packet is lost will the next 3 packets be acknowledge using sequence and acknowledge numbers indicating these 3 packets????

Does that also mean "without reading"? The mechanism for packet sequencing, sliding windows, acknowledgement etc should be amply explained in good literature. I am writing this note to argue that implementing TCP in "shot-in-the-dark" approach will definitively fail.

It's been many years since I was working with networking, so the below might be wrong in details, but I', trying to describe a "general picture":

Basic acknowledge mechanism is: Sender and receiver has a "sliding window" of "packets in transfer". Sender can send several packets without having got any ACK from the receiver. Receiver acknowledges (and "consumes") packets as far as the sequence is complete and error free. Example (with a sliding window size of 5, and using packet numbers rather than the actual byte number in the "stream"):

1. Sender sends packet #1

2. Sender sends packet #2

3 Sender sends packet #3

4. Sender sends packet #4

Because of the nature of IP these packets are not guaranteed to arrive at the receiver at all, and not guaranteed to arrive in sequence if they arrive. Let's play nice and let all packets arrive (but in a different order).

6. Packet #3 arrives. Receiver can not acknowledge this since it has not seen #2.

7. Packet #2 arrives. Receiver akcnowledges #3. I.e. an aknowledge of #n is saying "I've seen everything up to #n". No explicit acknowledge of #2 is done. Receiver can now deliver #2-3 to the upper layer.

When the sliding window size is 1 the system "degenerates" (in lack of a better word) into a stop-and-wait protocol. I.e. no outstanding non-acknowledged data will occur except for "the one in transit".

If a packet goes missing the receiver can wait for it, while later packets arrive, but at some point it will have to tell the receiver it is missing. If not at any other time then when the sliding window is exhausted.

Also: Whenever you think of TCP over IP you should have this mental picture in mind:

IP is primarily about routing. It is a protocol that tries to deliver individual IP packets (datagrams) to a receiver. Neither performance (optimal route) or delivery at all is guaranteed. IP is concerned with the "individual hops" of a packet from sender to receiver. IP is active in all nodes along the route from sender to receiver.

TCP is not at all about routing. It's original rationale was completely about "guaranteed delivery in sequence". It is an end-to-end protocol. I.e. Even if the traffic from sender to receiver goes through n (n>2) nodes the only nodes where TCP will be active is at the sender and receiver. Nodes in between are only doing IP. (In principle... ;-) Things has evolved, someone invented firewalls, someone invented "NAT", etc...)

Without this insight the division of "the stack" into IP and TCP makes little sense, and so does what each of those protocols do. Actually, this is just the top of the insight iceberg. Read the literature. It took a bunch of really brilliant minds to design the TCP/IP protocol suite. While it was actually technically less intricate than other contemporary protocol suites (it was a "design goal, AFAIK), it is not trivial. Read. Brace yourself. Code the absolutely necessary parts of TCP first. Test A LOT.

And do study code of other implementations!

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here.

No guarantees, but if we don't report problems they won't get much of a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

The "obvious implementation" is that each transmitted packet is assigned a timestamp based on it's original transmission time and placed on a "retransmission queue." When a packet is acked (you receive some AKC for a sequence number larger than the last bit of data in the packet) it is removed from the retransmission queue. Some sort of "TCP Timer process" comes along and scans the retransmission queues, retransmitting each packet that has timed out (and resetting the timeout, to a larger value, just in case the packet was delayed, rather than lost.)

May I challenge the word "obvious" in the above statement ?

The implementation described above is just old and inefficient one, and there are reasons for that. At that time the hardware support for CRC and IP checksum was scarce, so when you build a packet you do these calculation in pure software. Now to retransmit, wouldn't be easier to have a copy of the said segment and just retransmit it ? However, this does not come for free. Maintaining a time stamp for all not acked segments, wasting memory is not always the best choice. Another reason for this implementation is that at that time more packets were lost than today. Yet another reason for it was that the congestion avoidance algorithms were not well developed as they are today.

Today every new MAC controller offers CRC and support for IP checksum calculation. Plus, in case of embedded systems RAM is at a premium. A modern implementation would have only one tx circular buffer and use the mentioned SND.UNA (unacknowleded data). There is no track of a transmitted segment. Once an received ACk packet acks new data, this SEQ.UNA pointer is incremented. So it is almost no penalty to repack a bunch of bytes and build a new TCP packet and send it. How much you send after a packet has been lost ? According to the new calculated congestion window.

Extras from RFC1122:

IMPLEMENTATION:
Some TCP implementors have chosen to "packetize" the
data stream, i.e., to pick segment boundaries when

segments are originally sent and to queue these
segments in a "retransmission queue" until they are
acknowledged. Another design (which may be simpler) is
to defer packetizing until each time data is
transmitted or retransmitted, so there will be no
segment retransmission queue.

The reason the mentioned implementation is still in use today is that it uses less hardware support (a kinda shame) and makes it easier to port to various platforms with almost no change to the library (don't fix if is not broken).

Kudos to those who wrote those RFCs decades ago. Their work still stands up today.

While those books are good, they may be outdated by the new hardware advances.

Thanks guys, I have the TCP/IP Lean and a number of other books. I'm more or less just developing from scratch.

Coding from scratch is fine, and a great in-depth learning experience if that's the goal (as opposed to getting to completion fast, in which case it's not the best choice). But I have to politely suggest that if you have the books, the answers are in there but it'll take some studying to pull them out. In particular, TCP/IP Illustrated has some excellent packet sequence diagrams that show the interaction between the hosts (thus, the "Illustrated") - i.e., it's a great reference to start with. The value in TCP/IP Lean is mostly in picking out some clever tricks to make a large protocol minimally functional on a tiny microcontroller - I didn't want to use his library, but enjoyed his creative solutions.

The other respondents have provided good technical answers. I'll only add an observation that even the timeout values in TCP are elastic and you need logic to determine what's a good value. If I were in your shoes, adding sliding window support would come well down the line of actually getting a basic single-packet TCP session working.

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here.

No guarantees, but if we don't report problems they won't get much of a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

can someone answer the following simple question, I send 7 packets, the first three are acknowledge but the fourth is lost. will the server acknowledge the remaining three packets? If not, then how do I go about retransmission? Do I send packet 4? What about the rest of the packets, packet 5, 6, 7?

If I send 7 packets and the first 3 are acknowledge and the 4 packet is lost will the next 3 packets be acknowledge using sequence and acknowledge numbers indicating these 3 packets?

No. At best, you would get duplicate ACKs for the first three packets.

Note that you don't actually get ACKs per packet. You get an ACK saying that the receiver has received all the bytes up through the last byte of packet 3. TCP as a protocol definition is very militant about being byte-stream oriented, one of the things that makes it complex.

Although, I'm not sure I agree that it's THAT complex, if all you want is an operational and marginally compatible implementation. Optimization, conformance, and large-scale cooperation add complexity.

In particular, a simple implementation is likely to use a lot of memory, and do a lot of copying. "Small" systems are not where most modern TCP research has been aimed, so you may have trouble finding implementation hints that are applicable.

Nitpicking: Don't mix up "server" with "receiver". In a server-client situation both sides send, both sides receive and both sides acknowledge. I.e. acknowledgement is not solely specific for a server.

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here.

No guarantees, but if we don't report problems they won't get much of a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

So let us get this straight! If I send 7 packets and the first three are acknowledged and the four is lost and the last three are received. the sender is going to resend the last four packets because they have not been acknowledged!!! This seems stupid. Am I missing something?

So let us get this straight! If I send 7 packets and the first three are acknowledged and the four is lost and the last three are received. the sender is going to resend the last four packets because they have not been acknowledged!!! This seems stupid. Am I missing something?

If it seems stupid for you, then please look at SACK (sequence acknowledgement). Since you have implemented the SYN packets, you may be aware that there is an option called SACK. If both advertise it during the connection establishment, then the sender in the case you talk about will only retransmit the third segment. If SACK is not implemented, a receiver is not required to buffer data that is not continuous.

If I send 7 packets and the first three are acknowledged and the four is lost and the last three are received. the sender is going to resend the last four packets because they have not been acknowledged!!! This seems stupid. Am I missing something?

You mean you send 700 bytes of data, and get an ACK for the first 300 bytes, eventually you timeout and do a retransmission of the data starting after the 300 bytes that have been ACKed.

Exactly how much you retransmit is subject to further thought. I think that there were suggestions in between "original TCP" and "Modern TCP" involving "cleverness" like retransmitting only one packet, but not backing off the timeout for the packets after that. But yeah, there are some assumptions/ideas in the original specifications that are not ideal in modern networks. And lossy links tend to really cause poor performance in typical TCPs. "TCP over lossy networks" is a topic keeping the networking Masters and PhD students in thesis material...

(Note that the common solution for lossy links is to implement some sort of link-level reliability protocol. LAPB is one example. A link-level protocol knows much more about the characteristics of the individual link (expected bandwidth and delay, even possible NAKing or forward error detection), and can implement things much more creative fixes than you can above the IP layer, which assumes nothing...)

Okay, another hypothetical question! If I receive 3 packets that are out of order how the hell do I know that??

I've tested sending three packets "setting the sequence number for each packet sent. It appears to work. If I deliberately send them out order wireshark shows this and the I receive the ACK's for each packets after the final packet is received.

So I have to set the sequence number for each packet I send! That clears up everything.

One of the core things in TCP is the sequencing of packets. Each packet is assigned a sequence number by the sender. This is stored in a specific SEQ# field in the TCP header.

SEQ# is bits 32 to 63 in the TCP header, as shown e.g. here: https://en.wikipedia.org/wiki/Tr... . From that same Wikipedia article, that certainly would be wise to read, I quote this:

Sequence number (32 bits)

Has a dual role:

If the SYN flag is set (1), then this is the initial sequence number. The sequence number of the actual first data byte and the acknowledged number in the corresponding ACK are then this sequence number plus 1.

If the SYN flag is clear (0), then this is the accumulated sequence number of the first data byte of this segment for the current session.

You need to read the specs. And it is highly recommended to read the "Illustrated" books. At least volume I.

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here.

No guarantees, but if we don't report problems they won't get much of a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]