I am reimplementing an old network layer library, but using boost asio this time. Our software is tcpip dialoging with a 3rd party software. Several messages behave very well on both sides, but there is one case I misunderstand:

The 3rd party sends two messages (msg A and B) one after the other (real short timing) but I receive only a part of message A in tcp-packet 1, and the end of message A and the whole message B in tcp-packet 2. (I sniff with wireshark).

I had not thought of this case, I am wondering if it is common with tcp, and if my layer should be adaptative to that case - or should I say to the 3rd party to check what they do on their side so as I received both message in different packets.

do you mean that you never get the complete message A?
–
DefaultNov 19 '12 at 15:41

I use the same socket yes, I receive the complete message A but dispatched over the two packets. All the data is here. My problem is with message B, that I don't see because it's after the end of message one in the same packet.
–
Stephane RollandNov 19 '12 at 15:43

Wait, are you implementing an application or a TCP stack? If an application, you never see TCP packets. If a stack, you never deal with messages. You talk about "tcp-packet 1" and you talk about "message A". You can never have both of those things at the same layer. Packets are at the network layer. Messages are at the application layer. If you're trying to do both at once, don't.
–
David SchwartzNov 19 '12 at 16:56

3 Answers
3

Packets can be fragmented and arrive out-of-sequence. The TCP stack which receives them should buffer and reorder them, before presenting the data as an incoming stream to the application layer.

My problem is with message B, that I don't see because it's after the end of message one in the same packet.

You can't rely on "messages" having a one-to-one mapping to "packets": to the application, TCP (not UDP) looks like a "streaming" protocol.

An application which sends via TCP needs another way to separate messages. Sometimes that's done by marking the end of each message. For example SMTP marks the end-of-message as follows:

The transmission of the body of the mail message is initiated with a
DATA command after which it is transmitted verbatim line by line and
is terminated with an end-of-data sequence. This sequence consists of
a new-line (), a single full stop (period), followed by
another new-line. Since a message body can contain a line with just a
period as part of the text, the client sends two periods every time a
line starts with a period; correspondingly, the server replaces every
sequence of two periods at the beginning of a line with a single one.
Such escaping method is called dot-stuffing.

Alternatively, the protocol might specify a prefix at the start of each message, which will indicate the message-length in bytes.

If you're are coding the TCP stack, then you'll have access to the TCP message header: the "Data offset" field tells you how long each message is.

Yes, this is common. TCP/IP is a streaming protocol and your "logical" packet may be split across many "physical" packets, so the client is responsible for assembling the higher-level packets. Additionally, TCP/IP guarantees the proper ordering, so you don't have to worry about assembling out of order packets.

your problem has got nothing to do with TCP at all. your problem is that you expected asio to do the message parsing for you. it does not, you have to implement it.

if your messages are all the same size do an async read for that size.

if they are of different length do a async read for your header size, analyze the header and do an async read for the rest of the message according to the header.

if your messages are of variable length and the size is unknown but there is a defined end character or sequence then you have to save the remaining bytes behind that end sequence and append the next read to that remainder.