A checksum on the header only. Since some header fields change
(e.g., time to live), this is recomputed and verified at each point
that the internet header is processed.

The checksum algorithm is:

The checksum field is the 16 bit one's complement of the one's
complement sum of all 16 bit words in the header. For purposes of
computing the checksum, the value of the checksum field is zero.

This is a simple to compute checksum and experimental evidence
indicates it is adequate, but it is provisional and may be replaced
by a CRC procedure, depending on further experience.

Cool bear's hot tip

Yeah, it was never replaced with a Cyclic Redundancy Check. Note that this RFC was published in 1981 - the
CRC-32 had been invented 20 years prior!

That said, CRC-32 is used in Ethernet - it's the 4 bytes right after the payload,
the field is named “Frame check sequence”.

Note also that different polynomials can be used to compute CRC-32s - the
IEEE one (used in Ethernet) is “not the best”. That's all I'll say about this
now, one could spend an entire article talking about those.

The idea behind checksumming is to include a hash alongside the actual data,
to detect transmission errors. We're interested in detection only - if we needed
to be able to correct those errors, we'd need an Error correction code.

A hash function needs to have several properties in order to be suitable for
error detection. The most important one is that when the input changes even
slightly, the resulting hash should be completely different. It's likely
for a transmission error to be a single bit flip - and even that should give
a completely different sum:

One thing that isn't so important is for the hash function to be collision-resistant.
It is not particularly hard to find two inputs whose CRC-32 are the same:

What did we learn?

Checksums are meant to detect accidental data corruption. They are useless
against intentional data corruption.

Transmute the slice to a slice of another type, ensuring alignment of the
types is maintained.

This method splits the slice into three distinct slices: prefix, correctly
aligned middle slice of a new type, and the suffix slice. The method may make
the middle slice the greatest length possible for a given type and input
slice, but only your algorithm's performance should depend on that, not its
correctness. It is permissible for all of the input data to be returned as
the prefix or suffix slice.

This method has no purpose when either input element T or output element U
are zero-sized and will return the original slice without splitting anything.

Well, tail is pretty self-explanatory right? If we have five u8, we can only make
two u16:

What about the head though?

Well, what if the start of our u8 slice is not on a 16-bit (2-byte) boundary?

In our case:

We expect the input to be 16-bit aligned. Since it'll often be a
freshly-allocated slice, it'll often be 64-bit aligned in fact, which is enough for us.

We also expect the input's size to be a multiple of 16 bits. This is at
least true for IPv4 headers, which size is expressed in “32-bit words”
(that's the unit of the ihl field).

However, stranger things have happened, so we'll enforce our assumptions by panicking
if they're broken. (And if we ever hit that panic, we'll need a plan B).

0x1_00_00 is the first bit that doesn't fit in an u16 - each 00
represents a byte. In Rust, you can use underscores (_) as visual
separators in literals, which we use to our advantage here.

We could just as well have written 0x10000, 65536,
0b1_00000000_00000000, but 0x1_00_00 just seemed like a good balance.

Let's give our code a go shall we?

Since we're capturing live network traffic, the checksums we receive and the ones
we compute should match up. So in theory, we'd just have to zero out the checksum
field, call checksum() on the whole header, and compare them. Makes sense!

But! There's an even nicer way to check. If we checksum the header without zeroing
checksum field, the result should be zero. Let's give it a shot.

I was curious what could cause that, so I went hunting for an explanation,
and here's what I found:

There are lots of NICs that can compute the checksum on chip.

Thus, if libpcap is loaded on a machine that is sending/receiving packets
itself, the checksum will validate correctly going in one direction, but not
the other (inbound good, outbound bad).

That's because it can sniff the packet contents BEFORE it makes it to the
wire, and before the hardware can compute the checksum. The only way to
guarantee a proper checksum is to sniff packets that have already made it to
the wire (e.g. a mirror port on a switch).

Okay, good! Now for the code field. Looks like it's unused for “Echo Reply” and “Echo Request”,
but it has a meaning for “Destination Unreachable” and “Time Exceeded”.

For example, for “Destination Unreachable”, a code of 1 means “Destination host unreachable”,
whereas a code of 4 means: “Fragmentation required, and DF flag set”. That's interesting! The
DF flag means “don't fragment”, which means we want our IP datagrams to arrive unfragmented.
If we send a datagram large enough, it might make a few hops and eventually reach a host that
doesn't support datagrams that large, and we might get back that type+code.

Cool bear's hot tip

Turns out you can use that to find the MTU (maximum transmission unit - ie.
the maximum size of IP packet) of a path between two hosts.

At this point, our ipv4::Packet contains redundant information - the
constraints of an IPv4 packet are not fully encoded via the type system. When
serializing (generating bytes to send on the wire), we'll probably ignore fields
like protocol, and use only payload. Likewise, the checksum field is only
meaningful when we read an IPv4 packet, not when we generate one.

This was obtained by running ping -i 3 8.8.8.8. Note that the payload we send
is still a bunch of letters from the alphabet. The payload we get back though, does not
seem like a string at all.

Cool bear's hot tip

The Unicode replacement character ‘�’ that you see is not because your
device is missing some fonts. It's been inserted by from_utf8_lossy because
parts of the input were not valid UTF-8.

In fact, 45 00 00 looks really familiar. It looks like an IPv4 packet, where
4 is the version, 5 is the IHL (5 32-bit words = 20 bytes), then dscp
and ecn are both zero, then total length is 00 3c, the 16-bit big-endian integer
3c, which is 60.

They're not exactly the same though - what changed? We know the IPv4
header structure, let's review:

Everything is the same up until the TTL - that makes sense! For every hop,
the TTL is decreased by one. It makes complete sense that it expired when
the TTL was about to drop to zero.

As for the checksum, we've already established that when using npcap, outgoing
packets had a zero checksum. If that wasn't the case, it would still be
different though, as changing the TTL changes the packet's checksum. Everything
checks out.

Another interesting thing to notice is that we're trying to ping 8.8.8.8, but
it's 78.255.77.126 that's replying to us. That also makes sense - we never
made it all the way to 8.8.8.8, so some node in the middle replied to us.

The input is borrowed. It does not need to be copied. It's just a slice.

The output is owned. It cannot refer to the input in any way. We're
making u16 and u12 and u3 out of the input, but everything needs to
be owned. If we want some part of the input as-is, we need to make a Vec
out of it, which is part of the reason we made Blob.

The error can reference the input

So the input is borrowed until we either:

discard the error, or

succeed in parsing it.

When serializing though, we're generating bytes. Do we still need to worry
about borrowing / ownership? That all depends.

However, we're not going to do that. One of the many reasons include: why the
heck are we dealing with I/O errors here? We're just generating bytes, we're not
supposed to care whether they're successfully written out or not.

But the point is: if we were doing that, then we wouldn't have to worry
about ownership. By the time serialize returns, we can throw away the
icmp::Packet completely, because it's all written out.

Instead, we're going to do something similar to what we did for parsing with
nom, except, the other way around.

knows how to serialize &self for a given W, which should be valid for the lifetime 'a (see W: Write + 'a)

Also, this serializer is purely declarative. It's just a tuple of 16-bit
big-endian numbers!

What did we learn?

The same way nom lets us combine parsers, cookie-factory lets us combine
serializers.

Serializers are just functions. They borrow (or copy) their input, and can
serialize it any number of times to a compatible output type. They're entirely
uninterested in I/O errors, which are handled on another level.

Alright, that works! Having a match here will ensure that we have exhaustive
coverage of all the variants, so if we handle other kinds of ICMP request headers,
we'll get a nice compile-time error here to remind us to serialize that, too.

Now, we said earlier that, when serializing, we'd disregard the typ field
because it's redundant with Header.

Also, we're going to need our ICMP packet to have a valid checksum,
and some of the information we can serialize from Header will be before
the checksum, while the rest will be after the checksum.

So let's make another function to serialize what goes before the checksum.

Heyyyyy. They do look similar. The only differences are at byte offsets 2 and 3,
which.. that's the checksum!

Marty! You've gotta come back with me!

Where? Back to the buffer!

So we've got a little pickle. We need to compute a checksum for our
freshly-generated ICMP packet, but in order to do that we need to generate
some stuff before and some stuff after the checksum, so by the
time we need it, we don't yet have all the information we need.

Let's say you're writing a binary format that has messages of variable length,
and when serialized, they're prefixed by their length, as an u32.

back_to_the_buffer would let you reserve 4 bytes, then write out
the message itself:

Then it would go back, and invoke a closure with a write context
at the position we reserved, with the result of the message payload's
serialization. Now that it's done, we know it's of length 12, so we
can write an u32 with value 12:

And voilà!

Unfortunately, we can't use that. We'd need to reserve data in the middle.

We could technically come up with our own riff on BackToTheBuffer (it's all
just a bunch of seeks anyway - I know, I looked), but let's not worry too much
about it for now. It's nice to know we have the option, if we ever need to
squeeze some extra performance out of our custom network stack.

Cool bear's hot tip

We won't.

We won't need to.

So for now, let's do it the old-fashioned way. By first generating the entire
ICMP packet in a buffer, then computing the checksum, then overwriting the checksum
part of the buffer with our u16 result.

The sum of 16-bit integers can be computed in either byte order.
Thus, if we calculate the swapped sum:

[B,A] +’ [D,C] +’ … +’ [Z,Y]

the result is the same as [before], except the bytes are swapped in
the sum! To see why this is so, observe that in both orders the
carries are the same: from bit 15 to bit 0 and from bit 7 to bit
8. In other words, consistently swapping bytes simply rotates
the bits within the sum, but does not affect their internal
ordering.

Therefore, the sum may be calculated in exactly the same way
regardless of the byte order (“big-endian” or “little-endian”)
of the underlaying hardware. For example, assume a “little-
endian” machine summing data that is stored in memory in network
(“big-endian”) order. Fetching each 16-bit word will swap
bytes, resulting in the sum [above]; however, storing the result
back into memory will swap the sum back into network byte order.

Note that our code will now only work on little-endian machines,
because we're explicitly using le_u16 when serializing, which would
do an extra swap on big-endian machines. So, you know, don't go
run this on OpenRISC with no modifications.

It does! And we don't even have to worry about those extra 20 bytes that
Blob doesn't print, because we sneaked an assert_eq! in there, so our
code would crash if it didn't work.

And just like that, we're seemingly real close to sending our own, hand-crafted
network traffic… We just have to also serialize IPv4 packets and, oh, Ethernet
frames. Well that shouldn't be too hard, right?