Path MTU Discovery on the U-Verse RG

Path MTU Discovery on the U-Verse RG

I broke this discussion out into a separate thread from where the discussion originally started since this is a separate issue.

The discussion originated with a discussion of using a VPN router behind the RG, in the DMZ, and doing IPSec/ESP through the RG and 1) Whether that will work properly or at all, and 2) what problems might arise in this configuration.

I can also confirm that IPSec/ESP and IPSec/AH both work properly through the RG when the VPN router is set in the DMZ. However, there is a caveat: The RG does not properly send ICMP packet-too-big responses back to the VPN router, which breaks PMTU discovery. You will have to set the IP MTU manually on your IPSec tunnel to avoid fragmentation. I used an MTU of 1350, and it was working fine. This was on a Cisco 2811 with a tunnel to a Cisco 2851.

Thanks for that information about path MTU discovery. But why would the RG have to generate too-large messages? Is the MTU anything less than 1500?

OK, well the issue is the way PMTU Discovery works. Firs, remember that MTU = Maximum Transmission Unit, the maximum frame size that a network can handle due to its architecture. Note that MTU is generally a layer 2 limitation.

In the old days, different networks had different MTU sizes that they could handle, and there was no way for the end stations to know what networks their packets might traverse over, and what the MTU was of each of these networks. If a station sent a 1500 byte IP packet, but there was a link between the two stations that had a 576 byte limit, the IP packet would be fragmented into three layer 2 frames to be transported across that network, and would then get put back together.

There are several problems with fragmentation. First, it uses a lot of CPU/memory resources on the routers that have to fragment and reassemble. On the receiving router, fragments have to be held in memory until all the pieces have been received before they can be reassembled. Second, if the link is dropping a small percentage of frames, it drastically increases the likelihood that a given IP packet will have to be retransmitted by TCP, because if one fragment is lost, all the fragments must be resent.

To avoid fragmentation, you used to have to be able to see all the network segments (or know their MTU), and manually adjust your MTU on your computer downward until the packet sizes your computer was sending out were small enough so that they wouldn't get fragmented.

Starting with Windows 95, a new mechanism to control IP fragmentation was introduced called path MTU discovery, or PMTUD. What the end station does is simply send all IP packets out with the DF (Don't Fragment) bit in the IP header set. If this packet hits a router where the network it needs to forward the packet to has an MTU smaller than the packet size (which would require fragmentation), the router is supposed to drop the packet, and send an ICMP "Packet Too Big" message back to the computer.

PMTUD takes advantage of that, because if it receives an ICMP Packet-Too-Big message, it lowers the IP datagram size and sends it out again. Eventually the packet will make it through all the way to the target station without being fragmented, and the computer just remembers the correct IP datagram size for that TCP session.

MTU issues show up a lot in IPSec implementations because of the repeated encapsulations for all the carrier protocols. For example, on an IPSec/ESP link coupled with GRE, a source IP packet is encapsulated in GRE, which is then encapsulated in IPSec/ESP. Each encapsulation reduces the available MTU to the computer.

The issue with the U-Verse RG is that it seems to ignore the DF bit, and instead of dropping it and sending ICMP Packet-Too-Big, it just goes ahead and fragments the packet anyway. This obviously breaks PMTUD, requiring you to manually set it on your router.

For a Cisco IPSec/ESP with GRE tunnel, the manually set MTU should be 1438, but I typically set it a little lower to avoid any unexpected issues, say to 1400.

Re: Path MTU Discovery on the U-Verse RG

That's a good thumbnail sketch of PMTU. But I understand how it works, and I know that the headers added by IPSEC reduce the effective MTU as seen by the applications. So my point is this: under what circumstances should the Uverse RG ever have to return a datagram-too-large ICMP message? I presume it can handle 1500 byte packets, and since you talk to it with standard Ethernet with an MTU of 1500 bytes, how could you ever send it a packet that would be too large? Are any of your computers configured to send jumbograms?

The only entites that should ever have to return an ICMP datagram-too-big message over Ethernet are those with MTUs less than 1500 bytes,notably tunnels like your VPN router. It's up to your VPN router, the thing that's doing the encapsulation into ESP or AH, to limit the size of the packets it can accomodate and to return the path MTU discovery messages to its clients to tell them that the MTU is something less than 1500 bytes.

Re: Path MTU Discovery on the U-Verse RG

That is true that the RG shouldn't have to send an ICMP PTB as long as the VDSL link can handle a 1500 byte packet. But I'm not sure it can.

I believe someone has posted that the upstream packets on the broadband port have an 802.1q VLAN header, which would add 8 bytes, leaving the MTU at 1492.

I've also seen people over at DSL reports post that after trial-and-error, their MTU was indeed 1492. Although some people also posted that theirs was 1500. You probably can't really find out unless you have an ability to see what the packet fragmentation might be at the other end of the link (e.g. a router under your control).

Re: Path MTU Discovery on the U-Verse RG

I just ran some tests. From one of my static addresses, i.e., NOT through the NAT in the 3800, I had no trouble getting responses to 1500 byte pings to several remote sites. I.e., the ethernet frames leaving my computer were 1514 bytes long (1500 byte IP datagram plus 14-byte Ethernet header) and the Ethernet frames leaving the 3800 over the VDSL link were 1518 bytes after the addition of the 4-byte 802.1q tag.

I got the same results for three separate sites: 74.53.67.2, the commercial server that hosts my personal website; Google's public DNS resolver 8.8.8.8; and the next-hop router in the Uverse network beyond my 3800, 99.71.136.1. I can still see my IP fragments leaving the 3800 over the VDSL link, complete with 802.1q tag, so the 3800 is definitely not blocking them. At the moment I have no root access to a remote ping target, nor can I see the downstream link from the VDSL modem to the 3800, so I cannot tell if my IP fragments aren't reaching their destination, or if they are and the reply (which would also consist of two IP fragments) is being blocked somewhere along the way.

So perhaps your problem is in the way your VPN box interacts with AT&T's network beyond the 3800, i.e., this could be one case where the 3800 is notat fault. If your VPN is configured in such a way as to generate IPv4 fragments instead of returning an ICMP message when a client packet plus ESP headers would exceed 1500 bytes, those packet fragments would not get through.

Since an IPSEC tunnel knows the MTU of its outbound interface, it should know how big a client packet it can tunnel without having to generate fragments. So if a client packet comes in with the DF bit set that exceeds this limit, the IPSEC tunnel should obviously return an ICMP unreachable with the proper MTU. But what if the client doesn't set the DF bit? In this case, I think the IPSEC tunnel might well go ahead and generate those fragments, and then you'll have problems over the Uverse path. Can you check that all the clients of your VPN box are setting the DF bit, i.e., they have path MTU discovery enabled?

Re: Path MTU Discovery on the U-Verse RG

All Windows machines have PMTUD enabled by default, although this can be turned off with a registry entry. However, I did not have that registry entry on the machines that I was testing from.

On a Cisco router with a GRE to IPSec tunnel, you use thetunnel path-mtu-discovery command to cause the router to copy the DF bit from the IP packet header to the GRE packet. I did have that enabled on the GRE tunnel. I beleive the IPSec implementation does this automatically when the GRE packet is encrypted by IPSec.

The Cisco document says that the most common cause of PMTUD failure is a firewall that improperly blocks the ICMP Packet Too Big response. I don't think this was a problem, but I didn't specifically test it in my setup. Next time I do this tunnel (which I do periodically as part of my job), I will specifically test it against ICMP.

Re: Path MTU Discovery on the U-Verse RG

A couple of items to think about, first 2Wire only supports the following VPN configuration

Verify that the VPN is not using IPSEC protocol 50 ESP in transport mode and IPSEC protocol 51 as they are incompatible with 2Wire gateways. Only IPSEC Protocol 50 ESP in tunnel mode will work behind 2Wire gateways.

You can force the upstream MTU for the RG using this page in the RG to be lower than the defautl of 1500

Re: Path MTU Discovery on the U-Verse RG

Can anyone explain that cryptic piece of prose from 2WIRE? What do they mean, "incompatibile with 2Wire gateways?" If it functions as a conventional NAT DMZ, and if you've designated your VPN box as the DMZ host for the primary public address assigned to the 3800, then all protocols not otherwise handled by the 3800 (TCP, UDP, ICMP) or diverted to other host applications ought to be by default sent to the DMZ host, and that should include the IPSEC protocols 50 and 51. And if you have an optional block of static IP addresses, then ALL packets to the address you've assigned to your VPN box ought to be sent to it, regardless of protocol.

This is what's so frustrating about the 3800, trying to figure out what the hell they actually mean by a particular phrase like that, or by "Stealth mode", or "Strict UDP session control" or "miscellaneous attack" or ...

It would be interesting to do a study of all tech support forums related to home routers and their configuration and see what fraction is dominated by discussions about struggles with kludges like NAT port forwarding and DMZ. If home router makers would just implement IPv6 6to4 tunneling, we could leave all this in the past even before the ISPs implement IPv6 themselves.

Re: Path MTU Discovery on the U-Verse RG

Can anyone explain that cryptic piece of prose from 2WIRE? What do they mean, "incompatibile with 2Wire gateways?" If it functions as a conventional NAT DMZ, and if you've designated your VPN box as the DMZ host for the primary public address assigned to the 3800, then all protocols not otherwise handled by the 3800 (TCP, UDP, ICMP) or diverted to other host applications ought to be by default sent to the DMZ host, and that should include the IPSEC protocols 50 and 51. And if you have an optional block of static IP addresses, then ALL packets to the address you've assigned to your VPN box ought to be sent to it, regardless of protocol.

This is what's so frustrating about the 3800, trying to figure out what the hell they actually mean by a particular phrase like that, or by "Stealth mode", or "Strict UDP session control" or "miscellaneous attack" or ...

It would be interesting to do a study of all tech support forums related to home routers and their configuration and see what fraction is dominated by discussions about struggles with kludges like NAT port forwarding and DMZ. If home router makers would just implement IPv6 6to4 tunneling, we could leave all this in the past even before the ISPs implement IPv6 themselves.

Message Edited by ka9q on 12-23-2009 04:25 AM

The explaination said it was a matter of compatiility with transport mode versus tunnel mode; i.e., you cannot originate or terminate a tunnel at the RG proper, but it does support pass-through.

Doing 6-to-4 is stupid for anything other than necessary compatibility issues ... which do not yet exist. Do you have some IPv6-only equipment at home? Any unnecessary encaps, conversion, or translation just adds to the processing load and end-to-end propagation delay.

The "kludges" are usually only desired by an occasional user that want to do something well beyond the vast majority of users. If the frustration of trying to get a ten pound bag to hold twenty-five pounds of "stuff" is really bringing you down, try a commercial account (non-U-Verse) and a nice commercial router (I use a Cisco 2821) that will do all that stuff without having to kludge.To build an all-inclusive geek-satisfying unit would raise the service costs (and rates for everyone else) unnecessarily.

The alternative is the DMZ of the RG and an auxillary router (as long as it permits DHCP on the "outside" port), which permits virtually all activity without interference.

Re: Path MTU Discovery on the U-Verse RG

Well, it certainly makes sense that the 3800 can't originate or terminate an IPSEC tunnel since it doesn't support those protocols, but I don't see why the device you've designated as the DMZ host can't do either. As long as the 3800 passes protocols 50 and 51, you should be fine.

I beg to differ on 6to4. It is by far the best "NAT penetration" tool currently available to those with only a single publicly routable IPv4 address. Until I got a block of extra static IPv4 addresses, the computers on my network at home (all of which have supported dual stack for years) each had a private IPv4 address and a fully routable IPv6 address.The IPv4 addresses were subject to the usual NAT limitation to outbound-only connections. But their IPv6 addresses can be reached from the outside in a totally transparent fashion without any NAT port forwarding kludges of any kind. The only cost is the presence of both an IPv4 and an IPv6 header over the IPv4 Internet, i.e., the packets are 20 bytes larger than if I were running native IPv6. I don't even notice it.The important thing for performance, latency, etc, is that the route taken by my packets between two IPv6 hosts with 6to4 addresses is exactly the same as if I had native IPv6 addressing; there are no extra hops to anybody's tunnel.

Only when a 6to4 host talks to a non-6to4 IPv6 host do you need to tunnel through a third party and likely use a route that is physically longer than it otherwise could be. I don't seem to do that very often as my primary use of IPv6 is to reach back into my own network from the outside. I do think sites on the "real" IPv6 Internet should also maintain 6to4 addresses through a local 6to4 tunnel and use them when talking to a remote 6to4 site.

All I need is a single public IPv4 address from whoever is providing me with service and I can talk directly to any port on any of my home computers despite having only one public IPv4 address at home. And that's the only drawback to 6to4 that I've encountered so far. Most airports, coffee shops and hotels now stick you behind a NAT, and 6to4 won't work when the tunnel has a private IPv4 address. (There's Teredo, but I haven't tried it.) The solution to this is to lobby the vendors of the commodity NATs to include a 6to4 tunnel alongside the IPv4 NAT. Then traveling users with dual-stack hosts (which is virtually everybody) can have a private IPv4 address and a public IPv6 address. I've found it a very workable arrangement.

Re: Path MTU Discovery on the U-Verse RG

Right, there's no encryption when you just wrap an IPv6 packet inside an IPv4 packet. The outer IPv4 header, the one the Internet sees, has the value '41' in the protocol field, meaning that what follows is an unencrypted IPv6 header.

The purpose of 6to4 tunneling is simply to give you IPv6 networking functionality when all you have is an IPv4 network. A single routable IPv4 address is automatically associated with a /48 IPv6 address block, meaning you have **80** bits on the right hand side to play with. In usual practice, the rightmost 64 bits are used with stateless autoconfiguration for the host part, constructed by expanding the 48-bit Ethernet address of the interface to 64 bits by inserting the constant FFFE in the middle. That leaves 16 bits for a subnetwork ID within an organization. So you can have up to 65536 subnets, each with an essentially unlimited number of hosts, all hung off a single solitary IPv4 address, and no prior arrangements or assignments from IANA or anyone else are necessary. It might be a little extreme to build a network this big with just a single point of attachment to the outside world, but at least you won't run out of addresses if you try.

Of course you can always run IPSEC on top of IPv6 - technically, it's mandated to be available, if not used - and secure your traffic that way. Or you can do as I do and encrypt at the application layer, ssh for remote logins and file transfers (including rsync) and SSL/TLS for mail (IMAPS and SMTP/TLS).

Re: Path MTU Discovery on the U-Verse RG

You can certainly use an ordinary VPN to reach into your home network even when all you have is a single IPv4 public address. That can work just fine. You don't even have to encrypt; you could use direct tunneling, e.g. with IP protocol #4 indicating that what follows the outer IPv4 header is another IPv4 header. I did that for years with a Linux box on my company's DMZ so I could 'extrude' a block of their static address space over my cable modem to my home network.

Or you could use Cisco GRE and save a few bytes.

The only drawback to that approach happens if you want to keep multiple VPNs up to multiple private networks and they happen to use the same private addresses. How do you say that you want to reach the 192.168.0.10 through VPN tunnel A as opposed to the 192.168.0.10 through VPN tunnel B?

192.168.0.0/24 and 192.168.1.0/24 are used a lot more heavily than, say, 10.56.30.0/24, so these address collisions actually happen quite often.If you're in a position to renumber one or both networks to resolve the conflict, then fine; but sometimes, as in large companies, you're not. My company avoids the low 192.168 blocks internally precisely because they are so popular on home networks.

There are some advantages to using IPv6, even on small networks. Probably the nicest is "stateless autoconfiguration", which does away with the DHCP server. Each router (you can have more than one) issues a "router advertisement", giving the prefix (usually 64 bits) for the subnetwork that it is routing for. Hosts pick up these multicasts and fill in the lower 64 bits with a host-specific part. This can be a random number if desired for privacy purposes, but it is usually the interface Ethernet MAC address, expanded from 48 to 64 bits by inserting the constant FFFE in the middle.

When you combine stateless autoconfiguration with multicast DNS, you can also get rid of the need for the local DNS server, so there's one less thing to break and wreak havoc on your network. It's true that there's an autoconfiguration procedure for IPv4 as well (that's where those mysterious 169.254.xxx.xxx addresses come from) but the IPv6 mechanisms are a lot cleaner because you have so much more room.

Re: Path MTU Discovery on the U-Verse RG

(Quote ka9q)

The only drawback to that approach happens if you want to keep multiple VPNs up to multiple private networks and they happen to use the same private addresses. How do you say that you want to reach the 192.168.0.10 through VPN tunnel A as opposed to the 192.168.0.10 through VPN tunnel B?

192.168.0.0/24 and 192.168.1.0/24 are used a lot more heavily than, say, 10.56.30.0/24, so these address collisions actually happen quite often.If you're in a position to renumber one or both networks to resolve the conflict, then fine; but sometimes, as in large companies, you're not. My company avoids the low 192.168 blocks internally precisely because they are so popular on home networks.

(end quote)

Interconnecting multiple networks with the same address blocks can be done without too much effort with NAT. It's basically the same NAT, but operating from the inside out instead of the outside in.