configuring adjust-mss + ip mtu in the same tunnel, affects hsrp?

I am using a GRE-IPSEC tunnel configuration + EIGRP routing protocol and I would like to confirm if I am right with the following:

Due to I am configuring the IP MTU = 1400 in the Tunnel interface, I am avoiding additional fragmentation after GRE encapsulation because there is enough room for IPSEC encapsulation (no matter which mode I am using - I am considering the worst case Tunnel Mode). However, I would like to know what happens if I ALSO include in the Tunnel configuration the command ip tcp adjust-mss 1360 which is clear operates in layer 4 during the 3 way handshaking process to establish the TCP connection/session between opposite hosts (in this case the interaction is with the respective end routers). By adding this MSS command, I understand that I could also eliminate the initial fragmentation of the 1500 bytes LAN packets (before GRE encap) because the hosts are notified to send 1360 bytes packets to the Router and based on the previous, I would be able to transfer packets without "theorical" fragmentation between both ends.

One more question, how can affect if I include this additional command (TCP ADJUST-MSS) the performance (process + memory) of a router 3845 or 7200 without producing for example a entire crash of the device???. I understand that this TCP MSS negotiation is router process intensive but is less than IPSEC encryption/decryption.

Is the IP MTU parameter the responsible for limiting the MSS size during the 3-way tcp handshake to 1360 bytes ((IP MTU configured in the Tunnel Interface = 1400) - (40 bytes headers)) as you see in the next image? How the PMTUD is also participating in this process?

No, the IP MTU parameter configured on a router is not responsible for the initial MSS size as advertised by end hosts. The end hosts initially derive their starting MSS value by looking at their own interface MTU (not the router's MTU - they do not know it at all) and decreasing it by at least 40 bytes. This starting MSS may subsequently be lowered thanks to ip tcp adjust-mss on a router. During the TCP session, the real MSS used to talk to the other party may dynamically change according to the current MTU of the end host's interface, or it may be influenced by PMTUD during the session, but it may never exceed the negotiated MSS.

Quoting from RFC 1122, Section 4.2.2.6 stipulates:

The maximum size of a segment that TCP really sends, the
"effective send MSS," MUST be the smaller of the send MSS
(which reflects the available reassembly buffer size at the
remote host) and the largest size permitted by the IP layer:
Eff.snd.MSS =
min(SendMSS+20, MMS_S) - TCPhdrsize - IPoptionsize
where:
* SendMSS is the MSS value received from the remote host,
or the default 536 if no MSS option is received.
* MMS_S is the maximum size for a transport-layer message
that TCP may send.
* TCPhdrsize is the size of the TCP header; this is
normally 20, but may be larger if TCP options are to be
sent.
* IPoptionsize is the size of any IP options that TCP
will pass to the IP layer with the current message.

The PMTUD - if run by the end host engaged in the TCP communication - may influence the MMS_S element of the equation. Please do not confuse this Effective Send MSS with the value of the MSS option indicated in the TCP header during the 3-way handshake. That MSS is here represented as SendMSS and it may never be exceeded.

Based on my test below, I dont see any significant difference between using IP MTU or IP TCP Adjust-MSS if the IP MTU is already responsible for limiting TCP traffic through the GRE Tunnel connection.

The difference is rather strong. As I just explained, the IP MTU configured on a router is not directly influencing the MSS choice at the end host simply because the end host has no idea about the configuration of the router. In fact, the only way to find out is to use the PMTUD that has to be run by the end host again - not by the router. If the PMTUD is not performed, end hosts may end up negotiating much larger MSS than what would be necessary to prevent fragmentation, and the resulting IP packets sent by end hosts will have to be fragmented by the router.

The ip tcp adjust-mss is therefore indispensable.

Peter, if I am not wrong, the initial MSS value = 1260 sent by my PC depends on the Operating System and Application based on your orientation?

Share:

Replies

Sorry, I forgot to ask that, is there any significant impact over my HSRP operation after adding the ADJUST MSS in both routers (primary/backup) so the TCP session MSS needs to be recalculated after a failure in the primary HSRP Router???.

Due to I am configuring the IP MTU = 1400 in the Tunnel interface, I am avoiding additional fragmentation after GRE encapsulation because there is enough room for IPSEC encapsulation (no matter which mode I am using - I am considering the worst case Tunnel Mode).

Yes, you are correct. I assume you are familiar with the following document:

It directly recommends lowering the IP MTU on tunnel interfaces to 1400 bytes exactly as you did.

By adding this MSS command, I understand that I could also eliminate the initial fragmentation of the 1500 bytes LAN packets (before GRE encap) because the hosts are notified to send 1360 bytes packets to the Router and based on the previous, I would be able to transfer packets without "theorical" fragmentation between both ends.

Yes, this is absolutely correct. In fact, whenever you are lowering the IP MTU to a value less than 1500 bytes, you should always accompany it with the TCP MSS adjustment (decremented by another 40 bytes from the configured IP MTU) to prevent end hosts from sending overly large TCP segments requiring fragmentation.

how can affect if I include this additional command (TCP ADJUST-MSS) the performance (process + memory) of a router 3845 or 7200 without producing for example a entire crash of the device???. I understand that this TCP MSS negotiation is router process intensive but is less than IPSEC encryption/decryption.

This is in general difficult to say. The ip tcp adjust-mss command applies only to the first and second TCP segment exchanged during TCP connection establishment (i.e. segments with SYN flag set). Also, the modification encompasses merely setting the MSS in the TCP segment header to the min(segment-MSS, configured-MSS) and the usual checksum recalc stuff. Other segments are not inspected or modified. Therefore, I would personally believe that the MSS adjustment process is in fact quite negligible - someone please correct me if I am wrong here.

is there any significant impact over my HSRP operation after adding the ADJUST MSS in both routers (primary/backup) so the TCP session MSS needs to be recalculated after a failure in the primary HSRP Router???.

TCP MSS can not be recalculated once negotiated at the beginning of the session. The only stage when MSS is negotiated is the initial TCP three-way handshake. After that, the MSS stays the same during the entire session. Mismatches of IP MTU and TCP MSS in the primary and backup path could therefore be quite unpleasant. But if you set both your HSRP routers identically with regard to IP MTU and MSS, you should have no issues and there should be no impact over your HSRP at all.

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Peter Paluch wrote:

This is in general difficult to say. The ip tcp adjust-mss command applies only to the first and second TCP segment exchanged during TCP connection establishment (i.e. segments with SYN flag set). Also, the modification encompasses merely setting the MSS in the TCP segment header to the min(segment-MSS, configured-MSS) and the usual checksum recalc stuff. Other segments are not inspected or modified. Therefore, I would personally believe that the MSS adjustment process is in fact quite negligible - someone please correct me if I am wrong here.

I've used ip tcp adjust-mss since the feature was first offered - never noticed any significant impact to the devices it was configured on. (What I did notice first time I used it, much quicker initial [TCP] transfer rate for GRE tunnels.)

I would like to add the following, I am planning to use MSS ADJUST because I have both types of traffic (UDP + TCP) and at least the TCP traffic could be controlled in terms of avoiding fragmentation, leaving the MTU IP basically for UDP fragmentation before GRE encap. However, one more consideration (additional comments from both are well-received) that I have is that I am going to check with the application/servers teams the impact of this MSS change because I do not know if any kind of failure/degradation on the service provided by this app/servers could be produced after limiting MSS size.

Peter, I am going to certify MTU/MSS on the backup router as you suggested me in order to avoid additional inconvenients post-implementation of MSS.

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Finally, if you have any other recommendation, please let me know.

Might already be in the document Peter referenced, but enable tunnel path-mtu-discovery on tunnel interface.

Before closing this post I would like to check with you the following:

If I have the following diagram regarding this case: (there is a WCCP module in the router). Do you think that the best place to apply MSS Adjust is in the Ethernet interface connecting to the LAN??. Based on the technical information that I manage, this command should be applied the Tunnel0 interface. Thanks in advance for your final comments

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

TCP adjust should work anywhere along the transit path, where it's supported. I normally prefer placing on tunnel interfaces as that limits MTU reduction to just the traffic that needs it.

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

I configured on the remote branch the following but I am still getting fragmentation (see below)

Remote Router:

interface Tunnel0

description ** Tunnel Central Site **

bandwidth 3000

ip address 10.1.1.1 255.255.255.252

ip accounting output-packets

ip mtu 1400

ip flow egress

ip wccp 62 redirect in

ip tcp adjust-mss 1360

keepalive 10 3

tunnel source 10.1.2.2

tunnel destination 10.1.2.124

Remote-Router#show ip traffic | in fragmented

2062 fragmented, 4124 fragments, 0 couldn't fragment

Remote-Router#show ip traffic | in fragmented

2066 fragmented, 4132 fragments, 0 couldn't fragment

Remote-Router#show ip traffic | in fragmented

2068 fragmented, 4136 fragments, 0 couldn't fragment

Remote-Router#

When I analyzed the IP ROUTE CACHE FLOW for UDP packets (protocol 11-hex) there is an increment which I assume are related with the fragmentation shown above because MTU = 1400 is less than UDP = 1500 packets which are not reduced by MSS Adjust and must be fragmented.

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

I apologize for answering somewhat lately but Joseph has stepped in more than adequately. Joe, thank you!

Regarding the applicability of ip tcp adjust-mss command on Tunnel interfaces, the document you referenced that states it won't work is relevant for an old IOS version. I do not believe that this limitation exists anymore in the IOS version you are running. I had a quick look at the command reference for IOS 12.4:

It does not mention any similar limit. So using this command on a GRE tunnel should be fine.

Regarding the remaining fragmentation still reported by the router, I would personally assume that it is the UDP traffic being fragmented. Whether the MSS reduction works can be verified relatively simply by installing Wireshark on a client computer and accessing some TCP services across the GRE tunnel. The MSS observed in the TCP responses coming from across the tunnel must be reduced to 1360 or less.

The tunnel path-mtu-discovery command suggested by Joseph is a good idea. However, it is not going to prevent fragmentation, rather, it is going to dynamically discover the maximum MTU that can be used by a tunnel - provided that ICMP packet-too-big messages are permitted to be delivered to your tunnel endpoint. Sadly, many firewall policies inadvertently filter out ICMP communication and break the Path MTU Discovery process.

Peter / Joseph, I am very grateful with both, thank you so much. Additionally, I agree with you regarding that UDP traffic is the one being fragmented. However, I would like to share with you the last tests that I made using PMTUD + IP MTU = 1400 on both ends (without MSS ADJUST). This are the results that I would like to share with you.

When I made SSH (tcp=22) from a PC in the internal LAN at the Central Site to the IP address of the Router LAN Interface at the Remote Site, passing through the GRE TUNNEL connecting both sites, I got a MSS of 536 bytes (similar to the explanation presented in the following link) but, why this situation if I already activated PMTUD, so the MSS should be 1400 - 40 = 1360???

In addition to the previous, I am going to test my SSH connection using Wireshark and access the LAN interface of the Remote Router. By doing this, I will check if the MSS (1360) is effectively negotiated however I am not sure if this is the best option/scenario for testing the MSS negotiation process.

When I made SSH (tcp=22) from a PC in the internal LAN at the Central Site to the IP address of the Router LAN Interface at the Remote Site, passing through the GRE TUNNEL connecting both sites, I got a MSS of 536 bytes (similar to the explanation presented in the following link) but, why this situation if I already activated PMTUD, so the MSS should be 1400 - 40 = 1360???

The MSS is first set by the parties opening the TCP connection. The MSS is their own decision and it depends on their operating system settings and the application that opens a TCP socket. If the MSS size is already lower than the value of ip tcp adjust-mss, the router is not going to modify the MSS in any way (i.e. it is certainly not going to increase it).

So what you saw is completely okay.

My final question is: How the MSS for TCP Traffic going through the GRE Tunnel is actually set to the IP MTU configure on both ends of the GRE Tunnel - 40 bytes of headers??.

The ip tcp adjust-mss is only going to decrease the MSS in traversing TCP segments, if the value exceeds the configured limit. However, if the MSS is already being advertised by the TCP peers as lower than the value of ip tcp adjust-mss, the router will not interfere - it is the decision of the TCP peers to have the MSS so small.

That is why this MSS manipulation by ip tcp adjust-mss is called as "MSS clamping" in Linux operating system. The MSS is only made smaller if necessary by the router, but it is never increased - doing so could lead to various problems with delivering or processing the TCP segments by the party whose MSS has been increased this way.

Hi Peter, thank you so much for your explanation. I would like to comment that I am only using is IP MTU + PMTUD between both ends instead of ip tcp adjust-mss, and I got the following results using Wireshark on my PC and generating a TCP connection from my PC (10.1.55.38) to a server located in the remote site LAN (10.14.32.37). Now I see that the IP MTU = 1400 configured in the interface Tunnel for GRE is responsible for setting the MSS=1360 and PMTUD is also useful in my case because it allows to adjust THE INTERNAL IP MTU for GRE in case that any router in the path from origin to destination requires a lesser MTU.

Based on my test below, The basic difference that I visualize between IP MTU/PMTUD and IP TCP Adjust-MSS is that IP Adjust fixes the MSS Value so if there is required lesser MSS in the route from Origin to Destination, the packet will be fragmented in the path due to a higher MSS previously configured. Adjust MSS looks like more try and error than MTU/PMTUD (except when you have FW/ACL blocking ICMP Messages which affects MTU/PMTUD).

Peter, one more question, WHY do you think is the reason for which my PC sent an initial MSS value = 1260 during the TCP Handshaking process? Is this probably the value stored dinamically by the INTERNAL IP MTU (as I explained before) after interacting with the routers on the ISP network in order to reduce this MTU Value and notified to me by my default GW (GRE Tunnel Router) for this case?? On the other hand, the destination sent me a 1360 MSS value which could mean that any packet that I sent will be fragmented in the path origin-destination.

Is the IP MTU parameter the responsible for limiting the MSS size during the 3-way tcp handshake to 1360 bytes ((IP MTU configured in the Tunnel Interface = 1400) - (40 bytes headers)) as you see in the next image? How the PMTUD is also participating in this process?

No, the IP MTU parameter configured on a router is not responsible for the initial MSS size as advertised by end hosts. The end hosts initially derive their starting MSS value by looking at their own interface MTU (not the router's MTU - they do not know it at all) and decreasing it by at least 40 bytes. This starting MSS may subsequently be lowered thanks to ip tcp adjust-mss on a router. During the TCP session, the real MSS used to talk to the other party may dynamically change according to the current MTU of the end host's interface, or it may be influenced by PMTUD during the session, but it may never exceed the negotiated MSS.

Quoting from RFC 1122, Section 4.2.2.6 stipulates:

The maximum size of a segment that TCP really sends, the
"effective send MSS," MUST be the smaller of the send MSS
(which reflects the available reassembly buffer size at the
remote host) and the largest size permitted by the IP layer:
Eff.snd.MSS =
min(SendMSS+20, MMS_S) - TCPhdrsize - IPoptionsize
where:
* SendMSS is the MSS value received from the remote host,
or the default 536 if no MSS option is received.
* MMS_S is the maximum size for a transport-layer message
that TCP may send.
* TCPhdrsize is the size of the TCP header; this is
normally 20, but may be larger if TCP options are to be
sent.
* IPoptionsize is the size of any IP options that TCP
will pass to the IP layer with the current message.

The PMTUD - if run by the end host engaged in the TCP communication - may influence the MMS_S element of the equation. Please do not confuse this Effective Send MSS with the value of the MSS option indicated in the TCP header during the 3-way handshake. That MSS is here represented as SendMSS and it may never be exceeded.

Based on my test below, I dont see any significant difference between using IP MTU or IP TCP Adjust-MSS if the IP MTU is already responsible for limiting TCP traffic through the GRE Tunnel connection.

The difference is rather strong. As I just explained, the IP MTU configured on a router is not directly influencing the MSS choice at the end host simply because the end host has no idea about the configuration of the router. In fact, the only way to find out is to use the PMTUD that has to be run by the end host again - not by the router. If the PMTUD is not performed, end hosts may end up negotiating much larger MSS than what would be necessary to prevent fragmentation, and the resulting IP packets sent by end hosts will have to be fragmented by the router.

The ip tcp adjust-mss is therefore indispensable.

Peter, if I am not wrong, the initial MSS value = 1260 sent by my PC depends on the Operating System and Application based on your orientation?

I would like to say that I am not an english language native and probably I used the wrong word/expression (sorry about that). I mean by orientation that I followed your instructions/guidance to understand better this case and apply improvements to the current device configuration.

On the other hand, I decided to use MSS Adjust + MTU because in the PMTUD process are not participating the end hosts so they can send packets with an larger MTU that the one negotiated by the routers using PMTUD.

Additionally, an extended ping from site to site was made using a variable size from 1400 to 1505 and all the packets until 1500 bytes were received by the Carrier Router in both ends (just before the GRE Tunnel starts and ends). Now we can confirm that the packet size allowed by the transport network (carrier) can reach 1500 bytes so there is no reason for a fragmentation outside the tunnel connection.

I definitely believe that the fragmentation problem is focused on the UDP traffic. So the only improvement that I can implement is to configure MTU + MSS Adjust as I mentioned previously in order to at least reduce any TCP fragmentation. MTU value for this GRE Tunnel I think should be configured to 1476 instead of the Cisco recommended 1400 because there is only 24 bytes of additional header (20 for IP header + 4 GRE header - I am not using IPSEC) which represents the entire 1500 max size allowed by the network. On the other hand, TCP MSS Adjust would be (1476-40=1436).