Idle TCP sessions timing out before timer expires

We've got this since upgrading to recent 5.4 code, from 5.1 code on an SSG520. SSH, MAPI, and Telnet connections all hang at random times of idleness. The timeouts happens between 5 minutes and 50 minutes and higher, but our timeout value is 4 hours. This is true for all traffic that maintains an open, but idle, TCP connection. (some traffic is VPN'd, some isn't, some is nat'ed, most isn't)

Just for sanity sake, wondering if anyone else has noticed this. If you want to try, just ssh through the SSG, and don't touch your keyboard (and make sure both client and server have keepalives turned off). See if the session disappears in the SSG by doing a get session | inc x.x.x.x. Works for Telnet was well, and other connections that will stay open and idle

What I get is a RST when we try to do input on our SSH session, because the SSG no longer has any info about that session (you can configure it to send RST's or not). again, you can monitor your session with get session... Sessions are just disappearing prior to timing out... Very odd...

Re: Idle TCP sessions timing out before timer expires

I've had a report of this happening with two clients using SSG20s (both OS ver 6.0.0, one is having the same issue with SSH on r4, the other is having Sage client timeouts on r3). I've done some mucking around with session timeouts to see if this helps, just waiting on confirmation of this before I do realtime monitoring of the session cache...

Do you get consistent timeouts per protocol (e.g. SSH always times out after 12 minutes or whatever) or does it always seem to be random?

Re: Idle TCP sessions timing out before timer expires

We got this resolved. It had to do with differences in how timeouts are processed between 5.1 and 5.4 code. Basically we had to add rules to each of our nat pools, whereas previously we could do this via a different mechanism.

The old config was able to be inserted into the new OS, but had no operational effects that we could tell.

Re: Idle TCP sessions timing out before timer expires

I figured out a workaround for the problem my client was having with Sage. Debugs showed that there were two ports in use with every connection (TCP 38113 and 20005), but other ports are later opened throughout the connection. I haven't got enough data to find a pattern in these ports opening (also waiting on Sage to get back to me about it), so the sledgehammer approach is to set up a custom service that extends all TCP connections to several hours and use this in the policy that allows traffic from the client to the server. A problem I ran into at first was that when I tried to specify both TCP and UDP (any dest port) in the service, it somehow overrode the system 'Any' timeout. So, ALL sessions were getting held open for several hours, causing the session cache to fill and causing problems with the firewall. When I specified only TCP, and only a range of about 10,000 dest ports, this resolved. I eventually extended it to include all ports from 1024-66535, without any problems.

The reason you need the long timeout in the first place is because whatever moron designed the Sage client protocol didn't think to use keepalive packets, which would hold the session open through firewalls. Without this, if the client is left idle for longer than the session timeout, the firewall will remove the session from its session cache, then the next packet that tries to pass will get dropped, as it won't have the 'SYN' flag set. This is correct behaviour for a firewall; the problem is with the Sage protocol.