I've got 2 IIS web servers behind a cisco 11503 load balancer and Im using the health check (custom web page that displays 'ok' if the machine should be in the rotation, anything else if it should be take out) feature.

When we change the health check page to take a server offline, we still get some connections (although most are sent to the other, online server) - it seems like these connections are remaining because they are part of already established and open tcp sessions. How do you guys stop sending all traffic to the server you want offline?

We could restart the web service to kill all open tcp sessions, but this would send some users back nasty error messages and we want them to have a seamless experience when we are patching machines. Our hosting company tells us that we need to just wait for these to expire, but this can take upto 30 minutes in our particular situation.

Maybe crank down your tcp KeepAlive timeouts? Not a solution, but at least it would decrease the frequency of the problem..
– Shane MaddenApr 14 '11 at 18:55

@Shane No, I think that really is the solution -- load balancers aren't going to change an existing TCP connection, so you do have to wait for them to expire. If the KeepAlive is set to a lower value -- like, say, 2 minutes -- you'll still get much of the same benefit of the feature, without having to wait ages for users to stop hitting your "offlined" server before you can switch it off.
– KromeyApr 14 '11 at 19:10

1 Answer
1

Not sure about Cisco, but with our F5, we perform a health check every 3-5 seconds. If the device fails, it is considered removed from the pool. I suspect Cisco does something similar, so if the health check page is reporting what it is supposed to, the Cisco device should not be sending any requests at all.

You may want to check your IIS logs and confirm that the health check is doing what is expected when you change it to "offline".

hey, yeah thats what I thought so I started investigating with our network admins, and I can see that new the health check is working because almost all connections are now being routed to the other server (including my requests from firefox). The admins say that the cisco logs do not document any traffic being switched to the dead server, but I can still see requests in the logs on my dead server using the IP that is being load balanced.
– TrevApr 15 '11 at 21:00