squirrel has asked for the
wisdom of the Perl Monks concerning the following question:

How can I make a TCP server process reject new connections when it is busy?

Starting with a basic server such as the first server here: http://perldoc.perl.org/perlipc.html#Sockets%3a-Client%2fServer-Communication

The server accepts an incoming connection and processes it. New connections are queued and the next accept picks up the next connection - all good. But I want to reject new incoming connections while it is busy (so that the client process can try a different box).

The backlog parameter defines the maximum length the queue of+ pending
connections may grow to. If a connection request arrives w+ith the
queue full the client may receive an error with an indication +of ECON-
NREFUSED or, if the underlying protocol supports retransmissi+on, the
request may be ignored so that retries succeed.

Don't rely on this part. Different implementations behave differently. For example Linux treats backlog parameter 0 as 3. In either case, when the queue is full, then the system simply stops answering SYN packets, which triggers their resending. As result, the peer gets no indication that your server is busy. It simply keeps waiting for connection to be established.

So, why didn't you immediately type "man listen" at the shell prompt
on your operating system of choice?

When I do that on a similar system, I immediately noticed:

If a connection request arrives when the queue is full, the client may
receive an error with an indication of ECONNREFUSED or, if the
underlying protocol supports retransmission, the request may be
ignored so that a later reattempt at connection succeeds.

Seems a quite stupid feature to allow one to reduce how much one wants to
queue up connections in a way that is designed to cause the client side to
just queue up connections in a different way. *shrug*

Given that, you'll probably want two streams of execution where one stream
does work and the other stream actively rejects connections. There are lots
of choices when it comes to having multiple streams of execution in Perl, but
each of the choices usually have drawbacks that can become significant. But,
for this simple case, you can probably use most of them without problems.

If you can't guarantee one stream has "priority" so that the other stream
only receives connections when the first is busy, then you'll probably want
a non-blocking semaphore / lock / critical section so each stream can
know when it should just reject.

For this case, I'd probably just use threads.pm with a shared variable
(despite usually finding myself in situations where Perl "threads" aren't the
best solution). Note that you'll need to explicitly lock the shared variable
around your 'test and set' code. See also: threads::shared.

The purpose is to return control to the client if the server is busy, so that it can determine what to do.

The server code referenced previously is a stripped down case. The actual server is using a pre-fork model. Open socket, fork multiple child processes each calling accept with the OS figuring out which child gets to do the accept. Would you use semaphores?

Whatever proves to be a good measure of "server is busy". If requests are relatively constant in resource cost, then you might have a hard ceiling of "N max simultaneous requests" and a semaphore would be a good fit (and you'd need to pre-fork upto N+1 children, of course).

But you might have a different "busy" metric you want to use like "system load average" or "idle CPU percentage" or "virtual memory in use" or whatever.

But, even if you do implement some "busy" metric other than "N simultaneous requests", you'll probably still have some "max N children" configuration and so a N-1 semaphore is probably still a good idea.

If it's worth for the client to reconnect to another box if the server is busy, I suppose the jobs take relatively long to process, otherwise the client could expect a server process to become free soonish when it enters the listen queue.

Are you sure you want preforked servers for that? Because the preforking is usually done to avoid the fork() overhead if it takes a significant fraction of the whole request processing time.

Otherwise you could have one master process that accepts all connections and can optionally have a little dialog with the client. If it can down a nonblocking semaphore, it just forks and lets the child do the rest, otherwise it tells the client to check another box (you could even have the servers inform each other of their current load so it could give a hint to the client) and hangs up.

I don't think that using semaphores is a particularly good idea. There are some situations where these things can get left in memory when crashes happen and part of a system resource can just without much adieu ... just go away.

See my other post with "option 2". If you want to share a bit of "go no/go information" between processes, consider using the file lock mechanism. This is a common way to do this. The "busy/not busy decider" manages a write lock.. more specifically an exclusive lock to a zero length file. Other processes test this file to see: "if I wanted to get a write lock, could I do it?". If the answer is "no" then close the active connection.

The file lock control table is a memory resident thing and checking this is fast but you have to open the file first, however this is usually fast compared with the network delays just to get to your box to begin with. Have the "decider" function be the only one who actually locks this "I_am_too_busy" file.

If the "decider" process crashes for some reason, its lock is released - no clean-up required. A semaphore can potentially have problems.

There are other ways to share information between processes on *nix systems. I would think about the easiest ways and only get more complicated when needed. Stay with the *nix forking server model if you can. I would also consider the post by mbethke.

I don't see any specs on how fast this has to be, nor benchmarks that show why a particular implementation is too slow. The fastest implementation would be a "super server" which is the most complicated model because it involved both select based and fork based code. I would write that in C if this level of complexity and performance is needed.

Anyway, this thread started with a fairly simple question and it appears that things are getting more and more complicated.

If you want to stop taking new clients when you
are "busy", where busy is some status that you decide upon.
Maybe it is number of active connections, etc., I see 2 options:

(1). Close your listen socket. This will not affect currently active
sockets. When you decide that you are "not busy", restart your listen
socket. In this case, the client can't tell the difference between
"server down" and "server busy". The OS will know that no listen
socket is bound to the port and should send back to the client
"connection refused". Although simple, I don't think this is
the best for your situation.

(2). Accept the connection, send back a protocol defined error message
and then hang up (close the active socket). In this case, I would
not fork a child, just have the main server process send the "not
now" message and hang up (close the active socket)

Things to consider: (a) think about how to handle SIGPIPE - normally the
main server process doesn't talk to clients, but in this case it
would. You should allow for the fact that the write to the active
socket could, in a degenerate case trigger a SIGPIPE. (b) Think about
what happens if you have some rogue client keeps trying to connect
after very short intervals (like 1 ms). (c) Of course you have to decide
what "busy" means.

If you are writing both ends (the client and the server) this is the way
to go. A very clearly defined and performant action happens when you
are "busy". You can put anything in that "I'm busy...go away" message
that you want - maybe how many connections you can handle on a "good day",
"other url's to go to", it could be anything that you want!

(3). I don't think that moving the listen queue to zero will work.
A typical value for this would like 5 or something like that. Anyway,
the queue are the other connections rather than the pending one. So
a value of zero would still allow a "pending one". You don't want any
connections being left unhandled quickly. I don't think that the behavior
of this option is very well defined and therefore I believe that
it is far inferior to option 2.

Summary: go with option 2 above if you have control of the server and the
client. If you do not have control over the client, then consider just
accepting the connection and then proceed directly to "hang up", ie. closing
the active socket. The client will get a SIGPIPE and know right away that
you rejected this connection, well or at least that something didn't work with it and it will know this bit of information very fast.

Here's a possibly stupid question - if you have multiple boxes accepting the same type of connection, why not just have the client randomly choose between them? This will balance the load. Is there some reason to dump everything on one until it maxes out?

The client does choose randomly among the boxes, but they may be busy.

Think SMTP & MX records with multiple primary and secondary servers. The client connects to one of the primary servers. All is good if it can be processed quickly. If not, I want the connection to be rejected so the client tries another primary box. If none of the primaries can take it, then it goes to the secondaries. It should be handled by the primaries if possible, but it should not queue for them: it is more important that it is handled by something quickly, whether primary or secondary.

Don't know the original poster's exact situation, but for example if you have a relatively small load of long-running transactions, random isn't good enough; if you happen to hit a busy system you really want to at least consider trying another system. (A real load-balancer is the brute force solution of course; one that understands the present load and can accurately predict the future load. But that's often beyond the budget of this kind of setup; and the commercial ones are optimized for large numbers of small transactions rather than the reverse.)

Backlog is the maximum number of established connections (i.e. handshake was completed) that may be placed in accept(2) queue. On BSD (and Linux) value 0 is actually means 1, so here you already have a problem, there's always one place in the queue. But even if all places in queue are taken TCP continues accepting connection requests, it doesn't reject them. Consider following example server:

As you can see on the client side all connections are in ESTABLISHED state. From the server side one connection (from port 44390) was ESTABLISED and accepted, another connection (from port 44392) ESTABLISHED but not yet accepted, it is in accept queue, and two more connections are in SYN_RECV state, that means that server sent SYN-ACK to client, but ignoring ACKs from client till there will be a place in the accept queue (just wait a minute).