Let say there are two backend servers with maxconn option set to 1 and with round-robin routing strategy. This means that at times only two requests could be processed and others are put on queue.

Lets say the web app makes two separate requests in a 60 second time frame to get needed data to render the page with good user experience. What happens when in between these two requests many other clients requests arrive which got queued and later timeout? Does this mean that second request of the first user has a bigger chance to time out because of already queued other users requests? Do I understand correctly that haproxy request queuing works like FIFO? If yes, this could lead to bad experience for all visitors on very high load because ALL users will experience that some requests are completed, but some timeout. I would prefer that some users experience faultless experience without any timeouts, and other clients get timeout on first request if there aren’t enough slots left.

On very high load I know that I will not be able to handle all request, but I want at least make good experience for limited number of users (even better if I could prioritise clients by custom header). Maybe it could be some queue by user’s header or connection to hold connection per client while active e.g. for 1 minute in between requests, so all request could be processed without putting to queue, but put all requests of new clients while limited number of clients are served.

[quote=“norcis, post:1, topic:847”]
Lets say the web app makes two separate requests in a 60 second time frame to get needed data to render the page with good user experience. What happens when in between these two requests many other clients requests arrive which got queued and later timeout? Does this mean that second request of the first user has a bigger chance to time out because of already queued other users requests?[/quote]

No. Because it is FIFO and that is not how FIFO works. First in First Out means the oldest request comes first.

Newer requests are either queued (when the queue is not full) or dropped (when the queue is full).

FIFO is an acronym for first in, first out, a method for organizing and manipulating a data buffer, where the oldest (first) entry, or 'head' of the queue, is processed first. It is analogous to processing a queue with first-come, first-served (FCFS) behaviour: where the people leave the queue in the order in which they arrive. FCFS is also the jargon term for the FIFO operating system scheduling algorithm, which gives every process central processing unit (CPU) time in the order in which it is d...

norcis:

On very high load I know that I will not be able to handle all request, but I want at least make good experience for limited number of users (even better if I could prioritise clients by custom header). Maybe it could be some queue by user’s header or connection to hold connection per client while active e.g. for 1 minute in between requests, so all request could be processed without putting to queue, but put all requests of new clients while limited number of clients are served.

My suggestion is to create dedicated backends for your priority users, so they have dedicated backend/server queues. And you can use whatever you like on the frontend to redirect the users to the priority backend, for example user headers.