I’ve been doing some load testing in anticipation of our busy season. I’m able to push through 800 connections per second to a single backend without much problem. When I add additional
servers to the backend pool, and increase the load on the haproxy machine to 3000-4000 connections per second, haproxy starts marking my backends as down, even though I’ve increased my backend capacity 10x, and only 4x-5x’ed my load.

Shows backends going from L7OK -> *L7OK (what does the * mean?) -> L4TOUT -> Healthy. The downstream server is always healthy, and tcp connections from another machine succeed without any issue.

CPU on the haproxy machine I’m using is fine (an r4.large EC2 instance), as is memory usage. The machine is maxing out at 50MBps bandwidth usage, which is a 1/10th of what we push in production, so I’m not bandwidth limited.

Any ideas why haproxy isn’t able to successfully complete a healthcheck? Any suggestions on debugging this?

@lukastribus sorry for the delay. I turned on option dontlog-normal, but still get pretty noisy logs. I also don’t see healthchecks. Here’s an example of a log line I see (which seems normal to me. 200 status code).