[ANNOUNCE] haproxy 1.5-dev3

finally we managed to merge all the stuff ! Haproxy 1.5-dev3 was released
with everything that went into 1.4.9, plus some added bonus that were
mainly developped at Exceliance :

- support for binding to UNIX socket on the accept side. Haproxy can
now receive connections over a UNIX socket. This is particularly
useful when combined with stunnel (we also have a patch for that
in the 'patches' directory).

- support for a new "PROXY" protocol that was designed to forward
transport-level information between proxies. The idea is to permit a
component like stunnel to inform haproxy about the protocol, source
and destinations of an incoming connection, so that haproxy can make
use of that everywhere internally (acls, logs, transparent, ...)
instead of stunnel's address. The main advantage over the x-forwarded-for
patch is that it now supports keep-alive and is not limited to HTTP
anymore. When combined with the UNIX socket, it can make haproxy and
stunnel integrate seamlessly and reliably. Obviously, we have a patch
for stunnel ready too ;-)

- tcp-response filtering : it's possible to wait for some ACLs to match in
the response before forwarding (or blocking).

- stick-table learning from responses. It's now possible to learn some
patterns from responses and match them again in requests. Doing so
allows haproxy to learn SSL IDs in order to offer SSL-based stickiness
to SSL reverse-proxy farms.

- stick-table synchronization : the stickiness information in stick-tables
can now be synchronized over the network between as many other haproxies
as you like in a multi-master fashion. Also, during soft-restarts, the
new process learns the table from the old one so that restarts do not
lose that precious information anymore. Designing this was quite a tough
work (Aleks might recall we started talking about such a protocol about
6 years ago now), and is the second half of the large work co-sponsored
by Exceliance[1] and LoadBalancer.org[2]. Now it's completely advisable
to simply rely on source IP for some protocols such as RDP in certain
environments, since restarts will not kill user connections.

For those interested in the last point, the protocol is very cheap over the
wire and is designed with a large window and ACKs, so that it can sync over
high latency networks and even recover from network outages. The sync is fast
enough so that even people using a round-robin L4 LB in front of two haproxies
should not experience any issues under moderate loads (thousands of new entries
per second).

A few typos, minor bugs and error reporting issues were fixed (including the
ones contributed by Cyril a few days ago).

Minor optimizations were performed in order to avoid a few useless operations
in process_session(). The acute observers may notice a tiny drop of CPU usage
(around 5% of user time) from previous versions.

For the next versions, I'd really like to be able to concentrate on the core
to try to finish the end-to-end keep-alive support. After that there are
less intrusive changes to work on. I'm still hoping for an 1.5 release by
the beginning of next year.

Le vendredi 12 novembre 2010 00:34:29, Willy Tarreau a écrit :
> Hi,
>
> finally we managed to merge all the stuff ! Haproxy 1.5-dev3 was released
> with everything that went into 1.4.9, plus some added bonus that were
> mainly developped at Exceliance :

I've quicky tested these 2 first features :

> - support for binding to UNIX socket on the accept side. Haproxy can
> now receive connections over a UNIX socket. This is particularly
> useful when combined with stunnel (we also have a patch for that
> in the 'patches' directory).

First of all, it works :-) But using ab to stress stunnel+haproxy, I got some
"SSL read failed" errors (with at least 10 concurrent connections on a
laptop). I suspect it comes from ab and not from stunnel or haproxy, but as
soon as I go back to TCP instead of a UNIX socket, I don't have these errors.
I also tested stunnel+nginx with UNIX sockets, still no error.
And replacing ab with httperf, it always works.

> - support for a new "PROXY" protocol that was designed to forward
> transport-level information between proxies. The idea is to permit a
> component like stunnel to inform haproxy about the protocol, source
> and destinations of an incoming connection, so that haproxy can make
> use of that everywhere internally (acls, logs, transparent, ...)
> instead of stunnel's address. The main advantage over the
> x-forwarded-for patch is that it now supports keep-alive and is not
> limited to HTTP anymore. When combined with the UNIX socket, it can make
> haproxy and stunnel integrate seamlessly and reliably. Obviously, we have
> a patch for stunnel ready too ;-)

It didn't work with "option http-server-close". My guess is that the
AN_REQ_DECODE_PROXY bit is re-enabled after the first transaction.
I don't provide a full patch because I don't know if it's the better solution,
but applying this fixes the issue :
--- haproxy-1.5-dev3/src/proto_http.c 2010-11-11 23:29:35.000000000 +0100
+++ /home/cbonte/Public/haproxy/haproxy-1.5-dev3/src/proto_http.c 2010-11-12
13:53:14.154398641 +0100
@@ -3949,6 +3949,7 @@
s->rep->lr -= s->req->size;

On Fri, Nov 12, 2010 at 02:07:22PM +0100, Cyril Bonté wrote:
> > - support for binding to UNIX socket on the accept side. Haproxy can
> > now receive connections over a UNIX socket. This is particularly
> > useful when combined with stunnel (we also have a patch for that
> > in the 'patches' directory).
>
> First of all, it works :-) But using ab to stress stunnel+haproxy, I got some
> "SSL read failed" errors (with at least 10 concurrent connections on a
> laptop). I suspect it comes from ab and not from stunnel or haproxy, but as
> soon as I go back to TCP instead of a UNIX socket, I don't have these errors.
> I also tested stunnel+nginx with UNIX sockets, still no error.
> And replacing ab with httperf, it always works.

Do you know if keep-alive was involved in any of these tests ?

> > - support for a new "PROXY" protocol that was designed to forward
> > transport-level information between proxies. The idea is to permit a
> > component like stunnel to inform haproxy about the protocol, source
> > and destinations of an incoming connection, so that haproxy can make
> > use of that everywhere internally (acls, logs, transparent, ...)
> > instead of stunnel's address. The main advantage over the
> > x-forwarded-for patch is that it now supports keep-alive and is not
> > limited to HTTP anymore. When combined with the UNIX socket, it can make
> > haproxy and stunnel integrate seamlessly and reliably. Obviously, we have
> > a patch for stunnel ready too ;-)
>
> It didn't work with "option http-server-close". My guess is that the
> AN_REQ_DECODE_PROXY bit is re-enabled after the first transaction.
> I don't provide a full patch because I don't know if it's the better solution,
> but applying this fixes the issue :
> --- haproxy-1.5-dev3/src/proto_http.c 2010-11-11 23:29:35.000000000 +0100
> +++ /home/cbonte/Public/haproxy/haproxy-1.5-dev3/src/proto_http.c 2010-11-12
> 13:53:14.154398641 +0100
> @@ -3949,6 +3949,7 @@
> s->rep->lr -= s->req->size;
>
> s->req->analysers |= s->listener->analysers;
> + s->req->analysers &= ~AN_REQ_DECODE_PROXY;
> s->rep->analysers = 0;
>
> http_silent_debug(__LINE__, s);
>

Good catch, you're perfectly right, I did not think about this case !
Right now we should apply your fix as-is. Later we'd probably try to
split analysers between connection-based and transaction-based.

Le vendredi 12 novembre 2010 15:05:40, Willy Tarreau a écrit :
> On Fri, Nov 12, 2010 at 02:07:22PM +0100, Cyril Bonté wrote:
> > > - support for binding to UNIX socket on the accept side. Haproxy can
> > >
> > > now receive connections over a UNIX socket. This is particularly
> > > useful when combined with stunnel (we also have a patch for that
> > > in the 'patches' directory).
> >
> > First of all, it works :-) But using ab to stress stunnel+haproxy, I got
> > some "SSL read failed" errors (with at least 10 concurrent connections
> > on a laptop). I suspect it comes from ab and not from stunnel or
> > haproxy, but as soon as I go back to TCP instead of a UNIX socket, I
> > don't have these errors. I also tested stunnel+nginx with UNIX sockets,
> > still no error.
> > And replacing ab with httperf, it always works.
>
> Do you know if keep-alive was involved in any of these tests ?

I tried both, It's easier to reproduce without keep-alive.
Actually, I also met the issue with httperf.

Thank you Cyril, I'll forward all that material to Emeric in case
he finds a clue about that. I hope we're not hitting buffer size
limits or things like this on the unix sockets :-/

TCP_NODELAY should not be set because it does not exist on the UNIX
sockets, but I don't think there is any relation. More likely it's
a matter of a connection limit or too fast reuse somewhere, and I'm
not used to tune for that !

I see, it's probably just a matter of including <stdint.h>. However, this one
is not present on all machines. It's interesting to note that the uint32_t at
the line just before did not cause any trouble. Since those types are rarely
used in haproxy, I'd rather replace "uint32_t" with "unsigned int" and "int32_t"
with "int".

Could you please try with this minuscule non-invasive patch ? If it works, I'll
merge it as-is for now.

Le vendredi 12 novembre 2010 16:45:54, Willy Tarreau a écrit :
> Thank you Cyril, I'll forward all that material to Emeric in case
> he finds a clue about that. I hope we're not hitting buffer size
> limits or things like this on the unix sockets :-/

OK, it took me some times and a lot of tests/modifications in stunnel and
haproxy but I've found the limit.
In proto_uxst.c there's a call to listen(sock, 0)

On Sun, Nov 14, 2010 at 02:57:42PM +0100, Cyril Bonté wrote:
> Hi Willy,
>
> Le vendredi 12 novembre 2010 16:45:54, Willy Tarreau a écrit :
> > Thank you Cyril, I'll forward all that material to Emeric in case
> > he finds a clue about that. I hope we're not hitting buffer size
> > limits or things like this on the unix sockets :-/
>
> OK, it took me some times and a lot of tests/modifications in stunnel and
> haproxy but I've found the limit.
> In proto_uxst.c there's a call to listen(sock, 0)

Ah cool, thank you for chasing this one down !

> I tried with listen(sock, 2000) and could run
> $ ab -n10000 -c500 https://localhost:8443/
> $ httperf --server localhost --port 8443 --uri / --rate 200 \
> --num-conn 10000--ssl --num-call 1
> without any problem, which was not the case before.
>
> Now, as it's shared with the stats, I don't know what to do.
> Should we use the listener backlog value for both or should we keep 0 for the
> stats ?

In my opinion, we should use the listener's backlog. This will require
some code changes in order to be able to pass the backlog's size to
create_uxst_socket(). On the other hand, this function is quite old now
and is only used by uxst_bind_listener(). Probably that it will be easier
to move its code there and get rid of the function.

Le dimanche 14 novembre 2010 15:06:50, Willy Tarreau a écrit :
> In my opinion, we should use the listener's backlog. This will require
> some code changes in order to be able to pass the backlog's size to
> create_uxst_socket(). On the other hand, this function is quite old now
> and is only used by uxst_bind_listener(). Probably that it will be easier
> to move its code there and get rid of the function.
>
> Do you want to send a patch with that ?

OK to send a patch, just the time to merge create_uxst_socket() in
uxst_bind_listener(), then, and doing some tests ;-)