Hello,
There is a bug in lib/util_sock.c:open_socket_out() that causes timeouts to
be incorrectly handled when attempting to open a socket. I ran across this
problem when a timeout that was supposed to be ten seconds was taking over
two minutes to time out. We're using version 3.0RC3 in our build but the
logic looks to be the same in HEAD.
Here's a snippet of the code as it exists now. The intention is to generate
a series of increasingly longer timeouts until the total requested timeout
has been exhausted. However the timeout code is not checking against the
accumulated total time but rather against the length of the final timeout in
the series. The resulting timeout is therefore the sum of this and all
preceding timeouts.
00760 if (ret < 0 && (errno == EINPROGRESS || errno == EALREADY ||
00761 errno == EAGAIN) && (connect_loop < timeout) )
{
00762 smb_msleep(connect_loop);
00763 connect_loop += increment;
00764 if (increment < 250) {
00765 /* After 8 rounds we end up at a max of 255
msec */
00766 increment *= 1.5;
00767 }
00768 goto connect_again;
00769 }
I've generated a patch for our build to change the code to this (below).
Note the addition of the 'totalTime' variable, declared above and
initialized to zero.
699 if (ret < 0 && (errno == EINPROGRESS || errno == EALREADY ||
700 errno == EAGAIN) && (totalTime < timeout) ) {
701 totalTime += connect_loop;
702 msleep(connect_loop);
703 connect_loop += increment;
704 if(connect_loop+totalTime > timeout)
705 {
706 connect_loop = timeout-totalTime;
707 }
708
709 if (increment < 250) {
710 /* After 8 rounds we end up at a max of 255 msec
*/
711 increment *= 1.5;
712 }
713 goto connect_again;
After this patch the connect attempt does timeout at the specified time,
i.e. it seems to work, but please let me know if you spot any errors or
potential side-effects of this change. Otherwise I think this is a
legitimate bug that should be fixed in HEAD.
Thanks,
Joe Meadows
Snap Appliance