<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi!<div><br></div><div>I want to add some clues to this puzzle.</div><div><br></div><div>First of all the right way: send, shutdown wait for close is not working here as would have been </div><div>my first answer :-) </div><div>Then I moved the subscription code to just wait for the empty queue event, and NOT do the timeout. </div><div>This worked well. My interpretation is that the reader is so slow, so that the write side </div><div>"write ready" is not signaled and no more data goes down to kernel space (probably some threshold.)</div><div>This will keep the pending data at the same level, and hence timeout.</div><div><br></div><div>My scheme looked like:</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>fill output buffer 1K at a time until send_pend > 0 </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>start reader</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>subscribe for empty_q (code from prim inet)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>wait for empty_q</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>send shutdown</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>wait for close</div><div><br></div><div>So how would a workaround look like ? </div><div><br></div><div>One way is to make sure that the kernel buffer is smaller (now I can send 160K). Use what ever size that</div><div>make sense with the knowledge of the inet built in time out timer (5 secs)</div><div><br></div><div>An other way to fix this could be to try to add kernel buffer space to send_pend but that is </div><div>pretty os dependent :-)</div><div><br></div><div>/Tony</div><div><br></div><div><br></div><div><br><div><div>On 5 apr 2012, at 00:13, Matthias Lang wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>On Tuesday, April 03, Andreas Schultz wrote:<br><br><blockquote type="cite">In my case the receiver is to slow to process all the data, sender<br></blockquote><blockquote type="cite">does 10k packets of 1k size, the receiver only gets the first 2000<br></blockquote><blockquote type="cite">packets. It might well be that I hit the close timeout and inet<br></blockquote><blockquote type="cite">discards the rest of the send queue.<br></blockquote><br>Ok, that's something to go on.<br><br>I had a go at reproducing what you're doing based on that description.<br>I see something unexpected which seems similar to what you reported.<br><br>The output of my program (at the bottom of this mail) running on R15B is:<br><br> 30> as_tcp:go().<br> calling close on TX at {23,35,35}<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> read 1000 octets<br> close returned at {23,35,45}, 10002ms later<br> ok<br> read 1000 octets<br> 31><br> =ERROR REPORT==== 4-Apr-2012::23:35:45 ===<br> Error in process <0.120.0> with exit value: {{badmatch,{error,closed}},[{as_tcp,slow_read,1,[{file,"as_tcp.erl"},{line,31}]}]}<br><br>I had a bit of a dig in prim_inet.erl. It sounds like you've looked<br>there too. That code looks like it's intended to loop 'forever' trying<br>to send the queued data, as long as some progress is made every so often<br>(always sends at least something in every 5s timeout period). But running<br>my program suggests that isn't happening as intended.<br><br>Instrumenting prim_inet, it looks like the {subs_empty_out_q,0 } message<br>is coming up from inet_drv.c even though the queue isn't empty. Looking<br>in there, I noticed this:<br><br> static void tcp_inet_flush(ErlDrvData e)<br> {<br> tcp_descriptor* desc = (tcp_descriptor*)e;<br> if (!(desc->inet.event_mask & FD_WRITE)) {<br> <span class="Apple-tab-span" style="white-space:pre"> </span>/* Discard send queue to avoid hanging port (OTP-7615) */<br> <span class="Apple-tab-span" style="white-space:pre"> </span>tcp_clear_output(desc);<br> }<br> }<br><br>but I think that change was introduced sometime after R11B-5, and<br>R11B-5 fails the same way. So probably a red herring.<br><br>Out of time looking at this for now.<br><br><blockquote type="cite">The fix should be simple, limit the send queue size.<br></blockquote><br>To what?<br><br>Zero seems to be the only value that will work even for arbitrarily slow<br>clients. And that defeats the point of having a send queue.<br><br>---<br><br>A likely _workaround_ is to call<br><br> inet:getstat(Tx, [send_pend])<br><br>if the answer is zero, then you know that you can call close() without<br>_erlang_ tossing data. I haven't tried this.<br><br>---<br><br>It's late, I might have outsmarted myself, but my current feeling is<br>that erlang is quietly tossing data and it shouldn't be.<br><br>Waiting for as long as it takes in close() seems like the right thing,<br>though Per might disagree. Waiting for N seconds in close() and then<br>returning an error if the queue didn't empty would also be better than<br>just quietly tossing it.<br><br>(And: yes, I know, application-level ACKs would avoid this<br>problem. But I'm not quite ready to say that this problem can't be<br>fixed.)<br><br>Matt<br><br>----------------------------------------------------------------------<br>%% Throwaway module: attempt to reproduce TCP problem reported to<br>%% erlang-questions 2012-04-02 by Andreas Schultz.<br>%%<br>%% He elaborated on 2012-04-04: gen_tcp drops data when close is called<br>%% with data buffered if the receiver is sufficiently slow.<br>-module(as_tcp).<br>-export([go/0]).<br><br>go() -><br> {ok, L} = gen_tcp:listen(0, [{active, false}, binary]),<br> {ok, Portno} = inet:port(L),<br> {ok, Tx} = gen_tcp:connect(localhost, Portno, []),<br> {ok, Rx} = gen_tcp:accept(L),<br> ok = gen_tcp:close(L),<br><br> One_hundred_k = list_to_binary(lists:duplicate(100000, 0)),<br> ok = gen_tcp:send(Tx, [lists:duplicate(40, One_hundred_k), "end of data"]),<br><br> spawn(fun() -> slow_read(Rx) end),<br><br> io:fwrite("calling close on TX at ~p\n", [time()]),<br> Before = now(),<br> ok = gen_tcp:close(Tx),<br> After = now(),<br> io:fwrite("close returned at ~p, ~pms later\n",<br><span class="Apple-tab-span" style="white-space:pre"> </span> [time(), timer:now_diff(After, Before) div 1000]),<br><br> ok = gen_tcp:close(Rx).<br><br>slow_read(Rx) -><br> {ok, _Bin} = gen_tcp:recv(Rx, 10000),<br> timer:sleep(1000),<br> io:fwrite("read 1000 octets\n"),<br> slow_read(Rx).<br><br>_______________________________________________<br>erlang-questions mailing list<br><a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>http://erlang.org/mailman/listinfo/erlang-questions<br></div></blockquote></div><br><div>
<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div><span class="Apple-style-span" style="color: rgb(51, 51, 51); font-family: Geneva, Arial, Helvetica, sans-serif; font-size: 12px; ">"Installing applications can lead to corruption over time. </span><span class="Apple-style-span" style="color: rgb(51, 51, 51); font-family: Geneva, Arial, Helvetica, sans-serif; font-size: 12px; ">Applications gradually write over each other's libraries, partial upgrades occur, user and system errors happen, and minute changes may be unnoticeable and difficult to fix"</span></div><div><span class="Apple-style-span" style="color: rgb(51, 51, 51); font-family: Geneva, Arial, Helvetica, sans-serif; font-size: 12px; "><br></span></div></span><br class="Apple-interchange-newline">
</div>
<br></div></body></html>