RE: [Webware-discuss] Stability Problems on FreeBSD

Ian Maurer wrote:
> Hello All,
>
> Thanks for all of your help so far. I did indeed have a mutex
> problem in my code that caused some deadlock in my code. I replaced
> the fcntl.flock calls with the lock in the threading module and that
> seemed to do the trick.
>
> Now that I have that problem out of the way, I am pretty confident
> that there is some sort of socket problem with that code in the
> flush method of the TASASSStreamOut class (or at least that's where
> the socket problem is showing up).
Is this problem still causing your appserver to lock up? Or is it just
resulting in unusual messages in your logfile that you want to track down?
My understanding of the code is that if the client is no longer listening,
flush() will cause the "StreamOut Error" warning message to appear but will
otherwise just be a no-op. Your servlet should go on processing, blissfully
unaware that the client has disconnected.
(It might make more sense for flush() to "raise EndResponse" at that point
to keep the client from doing unnecessary additional processing. That would
be a useful option to add to flush perhaps. It would also be useful for it
to log a less alarming-looking error message.)
- Geoff

Thread view

I have no FreeBSD experience but I do have a couple of ideas.
Ian Maurer wrote:
> Basically Webware just stops accepting requests. The process is still
> running with the state of 'lockf' (important?) but it just doesn't
> respond.
>
> Usually Webware just stops answering and the 'verbose' output
> shows the last request but no response is given:
>
> 363 2004-02-08 13:31:54 /WK/Context/Example
>
> (no 363 response)
>
> I am getting the following message at the end of the output LOG
> I am keeping:
>
> StreamOut Error: (54, 'Connection reset by peer')
This isn't _necessarily_ an error. It can happen when a client presses the
Stop button in their browser while a servlet is still processing the
request, if the servlet is using flush() to send partial responses. Try
running the "PushServlet" example servlet, then press Stop in your browser
before the page is done rendering. You'll probably get the same message,
and it's not an error.
>
> Which I believe comes from the ThreadedAppServer.py module at line
> 446.
>
> I am really at a loss for where to start chasing down this problem.
> I have been running Webware for over 2 years without any problems,
> so I guess I am a little bit spoiled.
>
> Any thoughts or suggestions? Any more information needed?
I'm wondering if the server grinds to a halt all at once, or if the threads
get locked up one by one until they are all wedged (which has happened to me
through no fault of WebKit -- a 3rd party library was locking up
occasionally, and as soon as all of the threads in the pool were wedged, the
appserver was dead).
You can help figure that out by putting this in ThreadedAppServer.py at the
top of RequestHandler.handleRequest():
print '%5i thread is %s' % (self._number,
threading.currentThread().getName())
By examining the messages this produces you should be able to figure out if
the pool of available threads is getting smaller and smaller as your
appserver runs. Ordinarily it seems to cycle through the threads in
round-robin fashion (at least this is how it apparently works on Windows;
I'm not sure about other OS's) so it's easy to tell when a thread gets
wedged.
In AppServer.config you may want to set StartServerThreads,
MaxServerThreads, and MinServerThreads to the same value so there's no
confusion about how large your thread pool is.
Good luck!
- Geoff

Ian Maurer wrote:
> Hello All,
>
> Thanks for all of your help so far. I did indeed have a mutex
> problem in my code that caused some deadlock in my code. I replaced
> the fcntl.flock calls with the lock in the threading module and that
> seemed to do the trick.
>
> Now that I have that problem out of the way, I am pretty confident
> that there is some sort of socket problem with that code in the
> flush method of the TASASSStreamOut class (or at least that's where
> the socket problem is showing up).
Is this problem still causing your appserver to lock up? Or is it just
resulting in unusual messages in your logfile that you want to track down?
My understanding of the code is that if the client is no longer listening,
flush() will cause the "StreamOut Error" warning message to appear but will
otherwise just be a no-op. Your servlet should go on processing, blissfully
unaware that the client has disconnected.
(It might make more sense for flush() to "raise EndResponse" at that point
to keep the client from doing unnecessary additional processing. That would
be a useful option to add to flush perhaps. It would also be useful for it
to log a less alarming-looking error message.)
- Geoff

> > Now that I have that problem out of the way, I am pretty
> confident
> > that there is some sort of socket problem with that code in the
> > flush method of the TASASSStreamOut class (or at least that's
> where
> > the socket problem is showing up).
>
> Is this problem still causing your appserver to lock up? Or is it
> just
> resulting in unusual messages in your logfile that you want to
> track down?
I am sorry I wasn't clearer. The server is indeed locked up. It won't
respond to any more requests and it won't even shutdown normally.
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

> I'm wondering if the server grinds to a halt all at once, or if the
> threads get locked up one by one until they are all wedged (which
> has happened to me through no fault of WebKit -- a 3rd party
library
> was locking up occasionally, and as soon as all of the threads in
> the pool were wedged, the appserver was dead).
That is it. So I need to do some debugging on my code, which is a
relief since I am pretty sure it will be fixable and not a unfixable
limitation of my host machine networking capabilities.
If anyone has any tips or experience as to why an application ported
from Windows and Linux would now not lock up on FreeBSD, I would
appreciate any suggestions since I am new to FreeBSD. Otherwise, I
think I have enough info to track down this issue on my own.
Thank you very much for your help...
Ian Maurer
__________________________________
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online.
http://taxes.yahoo.com/filing.html

Ian Maurer wrote:
>>I'm wondering if the server grinds to a halt all at once, or if the
>>threads get locked up one by one until they are all wedged (which
>>has happened to me through no fault of WebKit -- a 3rd party
>
> library
>
>>was locking up occasionally, and as soon as all of the threads in
>>the pool were wedged, the appserver was dead).
>
>
> That is it. So I need to do some debugging on my code, which is a
> relief since I am pretty sure it will be fixable and not a unfixable
> limitation of my host machine networking capabilities.
>
> If anyone has any tips or experience as to why an application ported
> from Windows and Linux would now not lock up on FreeBSD, I would
> appreciate any suggestions since I am new to FreeBSD. Otherwise, I
> think I have enough info to track down this issue on my own.
Some people have had problems with threads on FreeBSD before, for
specific versions. I think it had something to do with the socket
handling. I thought we applied some fix related to this, but maybe we
never did, or maybe it wasn't sufficient.
Ian

I had an issue on FreeBSD where the postgres (pyPGSQL) connections were
being dropped for some reason. I think that I had to update the python
version installed and the postgres driver.
-Aaron
Ian Bicking wrote:
> Ian Maurer wrote:
>
>>> I'm wondering if the server grinds to a halt all at once, or if the
>>> threads get locked up one by one until they are all wedged (which
>>> has happened to me through no fault of WebKit -- a 3rd party
>>
>>
>> library
>>
>>> was locking up occasionally, and as soon as all of the threads in
>>> the pool were wedged, the appserver was dead).
>>
>>
>>
>> That is it. So I need to do some debugging on my code, which is a
>> relief since I am pretty sure it will be fixable and not a unfixable
>> limitation of my host machine networking capabilities.
>>
>> If anyone has any tips or experience as to why an application ported
>> from Windows and Linux would now not lock up on FreeBSD, I would
>> appreciate any suggestions since I am new to FreeBSD. Otherwise, I
>> think I have enough info to track down this issue on my own.
>
>
> Some people have had problems with threads on FreeBSD before, for
> specific versions. I think it had something to do with the socket
> handling. I thought we applied some fix related to this, but maybe we
> never did, or maybe it wasn't sufficient.
>
> Ian
>
>
> -------------------------------------------------------
> SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> Build and deploy apps & Web services for Linux with
> a free DVD software kit from IBM. Click Now!
> http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> _______________________________________________
> Webware-discuss mailing list
> Webware-discuss@...
> https://lists.sourceforge.net/lists/listinfo/webware-discuss
--
-Aaron
http://www.MetroNY.com/
"I don't know what's wrong with my television set. I was getting
C-Span and the Home Shopping Network on the same station.
I actually bought a congressman."
- Bruce Baum

Hello All,
Thanks for all of your help so far. I did indeed have a mutex
problem in my code that caused some deadlock in my code. I replaced
the fcntl.flock calls with the lock in the threading module and
that seemed to do the trick.
Now that I have that problem out of the way, I am pretty confident
that there is some sort of socket problem with that code in the
flush method of the TASASSStreamOut class (or at least that's where
the socket problem is showing up).
I also figured out that I could replicate the problem by stopping
a request using Internet Explorer (of course :). So now that I can
actually replicate the problem, it should be only a matter of time
to figure out the source of the problem. So, I started looking
at the putting some print statements in the code since I don't
know how else to tackle debugging a web server...
Here is my modified code for debugging. I added 1 debug statement
and made debug a variable for holding the time and a random int:
def flush(self):
from time import time
from sys import stdout
from random import randint
debug=(time(), randint(1,100000))
result = ASStreamOut.flush(self)
if result: ##a true return value means we can send
reslen = len(self._buffer)
if debug: print debug, "TASASStreamout is sending %s bytes" %
reslen
sent = 0
while sent < reslen:
try:
sent = sent + self._socket.send(self._buffer[sent:sent+8192])
except socket.error, e:
if e[0]==errno.EPIPE: #broken pipe
pass
else:
print "StreamOut Error: ", e
break
if debug: print debug, "TASASStreamout has sent %s bytes" % sent
self.pop(sent)
stdout.flush()
Here are the results...
Request 1 and 2 are successful requests. Number 3 is the broken one.
Creating 5 threads.....
Ready (0.67 seconds after launch)
1 thread is Thread-3
1 2004-02-12 12:45:03 /WK/M/Page
(1076618708.494272, 32645) TASASStreamout is sending 20567 bytes
(1076618708.494272, 32645) TASASStreamout has sent 20567 bytes
(1076618708.496495, 59979) TASASStreamout is sending 0 bytes
(1076618708.496495, 59979) TASASStreamout has sent 0 bytes
1 5.01 secs /WK/M/Page
2 thread is Thread-4
2 2004-02-12 12:45:57 /WK/M/Page
(1076618757.3941059, 65872) TASASStreamout is sending 20697 bytes
(1076618757.3941059, 65872) TASASStreamout has sent 20697 bytes
(1076618757.396776, 79609) TASASStreamout is sending 0 bytes
(1076618757.396776, 79609) TASASStreamout has sent 0 bytes
2 0.12 secs /WK/M/Page
3 thread is Thread-5
3 2004-02-12 12:46:06 /WK/M/Page
(1076618768.8915739, 41809) TASASStreamout is sending 15068 bytes
StreamOut Error: (54, 'Connection reset by peer')
(1076618768.8915739, 41809) TASASStreamout has sent 8192 bytes
(1076618768.8936629, 69418) TASASStreamout is sending 6876 bytes
(1076618768.8936629, 69418) TASASStreamout has sent 0 bytes
As you can see, the flush command is being called twice (the random
integer shows that). I am, of course, unsure if this is a cause,
a result, or just what's suppose to happen.
If anyone with more experience with socket programming has any ideas
then I would be glad to hear them.
thanks again for your help,
ian
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree