Is this thread dump indicating tomcat is low on memory?

Norman P Rozental

Greenhorn

Posts: 12

posted 8 years ago

Hi Tomcatters,

One of our thread dumps (which I've included below) appears to be a concern and was wondering if you might know more of what might be going on. Line 30 in the thread dump below shows the line number in the generated servlet that appears to be causing a problem (which is headingsList_jsp.java:135). Line number 135 of headingList_jsp.java contains the following:

Also from the thread dump, lines 19 and 20 are interesting. It appears that tomcat cannot flush the output. Could this be related to tomcat running low on memory? If not, I'd love to know your thoughts.

The version of tomcat is: 5.5.20
OS is Solaris 10

The thread dump:

Kind regards,
Norm.

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

I see nothing to indicate a low on memory problem, after all, Tomcat managed to generate a stack dump. Why are you suspecting a memory problem?

Has this code run ok before? What was changed recently?

What is the actual Throwable that generated this?

Bill

Bauke Scholtz

Ranch Hand

Posts: 2458

posted 8 years ago

This rather look like a deadlock / synchronization issue.

That said, you shouldn't be writing raw Java code in JSP files. Use JSP for presentation logic only and if necessary use taglibs/EL to control the presentation flow. If there is any need for raw Java code in a JSP file, then you should be doing it in a Java class, such as a Servlet or a Javabean.

I now agree that it doesn't appear to be a memory issue. There hasn't been a change at all over the past 6 months. What has changed is possibly the size of the data being returned in the response and the number of users hitting this page at the same time has increased.

This application is an internal app used by the employees to save information for clients. It gets very busy around 10:30am for 15 minutes until all the allocated bookings have been saved for clients. It works on a commission basis so the employees really really want to make bookings for their clients.

Around this busy time, for some users, the page doesn't completely finish rendering. For example, instead of 2000 rows in a table, there are only 200 returned. During a quiet period of the day, all 2000 rows are returned without a problem.

It looks like from the thread dump and looking at the generated servlet class, that tomcat isn't completely rendering the page - it just reaches a certain line in the class (eg, headingsList_jsp.java:135 as can be seen from my original post) and tries to flush but can't.

Note: The thread dump that I have included in this forum appears many times when I view all the thread dumps - always the same generated class (headingsList_jsp.java) but different line numbers.

Some other possibly important notes:

* The generated HTML when all the data is returned is around 1MB. Could this be too large for Tomcat to handle?
* The page in question makes a db query via Hibernate, however since Tomcat is already in the process of rendering the .jsp I'm pretty sure all 2000 rows have been returned. Our DBA has also mentioned that he couldn't see any deadlocks or that nature while analyzing the Oracle DB.
* The application uses DOJO (enhanced display effects, like progress bar, using Javascript) - I don't think this is a cause for concern.

I'm happy to provide anymore info if you need it.

Kind regards,
Norm.

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

Well this is an interesting problem, I can see why you might suspect memory or other capacity limits.

1. Tomcat has no intrinsic limit on output size since each response gets its own output stream which writes to the connection on the fly.

2. Maybe you need to restrict the number of requests accepted at one time?

3. Did I miss where you state the exact Exception being thrown??

4. Im not familiar with Solaris - could you be hitting some sort of operating system limit on sockets?

Bill

Norman P Rozental

Greenhorn

Posts: 12

posted 8 years ago

Hi Bill,

The only reason I initially thought it might be a memory problem was because the problem disappears after we re-start Tomcat.

It's only after a 4 or 5 day period does it start to behave how I've been describing.

I just want to go over your points:

1. Ok, thanks. So this means that it doesn't really matter what size the page is - Pretty sure I've seen bigger sizes being rendered by Tomcat
2. Hmmm. Maybe restricting might be an answer. The more requests accepted by Tomcat the more chance of bottle necks perhaps?
3. That's the thing. There is no Exception being thrown in the logs. Possibly the logging level is not fine enough and very hard to change for me in a production environment as I don't have full control over it.
4. Interesting idea. Maybe. Might have to speak to our Unix people.

Would it help if there were two Tomcat instances running instead of the one? The load balancer (if one is used) would ensure that requests are pretty much shared between the two instances.
Also, back to the original thread dump... Line 19 specifically. It says that another thread has caused the lock (tid: 0x9b64b970). Is this what the dump is saying. I can't seem to be able find that tid in the thread dumps. Shouldn't it be there?

Once again thanks for your time in answering my question. I really appreciate it.

Norm.

Norman P Rozental

Greenhorn

Posts: 12

posted 8 years ago

I also forgot to mention that there are about 6 scheduled quartz jobs running within this tomcat instance. Does this fact raise an eyebrow?

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

I don't understand why this Thread dump has no associated Throwable message. Where does the dump appear?

The "not appearing until 4 or 5 days" after restart has the flavor of some system resource not being properly disposed of and finally running out.

What kind of monitoring do you have available? The Tomcat management app has helped me spot memory and Thread problems. I just routinely check memory and thread usage every day with the management app.

Bill

Norman P Rozental

Greenhorn

Posts: 12

posted 8 years ago

Hi Bill,

Unfortunately I don't have access to the tomcat management app. Our Unix guys don't give the support people (me) access to such things and it would take weeks before I get granted access.

The thread dump I have attached is hard for me to really understand. Firstly is it bad? Or indicating the thread is locked? The thread is in a runnable state. What does this thread dump really mean? Is this thread locked by something else preventing it from returning to an Object.wait state?

Also I cannot see any Exception in the catalina.out log files at the time I execute the thread dump.

The most important thing to know is the thing that isn't shown: What kind of Exception is at the head of this trace???

The ultimate indicator that Tomcat is low on memory would be an OutOfMemoryException. On the other hand, I could just as easily see this sort of stck trace coming from an interrupted connection. So it's critical to know the reported cause of failure.

One alarm bell is that there are supposed to be 2000 records processed in the page. That's always a concern. As I've mentioned before, my eyeballs throw an exception when there's that much on one page.

A hint as to possible causes is that it takes a few days after restart to manifest. Most likely some resource is slowly leaking.

An IDE is no substitute for an Intelligent Developer.

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

Eeeek - I see I have been operating under a misconception!

at the time I execute the thread dump.

The thread dump is not JVM output due to an exception but triggered by you - presumably because output had stopped??

Is that right? The response thread is hung trying to write ?

I dont normally deal with Java on unix systems so I didnt understand the thread dump.

Well, that would explain the lack of an exception. It's still a stack trace, Bill - the system thread dump doesn't know enough about JVM internals to attach source line numbers.

The problem with a single snapshot on a running-but-uncooperative process is that it can be somewhat misleading. Kind of like trying to map out an elephant by grabbing its tail. It's preferable to take a set of samples to see what the range of functions involved are.

Even better is to connect via a remote debugger, set a breakpoint when it starts misbehaving, then walk around it until you have mapped out whatever cul-de-sac it's hung up in.

An IDE is no substitute for an Intelligent Developer.

Norman P Rozental

Greenhorn

Posts: 12

posted 8 years ago

Hi guys,

That's right. The thread dump I initially attached is NOT JVM output due to an exception but when I issued the kill signal to produce the thread dump. I'm glad that's cleared things up.

And yes it appears that the response thread is hung trying to write.

From the produced thread dump (and when we have issues) I can see MANY threads of type "http-22421-ProcessorXX" that seem to be locked trying to write - all in a runnable state. (The XX is just an http processor thread number). One example is the original thread dump I attached to this topic.

Tim, the hint that it takes a few days after a re-start to manifest does appear to be some resource that is slowly leaking which Bill also pointed out earlier. Question is, when memory starts to become an issue, why (if that's what the problem is) do I see many thread dumps showing that the response is hung trying to write to output.

The lock is a serialization mechanism, and depending on how things are designed, it's perfectly reasonable for a number of concurrent threads to be holding locks. There's a big difference between a process lock and being "locked up".

Assuming that you've determined definitively that once the process enters SocketWrite0, it never comes back, you've left the Java world behind. Since you're Solaris 10, this might be a good time to get to learn the wonders of dtrace.

There does seem to be something odd there, since it certainly appears that you're getting stuck on HttpResponse writes. That doesn't seem right - an HttpResponse should simply be pushing the response back towards the requester's listening port, and about the worst that should happen would be that the port isn't listening and that the write request should time out after waiting for an ACK/NAK.

You would run low on other resources if a lot of responses were waiting to complete their communications, since the resources used to send the response can't be released until something either goes through or times out. So if neither one is happening, you'd get a backlog.

It's possible you have a network problem of some sort.

An IDE is no substitute for an Intelligent Developer.

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

Question is, when memory starts to become an issue, why (if that's what the problem is) do I see many thread dumps showing that the response is hung trying to write to output.

I missed the place where you monitor/log memory use and see available memory steadily decreasing

Question is, when memory starts to become an issue, why (if that's what the problem is) do I see many thread dumps showing that the response is hung trying to write to output.

I missed the place where you monitor/log memory use and see available memory steadily decreasing

Bill

If I'm reading it right, the number of free sockets is also decreasing and the thread count is rising. All consequences of creating short-terms object and having them hang around forever.

An IDE is no substitute for an Intelligent Developer.

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

Am I correct in assuming that your application only uses the servlet api and does not attempt to manipulate sockets directly OR create Threads?

Bill

Norman P Rozental

Greenhorn

Posts: 12

posted 8 years ago

Hi Bill,

I know it's a late reply and I apologize. To answer your question, our application does use the servlet api / struts and there is no manipulation of sockets.

The application is using DOJO at the front end. Not sure how much you know about that but it's an AJAX framework. Basically in our application and where this problem is happening is when a user clicks on a radio button, a call is made through DOJO, which subsequently calls a database query which returns what we are pretty sure to be a complete set of data (the 2000 records I have mentioned previously). For some reason during rendering of the HTML, it dies (which is the original post I have supplied). I reckon they culprit is this DOJO framework but it's very hard to re-produce as sometimes users report an error and other times it's all working great. We have not had re-start of the application now for 14 days and I'm pretty sure that this issue will happen again for a lot of users.

I hope I haven't confused you.

Norm.

William Brogden

Author and all-around good cowpoke
Rancher

Posts: 13078

6

posted 8 years ago

You say its an AJAX framework using DOJO, are these 2000 records being written directl to the containing HTML or are they supposed to be filled in by a separate AJAX style request? If AJAX request, what handles it?

If the containing page is plain html and it is a Javascript request which is failing partway through then the hangup in JSP is not related to the the incomplete set of records.

I have been hinting that you should be actually monitoring memory, now I am going to be direct - we need to eliminate or confirm the possible memory correlation that your original post hinted out.

Kindly find some way to record free and total memory use - see java.lang.Runtime methods.