Running Out of Memory

I'm working on a utility that will take in a file full of ID numbers, check each ID number against a database to see if it has been modified, and then mail a CSV file back to the requester with the results. It's quite possible that the file being uploaded would be very large but, when I get to more than roughly 20,000 ID numbers in that file, I constantly run into OutOfMemoryExceptions.

I've tried to do all sorts of things to reduce memory consumption but, no matter what I do, I always run out of memory at the same stage. Even when I make changes that should result in less memory usage, I don't see any change in performance, which I find very confusing. Here's a snippet of some of the relevant code I'm using. Any thoughts on what I could do to make this more efficient would be much appreciated.

I've tried periodically calling System.gc(), as well, but it made no impact on the result and degraded my performance considerably, so I've removed it.

you can use the -Xmx and -Xms to increase your heap size. This may not be a permanent fix, but it might let you limp through your initial struggles.

type "java -X" to see how to use it...

There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors

Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271

posted Nov 02, 2009 10:34:45

0

fred rosenberger wrote:you can use the -Xmx and -Xms to increase your heap size. This may not be a permanent fix, but it might let you limp through your initial struggles.

I appreciate that but, unfortunately, that's not an option. This code is running as a J2EE application and is hosted on a box which I do not have control over. My chances of getting them to raise the heap size for my application on that box are...well...let's say I'd be better off playing the lottery.

It isn't obvious to me why you have to read all 20,000 strings into memory before processing them. Couldn't you just do something like "read a string, process a string, until end of file"?

Although a mere 20,000 strings shouldn't blow out your memory unless they are extraordinarily long. So there's probably more to it than just that. I would recommend profiling the application but I suspect you might have problems arranging that too.

Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271

posted Nov 02, 2009 11:09:41

0

Paul Clapham wrote:It isn't obvious to me why you have to read all 20,000 strings into memory before processing them. Couldn't you just do something like "read a string, process a string, until end of file"?

I've considered this, as well. The reason I read all the Strings (which are numeric values from 6-8 digits long, so they're not horribly large) into memory is so that I can get an accurate count of how many there are. This allows me to provide progress statistics. I read the numbers into a list and then set the original file to null so the "net memory usage increase" should be negligible.

Even still, like you said, 20,000 Strings of that size shouldn't really be causing that much of an issue.

Corey McGlone wrote:I've considered this, as well. The reason I read all the Strings (which are numeric values from 6-8 digits long, so they're not horribly large) into memory is so that I can get an accurate count of how many there are. This allows me to provide progress statistics. I read the numbers into a list and then set the original file to null so the "net memory usage increase" should be negligible.

Even still, like you said, 20,000 Strings of that size shouldn't really be causing that much of an issue.

It would seem to me that keeping a running count would be just as accurate, and a much better use of resources. What happens when your file gets to 200,000 lines, or 2,000,000?

John.

Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271

posted Nov 02, 2009 12:00:26

0

John de Michele wrote:It would seem to me that keeping a running count would be just as accurate, and a much better use of resources. What happens when your file gets to 200,000 lines, or 2,000,000?

I do keep a running count of the number of lines I've processed (that's the variable "totalRowsProcessed"). But, in order to know my progress, which is expressed as a percentage, I need to know how many rows were in the original file as well.

I didn't really figure this was where my issue was - after all, once the numbers are loaded into that list, the list doesn't grow. If I'm getting past this portion of my code (which I know I am), then this shouldn't be an issue.

That said, this is using memory that isn't absolutely necessary to use so I went ahead and tried this. I read in from my file one byte at a time and process each number as I get it. This got me all the way from 15% to 16% before dying with an OutOfMemoryException.

Given that, I'm pretty sure this isn't where my problem lies.

Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271

posted Nov 03, 2009 12:42:55

0

Just to close up this thread - I found my issue today. As usual, the problem lied in code that I wasn't even looking at. It was inside this method:

Inside that method, I was creating a CallableStatement and using it to invoke a stored procedure on the database I'm connected to. After processing, I was properly closing the ResultSet, but I had forgotten to close the CallableStatment. Adding cStmt.close() fixed all my memory issues.

Corey McGlone wrote:Just to close up this thread - I found my issue today. As usual, the problem lied in code that I wasn't even looking at. It was inside this method:

Inside that method, I was creating a CallableStatement and using it to invoke a stored procedure on the database I'm connected to. After processing, I was properly closing the ResultSet, but I had forgotten to close the CallableStatment. Adding cStmt.close() fixed all my memory issues.

Thanks for all the help, everyone.

Do you or anyone have any explanation on that, how not closing a CallableStatement causes out of memory error?

"Releases this Statement object's database and JDBC resources immediately instead of waiting for this to happen when it is automatically closed. It is generally good practice to release resources as soon as you are finished with them to avoid tying up database resources. "
javadoc of the close method I am tempted to believe them,you never know what's under the hood ;)