Sunday, November 26, 2006

Two and a half weeks ago I published a post about problems with random VM crashes using Java 5 on Linux with a 2.4 kernel.

Most of the feedback I got suggested upgrading to a more recent kernel version. Because this represents a major undertaking for our application (several thousand clients deployed with Java 1.4.2 on RH9) we needed to be sure this would work.

Because all the crash reports - see the original post - seemed to hint into the GC's direction I wrote a little test application to stress the garbage collector. What it does is to create a configurable number of threads, each of which just allocates a byte[] of variable size. In case an OutOfMemoryError occurs, the thread gets replaced with a new one. You can find the code at the bottom of this post.

I started 4 instances of this tool under Ubuntu 6.06, each configured with 20 threads (first parameter) and up to 40MB of memory per thread (second parameter, in bytes).

After 4 days of continuous running - we had just started to feel a little more safe - one of the processes crashed, leavig a hs_err file behind, again telling us the current activity was a full garbage collection.

We have now filed a bug report with Sun which is yet to be reviewed by them. They say that it currently takes around 3 weeks before you get an official bug id, however I do not think any fix will be in time for us to use Java 5 for our next application release.

Does anyone know, if there is a way of getting the crash files with debug symbols? Maybe this would allow us to do some more testing on our own. Is there some sort of debug-VM version available for download?

Tuesday, November 21, 2006

On Todd Huss' blog I just came about a very simple way of initializing a collection with a set of predefined values. It is so simple that it is amazing people do not use it way more often. For my part, I have seen this use of instance initializers for the first time, although they are nothing sooo special...

Saturday, November 18, 2006

Reading about Peter Zaitsev's feature idea about Finding columns which a query needs to access - which I would really like to see implemented - reminded me of a bug report I filed in 2004 and which bit me again only a few days ago. You can find it under Bug #7074 in the MySQL bug tracking tool. Although it is filed as a feature request, I think one should be aware of this, as it may cause problems in your applications (it did in ours).

Basically it is about explicitly specifying which columns you need in a result set, instead of just using SELECT *. This is generally a good idea, however if the table contains BLOB columns, it becomes even more important, as it may affect performance heavily in an unexpected manner.

From the bug report:

MySQL first reads all the selected columns, and only after that
checks the WHERE.

This may lead to long running queries, even if you do not use the BLOB column in the WHERE clause and even if there is no data to retrieve based on the query conditions.

Monday, November 13, 2006

Recently I received a notification about F-Secure Anti-Virus 2007 being available. As an F-Secure customer you are entitled to upgrade from the 2006 version if your subscription is valid. So I downloaded the installation package and performed the upgrade.

After the obligatory reboot things started to fall apart. My computer would not respond for more than about 30 seconds after I had logged in. Opening the Start menu would work, maybe even opening e. g. the Control Panel sub menu. However nothing else would work after this point. Using Ctrl-Alt-Del to get the Task Manager just allowed me to "wipe" the start menu from the screen, no more action would be possible.

What made me suspicious was a little dialog I had to dismiss right after logging in that informed me about my Kerio Personal Firewall not being found by the system-tray GUI. Because conflicting firewalls are known to cause lockup problems like this, I originally bought F-Secure Anti-Virus instead of the whole Internet Security package. Anti-Virus 2006 had been working fine in conjunction with the separate personal firewall.

I rebooted to see if this was some sort of transient problem with the first reboot after the install. This time I did not even get an Explorer to launch and show me my desktop. Apart from the wallpaper and a mouse pointer I could not see anything. Hitting Ctrl-Alt-Del again let me launch the Task Manager. I tried to start explorer.exe from there, to no avail.

I decided to uninstall the personal firewall. I tried to boot into Safe Mode, just to see that it would not come up and instead die with a blue screen. To be fair, I have to say that I had not tried Safe Mode for a loong time, so I do not know if it would have worked before my problems started.

My only way to resolve this was to boot into the Vista RC installation I luckily had not deleted yet and to disable the startup of the firewall service in the XP install. To do so I loaded the windows\system32\config\system registry hive into the Vista regedit and set the startup type (ControlSet00x\Services\servicename\Start to 0 - which means disabled - in the firewall service node of the active ControlSet001. You can see which control set is the one for "normal" Windows startup by looking at the SYSTEM\Select\Default value.

Upon restart the situation did not change, the same problem as before. Because I was not sure whether just disabling the Kerio service had been enough, I decided to uninstall it. To do so I had to disable F-Secure Anti-Virus, too. So I loaded Vista again and opened the XP registry. Luckily the F-Secure services all have human readable key names, all starting with "F-Secure", so it was very easy to disable them as well.

Back in XP I was for the first time able to do more than wait for the lock-up. I uninstalled Kerio using the Control Panel's "Add/Remove Programs" applet and rebooted, after I had set the F-Secure services back to their original startup settings.

Guess what... It still did not work... I came to the conclusion that there must be some sort of a bug in F-Secure Anti-Virus's 2007 version. In the meantime my father had called, complaining about the same problem, which at the time seemed to support my theory. At that point however I did not know yet, that he used the Sunbelt Personal Firewall, too.

After going through the whole boot Vista - load XP registry - disable services - reboot to XP hassle I finally uninstalled Anti-Virus 2007, rebooted and re-installed 2006. At this point I had restored the situation where I had originally left off - minus the Kerio Personal Firewall.

For some reason I did not want to believe that F-Secure would ship such a lousy product. I fired up regedit again and opened the services subtree. There I reviewed every one of them, not knowing what exactly to look for. Finally I found these two entries:

The ImagePath in the second node reads "%systemroot%\system32\drivers\khips.sys" when viewed with regedit. Searching the net for that name reveals that it is the "Kerio Host Intrusion Prevention Service". Obviously this is a remainder of the Kerio Personal Firewall that I thought I had removed.

In the Device Manager one can also see this service when the "View/Show Hidden Devices" option is enabled. It will show up under "Non-PnP-Drivers" (sorry if the option names are a little off, I am trying to guess their names, because I use a German Windows).

As soon as I had removed both of the registry keys above (kerio.uk.com contains a reference to fwdrv) and rebooted, I could use F-Secure Anti-Virus 2007 without any problems. I will file this with F-Secure now...

They guys of the JavaPosse have just released a special issue of their podcast in which they interview Mark Reinhold (chief engineer for Java SE), Rich Sands (community marketing manager for Java SE) and Eric Chu (senior director of the Client Systems Group and head of its Java ME initiatives).

Saturday, November 11, 2006

Very cool behaviour of QuickTime on Vista (ok, RC1, but I do not think this will become better):

This is shown when you start the QuickTime control panel applet. Notice that the publisher information is displayed as "Microsoft Windows Publisher", so you have no idea that this was really the QuickTime applet. It could have been any other process in the background, too.

Tuesday, November 07, 2006

We are currently evaluating the consequences of migrating our application from Java 1.4 to Java 5. While initial tests revealed only simple issues (like variables called enum etc.) we are now seeing a much more severe problem: Random VM crashes.

Currently we only see this on Linux (Kernel 2.4) only, however even there we cannot reliably reproduce the problem. On a single machine we have seen two crashes in a week. Notably the application was not being used, it was just started and waiting for user input. Some background threads are running in this situation, however they do not do any work, either. They just poll some database tables for external changes, but there were none.

All of a sudden a VM would crash, leaving a hs_err_pid1234.txt behind. This is what they look like (shortened):

Looking through the Sun bug database I found several reports about similar crashes, however they were all closed as not reproducible. This is our problem, too. Right now the application has been running for 5 days without a problem. Nevertheless this is not too comforting, as we would have several thousand VMs in production use. Should we decide to migrate even a chance of 0.1% for this crash would leave us with several problem reports a day which we cannot accept.

Thursday, November 02, 2006

As announced previously I spent some time to get Beryl to work on my newly upgraded Edgy Eft installation. Although it did not went as smoothly as I would have hoped, it was not too troublesome either.

Dual head still seems to be a major problem in many areas in Linux. This definitely something the Windows people do not have to worry about just as much, but ok, this may partly be related to the hardware vendors not providing some sort of unified and/or open drivers.

Nevertheless it is now working, after some changes to my xorg.conf. Before those I always got an error message from Beryl, complaining about a missing RandR extension.

The effects are really nice, some of them are however too slow for my taste in the default settings. After speeding them up a little (I do not like to wait for a context-menu to wobble into view, if it wobbles for more than a fraction of a second) I really liked it. There are some issues left, but I assume this is because of the ongoing development. E. g. window resizing is a little strange if you grab a window's top edge and move the mouse up and down. One would expect the window to remain in place and gain or loose height from the top, i. e. where you drag. However sometimes windows seem to be resized on the bottom.

Video playback is also choppy, but that seems to depend on the file I play back. Probably due to different codecs, however I have not really looked deeper into it.

From what I have seen so far, I believe there is very much potential in this :)

About me

I am a software developer focused mainly on mobile development on iOS and Android, with CenterDevice GmbH and codecentric AG. Apart from this I spend my time with podcasts, books and technology in general. This is my personal site. Anything posted here is my personal and private view of things. codecentric's blog (with the occasional crosspost) can be found at blog.codecentric.de.