Friday, December 26, 2008

We've discovered a bug in the Solaris JVM that we're using (1.5.0_08-b03). What happens is that we have a long running JVM that once a minute forks an executable viajava.lang.Runtime.exec and reads its stdout. After a long time, one of the forks doesn't actually make it to the exec, and the thread that asked for the exec doesn't continue. That's why I think it's a JVM bug: the exec call is made and only completes half of its job.

Tuesday, December 23, 2008

This is the second time this has bitten me: I want to use Hibernate with standard JPA to persist my entities. And I want my Entities autodetected, as Hibernate is capable of, even outside of a JEE container. So in my persistence.xml I have this bit of code:

<property name="hibernate.archive.autodetection" value="class,hbm"/>

But, I do not compile this persistence.xml into my .jar file, instead I just make it part of my classpath for my unit tests, thinking this will make things more flexible. And of course, this doesn’t work. The autodetection only works if the persistence.xml is located in the META-INF section of the jarfile that contains the Entities to be detected. See my post in the Hibernate forum.

If you’re a JSF newbie (like me), and you’re using Seam, you might be tempted to take one of the examples and hack away at it. For instance, in the booking example, the first page is called home.xhtml. After you type up some tags, you want it to run, so you point your browser at:

http://localhost:8080/myapp/home.xhtml

What you’ll find is not a JSF rendered page, but your JSF tag source! Then you think that you’ve misconfigured something, so you look over everything. But what’s really going on is that the Seam example isn’t configured to render that page, instead you should go to:

http://localhost:8080/myapp/home.seam

That will render the home.xhtml as JSF. Look in your war file’s WEB-INF/web.xml, and find this bit of code:

If you're running on OS X, disable the unnecessary dock icon from JBoss by adding -Djava.awt.headless=true. This might also solve problems on Linux/Solaris boxes that you're ssh'd onto and you don't have a DISPLAY environment variable set for X Windows.

Use your own jboss-log4j.xml file, which by default is in${jboss.home}/server/default/conf/jboss-log4j.xml. You probably don't need DEBUG output emitted to your console...

In order to get Java SE 6 running on your Mac, open the Java Preferences (at /Applications/Utilities/Java/Java Preferences) and drag Java SE 6 to the top of the list for Java applet versions and/or Java application versions. Now I get:

Friday, December 12, 2008

I'm working on a new project and using git for the first time; I really like the distributed source control model, it makes a lot of sense. I come from using p4 for quite a long time, so git feels a lot like CVS/SubVersion, but obviously a lot more powerful than CVS. Also, I want to make sure that I do not say that git is hard to understand and use.

The way I work with my projects is by submitting not only all my source code for my project, but also all tools that are used to generate my project. (Obviously I draw the line somewhere, I don't submit my Linux distribution that I use on the build machine.) That way I can guarantee reproducibility and no one on the team has to install their own tools before being productive.

A short time into my new project I already have more than 500,000 files in my git repository. Now every time I use any git command, it just sits there for many seconds... even when I open a file using Emacs, I have a severe delay because git is invoked to check its status. Using dtrace I've determined that it is doing an lstat on every single file in the repository, to make sure that it's not out of date.

Therefore beware: git is not good for large repositories. And when I say large, I don't mean "sorta large" like 20,000 files. I mean more on the order of 500,000 or 2.5 million files (the current size of our p4 repository). This is because git is just like CVS: it automatically determines for you which files you have edited/deleted/added, and which you have not, but it does this by doing an lstat on every single file in your tree. p4 does not make this assumption, and requires you to tell it what files you have edited/deleted/added, therefore it never does an lstat of the files in your tree (unless you invoke some special commands to ask it to look).

It seems there are attempts to optimize the lstat operations, but it is part of the design of git, and would likely be unnatural to avoid it or suppress it.

I found the git benchmarks, where it describes how efficient it is for the size of its repositories, which is great. However, it does not mention the number of files in a repository. And I couldn't find the current size of the linux kernel repository, but I found a mention that the pull of an entire tree was 5,000 files, which included all of git's metadata (if I understood it correctly).

In order to work around git's slowness with large numbers of files, I decided to split my git repository into two halves; the half with the tools, and the other half with my project source. I found a tool called git split that seems to do the job (seehttp://people.freedesktop.org/~jamey/git-split), but it didn't work on my git 1.6.0 repository. It got stuck because it couldn't read my .git/info/grafts. So I gave up there and just deleted all the tools out of my tree, and added them into a separate git repository. That made things much better.

Sunday, November 23, 2008

I have a Java process that is forked from a Windows service; the service process monitors the child Java process to make sure it's OK.

However, I was seeing in my logs that sometimes I would get the exit code 143. My code does not generate this exit code; further, there are no .dmp files or Java crash reports or exceptions logged or any other indication in my code that the process exited. The explanation is simple: my service and all other processes were receiving CTRL_SHUTDOWN_EVENT. My service process ignores this, but Java does not (see IBM's excellent document on Java Signal Handling).

The result is that my Java process would exit, my monitor would not know why, and restart it, at which time it would again get a shutdown event, etc, until the machine actually completed its shutdown.

Wednesday, November 12, 2008

In all our Windows executables we AddVectoredExceptionHandler so that we can get .dmp files when things crash. However, I recently discovered that doing so preventsPurify from working correctly on Windows. This should probably be expected, and I'm probably not chaining to the next handler correctly, but it's something to watch out for.

Subtopic #1: In all other operating systems, if you want a core dump file, it's very straight-forward; you ulimit -c unlimited, and you get core files. In Windows, you have to write your own Dump file from code that you write. You could depend on Dr. Watson, or Windows Error Reporting, but if you want your own dump file in a place where you or your customers can find it, you have to write it yourself.

Subtopic #2: Fifteen years ago, Purify was the most important development tool you could have. It's still the most important development tool you could have, except that it hasn't really changed in 15 years. I have several friends that are ex-Pure, and they are also dismayed at how it's just not keeping up. For instance:

It wasn't until 2004 (or was it 2006?) that you could really use Purify on Linux. Wow.

We still can't Purify JNI C code in a Java JVM on Linux. The JVM does magic things that Purify can't handle. Oh well.

Tuesday, November 11, 2008

I spent some time determining that the sample code for ndisprot that is included in the 6001.18001 WDK has a bug in the IRP cancel code. What's worse is that the bug only exists in the directory labeled 5x, not the directory labeled 60.

For some (at least 2) of the samples in the latest WDK's, they made a clean break from the NDIS 5.x and earlier sample code when they wrote the samples for NDIS 6.0. In the NDIS 6.0 sample they wrote the cancel code according to the pseudo code in MSDN. Unfortunately, they didn't back-port those changes/fixes to the sample code in the 5x directory (sample code intended for NDIS 5.x)

Thursday, August 14, 2008

When you call InetAddress.getLocalhost(), a reverse DNS lookup for your hostname is done. In the worst case, you’ve specified a DNS server that isn’t reachable, and so you have to wait for the DNS timeout, which can be quite long, like 30 seconds or 2 minutes. The reason the crypto code in JCE is doing this is for a random seed generator. Seems you could find something else more random than your hostname…

Below I’ve replicated the sample code that I created for this fix, in case it’s of any use to anyone:

I’ve found what I believe is a workaround to this problem, that seems to work against Java6. It works by setting the system property impl.prefix, and using implementations derived from the following classes:

The override implementations of Inet4AddressImpl and Inet6AddressImpl are designed to make sure that InetAddress.getLocalHost() returns an answer without causing any network access. That means that SSL connections, when constructing their random seed that includes the local hostname, will not hang when DNS cannot be reached.

The reason PlainDatagramSocketImpl is overridden is because the system propertyimpl.prefix is also used to construct it; if impl.prefix is not specified, then a prefix of “Plain” is assumed, and thus PlainDatagramSocketImpl is loaded. Therefore we must provide an implementation that with our own matching prefix.

The main class, DefeatGetLocalHost sets the system property impl.prefix to “DefeatGetLocalHost”. This will cause the following classes to be loaded when they are needed:

The reason that these derived classes are set in the same package, java.net, is because constructors and methods are package protected; therefore placing them in the same package provides the highest level of compatibility.

Also, in order to get our derived classes in package java.net to load in the Java runtime, we have to append the boot classpath. This is done with: -Xbootclasspath/a: after which we specify the directory with our class files.

In the next comment are the source files that I wrote to demonstrate. Compile it and execute DefeatGetLocalHost using -Xbootclasspath/a: to include the overridden classes.

Tuesday, January 1, 2008

Just filed a bug against log4j; it can swallow InterruptedException delivered to a thread that uses logging. This is because it catches IOException and ignores it, but IOException can also be InterruptedIOException, which can be generated on some platforms when a thread is interrupted.

Update: As of 8/14/2008, the log4j developer Curt Arnold has bug 44157 fixed, and the current plan is to ship the fix with log4j 1.2.16.