Search This Blog

PoLA and HttpURLConnection

If you are a developer like me you probably heard of 'PoLA' or the Principle of Least Astonishment. The principle is fairly simple. Never surprise a user. Or in this case even more important, never surprise a developer. Unfortunately I got more then I bargained for last week when one our web service clients was generating rogue requests.

Rogue requests you say? Yes, like in; we have no freaking idea where they're coming from. It's one of those moments where managers start running around in circles, panicking and yelling "we're hacked!!" or "please someone turn off Skynet!!".

After a long day of debugging and going through too many logs it turned out that the client was in fact disconnected before processing ended on the endpoint. So there requests weren't simultaneous after all. But why did that took us a day to find out? Did we play starcraft2 again instead of working?

Well no, it all started with the HTTP read timeout on the endpoint's container being unexpectedly set lower than we thought. The logging on the endpoint indicated that a reply was generated but the client was actually disconnected before that event because of the read timeout. This was of course not logged by our aspects on the endpoint side as this is decided on a lower level (the HTTP stack) rather then our endpoint itself.

Ok, true, I hear you say, but what about the web service client log? The web service client should have thrown a "ReadTimeoutException" or something similar and that should have been written to the log, right? Well, true, but it didn't. And now it comes, as it turned out the real surprises is inside HttpURLConnection (more specifically the default Oracle internal handler sun.net.www.protocol.http.HttpURLConnection)

Did you know that this default impl of HttpURLConnection has a special "feature" which does HTTP retries in "certain situations"? Yes? No? Well, I for once didn't. So what happened was that the timeout exception was indeed triggered on the web service client but silently catched by HttpURLConnection itself by which it decided to do an internal retry on its own. This means that the read() method called the web service on HttpURLConnection remains blocked, like you are still waiting for the response of the first request. But internally HttpURLConnection is retrying the request more then once and thus generating multiple connections. This explained why it took us so long to discover this as the second call was never logged by our code as it is in fact never triggered by our code but by HttpURLConnection internally.

Putting all this aside, who the hell builds a retry mechanism that isn't really documented nor configurable? Why am I after 15years of Java development (and a network fetish) not aware of this feature? But more over, why the hell is it doing retries on a freaking HTTP freaking POST? PoLA breach detected!

As you probably guessed by now it's a bug (http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6382788). Not the retry mechanism of course, that's just crap. The bug is that it also happens for a POST (which is by default not idem potent per HTTP RFC). But don't worry, Bill fixed that bug a long time ago. Bill fixed it by introducing a toggle. Bill learned about backward compatibility. Bill decided it's better to leave the toggle 'on' by default, because that would make it bug-backward-compatible. Bill smiles. He can already see the faces of surprised developers around the globe running into this. Please don't be like Bill?

So after some exciting days of debugging the solution was kinda lame. Merely setting the property to false fixed it. Anyway it surprised me enough to write a blog entry on it, so there you have it.

For the sake of completeness: if you run this code inside a container your results may vary. Depending on libraries used and/or your container, other implementations could have been registered which are then used rather than Oracle's internal one (see URL.setURLStreamHandlerFactory()). So now you might be thinking; why is that guy using the default HttpURLConnection then? Does he also drive to work in a wooden soapbox and cut's his grass with scissors? He could better start a fire and use smoke signals instead! Well, I can't blame you for thinking that, but we never deliberate decided on doings this. Our web service in question is a bit special and uses SAAJ SOAPConnectionFactory which on it's turn uses HttpURLConnection, which reverts to the default impl if no one registered another one.
If you use a more managed WS implementation (like Spring WS, CXF or JAX-WS impls) they will probably use something like Apache HTTP client. And of course if you, yourself would make HTTP connections you would opt for the same. Yes, I'm promoting Apache commons HTTP client, that little critter which changes public API more often then an average female changes shoes. But don't worry, I'll stop ranting now.

Comments

Post a Comment

Popular Posts

If you need to process large database result sets from Java you can opt for JDBC to give you the low level control required. On the other hand if you are already using an ORM in your application falling back to JDBC might imply some extra pain. You would be losing features such as optimistic locking, caching, automatic fetching when navigating the domain model and so forth. Fortunately most ORMs, like Hibernate, have some options to help you with that. While these techniques are not new, there are a couple of possibilities to choose from.
A simplified example; let's assume we have a table (mapped to class "DemoEntity") with 100.000 records. Each record consists of a single column (mapped to the property "property" in DemoEntity) holding some random alphanumerical data of about ~2KB. The JVM is ran with -Xmx250m. Let's assume that 250MB is the overall maximum memory that can be assigned to the JVM on our system. Your job is to read all records currently in …

I'm going to start this one with a bold statement: I always liked Java Swing, or applets for that matter. There, I said it.
If I perform some self analysis, this admiration probably started when I got introduced to Java.
Swing was (practically) the first thing I ever did with Java that gave some statisfactionary result and made me able to do something with the language at the time.
When I was younger we build home-brew fat clients to manage our 3.5" floppy/CD collection (written in VB and before that in basic) this probably also played a role.

Anyway, enough about my personal rareness. Fact is that Swing has helped many build great applications but as we all know Swing has it drawbacks.
For starters it hasn't been evolving since well, a long time. It also requires a lot of boiler plate code if you want to create high quality code.
It comes shipped with some quirky design "flaws", lacks out of the box patterns such as MVC.
Styling is a bit of a limitation …

This week it was time to upgrade our code base to the latest Hibernate 4.x.
We postponed our migration (still being on Hibernate 3.3) since the newer maintenance releases of the 3.x branch required some API changes which were apparently still in flux. An example is the UserType API which was still showing flaws and was going to be finalized in Hibernate 4.

The migration went quite smooth. Adapting the UserType's to the new interface was pretty straightforward. There were some hick ups here and there but nothing painful.

The thing to watch out for is the Spring integration. If you have been using Spring with Hibernate before, you will be using the LocalSessionFactoryBean (or AnnotationSessionFactoryBean) for creating the SessionFactory. For hibernate 4 there is a separate one in its own package: org.springframework.orm.hibernate4 instead of org.springframework.orm.hibernate3. The LocalSessionFactoryBean from the hibernate 4 package will do for both mapping files as well as annotate…