November 4, 2009

How To Unescape HTML in Java

I was writing an Html unescape algorithm in Java today. What I came out with is the one below. There is a problem in the algorithm below that eats up some space or characters for some corner cases. Can you figure out what the problem is?

I wrote the class in a way so that you can compile and run it in command prompt and can see the output right away. You can do some trial and error and figure out the issue in the algorithm below.

Thanks Nitol. Yes, Apache open source is the answer to this small small algorithms. I tried to write it myself just to enjoy and play aorund some data structure and algorithm problems. But while using for production, I should use libraries like this.