Just try this in Java, I dare you ...

The problem: he had a ton of HTML that was written by MS Excel. As a result, all of the HTML was upper-case and it had MS proprietary style information embedded in the tags. He needed to clean this up, fast. He didn't have Perl on his box, but five minutes later, he accessed a URL that pointed to this script that I wrote. Paste the HTML in the textarea, click submit and it's instantly cleaned.

#!/usr/bin/perl -T
use strict;
use warnings;
use HTML::TokeParser::Simple 2.1;
use CGI qw(:standard);

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Without JavaScript enabled, you might want to
use the classic discussion system instead. If you login, you can remember this preference.

Given that the problem is to Get the Job Done(tm), I'd say it's relevant whether HTML::TokeParse functionality is available in Java.

Some or all of the functionality may be, but probably in a slightly more verbose version. And probably not as easily found, installed and used. As always, having done similar things before is an important factor (me, I wouldn't have the experience to look at HTML::TokeParser for this problem, but I might find it anyway).