Jsoup 1.2.3 processes HTML 5

Jsoup, a free Java library for processing HTML, is available in version 1.2.3 with enhanced HTML 5 support.

Jsoup, a free Java library for processing HTML, is available in version 1.2.3 with enhanced HTML 5 support.

As the parser has always implicitly supported HTML 5 tags, it now knows element definitions of the new standards. The tool can also generate an HTML-5-standards compliant page parse tree for further processing.

The second important innovation in Jsoup automatically detects the character set of a scanned document and decodes the input before parsing. There are also new selectors as well as small fixes and improvements.

Jsoup runs on Java version 1.5 and is under MIT / X license. On the Jsoup homepage there are Jar files for download and instructions in the Cookbook-style and the API reference.

Back in 1999 when the HTML 4.01 standard first appeared, virtually nobody envisioned video blogs, social networking sites, or Internet office tools. The upcoming HTML 5 standard will remake the web for the new generation of technologies and services.