A look at the state of HTML5 parsing and the Opera 11.60 beta

Opera has issued a beta release of version 11.60, which includes the company's …

A beta release of Opera 11.60, which was made available this week, includes a number of significant improvements to the browser’s HTML rendering engine. It also brings a visual overhaul to the built-in e-mail client and a few other nice cosmetic improvements.

Over the past two years, the developers behind Opera have taken major steps to modernize their Web browser and restore its competitiveness. The changes are broad in scope and have touched many different layers of the application.

One of the first major steps was the introduction of Carakan, a modern, high-performance JavaScript engine that uses just-in-time compilation to emit native code. Rendering also got a major overhaul when Opera transitioned to using its Vega vector drawing framework (which was originally implemented to support SVG) for all painting in the renderer. The cycle of modernization has continued in 11.60 with the introduction of a new HTML parser that conforms with the HTML5 specification.

HTML5 parser

Unlike XML, where structural validity is extremely important, HTML tends to be loose and highly forgiving. Very few pages on the Web strictly conform with standards. As such, HTML renderers need to be programmed to gracefully handle malformed markup—such as cases where tags are missing or nested unevenly. The manner in which aberrant HTML should be interpreted isn’t always intuitively obvious, however. Poor markup can sometimes create situations where the intention of the author is ambiguous. This can lead to inconsistent parsing behavior between browsers.

HTML5 is the first version of the standard to comprehensively define explicit parsing rules, even in edge cases that relate to malformed markup. The more specific parsing rules will help to improve interoperability between browsers by ensuring that the document object model (DOM) is assembled with consistent and uniform structure when markup is bad.

This is particularly beneficial for Web developers, because complex JavaScript code that uses the DOM APIs to manipulate the content and structure of a page will operate more predictably. There are also a number of other peripheral benefits that emerge from the new parsing rules. For example, a complete implementation adds native support for inline MathML and SVG content in HTML markup.

All of the major browser vendors have been working on new parser implementations that comply with the HTML5 standard. Apple began developing an HTML5 parser in WebKit last year and deployed it to end users in Safari 5.1 earlier this year. Google shipped it in Chrome 7 a few months after it was implemented in WebKit. Mozilla made an experimental HTML5 parser available behind an about:config option in Firefox 3.6, and finally stabilized it earlier this year for Gecko 2, which was incorporated in Firefox 4. Microsoft announced in July that it has also started working on an HTML5 parser, which it expects to ship in Internet Explorer 10.

Opera first unveiled its HTML5-compatible parser in February, which it released an experimental build. The company had taken the opportunity to greatly overhaul its aging parser implementation, which had become a bit cluttered with complex legacy code. The new parser, which is codenamed Ragnarok, is finally integrated in Opera 11.60. The new parser is said to be slightly more memory intensive, but the increased footprint is offset by optimizations elsewhere in the browser.

In addition to the HTML5 parser, Opera 11.60 also adds a number of other noteworthy features under the hood. It has gained support for HTML5 custom protocol schemes and CSS3 radial gradients. Opera’s scripting engine has also been improved to conform ECMAScript 5.1, a minor revision of the JavaScript language standard that was published last year to incorporate corrections to the specification that were made when ECMAScript 5.0 went through the ISO standardization process.

Conclusion

In addition to the developer-centric features we have discussed in this article, Opera 11.60 also brought some user interface refinements, major improvements to the browser’s e-mail client, and some Lion compatibility enhancements such as support for full-screen mode. We will take a closer look at the browser’s user-facing enhancements when version 11.60 is officially released.