Links

Fuzzinator, a mutation and generation based browser fuzzer

Fuzzers are widely used tools for testing software. They can generate random test cases and use them as input against the software under fuzzing/testing. Since the tests have randomly-built content, it is not necessary to check them for correctness, but they are suitable for catching rough bugs like use-after-frees, memory corruptions, assertion failures and further crashes. There are many approaches how to generate these tests, but all of them can be categorized into three main groups: whitebox, blackbox and graybox fuzzers. The first approach - whitebox - is based on the source code of the tested software and endeavors to cover even more unique control flow paths. The blackbox fuzzing doesn't care about the source code at all. It uses some kind of description about the input and generates new tests based on this knowledge. The graybox testing is a hybrid solution that incorporates traditional black box testing with insights gained from reverse code engineering efforts.

A few months ago I have started to write a fuzzer framework for browser testing, named Fuzzinator, aimed to support the most popular languages on the web. My goal wasn't just supporting many languages but any browsers too. This way the blackbox approach was chosen. The basic idea behind the chosen technique was to learn the basic constructions of a particular language and use them to build new, complex test cases.

For the learning phase I've used existing tests from different sources. The initial inputs of the tool were imported from validated sources, like the layout test collection of WebKit, a local copy of the Alexa Top100 and random tests collected from the web with language specific crawlers. These tests were tokenized and saved with the appropriate information into databases. In the generation phase these code pieces are stuck together randomly.

In some aspect, Fuzzinator resembles LangFuzz (https://users.own-hero.net/~decoder/holler-mthesis-2011.pdf), because both of them learn from existing examples. On the other hand, Fuzzinator does not just take an existing test to replace each of its constructs with a similar one (as done by LangFuzz), but builds new tests from the scratch too.

Learning from valid (at least: existing) examples ensure that the generated tests will be syntactically correct. However the experiences showed that the execution of these tests can fail at the very beginning of the test due to semantical problems. For this reason every supported language has a semantic checker part which handles id matching, object and function existence, etc.

Now I have a working version of Fuzzinator that supports HTML, CSS, SVG and JavaScript languages. Although it's still strongly under development it has already reported a bunch of bugs. These bugs are collected in bugzilla under a meta bug: https://bugs.webkit.org/show_bug.cgi?id=116980. If you browse these bugs you can see the biggest power of such tools: most of the test cases are so crazy that they wouldn't come into human testers' or developers' mind - but they are still errors and sometimes even exploitable. A few interesting examples from the collection so far:

1) SVG: What do you think, what happens if you create a really huge path object with small dashes? Yes, for each of the fragments memory will be allocated and we will simply run out of our memory, and it only depends on the given port's implementation whether it cause evil issues or not. (https://bugs.webkit.org/show_bug.cgi?id=106228)

The test tries to animate the display property of a circle, however the given value ("bevel") is not suitable for this attribute. But since it's a valid keyword of the strike-linejoin property it scraped through every checker and caused a crash in CSSParser. (https://bugs.webkit.org/show_bug.cgi?id=105275)

As you can see Fuzzinator is a great tool to play dirty trick on your browser and you might find further examples under the meta bug which can drive your browser crazy, and even more comes...

In the future I plan further extensions for the fuzzer:

First of all I need an automated minimizer tool since the output of the fuzzer is usually quite large (thousand lines of code) and it takes a long time reducing it. For this reason I reported only half of the found bugs so far :(

Supporting more languages like WebGL, MathML, basic XML, etc.

Combine the current monolingual fuzzers.

If you have any experience or idea to share on this topic or you are simple just find this fuzzing technique interesting, drop me an email or comment below.