Discovering CVE-2019-13504, CVE-2019-13503 and the importance of API Fuzzing

Introduction

Exiv2 is a set of “c++ metadata library and tools…used by many projects including in KDE, Gnome Desktop as well as many other applications including GIMP, Darktable, shotwell, GwenView and Luminance HDR” (cited from their website). The fix for the issue was committed pretty quickly and in overall looks like the repository is well maintained.

Mongoose is an embedded web server library, probably used by some other open-source projects as well as commercial projects. The bug was never fixed (at least in the time of writing) and looks like the oss project is not well maintained even though it is relatively popular at least by the number of stars in their github. (EDIT: The bug was addressed and should be fixed soon).

Both bugs were found by fuzzing and we will go through the process of setting up the correct fuzzer, compiling, running and finding the bug as well as show how different setup/fuzzer will have different results.

AFL vs libFuzzer

There are a lot of good fuzzers for native applications. Some of the most notable ones are AFL and libFuzzer which work as guided fuzzers for c/c++ applications. We won’t discuss binary fuzzing (like afl-qemu,afl-unicorn) here as we are more focused on discussing fuzzing as part of the development cycle where source code is usually available.

Both fuzzers are very good in terms of the fuzzing engine/heuristics itself but they share some differences.

AFL is usually easier to setup and it can work with your “main” command line out-of-the-box. This is lucrative both for developers and for security researchers who goes through a lot of projects and tries to find vulnerabilities.

libFuzzer is part of the LLVM compiler infrastructure project and comes built-in with the clang compiler. Though libFuzzer requires a bit more work to setup, it is ideal when it comes to fuzzing specific API calls in a library as well as it has better support for different sanitisers. You can see a basic example on their homepage as well as our example for exiv2 and mongoose in this post.

Also It is possible to use AFL engine while using a target function like in libFuzzer. We won’t go through how to do this in the post.

Exiv2 – AFL setup

We will show a simple example of how to setup AFL for exiv2 (This part won’t find any vulnerabilities at least not quickly, you can skip to the next section if you want to reproduce the bug). The following instructions were tested on Ubuntu:18.04 – you can use docker run -it ubuntu:18.04/bin/bash and walk through the instructions.

This will build an instrumented AFL exiv2 binary (with the vulnerable version as the version is already fixed in master). Once you run the afl fuzzer it won’t find any vulnerabilities at least not quickly enough either due to someone already run this exact fuzzing setup or due to not having a “low-hanging” bugs in this path.

We also compiled the target with AFL_USE_ASAN but it provided similar results (The process is the same, just use 32bit binary or increase the memory limit). We will also show later that the crash found by libFuzzer didn’t affect the command-line utility.

Exiv2 – libFuzzer setup

Fuzzing with AFL is not sufficient like we will see in this section, especially for a library. Because a library doesn’t necessary used by the command-line but used by the APIs exported by the library which might have slightly different code. This is also where the power of libFuzzer comes into play.

The fuzzer is available in a PR which we contributed and will be merged soon. In the meanwhile you can see the fuzzer implementation here.

Following are the instruction to build and run the libFuzzer target (“target” or “harness”- is the function that you implement that is called by the fuzzer )

Pretty quickly you will see the heap-buffer-overflow with READ size 4 by the ASAN:

We can also double check that this crash indeed doesn’t effect the exiv2 command line via ./bin/exiv2 ./crash where we get “failed to read image” output and not the ASAN output because of the minor difference in the code of the command-line tool and the exported API.

Hence this is a good example of why fuzzing exported API functions in C/C++ libraries is important. Also in terms of best practice it’s not only important to do a one time fuzzing but have a continuous fuzzing in-place where new code get “fuzzed” (a new verb:)) every time it’s introduced into master, exactly like unit-test.

Mongoose

You can check our pull-request which wasn’t addressed at all, at least for the time of writing.

We won’t go through how to setup the fuzzer as it’s similar to exiv2 and you can look at the instructions in the pull-request.

Mongoose is a good example where it’s hard to setup fuzzing with AFL as there is no binary per-se like in exiv2 or other command line tool. This is a good example where libfuzzer can come into play and fuzz exported/dangerous functions that usually involve some kind of parsing like mg_parse_http in the case of mongoose.