Observations of a Digitally Enlightened Mind

Dissecting the Automatic Patch-Based Exploit Generator

There has been a lot of recent discussion on the Automatic Patch-Based Exploit Generator paper (here), and although it is compelling, it is far from the mass exploit generating, digital apocalypse one might be led to believe. It is clear that evolving techniques are automating many aspects of what has been a very manual reverse engineering process. It is also clear that the time to protect is decreasing dramatically. From code red, which had a 6-month lead time from patch to exploit, to recent 0-day and targeted attacks, we are quickly entering an era where traditional techniques are becoming too slow, too cumbersome, and too prone to error or service disruption to be effective.

Looking at the OODA loop <observe, orient, decode, and act> it becomes even more clear that an attacker has an advantage in that their time to reverse-engineer a patch or other protection mechanism will almost always be faster than a defendants time to reverse-engineer an attack – additionally the consequence of time is far more prevalent for the defense.

If one factors in cost (c), which would include some measure of difficulty (d) , expense (e) and time (t), coupled with risk, which is some measure of penalty (p) and likelihood (l) of being caught the results leave little doubt that automatic malware generation will not only increase in sophistication and speed, it will also increase in population exposure.

Anyway back to the APEG paper, in which it states

However, it is insufficient to simply locate the instructions which have changed between P and P’. In order for APEG to be feasible, one has to solve the harder problem of automatically constructing real inputs which exploit the vulnerability in the original unpatched program.

They go on to state what looks like vulnerability checking against input validation errors, not exploit generation – all of the security researchers, especially those who have dealt with developing vulnerability scanning checks will note the difference

Our approach to APEG is based on the observation that input-validation bugs are usually fixed by adding the missing sanitization checks. The added checks in P’ identify a) where the vulnerability exists and b) under what conditions an input may exploit the vulnerability. The intuition for our approach is that an input fails the added check in P’ is likely an exploit in P. our goal is to 1) identify the checks added in P’, and 2) automatically generate inputs which fail the added checks.

This would have been an extremely useful tool for the vulnerability check writing teams at nCircle, Qualys, and the rest of the VA industry, but as for automatically generating exploit code, well, that is possible if we bound the statement to automatically generating exploit code against input validation errors.

This is still impressive and I would welcome the opportunity to better understand what I am missing or what will be done with the next evolutionary leap to automating malware generation. In the meantime organizations must continue to move away from the traditional reactive, ad-hoc, firefighting mode of information security and towards more agile and effective processes and technologies that decrease attack vectors and dramatically reduce the time to protect.

For more detailed analysis of the paper and the reverse-engineering process I would suggest you read the following excellent posts:

This paper promises “automatic patch-based exploit generation”. The paper is a bit overstated, this isn’t possible. By “exploit” the paper does not mean “working exploit”. That’s an important difference. Generating fully functional exploits by reverse engineering a patch takes a lot of steps, this paper automates only one of them, and only in certain cases.

Anyhow, long post, short summary: The APEG paper is really good, but it uses confusing terminology (exploit ~= vulnerability trigger) which leads to it’s impact on patch distribution being significantly overstated. It’s good work, but the sky isn’t falling, and we are far away from generating reliable exploits automatically from arbitrary patches. APEG does generate usable vulnerability triggers for vulnerabilities of a certain form. And STP-style solvers are important.

The paper describes a toolset that produces exploits from patches almost instantly, and goes on to discuss the implications of instant exploit generation from patches, raising the specter of worms propagating in the hours while patch distribution is still taking place.

However, the toolset that is actually described in the technical details of the paper does not provide that sort of capability. The tool does not only require a patch diff, but also either an input that reaches the vulnerable code, or an indication by the tool’s user of the specific locations where the attacker controlled data that ultimately exercises the vulnerable code is input into the program. From that information the tool produces a set of inputs that would be rejected by the patched version.