We present a novel technique, automatic input rectification, and a
prototype implementation, SOAP. SOAP learns a set of constraints
characterizing typical inputs that an application is highly likely to
process correctly. When given an atypical input that does not satisfy
these constraints, SOAP automatically rectifies the input (i.e.,
changes the input so that is satisfies the learned constraints). The
goal is to automatically convert potentially dangerous inputs into
typical inputs that the program is highly likely to process
correctly. Our experimental results show that, for a set of benchmark
applications (namely, Google Picasa, ImageMagick, VLC, Swfdec, and
Dillo), this approach effectively converts malicious inputs (which
successfully exploit vulnerabilities in the application) into benign
inputs that the application processes correctly. Moreover, a manual
code analysis shows that, if an input does satisfy the learned
constraints, it is incapable of exploiting these vulnerabilities. We
also present the results of a user study designed to evaluate the
subjective perceptual quality of outputs from benign but atypical
inputs that have been automatically rectified by SOAP to conform to
the learned constraints. Specifically, we obtained benign inputs that
violate learned constraints, used our input rectifier to obtain
rectified inputs, then paid Amazon Mechanical Turk users to provide
their subjective qualitative perception of the difference between the
outputs from the original and rectified inputs. The results indicate
that rectification can often preserve much, and in many cases all, of
the desirable data in the original input.