Description

Many computing systems today are written in weakly typed languages such as C and C++. These languages are known to be ``unsafe'' as they do not prevent or detect common memory errors like array bounds violations, pointer cast errors, etc. The presence of such undetected errors has two major implications. The first problem is that it makes systems written in these languages unreliable and vulnerable to security attacks. The second problem, which has never been solved for ordinary C, is that it prevents sound, sophisticated static analyses from being reliably applied to these programs. Despite these known problems, increasingly complex software continues to get written in these languages because of performance and backwards-compatibility considerations.
This thesis presents a new compiler and a run-time system called SAFECode (Static Analysis For safe Execution of Code) that addresses these two problems. First, SAFECode guarantees memory safety for programs in unsafe languages with very low overhead. Second, SAFECode provides a platform for reliable static analyses by ensuring that an aggressive interprocedural pointer analysis, type information, and call graph are never invalidated at run-time due to memory errors. Finally, SAFECode can detect some of the hard-to detect memory errors like dangling pointer errors with low overhead for some class of applications and can be used not only during development but also during deployment. SAFECode requires no source code changes, allows memory to be managed explicitly and does not use metadata on pointers or individual tag bits for memory (avoiding any external library compatibility issues).
This thesis describes the main ideas, insights, and the approach that SAFECode system uses to achieve the goal of providing safety guarantees to software written in unsafe languages. This thesis also evaluates the SAFECode approach on several benchmarks and server applications and shows that the performance overhead is lower than any of the other existing approaches.

You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).