Practical secure information flow in programming languages

Abstract

If we classify variables in a program into various security levels, then a secure information flow analysis aims to verify statically that information in a program can flow only in ways consistent with the specified security levels. One well-studied approach is to formulate the rules of the secure information flow analysis as a type system. A major trend of recent research focuses on how to accommodate various sophisticated modern language features. However, this approach often leads to overly complicated and restrictive type systems, making them unfit for practical use. Also, problems essential to practical use, such as type inference and error reporting, have received little attention. This dissertation identified and solved major theoretical and practical hurdles to the application of secure information flow. ^ We adopted a minimalist approach to designing our language to ensure a simple lenient type system. We started out with a small simple imperative language and only added features that we deemed most important for practical use. One language feature we addressed is arrays. Due to the various leaking channels associated with array operations, arrays have received complicated and restrictive typing rules in other secure languages. We presented a novel approach for lenient array operations, which lead to simple and lenient typing of arrays. ^ Type inference is necessary because usually a user is only concerned with the security types for input/output variables of a program and would like to have all types for auxiliary variables inferred automatically. We presented a type inference algorithm B and proved its soundness and completeness. Moreover, algorithm B stays close to the program and the type system and therefore facilitates informative error reporting that is generated in a cascading fashion. Algorithm B and error reporting have been implemented and tested. ^ Lastly, we presented a novel framework for developing applications that ensure user information privacy. In this framework, core computations are defined as code modules that involve input/output data from multiple parties. Incrementally, secure flow policies are refined based on feedback from the type checking/inference. Core computations only interact with code modules from involved parties through well-defined interfaces. All code modules are digitally signed to ensure their authenticity and integrity. ^