Fast and Precise Sanitizer Analysis With BEK

Web applications often use special string-manipulating sanitizers on untrusted user data, but it is difficult to reason manually about the
behavior of these functions, leading to errors. For example, the Internet Explorer cross-site scripting filter turned out to transform some
web pages without JavaScript into web pages with valid JavaScript, enabling attacks. In other cases, sanitizers may fail to commute,
rendering one order of application safe and the other dangerous. BEK is a language and system for writing sanitizers that enables precise
analysis of sanitizer behavior, including checking idempotence, commutativity, and equivalence.

For example, BEK can determine if a target string, such as an entry on the XSS Cheat Sheet, is a valid output of a sanitizer. If so our
analysis synthesizes an input string that yields the target. Our language is expressive enough to capture real web sanitizers used in
ASP.NET, the Internet Explorer XSS Filter, and the Google AutoEscape framework, which we demonstrate by porting these sanitizers to BEK.

Our analyses use a novel symbolic finite automata representation to leverage fast satisfiability modulo theories (SMT) solvers and are
quick in practice, taking fewer than two seconds to check the commutativity of the entire set of Internet Exporer XSS filters, between 36
and 39 seconds to check implementations of HTMLEncode against target strings from the XSS Cheat Sheet, and less than ten seconds to check
equivalence between all pairs of a set of implementations of HTMLEncode. Programs written in BEK can be compiled to traditional languages
such as JavaScript and C#, making it possible for web developers to write sanitizers supported by deep analysis, yet deploy the analyzed
code directly to real applications.