Strings are ubiquitous in software. Tools for veriﬁcation and testing of software rely in various degrees on reasoning about strings. Web applications are particularly important in this context since they tend to be string-heavy and have large number security errors attributable to improper string sanitzation and manipulations. In recent years, many string solvers have been implemented to address the analysis needs of veriﬁcation, testing and security tools aimed at string-heavy applications. These solvers support a basic representation of strings, functions such as concatenation, extraction, and predicates such as equality and membership in regular expressions. However, the syntax and semantics supported by the current crop of string solvers are mutually incompatible. Hence, there is an acute need for a standardized theory of strings (i.e., SMT-LIBization of a theory of strings) that supports a core set of functions, predicates and string representations.
This paper presents a proposal for exactly such a standardization eﬀort, i.e., an SMT-LIBization of strings and regular expressions. It introduces a theory of sequences general-izing strings, and builds a theory of regular expressions on top of sequences. The proposed logic QF BVRE is designed to capture a common substrate among existing tools for string constraint solving.