Alphabet (computer science)

In computer science and mathematical logic, a non-empty set is called alphabet when its intended use in string operations shall be indicated.[1][2] Its members are then commonly called symbols or letters, e.g. characters or digits.[1][2] For example a common alphabet is {0,1}, the binary alphabet. A finite string is a finite sequence of letters from an alphabet; for instance a binary string is a string drawn from the alphabet {0,1}. An infinite sequence of letters may be constructed from elements of an alphabet as well.

Given an alphabet , we write to denote the set of all finite strings over the alphabet . Here, the denotes the Kleene star operator, so is also called the Kleene closure of . We write (or occasionally, or ) to denote the set of all infinite sequences over the alphabet .

For example, using the binary alphabet {0,1}, the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the empty string).

If L is a formal language, i.e. a (possibly infinite) set of finite-length strings, the alphabet of L is the set of all symbols that may occur in any string in L. For example, if L is the set of all variable identifiers in the programming language C, L’s alphabet is the set { a, b, c, ..., x, y, z, A, B, C, ..., X, Y, Z, 0, 1, 2, ..., 7, 8, 9, _ }.