Definition. Given an alphabet Σ, let S be the set {∅,∪,⋅,*,∩,¬,(,)}, considered to be disjoint from Σ. Let X be the smallest subset of (Σ∪S)* containing the following:

•

any regular expression is in X,

•

if u,v∈Y, then (u∩v),(¬⁢u)∈X.

An element of X is called a generalized regular expression over Σ.

Like regular expressions, every generalized regular expressions are designed to represent languages (it is clear that ∩ and ¬ are intended to mean set-theoretic intersection and complementation). If u is a generalized regular expression:

•

if u is regular expression, then the language represented by u as a generalized regular expression is L⁢(u), the language represented by u as a regular expression;

•

if A is represented by u and B is represented by v, then A∩B is represented by (u∩v)

•

if A is represented by u, then Σ*-A is represented by (¬⁢u).

By induction, it is easy to see that, given a generalized regular expression u, there is exactly one language represented by u. We denote L⁢(u) the language represented by u, and 𝒢⁢ℛ the family of languages represented by generalized regular expressions.

Since regular languages are closed under intersection and complementation, generalized regular expressions in this regard are no powerful than regular expressions. The symbols ¬ and ∩ are therefore extraneous. In other words,