First-order definable languages

Transcription

1 First-order definable languages Volker Diekert 1 Paul Gastin 2 1 Institut für Formale Methoden der Informatik Universität Stuttgart Universitätsstraße Stuttgart, Germany 2 Laboratoire Spécification et Vérification École Normale Supérieure de Cachan 61, avenue du Président Wilson Cachan Cedex, France Abstract We give an essentially self-contained presentation of some principal results for first-order definable languages over finite and infinite words. We introduce the notion of a counter-free Büchi automaton; and we relate counter-freeness to aperiodicity and to the notion of very weak alternation. We also show that aperiodicity of a regular -language can be decided in polynomial space, if the language is specified by some Büchi automaton. 1 Introduction The study of regular languages is one of the most important areas in formal language theory. It relates logic, combinatorics, and algebra to automata theory; and it is widely applied in all branches of computer sciences. Moreover it is the core for generalizations, e.g., to tree automata [26] or to partially ordered structures such as Mazurkiewicz traces [6]. In the present contribution we treat first-order languages over finite and infinite words. First-order definability leads to a subclass of regular languages and again: it relates logic, combinatorics, and algebra to automata theory; and it is also widely applied in all branches of computer sciences. Let us mention that first-order definability for Mazurkiewicz traces leads essentially to the same picture as for words (see, e.g., [5]), but nice charactizations for first-order definable sets of trees are still missing. The investigation on first-order languages has been of continuous interest over the past decades and many important results are related to the efforts We would like to thank the anonymous referee for the detailed report. Jörg Flum, Erich Grädel, Thomas Wilke (eds.). Logic and Automata: History and Perspectives. Texts in Logic and Games 2, Amsterdam University Press 2007, pp

2 262 V. Diekert, P. Gastin of Wolfgang Thomas [31, 32, 33, 34, 35]. We also refer to his influential contributions in the handbooks of Theoretical Computer Science [36] and of Formal Languages [37]. We do not compete with these surveys. Our plan is more modest. We try to give a self-contained presentation of some of the principal characterizations of first-order definable languages in a single paper. This covers description with star-free expressions, recognizability by aperiodic monoids and definability in linear temporal logic. We also introduce the notion of a counter-free Büchi automaton which is somewhat missing in the literature so far. We relate counter-freeness to the aperiodicity of the transformation monoid. We also show that first-order definable languages can be characterized by very weak alternating automata using the concept of aperiodic automata. In some sense the main focus in our paper is the explanation of the following theorem. Theorem 1.1. Let L be a language of finite or infinite words over a finite alphabet. Then the following assertions are equivalent: 1. L is first-order definable. 2. L is star-free. 3. L is aperiodic. 4. L is definable in the linear temporal logic LTL. 5. L is first-order definable with a sentence using at most 3 names for variables. 6. L is accepted by some counter-free Büchi automaton. 7. L is accepted by some aperiodic Büchi automaton. 8. L is accepted by some very weak alternating automaton. Besides, the paper covers related results. The translation from firstorder to LTL leads in fact to the pure future fragment of LTL, i.e., the fragment without any past tense operators. This leads to the separation theorem for first-order formulae in one free variable as we shall demonstrate in Section 9. We also show that aperiodicity (i.e., first-order definability) of a regular -language can be decided in polynomial space, if the language is specified by some Büchi automaton. Although the paper became much longer than expected, we know that much more could be said. We apologize if the reader s favorite theorem is not covered in our survey. In particular, we do not speak about varieties, and we gave up the project to cover principle results about the fragment

3 First-order languages 263 of first-order logic which corresponds to unary temporal logic. These diamonds will continue to shine, but not here, and we refer to [30] for more background. As mentioned above, we use Büchi automata, but we do not discuss deterministic models such as deterministic Muller automata. The history of Theorem 1.1 is related to some of the most influential scientists in computer science. The general scheme is that the equivalences above have been proved first for finite words. After that, techniques were developed to generalize these results to infinite words. Each time, the generalization to infinite words has been non-trivial and asked for new ideas. Perhaps, the underlying reason for this additional difficulty is due to the fact that the subset construction fails for infinite words. Other people may say that the difficulty arises from the fact that regular ω-languages are not closed in the Cantor topology. The truth is that combinatorics on infinite objects is more complicated. The equivalence of first-order definability and star-freeness for finite words is due to McNaughton and Papert [19]. The generalization to infinite words is due to Ladner [15] and Thomas [31, 32]. These results have been refined, e.g. by Perrin and Pin in [24]. Based on the logical framework of Ehrenfeucht-Fraïssé-games, Thomas also related the quantifier depth to the so-called dot-depth hierarchy, [33, 35]. Taking not only the quantifier alternation into account, but also the length of quantifier blocks one gets even finer results as studied by Blanchet-Sadri in [2]. The equivalence of star-freeness and aperiodicity for finite words is due to Schützenberger [28]. The generalization to infinite words is due to Perrin [23] using the syntactic congruence of Arnold [1]. These results are the basis allowing to decide whether a regular language is first-order definable. Putting these results together one sees that statements 1, 2, and 3 in Theorem 1.1 are equivalent. From the definition of LTL it is clear that linear temporal logic describes a fragment of FO 3, where the latter means the family of first-order definable languages where the defining sentence uses at most three names for variables. Thus, the implications from 4 to 5 and from 5 to 1 are trivial. The highly non-trivial step is to conclude from 1 (or 2 or 3) to 4. This is usually called Kamp s Theorem and is due to Kamp [13] and Gabbay, Pnueli, Shelah, and Stavi [9]. In this survey we follow the algebraic proof of Wilke which is in his habilitation thesis [38] and which is also published in [39]. Wilke gave the proof for finite words, only. In order to generalize it to infinite words we use the techniques from [5], which were developed to handle Mazurkiewicz traces. Cutting down this proof to the special case of finite or infinite words leads to the proof presented here. It is still the most complicated part in the paper, but again some of the technical difficulties lie in the combinatorics of infinite words which is subtle. Restricting the proof further to finite words,

4 264 V. Diekert, P. Gastin the reader might hopefully find the simplest way to pass from aperiodic languages to LTL. But this is also a matter of taste, of course. Every first-order formula sentence can be translated to a formula in FO 3. This is sharp, because it is known that there are first-order properties which are not expressible in FO 2, which characterizes unary temporal logic [7] over infinite words. The equivalence between definability in monadic second order logic, regular languages, and acceptance by Büchi automata is due to Büchi [3]. However, Büchi automata are inherently non-deterministic. In order to have deterministic automata one has to move to other acceptance conditions such as Muller or Rabin-Streett conditions. This important result is due to McNaughton, see [18]. Based on this, Thomas [32] extended the notion of deterministic counter-free automaton to deterministic counter-free automaton with Rabin-Streett condition and obtained thereby another characterization for first-order definable ω-languages. There is no canonical object for a minimal Büchi automaton, which might explain why a notion of counterfree Büchi automaton has not been introduced so far. On the other hand, there is a quite natural notion of counter-freeness as well as of aperiodicity for non-deterministic Büchi automata. (Aperiodic non-deterministic finite automata are defined in [16], too.) For non-deterministic automata, aperiodicity describes a larger class of automata, but both counter-freeness and aperiodicity can be used to characterize first-order definable ω-languages. This is shown in Section 11 and seems to be an original part in the paper. We have also added a section about very weak alternating automata. The notion of weak alternating automaton is due to Muller, Saoudi, and Schupp [21]. A very weak alternating automaton is a special kind of weak alternating automaton and this notion has been introduced in the PhD thesis of Rhode [27] in a more general context of ordinals. (In the paper by Löding and Thomas [17] these automata are called linear alternating.) Section 13 shows that very weak alternating automata characterize firstorder definability as well. More precisely, we have a cycle from 3 to 6 to 7 and back to 3, and we establish a bridge from 4 to 8 and from 8 to 7. It was shown by Stern [29] that deciding whether a deterministic finite automaton accepts an aperiodic language over finite words can be done in polynomial space, i.e., in PSPACE. Later Cho and Huynh showed in [4] that this problem is actually PSPACE-complete. So, the PSPACE-hardness transfers to (non-deterministic) Büchi automata. It might belong to folklore that the PSPACE-upper bound holds for Büchi automata, too; but we did not find any reference. So we prove this result here, see Proposition As said above, our intention was to give simple proofs for existing results. But simplicity is not a simple notion. Therefore for some results, we present two proofs. The proofs are either based on a congruence lemma

5 First-order languages 265 established for first-order logic in Section 10.1, or they are based on a splitting lemma established for star-free languages in Section 3.1. Depending on his background, the reader may wish to skip one approach. 2 Words, first-order logic, and basic notations By P we denote a unary predicate taken from some finite set of atomic propositions, and x, y,... denote variables which represent positions in finite or infinite words. The syntax of first-order logic uses the symbol for false and has atomic formulae of type P(x) and x < y. We allow Boolean connectives and first-order quantification. Thus, if ϕ and ψ are first-order formulae, then ϕ, ϕ ψ and xϕ are first-order formulae, too. As usual we have derived formulae such as x y, x = y, ϕ ψ = ( ϕ ψ), xϕ = x ϕ and so on. We let Σ be a finite alphabet. The relation between Σ and the set of unary predicates is that for each letter a Σ and each predicate P the truth-value P(a) must be well-defined. So, we always assume this. Whenever convenient we include for each letter a a predicate P a such that P a (b) is true if and only if a = b. We could assume that all predicates are of the form P a, but we feel more flexible of not making this assumption. If x is a position in a word with label a Σ, then P(x) is defined by P(a). By Σ (resp. Σ ω ) we mean the set of finite (resp. infinite) words over Σ, and we let Σ = Σ Σ ω. The length of a word w is denoted by w, it is a natural number or ω. A language is a set of finite or infinite words. Formulae without free variables are sentences. A first-order sentence defines a subset of Σ in a natural way. Let us consider a few examples. We can specify that the first position is labeled by a letter a using x y P a (x) x y. We can say that each occurrence of a is immediately followed by b with the sentence x P a (x) y x < y P b (y) z (x < z z < y). We can also say that the direct successor of each b is the letter a. Hence the language (ab) ω is first-order definable. We can also say that a last position in a word exists and this position is labeled b. For a b this leads almost directly to a definition of (ab). But (aa) cannot be defined with a firstorder sentence. A formal proof for this statement is postponed, but at least it should be clear that we cannot define (aa) the same way as we did for (ab), because we have no control that the length of a word in a is even. The set of positions pos(w) is defined by pos(w) = {i N 0 i < w }. We think of pos(w) as a linear order where each position i is labeled with λ(i) Σ, and w = λ(0)λ(1). A k-structure means here a pair (w, p), where w Σ is a finite or infinite word and p = (p 1,..., p k ) is a k-tuple of positions in pos(w). The set of all k-structures is denoted by Σ (k), and the subset of finite structures is denoted by Σ (k). For simplicity we identify Σ with Σ (0).

6 266 V. Diekert, P. Gastin Let x be a k-tuple (x 1,..., x k ) of variables and ϕ be a first-oder formula where all free variables are in the set {x 1,..., x k }. The semantics of (w, (p 1,...,p k )) = ϕ is defined as usual: It is enough to give a semantics to atomic formulae, and (w, (p 1,..., p k )) = P(x i ) means that the label of position p i satisfies P, and (w, (p 1,..., p k )) = x i < x j means that position p i is before position p j, i.e., p i < p j. With every formula we can associate its language by { } L(ϕ) = (w, p) Σ (k) (w, p) = ϕ. In order to be precise we should write L Σ,k (ϕ), but if the context is clear, we omit the subscript Σ, k. Definition 2.1. By FO(Σ ) (resp. FO(Σ )) we denote the set of firstorder definable languages in Σ (resp. Σ ), and by FO we denote the family of all first-order definable languages. Analogously, we define families FO n (Σ ), FO n (Σ ), and FO n by allowing only those formulae which use at most n different names for variables. 3 Star-free sets For languages K, L Σ we define the concatenation by K L = {uv u K Σ, v L}. The n-th power of L is defined inductively by L 0 = {ε} and L n+1 = L L n. The Kleene-star of L is defined by L = n 0 Ln. Finally, the ω-iteration of L is L ω = {u 0 u 1 u 2 u i L Σ for all i 0}. We are interested here in families of regular languages, also called rational languages. In terms of expressions it is the smallest family of languages which contains all finite subsets, which is closed under finite union and concatenation, and which is closed under the Kleene-star (and ω-power). The relation to finite automata (Büchi automata resp.) is treated in Section 11. For the main results on first-order languages the notion of a Büchi automaton is actually not needed. The Kleene-star and the ω-power do not preserve first-order definability, hence we consider subclasses of regular languages. A language is called star-free, if we do not allow the Kleene-star, but we allow complementation. Therefore we have all Boolean operations. In terms of expressions the class of star-free languages is the smallest family of languages in Σ (resp. Σ )

7 First-order languages 267 which contains Σ, all singletons {a} for a Σ, and which is closed under finite union, complementation and concatenation. It is well-known that regular languages are closed under complement 1, hence star-free languages are regular. As a first example we note that for every A Σ the set A (of finite words containing only letters from A) is also star-free. We have: A = Σ \ (Σ (Σ \ A)Σ ). In particular, {ε} = is star-free. Some other expressions with star are also in fact star-free. For example, for a b we obtain: (ab) = (aσ Σ b) \ Σ (Σ 2 \ {ab, ba})σ. The above equality does not hold, if a = b. Actually, (aa) is not star-free. The probably best way to see that (aa) is not star-free, is to show (by structural induction) that for all star-free languages L there is a constant n N such that for all words x we have x n L if and only if x n+1 L. The property is essentially aperiodicity and we shall prove the equivalence between star-free sets and aperiodic languages later. Since (ab) is star-free (for a b), but (aa) is not, we see that a projection of a star-free set is not star-free, in general. Definition 3.1. By SF(Σ ) (resp. SF(Σ )) we denote the set of star-free languages in Σ (resp. Σ ), and by SF we denote the family of all star-free languages. An easy exercise (left to the interested reader) shows that SF(Σ ) = {L Σ L SF(Σ )} = {L Σ L SF(Σ )}. 3.1 The splitting lemma A star-free set admits a canonical decomposition given a partition of the alphabet. This will be shown here and it is used to prove that first-order languages are star-free in Section 4 and for the separation theorem in Section 9. The alternative to this section is explained in Section 10, where the standard way of using the congruence lemma is explained, see Lemma Thus, there is an option to skip this section. Lemma 3.2. Let A, B Σ be disjoint subalphabets. If L SF(Σ ) then we can write L B AB = K i a i L i 1 i n where a i A, K i SF(B ) and L i SF(B ) for all 1 i n. 1 We do not need this standard result here.

8 268 V. Diekert, P. Gastin Proof. Since B AB = a A B ab, it is enough to show the result when A = {a}. The proof is by induction on the star-free expression and also on the alphabet size. (Note that B < Σ.). The result holds for the basic star-free sets: If L = {a} with a A then L B AB = {ε}a{ε}. If L = {a} with a / A then L B AB = a (or we let n = 0). If L = Σ then L B AB = B AB. The inductive step is clear for union. For concatenation, the result follows from (L L ) B AB = (L B AB ) (L B ) (L B ) (L B AB ). It remains to deal with the complement Σ \ L of a star-free set. By induction, we have L B ab = 1 i n K ial i. If some K i and K j are not disjoint (for i j), then we can rewrite K i al i K j al j = (K i \ K j )al i (K j \ K i )al j (K i K j )a(l i L j ). We can also add (B \ i K i)a in case i K i is strictly contained in B. Therefore, we may assume that {K i 1 i n} forms a partition of B. This yields: (Σ \ L) B ab = K i a(b \ L i ). 1 i n 4 From first-order to star-free languages q.e.d. This section shows that first-order definable languages are star-free languages. The transformation is involved in the sense that the resulting expressions are much larger than the size of the formula, in general. The proof presented here is based on the splitting lemma. The alternative is again in Section 10. Remark 4.1. The converse that star-free languages are first-order definable can be proved directly. Although strictly speaking we do not use this fact, we give an indication how it works. It is enough to give a sentence for languages of type L = L(ϕ) a L(ψ). We may assume that the sentences ϕ and ψ use different variable names. Then we can describe L as a language L(ξ) where ξ = z P a (z) ϕ <z ψ >z, where ϕ <z and ψ >z relativize all variables with respect to the position of z. We do not go into more details, because, as said above, we do not need this fact.

9 First-order languages 269 We have to deal with formulae having free variables. We provide first another semantics of a formula with free variables in a set of words over an extended alphabet allowing to encode the assignment. This will also be useful to derive the separation theorem in Section 9. Let V be a finite set of variables. We define Σ V = Σ {0, 1} V. (Do not confuse Σ V with Σ (k) from above.) Let w Σ be a word and σ be an assignment from the variables in V to the positions in w, thus 0 σ(x) < w for all x V. The pair (w, σ) can be encoded as a word (w, σ) over Σ V. More precisely, if w = a 0 a 1 a 2 then (w, σ) = (a 0, τ 0 )(a 1, τ 1 )(a 2, τ 2 ) where for all 0 i < w we have τ i (x) = 1 if and only if σ(x) = i. We let N V Σ V be the set of words (w, σ) such that w Σ and σ is an assignment from V to the positions in w. We show that N V is starfree. For x V, let Σ x=1 V be the set of pairs (a, τ) with τ(x) = 1 and let Σ x=0 V = Σ V \ Σ x=1 V be its complement. Then, N V = (Σ x=0 V ) Σ x=1 V (Σ x=0 V ). x V Given a first-order formula ϕ and a set V containing all free variables of ϕ, we define the semantics [ϕ] V N V inductively: [P a (x)] V = {(w, σ) N V w = b 0 b 1 b 2 Σ and b σ(x) = a} [x < y] V = {(w, σ) N V σ(x) < σ(y)} [ x, ϕ] V = {(w, σ) N V i, 0 i < w (w, σ[x i]) [ϕ] V {x} } [ϕ ψ] V = [ϕ] V [ψ] V [ ϕ] V = N V \ [ϕ] V. Proposition 4.2. Let ϕ be a first-order formula and V be a set of variables containing the free variables of ϕ. Then, [ϕ] V SF(Σ V ). Proof. The proof is by induction on the formula. We have [P a (x)] V = N V (Σ V {(a, τ) τ(x) = 1} Σ V ) [x < y] V = N V (Σ V Σx=1 V Σ V Σy=1 V Σ V ). The induction is trivial for disjunction and negation since the star-free sets form a Boolean algebra and N V is star-free. The interesting case is existential quantification [ x, ϕ] V. We assume first that x / V and we let V = V {x}. By induction, [ϕ] V is star-free and we can apply Lemma 3.2 with the sets A = Σ x=1 V and B = Σ x=0 V. Note that N V B AB. Hence, [ϕ] V = [ϕ] V B AB and we obtain [ϕ] V = 1 i n K i a i L i where a i A, K i SF(B ) and L i SF(B ) for all i. Let π : B Σ V be the bijective renaming defined

10 270 V. Diekert, P. Gastin by π(a, τ) = (a, τ V ). Star-free sets are not preserved by projections but indeed they are preserved by bijective renamings. Hence, K i = π(k i ) SF(Σ V ) and L i = π(l i ) SF(Σ V ). We also rename a i = (a, τ) into a i = (a, τ V ). We have [ x, ϕ] V = 1 i n K ia i L i and we deduce that [ x, ϕ] V SF(Σ V ). Finally, if x V then we choose a new variable y / V and we let U = (V \ {x}) {y}. From the previous case, we get [ x, ϕ] U SF(Σ U ). To conclude, it remains to rename y to x. q.e.d. Corollary 4.3. We have: FO(Σ ) SF(Σ ) and FO(Σ ) SF(Σ ). 5 Aperiodic languages Recall that a monoid (M, ) is a non-empty set M together with a binary operation such that ((x y) z) = (x (y z)) and with a neutral element 1 M such that x 1 = 1 x = x for all x, y, z in M. Frequently we write xy instead of x y. A morphism (or homomorphism) between monoids M and M is a mapping h : M M such that h(1) = 1 and h(x y) = h(x) h(y). We use the algebraic notion of recognizability and the notion of aperiodic languages. Recognizability is defined as follows. Let h : Σ M be a morphism to a finite monoid M. Two words u, v Σ are said to be h-similar, denoted by u h v, if for some n N {ω} we can write u = 0 i<n u i and v = 0 i<n v i with u i, v i Σ + and h(u i ) = h(v i ) for all 0 i < n. The notation u = 0 i<n u i refers to an ordered product, it means a factorization u = u 0 u 1. In other words, u h v if either u = v = ε, or u, v Σ + and h(u) = h(v) or u, v Σ ω and there are factorizations u = u 0 u 1, v = v 0 v 1 with u i, v i Σ + and h(u i ) = h(v i ) for all i 0. The transitive closure of h is denoted by h ; it is an equivalence relation. For w Σ, we denote by [w] h the equivalence class of w under h. Thus, [w] h = {u u h w}. In case that there is no ambiguity, we simply write [w] instead of [w] h. Note that there are three cases [w] = {ε}, [w] Σ +, and [w] Σ ω. Definition 5.1. We say that a morphism h : Σ M recognizes L, if w L implies [w] h L for all w Σ. Thus, a language L Σ is recognized by h if and only if L is saturated by h (or equivalently by h ). Note that we may assume that a recognizing morphism h : Σ M is surjective, whenever convenient.

12 272 V. Diekert, P. Gastin can be written as a finite union of languages of type UV ω, where U, V Σ are recognized by h and where moreover U = h 1 (s) and V = h 1 (e) for some s, e M with se = s and e 2 = e. In particular, we have UV U and V V V. Since {ε} ω = {ε}, the statement holds for L Σ and L Σ as well. A (finite) monoid M is called aperiodic, if for all x M there is some n N such that x n = x n+1. Definition 5.4. A language L Σ is called aperiodic, if it is recognized by some morphism to a finite and aperiodic monoid. By AP(Σ ) (resp. AP(Σ )) we denote the set of aperiodic languages in Σ (resp. Σ ), and by AP we denote the family of aperiodic languages. 6 From star-freeness to aperiodicity Corollary 4.3 (as well as Proposition 10.3) tells us that all first-order definable languages are star-free. We want to show that all star-free languages are recognized by aperiodic monoids. Note that the trivial monoid recognizes the language Σ, actually it recognizes all eight Boolean combinations of {ε} and Σ ω. Consider next a letter a. The smallest recognizing monoid of the singleton {a} is aperiodic, it has just three elements 1, a, 0 with a a = 0 and 0 is a zero, this means x y = 0 as soon as 0 {x, y}. Another very simple observation is that if L i is recognized by a morphism h i : Σ M i to some finite (aperiodic) monoid M i, i = 1, 2, then (the direct product M 1 M 2 is aperiodic and) the morphism h : Σ M 1 M 2, w (h 1 (w), h 2 (w)) recognizes all Boolean combinations of L 1 and L 2. The proof of the next lemma is rather technical. Its main part shows that the family of recognizable languages is closed under concatenation. Aperiodicity comes into the picture only at the very end in a few lines. There is alternative way to prove the following lemma. In Section 11 we introduce non-deterministic counter-free Büchi automata which can be used to show the closure under concatenation as well, see Lemma Lemma 6.1. Let L Σ and K Σ be aperiodic languages. Then L K is aperiodic. Proof. As said above, we may choose a single morphism h : Σ M to some finite aperiodic monoid M, which recognizes both L and K. The set of pairs (h(u), h(v)) with u, v Σ is finite (bounded by M 2 ) and so its power set S is finite, too. We shall see that there is a monoid structure on some subset of S such that this monoid recognizes L K.

14 274 V. Diekert, P. Gastin Proposition 6.2. We have SF AP or more explicitly: SF(Σ ) AP(Σ ) and SF(Σ ) AP(Σ ). Proof. Aperiodic languages form a Boolean algebra. We have seen above that AP contains Σ and all singletons {a}, where a is a letter. Thus, star-free languages are aperiodic by Lemma 6.1. q.e.d. 7 From LTL to FO 3 The syntax of LTL Σ [XU, YS] is given by ϕ ::= a ϕ ϕ ϕ ϕ XU ϕ ϕ YS ϕ, where a ranges over Σ. When there is no ambiguity, we simply write LTL for LTL Σ [XU, YS]. We also write LTL Σ [XU] for the pure future fragment where only the next-until modality XU is allowed. In order to give a semantics to an LTL formula we identify each ϕ LTL with some first-order formula ϕ(x) in at most one free variable. The identification is done by structural induction. and still denote the truth value true and false, the formula a becomes a(x) = P a (x). The formulae next-until and Yesterday-Since are defined by: (ϕ XU ψ)(x) = z : x < z ψ(z) y : x < y < z ϕ(y). (ϕ YS ψ)(x) = z : x > z ψ(z) y : x > y > z ϕ(y). It is clear that each LTL formula becomes under this identification a first-order formula which needs at most three different names for variables. For simplicity let us denote this fragment by FO 3, too. Thus, we can write LTL FO 3. As usual, we may use derived formulas such as Xϕ = XUϕ (read next ϕ), ϕ U ψ = ψ (ϕ (ϕ XU ψ)) (read ϕ Until ψ), Fϕ = U ϕ (read Future ϕ), etc. Since LTL FO 3 a model of an LTL Σ formula ϕ is a word v = a 0 a 1 a 2 A \ {ε} together with a position 0 i < v (the alphabet A might be different from Σ). For a formula ϕ LTL Σ and an alphabet A, we let L A (ϕ) = {v A \ {ε} v, 0 = ϕ}. We say that a language L A is definable in LTL Σ if L \ {ε} = L A (ϕ) for some ϕ LTL Σ. Note that the empty word ε cannot be a model of a formula. To include the empty word, it will be convenient to consider for any letter c (not necessarily in A), the language L c,a (ϕ) = {v A cv, 0 = ϕ}.

15 First-order languages 275 Remark 7.1. When we restrict to the pure future fragment LTL Σ [XU] the two approaches define almost the same class of languages. Indeed, for each formula ϕ LTL Σ [XU], we have L A (ϕ) = L c,a (Xϕ) \ {ε}. Conversely, for each formula ϕ there is a formula ϕ such that L A (ϕ) = L c,a (ϕ) \ {ε}. The translation is simply ϕ XU ψ = ϕ U ψ, c = and a = if a c, and as usual ϕ = ϕ and ϕ ψ = ϕ ψ. 8 From AP to LTL 8.1 A construction on monoids The passage from AP to LTL is perhaps the most difficult step in completing the picture of first-order definable languages. We shall use an induction on the size of the monoid M, for this we recall first a construction due to [5]. For a moment let M be any monoid and m M an element. Then mm Mm is obviously a subsemigroup, but it may not have a neutral element. Hence it is not a monoid, in general. Note that, if m 1 M and M is aperiodic, then 1 M mm Mm. Indeed, assume that 1 M mm and write 1 M = mx with x M. Hence 1 M = m n x n for all n, and for some n 0 we have m n = m n+1. Taking this n we see: 1 M = m n x n = m n+1 x n = m(m n x n ) = m1 M = m. Therefore mm Mm < M, if M is aperiodic and if m 1 M. It is possible to define a new product such that mm Mm becomes a monoid where m is a neutral element: We let xm my = xmy for xm, my mm Mm. This is well-defined since xm = x m and my = my imply xmy = x my. The operation is associative and m z = z m = z. Hence (mm Mm,, m) is indeed a monoid. Actually it is a divisor of M. To see this consider the submonoid N = {x M xm mm}. (Note that N is indeed a submonoid of M.) Clearly, the mapping x xm yields a surjective morphism from (N,, 1 M ) onto (mm Mm,, m), which is therefore a homomorphic image of the submonoid N of M. In particular, if M is aperiodic, then (mm Mm,, m) is aperiodic, too. The construction is very similar to a construction of what is known as local algebra, see [8, 20]. Therefore we call (mm M m,, m) the local divisor of M at the element m. 8.2 Closing the cycle Proposition 8.1. We have AP LTL. More precisely, let L Σ be a language recognized by an aperiodic monoid M. (1) We can construct a formula ϕ LTL Σ [XU] such that L \ {ε} = L Σ (ϕ).

18 278 V. Diekert, P. Gastin Finally, let K = K 0 K 1 K 2. We have already seen that L = σ 1 (K). It remains to show that K is definable in LTL T [XU]. Let N T, then, by definition, the language [N] g is recognized by g which is a morphism to the aperiodic monoid M with M < M. By induction on the size of the monoid, we deduce that for all n T 1 and N T there exists ϕ LTL T [XU] such that [N] g = L n,t (ϕ). We easily check that nl n,t (ϕ) = L T (n ϕ). Therefore, the language n[n] g is definable in LTL T [XU]. Moreover, K 0, nt1 m and nt 1 ω are obviously definable in LTL T [XU]. Therefore, K is definable in LTL T [XU]. q.e.d. (Lemma 8.2) Let b Σ be a letter. For a nonempty word v = a 0 a 1 a 2 Σ \ {ε} and a position 0 i < v, we denote by µ b (v, i) the largest factor of v starting at position i and not containing the letter b except maybe a i. Formally, µ b (v, i) = a i a i+1 a l where l = max{k i k < v and a j b for all i < j k}. Lemma 8.3 (Lifting). For each formula ϕ LTL Σ [XU], there exists a formula ϕ b LTL Σ [XU] such that for each v Σ \{ε} and each 0 i < v, we have v, i = ϕ b if and only if µ b (v, i), 0 = ϕ. Proof. The construction is by structural induction on ϕ. We let a b = a. Then, we have ϕ b = ϕ b and ϕ ψ b = ϕ b ψ b as usual. For next-until, we define ϕ XU ψ b = ( b ϕ b ) XU ( b ψ b ). Assume that v, i = ϕ XU ψ b. We find i < k < v such that v, k = b ψ b and v, j = b ϕ b for all i < j < k. We deduce that µ b (v, i) = a i a i+1 a l with l > k and that µ b (v, i), k i = ψ and µ b (v, i), j i = ϕ for all i < j < k. Therefore, µ b (v, i), 0 = ϕ XU ψ as desired. The converse can be shown similarly. q.e.d. (Lemma 8.3) Lemma 8.4. For all ξ LTL T [XU], there exists a formula ξ LTL Σ [XU] such that for all v Σ we have cv, 0 = ξ if and only if σ(v), 0 = ξ. Proof. The proof is by structural induction on ξ. The difficult cases are for the constants m T 1 or m T 2. Assume first that ξ = m T 1. We have σ(v), 0 = m if and only if v = ucv with u A h 1 (m). The language A h 1 (m) is recognized by the restriction h A : A M. By induction on the size of the alphabet, we find a formula ϕ m LTL A [XU] such that L c,a (ϕ m ) = A h 1 (m). We let m = ϕ m c XFc. By Lemma 8.3, we have cv, 0 = m if and only if v = ucv with u A and µ c (cv, 0), 0 = ϕ m. Since µ c (cv, 0) = cu, we deduce that cv, 0 = m if and only if v = ucv with u L c,a (ϕ m ) = A h 1 (m). Next, assume that ξ = m T 2. We have σ(v) = m if and only if v A m (note that letters from T 2 can also be seen as equivalence classes which are subsets of Σ ). The language A m is recognized by

19 First-order languages 279 the restriction h A. By induction on the size of the alphabet, we find a formula ψ m LTL A [XU] such that L c,a (ψ m ) = A m. Then, we let c m = ψ m XF c and we conclude as above. Finally, we let ξ = ξ, ξ 1 ξ 2 = ξ 1 ξ 2 and for the modality next-until we define ξ 1 XU ξ 2 = ( c ξ 1 ) U (c ξ 2 ). Assume that σ(v), 0 = ξ 1 XU ξ 2 and let 0 < k < σ(v) be such that σ(v), k = ξ 2 and σ(v), j = ξ 1 for all 0 < j < k. Let v 0 cv 1 cv 2 c be the c-factorization of v. Since the logics LTL T [XU] and LTL Σ [XU] are pure future, we have σ(v), k = ξ 2 if and only if σ(v k cv k+1 ), 0 = ξ 2 if and only if (by induction) cv k cv k+1, 0 = ξ 2 if and only if cv, cv 0 cv k 1 = ξ 2. Similarly, σ(v), j = ξ 1 if and only if cv, cv 0 cv j 1 = ξ 1. Therefore, cv, 0 = ξ 1 XU ξ 2. The converse can be shown similarly. q.e.d. (Lemma 8.4) We conclude now the proof of Proposition 8.1. We start with a language L Σ recognized by h. By Lemma 8.2, we find a formula ξ LTL T [XU] such that L = σ 1 (L T (ξ)). Let ξ be the formula given by Lemma 8.4. We claim that L = L c,σ ( ξ). Indeed, for v Σ, we have v L c,σ ( ξ) if and only if cv, 0 = ξ if and only if (Lemma 8.4) σ(v), 0 = ξ if and only if σ(v) L T (ξ) if and only if v σ 1 (L T (ξ)) = L. q.e.d. (Proposition 8.1) 9 The separation theorem As seen in Section 7, an LTL Σ [YS, XU] formula ϕ can be viewed as a firstorder formula with one free variable. The converse, in a stronger form, is established in this section. Proposition 9.1. For all first-order formulae ξ in one free variable we find a finite list (K i, a i, L i ) i=1,...,n where each K i SF(Σ ) and each L i SF(Σ ) and a i is a letter such that for all u Σ, a Σ and v Σ we have (uav, u ) = ξ if and only if u K i, a = a i and v L i for some 1 i n. Proof. By Proposition 4.2, with V = {x} we have [ξ] V SF(Σ V ). Hence, we can use Lemma 3.2 with A = Σ x=1 V and B = Σ x=0 V. Note that N V = B AB. Hence, we obtain [ξ] V = K i a i L i i=1,...,n with a i A, K i SF(B ) and L i SF(B ) for all i. Let π : B Σ be the bijective renaming defined by π(a, τ) = a. Star-free sets are preserved by injective renamings. Hence, we can choose K i = π(k i ) SF(Σ ) and L i = π(l i ) SF(Σ ). Note also that a i = (a i, 1) for some a i Σ. q.e.d.

20 280 V. Diekert, P. Gastin Theorem 9.2 (Separation). Let ξ(x) FO Σ (<) be a first-order formula with one free variable x. Then, ξ(x) = ζ(x) for some LTL formula ζ LTL Σ [YS, XU]. Moreover, we can choose for ζ a disjunction of conjunctions of pure past and pure future formulae: ζ = ψ i a i ϕ i 1 i n where ψ i LTL Σ [YS], a i Σ and ϕ i LTL Σ [XU]. In particular, every first-order formula with one free variable is equivalent to some formula in FO 3. Note that we have already established a weaker version which applies to first-order sentences. Indeed, if ξ is a first-order sentence, then L(ϕ) is star-free by Proposition 10.3, hence aperiodic by Proposition 6.2, and finally definable in LTL by Proposition 8.1. The extension to first-order formulae with one free variable will also use the previous results. Proof. By Proposition 9.1 we find for each ξ a finite list (K i, a i, L i ) i=1,...,n where each K i SF(Σ ) and each L i SF(Σ ) and a i is a letter such that for all u Σ, a Σ and v Σ we have (uav, u ) = ξ if and only if u K i, a = a i and v L i for some 1 i n. For a finite word b 0 b m where b j are letters we let b 0 b m = b m b 0. This means we read words from right to left. For a language K Σ we let K = { w w K}. Clearly, each K i is star-free. Therefore, using Propositions 6.2 and 8.1, for each 1 i n we find ψ i and ϕ i LTL Σ [XU] such that L ai ( ψ i ) = K i and L ai (ϕ i ) = L i. Replacing all operators XU by YS we can transform ψ i LTL Σ [XU] into a formula ψ i LTL Σ [YS] such that (a w, 0) = ψ i if and only if (wa, w ) = ψ i for all wa Σ +. In particular, K i = {w Σ wa i, w = ψ i }. It remains to show that ξ(x) = ζ(x) where ζ = 1 i n ψ i a i ϕ i. Let w Σ \ {ε} and p be a position in w. Assume first that (w, p) = ξ(x) and write w = uav with u = p. We have u K i, a = a i and v L i for some 1 i n. We deduce that ua i, u = ψ i and a i v, 0 = ϕ i. Since ψ i is pure past and ϕ i is pure future, we deduce that ua i v, u = ψ i a i ϕ i. Hence we get w, p = ζ. Conversely, assume that w, p = ψ i a i ϕ i for some i. As above, we write w = ua i v with u = p. Since ψ i is pure past and ϕ i is pure future, we deduce that ua i, u = ψ i and a i v, 0 = ϕ i. Therefore, u K i and v L i. We deduce that (w, p) = ξ(x). q.e.d.

of Languages LaBRI, Université Bordeaux-1 and CNRS MFCS Conference, Prague, August 2004 The general problem Problem: to specify and analyse infinite sets by finite means The general problem Problem: to

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

Proposition 49. 1. A group G is nilpotent if and only if G appears as an element of its upper central series. 2. If G is nilpotent, then the upper central series and the lower central series have the same

SMALL SKEW FIELDS CÉDRIC MILLIET Abstract A division ring of positive characteristic with countably many pure types is a field Wedderburn showed in 1905 that finite fields are commutative As for infinite

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

CHAPTER 7 GENERAL PROOF SYSTEMS 1 Introduction Proof systems are built to prove statements. They can be thought as an inference machine with special statements, called provable statements, or sometimes

ω-automata ω-automata Automata that accept (or reject) words of infinite length. Languages of infinite words appear: in verification, as encodings of non-terminating executions of a program. in arithmetic,

CODING TRUE ARITHMETIC IN THE MEDVEDEV AND MUCHNIK DEGREES PAUL SHAFER Abstract. We prove that the first-order theory of the Medvedev degrees, the first-order theory of the Muchnik degrees, and the third-order

Mathematical Induction (Handout March 8, 01) The Principle of Mathematical Induction provides a means to prove infinitely many statements all at once The principle is logical rather than strictly mathematical,

Left-Handed Completeness Dexter Kozen Computer Science Department Cornell University RAMiCS, September 19, 2012 Joint work with Alexandra Silva Radboud University Nijmegen and CWI Amsterdam Result A new

Large induced subgraphs with all degrees odd A.D. Scott Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, England Abstract: We prove that every connected graph of order

SOLUTIONS TO ASSIGNMENT 1 MATH 576 SOLUTIONS BY OLIVIER MARTIN 13 #5. Let T be the topology generated by A on X. We want to show T = J B J where B is the set of all topologies J on X with A J. This amounts

Degrees that are not degrees of categoricity Bernard A. Anderson Department of Mathematics and Physical Sciences Gordon State College banderson@gordonstate.edu www.gordonstate.edu/faculty/banderson Barbara

Regular Languages and Finite Automata 1 Introduction Hing Leung Department of Computer Science New Mexico State University Sep 16, 2010 In 1943, McCulloch and Pitts [4] published a pioneering work on a

1 Fixed-Point Logics and Computation Symposium on the Unusual Effectiveness of Logic in Computer Science University of Cambridge 2 Mathematical Logic Mathematical logic seeks to formalise the process of

12. Large cardinals The study, or use, of large cardinals is one of the most active areas of research in set theory currently. There are many provably different kinds of large cardinals whose descriptions

Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011 A. Miller 1. Introduction. The definitions of metric space and topological space were developed in the early 1900 s, largely

An example of a computable absolutely normal number Verónica Becher Santiago Figueira Abstract The first example of an absolutely normal number was given by Sierpinski in 96, twenty years before the concept

A Propositional Dynamic Logic for CCS Programs Mario R. F. Benevides and L. Menasché Schechter {mario,luis}@cos.ufrj.br Abstract This work presents a Propositional Dynamic Logic in which the programs are

Automata Theory Automata theory is the study of abstract computing devices. A. M. Turing studied an abstract machine that had all the capabilities of today s computers. Turing s goal was to describe the

TOPOLOGY: THE JOURNEY INTO SEPARATION AXIOMS VIPUL NAIK Abstract. In this journey, we are going to explore the so called separation axioms in greater detail. We shall try to understand how these axioms

CHAPTER Cardinality of Sets This chapter is all about cardinality of sets At first this looks like a very simple concept To find the cardinality of a set, just count its elements If A = { a, b, c, d },

MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples

RESULTANT AND DISCRIMINANT OF POLYNOMIALS SVANTE JANSON Abstract. This is a collection of classical results about resultants and discriminants for polynomials, compiled mainly for my own use. All results

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Mathematical Sciences 2012 1 p. 43 48 ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS I nf or m at i cs L. A. HAYKAZYAN * Chair of Programming and Information

I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;

The Classes P and NP We now shift gears slightly and restrict our attention to the examination of two families of problems which are very important to computer scientists. These families constitute the

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied

CHAPTER 1 BASIC TOPOLOGY Topology, sometimes referred to as the mathematics of continuity, or rubber sheet geometry, or the theory of abstract topological spaces, is all of these, but, above all, it is

SOME PROPERTIES OF FIBER PRODUCT PRESERVING BUNDLE FUNCTORS Ivan Kolář Abstract. Let F be a fiber product preserving bundle functor on the category FM m of the proper base order r. We deduce that the r-th

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,

CONSTRUCTION OF THE FINITE FIELDS Z p S. R. DOTY Elementary Number Theory We begin with a bit of elementary number theory, which is concerned solely with questions about the set of integers Z = {0, ±1,

Contemporary Mathematics THIN GROUPS OF FRACTIONS Patrick DEHORNOY Abstract. A number of properties of spherical Artin Tits groups extend to Garside groups, defined as the groups of fractions of monoids

University of Oslo MAT2 Project The Banach-Tarski Paradox Author: Fredrik Meyer Supervisor: Nadia S. Larsen Abstract In its weak form, the Banach-Tarski paradox states that for any ball in R, it is possible

Acta Math. Univ. Comenianae Vol. LXVI, 2(1997), pp. 285 291 285 A REMARK ON ALMOST MOORE DIGRAPHS OF DEGREE THREE E. T. BASKORO, M. MILLER and J. ŠIRÁŇ Abstract. It is well known that Moore digraphs do

SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples

System BV is NP-complete Ozan Kahramanoğulları 1,2 Computer Science Institute, University of Leipzig International Center for Computational Logic, TU Dresden Abstract System BV is an extension of multiplicative

NOTES ON CATEGORIES AND FUNCTORS These notes collect basic definitions and facts about categories and functors that have been mentioned in the Homological Algebra course. For further reading about category

Automata and Formal Languages Winter 2009-2010 Yacov Hel-Or 1 What this course is all about This course is about mathematical models of computation We ll study different machine models (finite automata,

A CONSTRUCTION OF THE UNIVERSAL COVER AS A FIBER BUNDLE DANIEL A. RAMRAS In these notes we present a construction of the universal cover of a path connected, locally path connected, and semi-locally simply

LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

CS 838: Foundations of Data Management Spring 2016 Lecture 2: Query Containment Instructor: Paris Koutris 2016 In this lecture we will study when we can say that two conjunctive queries that are syntactically

21. Polynomial rings Let us now turn out attention to determining the prime elements of a polynomial ring, where the coefficient ring is a field. We already know that such a polynomial ring is a UF. Therefore