The concept appears in ergodic theory—the study of stochastic processes and measure-preserving dynamical systems. Several different definitions for mixing exist, including strong mixing, weak mixing and topological mixing, with the last not requiring a measure to be defined. Some of the different definitions of mixing can be arranged in a hierarchical order; thus, strong mixing implies weak mixing. Furthermore, weak mixing (and thus also strong mixing) implies ergodicity: that is, every system that is weakly mixing is also ergodic (and so one says that mixing is a "stronger" notion than ergodicity).

In this definition, P is the probability measure on the sigma algebra. The symbol Xab{\displaystyle X_{a}^{b}}, with −∞≤a≤b≤∞{\displaystyle -\infty \leq a\leq b\leq \infty } denotes a subalgebra of the sigma algebra; it is the set of cylinder sets that are specified between times a and b. Given specific, fixed values Xa,Xa+1{\displaystyle X_{a},X_{a+1}}, etc., of the random variable, at times a{\displaystyle a}, a+1{\displaystyle a+1}, etc., then it may be thought of as the sigma-algebra generated by

One way to describe this is that strong mixing implies that for any two possible states of the system (realizations of the random variable), when given a sufficient amount of time between the two states, the occurrence of the states is independent.

Suppose {Xt} is a stationary Markov process, with stationary distribution Q. Denote L2(Q){\displaystyle L^{2}(Q)} the space of Borel-measurable functions that are square-integrable with respect to measure Q. Also let

The process is called ρ-mixing if these coefficients converge to zero as t → ∞, and “ρ-mixing with exponential decay rate” if ρt < e−δt for some δ > 0. For a stationary Markov process, the coefficients ρt may either decay at an exponential rate, or be always equal to one.[1]

The process is called α-mixing if these coefficients converge to zero as t → ∞, it is “α-mixing with exponential decay rate” if αt < γe−δt for some δ > 0, and it is “α-mixing with sub-exponential decay rate” if αt < ξ(t) for some non-increasing function ξ(t) satisfying t−1ln ξ(t) → 0 as t → ∞.[1]

The α-mixing coefficients are always smaller than the ρ-mixing ones: αt ≤ ρt, therefore if the process is ρ-mixing, it will necessarily be α-mixing too. However when ρt = 1, the process may still be α-mixing, with sub-exponential decay rate.

The process is called β-mixing if these coefficients converge to zero as t → ∞, it is “β-mixing with exponential decay rate” if βt < γe−δt for some δ > 0, and it is “β-mixing with sub-exponential decay rate” if βtξ(t) → 0 as t → ∞ for some non-increasing function ξ(t) satisfying t−1ln ξ(t) → 0 as t → ∞.[1]

A strictly stationary Markov process is β-mixing if and only if it is an aperiodic recurrent Harris chain. The β-mixing coefficients are always bigger than the α-mixing ones, so if a process is β-mixing it will also be α-mixing. There is no direct relationship between β-mixing and ρ-mixing: neither of them implies the other.

A similar definition can be given using the vocabulary of measure-preserving dynamical systems. Let (X,A,μ,T){\displaystyle (X,{\mathcal {A}},\mu ,T)} be a dynamical system, with T being the time-evolution or shift operator. The system is said to be strong mixing if, for any A,B∈A{\displaystyle A,B\in {\mathcal {A}}}, one has

For shifts parametrized by a continuous variable instead of a discrete integer n, the same definition applies, with T−n{\displaystyle T^{-n}} replaced by Tg{\displaystyle T_{g}} with g being the continuous-time parameter.

To understand the above definition physically, consider a shaker M{\displaystyle M} full of an incompressible liquid, which consists of 20% wine and 80% water. If A{\displaystyle A} is the region originally occupied by the wine, then, for any part B{\displaystyle B} of the shaker, the percentage of wine in B{\displaystyle B} after n{\displaystyle n} repetitions of the act of stirring is

In such a situation, one would expect that after the liquid is sufficiently stirred (n→∞{\displaystyle n\to \infty }), every region B{\displaystyle B} of the shaker will contain approximately 20% wine. This leads to

in the Cesàro sense, and ergodic if μ(A∩T−nB)→μ(A)μ(B){\displaystyle \mu \left(A\cap T^{-n}B\right)\to \mu (A)\mu (B)} in the Cesàro sense. Hence, strong mixing implies weak mixing, which implies ergodicity. However, the converse is not true: there exist ergodic dynamical systems which are not weakly mixing, and weakly mixing dynamical systems which are not strongly mixing.

The properties of ergodicity, weak mixing and strong mixing of a measure-preserving dynamical system can also be characterized by the average of observables. By von Neumann's ergodic theorem, ergodicity of a dynamical system (X,A,μ,T){\displaystyle (X,{\mathcal {A}},\mu ,T)} is equivalent to the property that, for any function f∈L2(X,μ){\displaystyle f\in L^{2}(X,\mu )}, the sequence (f∘Tn)n≥0{\displaystyle (f\circ T^{n})_{n\geq 0}} converges strongly and in the sense of Cesàro to ∫Xfdμ{\displaystyle \int _{X}f\,d\mu }, i.e.,

Since the system is assumed to be measure preserving, this last line is equivalent to saying that limn→∞Cov⁡(f∘Tn,g)=0,{\displaystyle \lim _{n\to \infty }\operatorname {Cov} (f\circ T^{n},g)=0,} so that the random variables f∘Tn{\displaystyle f\circ T^{n}} and g{\displaystyle g} become orthogonal as n{\displaystyle n} grows. Actually, since this works for any function g,{\displaystyle g,} one can informally see mixing as the property that the random variables f∘Tn{\displaystyle f\circ T^{n}} and g{\displaystyle g} become independent as n{\displaystyle n} grows.

Given two measured dynamical systems (X,μ,T){\displaystyle (X,\mu ,T)} and (Y,ν,S),{\displaystyle (Y,\nu ,S),} one can construct a dynamical system (X×Y,μ⊗ν,T×S){\displaystyle (X\times Y,\mu \otimes \nu ,T\times S)} on the Cartesian product by defining (T×S)(x,y)=(T(x),S(y)).{\displaystyle (T\times S)(x,y)=(T(x),S(y)).} We then have the following characterizations of weak mixing:

Proposition. A dynamical system (X,μ,T){\displaystyle (X,\mu ,T)} is weakly mixing if and only if, for any ergodic dynamical system (Y,ν,S){\displaystyle (Y,\nu ,S)}, the system (X×Y,μ⊗ν,T×S){\displaystyle (X\times Y,\mu \otimes \nu ,T\times S)} is also ergodic.

Proposition. A dynamical system (X,μ,T){\displaystyle (X,\mu ,T)} is weakly mixing if and only if (X2,μ⊗μ,T×T){\displaystyle (X^{2},\mu \otimes \mu ,T\times T)} is also ergodic. If this is the case, then (X2,μ⊗μ,T×T){\displaystyle (X^{2},\mu \otimes \mu ,T\times T)} is also weakly mixing.

A form of mixing may be defined without appeal to a measure, using only the topology of the system. A continuous mapf:X→X{\displaystyle f:X\to X} is said to be topologically transitive if, for every pair of non-empty open setsA,B⊂X{\displaystyle A,B\subset X}, there exists an integer n such that

Lemma: If X is a completemetric space with no isolated point, then f is topologically transitive if and only if there exists a hypercyclic pointx∈X{\displaystyle x\in X}, that is, a point x such that its orbit {fn(x):n∈N}{\displaystyle \{f^{n}(x):n\in \mathbb {N} \}} is dense in X.

A system is said to be topologically mixing if, given open sets A{\displaystyle A} and B{\displaystyle B}, there exists an integer N, such that, for all n>N{\displaystyle n>N}, one has

fn(A)∩B≠∅.{\displaystyle f^{n}(A)\cap B\neq \varnothing .}

For a continuous-time system, fn{\displaystyle f^{n}} is replaced by the flowφg{\displaystyle \varphi _{g}}, with g being the continuous parameter, with the requirement that a non-empty intersection hold for all ‖g‖>N{\displaystyle \Vert g\Vert >N}.

A weak topological mixing is one that has no non-constant continuous (with respect to the topology) eigenfunctions of the shift operator.

Topological mixing neither implies, nor is implied by either weak or strong mixing: there are examples of systems that are weak mixing but not topologically mixing, and examples that are topologically mixing but not strong mixing.