Core Ideas

A fixed point of a function f is an input x such that f(x)=x. Fixed point theorems show that various types of functions must have fixed points, and sometimes give methods for finding those fixed points.

Fixed point theorems come in three flavors: Topological, Diagonal, and Iterative. (I sometimes refer to them by central examples as Brouwer, Lawvere, and Banach, respectively.)

Topological fixed points are non constructive. If f is continuous, f(0)>0, and f(1)<1, then we know f must have some fixed point between 0 and 1, since f(x) must somewhere transition from being greater than x to being less than x. This does not tell us where it happens. This can be especially troublesome when there are multiple fixed points, and there is no principled way to choose between them.

Diagonal fixed points are constructed with a weird trick where you feed a code for a function into that function itself. Given a function f, if you can construct a function g, which on input x, interprets x as a function, runs x on itself, and then runs f on the result (i.e. g(x):=f(x(x)).), then g(g) is a fixed point of f because g(g)=f(g(g)). This is not just an example; everything in the cluster looks like this. It is a weird trick, but it is actually very important.

Iterative fixed points can be found through iteration. For example, if f(x)=−x2, then starting with any x value, iterating f forever will converge to the unique fixed point x=0.

(There is a fourth cluster in number theory discussed here, but I am leaving it out, since it does not seem relevant to AI, and because I am not sure whether to put it by itself or to tack it onto the topological cluster.)

These applications all use topological fixed points very directly, and so carry with them most of the philosophical baggage of topological fixed points. For example, while Nash equilibria exist, they are not unique, are computationally hard to find, and feel non-constructive and arbitrary.

Diagonal Fixed Points

Diagonal fixed point theorems are all centered around the basic structure of g(g), where g(x):=f(x(x)), as mentioned previously.

The Tarski cluster is about fixed points of monotonic functions on (partially) ordered sets found by iteration. Tarski's fixed point theorem states that any order preserving function on a complete lattice has a fixed point (and further the set of fixed points forms a complete lattice). The least fixed point can be found by starting with the least element and iterating the function transfinitely. This, for example, implies that every monotonic function on from [0,1] to itself has a fixed point, even if it is not continuous. Kleene's fixed point theorem strengthens the assumptions of Tarski by adding a form of continuity (and also removes some irrelevant assumptions), which gives us that the least fixed point can be found by iterating the function only ω times. The fixed point lemma for normal functions is similar to Kleene, but with ordinals rather than partial orders. It states that any strictly increasing continuous function on ordinals has arbitrarily large fixed points.

The Banach cluster is about fixed points of contractive functions on metric spaces found by iteration. A contractive function is a function that sends points closer together. A function f is contractive if there exists an ϵ>0 such that for all x≠yd(f(x),f(y))≤(1−ϵ)d(x,y). Banach's fixed point theorem state that any contractive function has a unique fixed point. This fixed point is limn→∞fn(x) for any starting point x. An application of this to linear functions is that any ergodic stationary Markov chain has a stationary distribution (which is a fixed point of the transition map), which is converged to via iteration. This is also used in showing that correlated equilibria exist and can be found quickly. Banach can also be used to show that gradient descent converges exponentially quickly on a strongly convex function.

Interdisciplinary Nature

The mapping of the key fixed point theorems discussed in the exercises into these categories is surjective:

Lawvere's fixed point theorem is Algebra

Banach's fixed point theorem is Analysis

Brouwer fixed point theorem is Topology

Gödel's first incompleteness theorem is Logic

Sperner's lemma is Combinatorics.

On top of that, major applications of fixed point theorems show up in Differential Equations, CS theory, Machine Learning, Game Theory, and Economics.

Tomorrow's AI Alignment Forum sequences post will be two short posts 'Approval-directed bootstrapping' and 'Humans consulting HCH' by Paul Christiano in the sequence Iterated Amplification.

The next posting in this sequence will be four posts of Agent Foundations research that use fixed point theorems, on Wednesday 28th November. These will be re-posts of content from the now-defunct Agent Foundations forum, all of whose content is now findable on the AI Alignment Forum (and all old links will soon be re-directed to the AI Alignment Forum).