Simulated Annealing is a global optimization algorithm that belongs to the field of Stochastic Optimization and Metaheuristics.
Simulated Annealing is an adaptation of the Metropolis-Hastings Monte Carlo algorithm and is used in function optimization. Like the Genetic Algorithm, it provides a basis for a large variety of extensions and specialization's of the general method not limited to Parallel Simulated Annealing, Fast Simulated Annealing, and Adaptive Simulated Annealing.

Simulated Annealing is inspired by the process of annealing in metallurgy. In this natural process a material is heated and slowly cooled under controlled conditions to increase the size of the crystals in the material and reduce their defects. This has the effect of improving the strength and durability of the material. The heat increases the energy of the atoms allowing them to move freely, and the slow cooling schedule allows a new low-energy configuration to be discovered and exploited.

Each configuration of a solution in the search space represents a different internal energy of the system. Heating the system results in a relaxation of the acceptance criteria of the samples taken from the search space. As the system is cooled, the acceptance criteria of samples is narrowed to focus on improving movements. Once the system has cooled, the configuration will represent a sample at or close to a global optimum.

The information processing objective of the technique is to locate the minimum cost configuration in the search space.
The algorithms plan of action is to probabilistically re-sample the problem space where the acceptance of new samples into the currently held sample is managed by a probabilistic function that becomes more discerning of the cost of samples it accepts over the execution time of the algorithm. This probabilistic decision is based on the Metropolis-Hastings algorithm for simulating samples from a thermodynamic system.

Simulated Annealing was designed for use with combinatorial optimization problems, although it has been adapted for continuous function optimization problems.

The convergence proof suggests that with a long enough cooling period, the system will always converge to the global optimum. The downside of this theoretical finding is that the number of samples taken for optimum convergence to occur on some problems may be more than a complete enumeration of the search space.

Performance improvements can be given with the selection of a candidate move generation scheme (neighborhood) that is less likely to generate candidates of significantly higher cost.

Restarting the cooling schedule using the best found solution so far can lead to an improved outcome on some problems.

A common acceptance method is to always accept improving solutions and accept worse solutions with a probability of $P(accept) \leftarrow \exp(\frac{e-e'}{T})$, where $T$ is the current temperature, $e$ is the energy (or cost) of the current solution and $e'$ is the energy of a candidate solution being considered.

The size of the neighborhood considered in generating candidate solutions may also change over time or be influenced by the temperature, starting initially broad and narrowing with the execution of the algorithm.

A problem specific heuristic method can be used to provide the starting point for the search.

Listing (below) provides an example of the Simulated Annealing algorithm implemented in the Ruby Programming Language.
The algorithm is applied to the Berlin52 instance of the Traveling Salesman Problem (TSP), taken from the TSPLIB. The problem seeks a permutation of the order to visit cities (called a tour) that minimizes the total distance traveled. The optimal tour distance for Berlin52 instance is 7542 units.

The algorithm implementation uses a two-opt procedure for the neighborhood function and the classical $P(accept) \leftarrow \exp(\frac{e-e'}{T})$ as the acceptance function. A simple linear cooling regime is used with a large initial temperature which is decreased each iteration.

Simulated Annealing is credited to Kirkpatrick, Gelatt, and Vecchi in 1983 [Kirkpatrick1983]. Granville, Krivanek, and Rasson provided the proof for convergence for Simulated Annealing in 1994 [Granville1994].
There were a number of early studies and application papers such as Kirkpatrick's investigation into the TSP and minimum cut problems [Kirkpatrick1983a], and a study by Vecchi and Kirkpatrick on Simulated Annealing applied to the global wiring problem [Vecchi1983].

There are many excellent reviews of Simulated Annealing, not limited to the review by Ingber that describes improved methods such as Adaptive Simulated Annealing, Simulated Quenching, and hybrid methods [Ingber1993].
There are books dedicated to Simulated Annealing, applications and variations. Two examples of good texts include "Simulated Annealing: Theory and Applications" by Laarhoven and Aarts [Laarhoven1988] that provides an introduction to the technique and applications, and "Simulated Annealing: Parallelization Techniques" by Robert Azencott [Azencott1992] that focuses on the theory and applications of parallel methods for Simulated Annealing.