The canonical partition function is defined as $$Z=\sum_{s}e^{-\beta E_s}$$ with the sum being over all states of the system. The way I saw this derived was by assuming that for each state, the probability of the system occupying that state is proportional to the Boltzmann factor: $P(E=E_s) = c \cdot e^{-\beta E_s}$. Summing the probabilities to one gives $Z=\frac{1}{c}$.

My question is : What principles were used in order to get this probability distribution for the energy?

2 Answers
2

My favorite way to obtain the canonical partition function is via quantum statistical mechanics and involves essentially only one principle: maximum entropy. The principle says that to obtain the statistical state of a system in a certain ensemble, one extremizes the entropy subject to the constraints that define the ensemble.

In the context of quantum statistical mechanics for a system in the canonical ensemble, one extremizes the so-called von-Neumann entropy
$$
S_\mathrm{vn}(\rho) = -k\,\mathrm{tr}(\rho\ln\rho)
$$
subject to the constraint that the ensemble average energy has some fixed value $E$;
$$
\mathrm{tr}(\rho H) = E
$$
Here $\rho$ denotes the density operator of the system, and $H$ is its Hamiltonian. This constraint is, in fact, one way of defining the canonical ensemble. This is a constrained optimization problem that can be solved using the method of Lagrange multipliers. The result is that the density operator of the system is.
$$
\rho = \frac{1}{Z}e^{-\beta H}, \qquad Z = \mathrm{tr}(e^{-\beta H})
$$
where $\beta$ is the Lagrange multiplier corrsponding to the constraint of fixed ensemble average energy.

Important Digression. If you use the derivation above, it's not at all clear a priori why the multiplier $\beta$ is inverse temperature. The multiplier $\beta$ can be identified with inverse temperature using the following argument. Notice that for $\rho$ of the canonical ensemble, we have
\begin{align}
S_\mathrm{vn}(\rho)
&= -k\mathrm{tr}\left(\rho\ln \frac{e^{-\beta H}}{Z}\right)\\
&= -k\mathrm{tr}\left(\rho(\ln e^{-\beta H}-\ln Z)\right)\\
&=k\mathrm{tr}\left(\rho(\beta H+\ln Z)\right)\\
&= k(\beta \mathrm{tr}(\rho H) + \ln Z)\\
&= k(\beta E + \ln Z)
\end{align}
Now, notice that the Lagrange multiplier is actually a function of $E$, the constrained value of the ensemble average of $H$, so we have
$$
\frac{\partial S_\mathrm{vn}}{\partial E} = k\left(\beta ' E + \beta + \frac{1}{Z}\frac{\partial Z}{\partial E}\right)
$$
where $\beta'$ denotes the derivative of $\beta$ with respect to $E$, but
\begin{align}
\frac{1}{Z}\frac{\partial Z}{\partial E}
&= \frac{1}{Z}\frac{\partial}{\partial E} \mathrm{tr}(e^{-\beta H})
= \mathrm{tr}(-\beta'He^{-\beta H}/Z)
= -\beta'\mathrm{tr}(\rho H)
= -\beta' E
\end{align}
so that putting this all together, we get
$$
\frac{\partial S_\mathrm{vn}}{\partial E} = k\beta
$$
on the other hand, recall that the thermodynamic temperature satisfies
$$
\frac{1}{T} = \frac{\partial S}{\partial E}
$$
so that if we identify the von-Neumann entropy with the thermodynamic entropy ($S = S_\mathrm{vn}$) gives
$$
\beta = \frac{1}{kT}
$$
as desired.

In canonical ensemble, you have a heat reservoir ($L$) and the observed system ($l$).
The total energy is $E_{tot} = E_L +E_l$, and is a constant because the total system $(L+ l)$ is isolated.

The probability $p_l$ of finding the observed system in a microscopic state of energy $E_l$ is equal to the probability of finding the heat reservoir in a microscopic state of energy $E_L$, that is :

$$p_l = \frac{\Omega_L(E_L)}{\Omega_{TOT}}$$
where $\Omega_L(E_L)$ is the number of microscopic states for the heat reservoir, and $\Omega_{TOT}$ is the number of microscopic states for the total system $(L+ l)$.