1D and N-dimensional Gaussian function

GMMs

EM

But you do not know the parameters of theses Gaussians: their means and covariance matrices and the probability that a data point stems from the 1st or the 2nd Gaussian. These parameters are unobservable, hidden, or “latent”.

EM can find good values for these latent variables by an iterative procedure.

Starting with some initial guess for these parameters we first

estimate the probability for each data point that it comes from the 1st or 2nd Gaussian (E-step)

then use this assignment of datapoints to the Gaussians to update the estimates for the latent variables, i.e. the means, covariance matrices, and “data point producer probabilities” (M-step)

To sum it up:

Expectation-maximization, as expected, works in two alternating steps. Expectation refers to computing the probability that each datum is a member of each class; maximization refers to altering the parameters of each class to maximize those probabilities.
from jormungand.net

EM explanation #1

Easy & very fast explained:

EM explanation #2

Explanations by Andrew Ng from Stanford University (start at minute 18):