The aims of this course are to introduce the principles and applications
of information theory. The course will study how information is measured
in terms of probability and entropy, and the relationships among conditional
and joint entropies; how these are used to calculate the capacity of a
communication channel, with and without noise; coding schemes, including
error correcting codes; how discrete channels and measures of information
generalize to their continuous forms; the Fourier perspective; and extensions
to wavelets, complexity, compression, and efficient coding of audio-visual
information.

Foundations: probability, uncertainty, information.
How concepts of randomness, redundancy,
compressibility, noise, bandwidth, and uncertainty are
related to information.
Ensembles, random variables, marginal and conditional probabilities.
How the metrics of information are grounded in the rules
of probability.

Entropies defined, and why they are measures of information.
Marginal entropy, joint entropy, conditional entropy,
and the Chain Rule for entropy. Mutual information between ensembles
of random variables. Why entropy is the fundamental measure of
information content.

Channel types, properties, noise, and channel capacity.
Perfect communication through a noisy channel. Capacity of a
discrete channel as the maximum of its mutual information over
all possible input distributions.

The quantized degrees-of-freedom in a continuous signal.
Why a continuous signal of finite bandwidth and duration has a fixed
number of degrees-of-freedom. Diverse illustrations of the principle
that information, even in such a signal, comes in quantized, countable,
packets.

Gabor-Heisenberg-Weyl uncertainty relation. Optimal “Logons”.
Unification of the time-domain and the frequency-domain as endpoints
of a continuous deformation. The Uncertainty Principle and its optimal
solution by Gabor’s expansion basis of “logons”. Multi-resolution
wavelet codes. Extension to images, for analysis and compression.

Kolmogorov complexity. Minimal description length.
Definition of the algorithmic complexity of a data sequence, and
its relation to the entropy of the distribution from which the data
was drawn. Fractals. Minimal description length, and why this measure
of complexity is not computable.