“The Minimum Description Length (MDL) principle is a method for inductive inference that provides a generic solution to the model selection problem, and, more generally to the overfitting problem.” (Peter Grünwald)
In this talk we will provide a very basic introduction to the philosophy and general idea behind the MDL principle that views learning as data compression. While we focus mostly on the high level goals of MDL and how it compares to, e.g., Bayesian inference or the information bottleneck, we will gently introduce a “crude two-part” version of MDL in some detail. Finally we briefly outline a “refined” version of MDL and discuss its pros and cons.

Recommended Reading

Since we start from scratch, no reading is required.
For a fruitful discussion about MDL vs. Bayesian Inference vs. Frequentist at the end of the talk you can browse (the beginnings of) section 17.1 and 17.2 here: