Discover what you don't know, and attack your weaknesses!

Experimental Design

Strongly Recommended Prerequisites

Recommended Prerequisites

None

Experiments are a very important component of data science; they're the only reliable way to to measure the change in one variable caused by another. Since it is often expensive to perform experiments, it is important to conduct them in an optimal way. There is no single design that is optimal for every scenario, so data scientists must be familiar with the specific designs that work well in the kinds of situations they generally encounter.

Recommended Books

Design and Analysis of Experiments with R

John Lawson

(Image takes you to Amazon.)

Key Features

In-text exercises

Example R code

Key Topics

Completely Randomized Design

Crossover and Repeated Measure Designs

Designs to Study Variances

Experimental Strategies for Increasing Knowledge

Factorial Designs

Fractional Factorial Designs

Incomplete and Confounded Block Designs

Linear Models

Mixture Experiments

Nested Designs

Randomization

Randomized Block Designs

Replication

Response Surface Designs

Robust Parameter Design Experiments

Split Plot Designs

Description

They say, Don't judge a book by its cover, but this book is almost worth buying for its cover alone. Lawson systematically covers the most common experimental situations and teaches you when each is appropriate. In many cases the basic designs that are introduced early in the text will work well. However, if you find yourself in a situation where your experimental units are highly heterogeneous with multiple factors that are hard to vary, you'll be glad you have this book to teach you the more obscure but more performant design. Lawson provides example code that takes some of the guesswork out of working with arcane R packages.

Statistics for Experimenters: Design, Innovation, and Discovery

George E.P. Box, J. Stuart Hunter, William G. Hunter

(Image takes you to Amazon.)

Key Features

In-text exercises

Solutions to some exercises

Key Topics

Blocking and Randomization

Data Transformation

Designing Robust Products

Elementary Probability and Statistics

Evolutionary Process Operation

Factorial Designs

Fractional Factorial Designs

Latin Squares Design

Linear Models

Process Control, Forecasting, and Time Series

Randomized Block Designs

Response Surface Methods

Split Plot Design

Description

This is the OG text on experimental design, and any data scientist who does a lot of experimentation will benefit from reading through it. We feel it's not the best possible experimental design book because it doesn't work as well as a reference as our top pick, and it tries to be too many things: an introductory statistics book, an experimental design book, an operations research book.... That said, it does have a lot of wisdom to offer on those subjects, so if you're interested in them this book will serve you well.