"There are a lot of small data problems that occur in big data. They don't disappear because you've got lots of stuff. They get worse." is a quote by British biostatistician David J. Spiegelhalter (1953 - ). The quote may be found in a March 28, 2014 article in the Financial Times written by Tim Hartford entitled "Big data: are we making a big mistake?"

A joke for discussing how transformations can make data more normal and stabilize variances across groups with different means (here the square root transformation for Poisson data). The joke was written in 2016 by Larry Lesser from The University of Texas at El Paso.

A joke to be used in teaching about the use of randomization in experiments or about the Pearson correlation coefficient. The idea for the joke came from Lawrence Mark Lesser of The University of Texas at El Paso in 2012.

A pun to familiarize students with Anscombe's Quartet - the group of 4 data sets with the same means, standard deviations, correlations, and regression lines for X and Y that were produced by British statistician Frank Anscombe in a 1973 paper in the American Statistician. The joke was written in 2016 by Larry Lesser from The University of Texas at El Paso. This joke should be used in a written rather than oral presentation since students will not "get" the joke if they have never heard of Anscombe's Quartet - the value for teaching coming from having them look it up.

This is a chapter on data wrangling excerpted from a book on data science. The book is “Modern Data Science with R,” and the authors are Benjamin J. Baumer, Daniel T. Kaplan, and Nicholas J. Horton. It contains the R code needed to do basic things with data such as sorting, arranging, and summarizing data.