Sunday, July 23, 2017

Statistics Sunday: Statistics Reading Round-Up

I'm working on some future Statistics Sunday posts, so in the meantime, I thought I'd offer you some of the statistics books on my reading list these days.

Recently, I finished reading:

How to Lie with Statistics by Darrell Huff - a quick reading at 144 pages. Lots of good information, especially for people with little statistics knowledge, because it will help you be a better consumer of research information you encounter in the media. He doesn't really go into how probability can influence sampling, and focuses on bias from the original researchers rather than secondary sources sharing the research. But you'll still learn a lot and you can knock this book out in a couple sittings.

Statistics Gone Wrong: The Woefully Complete Guide by Alex Reinhart - this book grew out a project Reinhart did as an undergraduate, which he started before he had any statistics training whatsoever. He's now a PhD candidate. His statistics knowledge is a bit thin in some places, and he switches back and forth on a few issues, but his understanding of probability is excellent. I learned a lot from him.

Fooled by Randomness by Nassim Nicholas Taleb - this is one of the books from Taleb's Incerto series. I had two of the books on my reading list - this one and The Black Swan - so I went ahead and picked up the full set, since it was only a little more than $40. Taleb is a former trader, and talks about some of the cognitive biases that cause us to see systematic explanations for random events, using the financial industry to demonstrate key probability concepts. His writing is very readable - you feel like he's sitting across from you chatting.

I'm currently reading The Seven Pillars of Statistical Wisdom by Stephen M. Stigler. In it, Stigler breaks down the seven core concepts that influenced statistical thinking. Though they're basic concepts now, they were radical propositions at some time. For instance, the first pillar is aggregation - summarizing the data with a single number. We do this all the time now with descriptive statistics like the mean, but when it was first proposed, mathematicians saw it as throwing away data or worse, trusting bad data along with good data (good and bad of course being subjective concepts).