Menu

Office Pool Statistics

I just finished watching an O’Reilly webcast on statistics for NFL office pools. I don’t care much about football, unless it’s the other kind of football, but I was interested to see what pieces of Python the presenter, [Tanya Schlusser][], was going to use: [pandas][] and [scikit-learn][]. Her presentation was pretty tense, but, luckily she made the code, including a Jupyter notebook, available on [GitHub][]. *Thank you, Tanya!*

A couple of other things came up in the group chat that accompanied the presentation or in the presentation itself:

* [seaborn][] is statistical data visualization library for Python.
* [statsmodels][] “provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.”
* You can store models in `scikit-learn` in [pickles][].
* I shouldn’t forget about [OpenRefine][].

## Addendum ##

As regular readers of these notes know, installation of `scikit-learn` is as easy as:

% sudo port install py34-scikit-learn

What I didn’t know is that the installation of `seaborn` in MacPorts includes `statsmodels`:

> Patsy is a Python library for describing statistical models (especially linear models, or models that have a linear component) and building design matrices. Patsy brings the convenience of R “formulas” to Python.

Related

Post navigation

The Amazing Crawfish Boat is available at your favorite bookseller (both Amazon and B&N). I have also released some additional free materials: audio versions of some of the chapters and photos — all available for download. Details are available on the book’s page.