Let's Learn Statistics !

Bargava Subramanian (~bargava)
|
03 May, 2015

31

Votes

Description:

Statistics has some important concepts and thought processes that drive Data Science. But is Statistics an arcane mathematical subject filled with esoteric formulae and concepts - and hence, difficult to learn ? We feel not.

BUT?!!

"I am a programmer", "math is not my cup of tea", "It's been ages since I did math. I don't know if I am capable of doing it", "WTH? I thought everything is commoditized/productized. So, why learn statistics?" We hear ya !

Why don't we take an application-centric programming approach to learn some of the basic concepts that drive data science? Is it possible? Most definitely.

Heavily inspired by Allen Downey's books Think Stats and Think Bayes, and also his Pycon US workshop(s), we try to demystify some of those concepts using some real-life examples. Some key concepts that we plan to cover are:

We would be doing data analysis using Pandas along with numpy and scipy. We would be doing some plotting using matplotlib/seaborn.

We would be using IPython Notebook to drive the workshop. The contents of the workshop are available at the repo: https://github.com/rouseguy/intro2stats . It is currently a work-in-progress. All the code, data and presentations would be available in this repository prior to the workshop

Links to get started on all of them are given below in the Content urls section.

Software Requirements-Must have

Python 2.7

git

Software Requirements-Recommended

We would be cloning a git repo and working off it. Link to that will be posted closer to the workshop date. There will be a requirements file that, when executed, will install all necessary libraries. For sake of completeness, we would need the latest versions of the following libraries:

Numpy

Pandas

Scipy

Matplotlib

Seaborn

IPython (along with IPython notebook)

Software-Optional

If attendees are comfortable, they can install and use Anaconda. If using Anaconda, prior to the start of workshop, please verify if all the requisite libraries are installed. Disclosure I use Anaconda