Description

Advances in technology have allowed us to collect massive amounts of data. A data scientist is a person who has the skills, knowledge, and ability to extract actionable knowledge from the data -- either for the good of society, advancement of science, profits in business, etc. This course will cover the topics needed to solve data-science problems, which include data preparation (collection & integration), data characterization & presentation, data analysis (experimentation & observational studies), and data products.

Prerequisites

The class requires an ability to deal with abstract mathematical concepts such as the ones covered in 01:198:112, 01:198:205, and 01:198:206. You need an introductory-level background in algorithms, probability, and linear algebra. You also need to know programming to perform data manipulation and analysis (e.g., one of Python, Matlab, R, etc) and Web programming (e.g., one of HTML, CSS, Javascript, etc). The specific programming language is mostly your choice.

Grading Policies

Class project (45%), where you solve a data-science problem from data preparation to data product