More information

Other events

This course will introduce a workflow for working efficiently with large amounts of data in R, using data from the Human Mortality Database (HMD) and Human Fertility Database (HFD). Using both of these large databases in an extended case study, the course will show how the R packages plyr and purrr can be used to automate and speed up all stages of the quantitative social science workflow, from tidying and loading data from multiple sources, to producing dozens of separate analyses and data visualisations through a single chunk of code.

While working through the extended case study, related packages, processes and patterns for working with large-scale and complex data efficiently will be introduced, including packages like stringr, tidyr and dplyr for data management, and ‘piped coding’ approaches for making R code more ‘literate’: easier to write, understand and reason about.

If you use the HMD and HFD, the code presented will likely be useful right away for your work. Even if you do not, the general patterns, concepts and methods introduced through the case study will help you think about how to manage large amounts of data and automate your own data workflows.

Price: £195 (Full fee)

Concessions: £140 £140 for those from educational and charitable institutions