This course will provide a detailed critique of the methods and philosophy of the Null Hypothesis Significance Testing (NHST) approach to statistics which is currently dominant in social and biomedical science. We will briefly contrast NHST with alternatives, especially with Bayesian methods. We will use some computer code (Matlab and R) to demonstrate some issues. However, we will focus on the big picture rather on the implementation of specific procedures.

This module follows on from Foundations in Applied statistics, and will teach you the basics of common bivariate techniques (that is, techniques that examine the associations between two variables). The module is divided between lectures, in which you'll learn the relevant theory, and hands-on practical sessions, in which you will learn how to apply these techniques to the analysis of real data.

Techniques to be covered include:

Cross-tabulations

Scatterplots

Covariance and correlation

Nonparametric methods

Two-sample t-tests

ANOVA

Ordinary Least Squares (OLS)

For best results, students should expect to do a few hours of private study and spend a little extra time in the computer labs, in addition to coming to class.

The challenge of causal inference is ubiquitous in social science. Nearly every research project fundamentally is about causes and effects. This introductory session will:

(i) set out some basic barriers to causal inference in the social sciences and why this matters; (ii) describe the counterfactual framework that underpins much of the discussion of causal inference; (iii) talk through the intuition of several research designs that can help researchers make stronger claims for causal relationships.

The emphasis is on setting out applications of each approach, along with pros and cons, so that participants understand when a particular design may be more or less suitable to a research problem.

The module will introduce students to the study of language use as a distinctive type of social practice. Attention will be focused primarily on the methodological and analytic principles of conversation analysis. (CA). However, it will explore the debates between CA and Critical Discourse Analysis (CDA), as a means of addressing the relationship between the study of language use and the study of other aspects of social life. It will also consider the roots of conversation analysis in the research initiatives of ethnomethodology, and the analysis of ordinary and institutional talk. It will finally consider the interface between CA and CDA.

The internet is a great resource for humanities and social science data, but most information is apparently chaotic. In this course we will explore how to programmatically access information stored online, typically in html, to create neat, tabulated data ready for analysis. The course is made up of four tutorials, designed to build the tools needed to effectively collect different types of data. The uses of web scraping are diverse: in this course we will use the programming language R to first access data directly from newspapers, and secondly by accessing live data streams using APIs (YouTube, Facebook, Google Maps, Wikipedia). Collectively these sessions will give the skillsets necessary to use web scraping in students’ own research. Slides from last year’s sessions may be consulted here: http://fredheir.github.io/WebScraping

This module will introduce you to the theory and practice of multivariate analysis, covering Ordinary Least Squares (OLS) and logistic regressions. You will learn how to read published results critically, to do simple multivariate modelling yourself , and to interpret and write about your results intelligently.

Half of the module is based in the lecture theatre, and covers the theory behind multivariate regression; the other half is lab-based, in which students will work through practical exercises using statistical software.

To get the most out of the course, you should also expect to spend some time between sessions having fun by building your own statistical models.

This is an introductory course for students whose research involves collecting, storing or analysing data using networked digital devices. Unless your research data is only collected using pen and paper or tape recorders and is written up on a manual typewriter, this course will be relevant to you. If you are planning to collect data online through either public or private communications, or you intend to share or publish data collected by other means it will be essential.

This module is an introduction to ethnographic fieldwork and analysis and is intended for students in fields other than anthropology. It provides an introduction to contemporary debates in ethnography, and an outline of how selected methods may be used in ethnographic study.

The ethnographic method was originally developed in the field of social anthropology, but has grown in popularity across several disciplines, including sociology, geography, criminology, education and organization studies.

Ethnographic research is a largely qualitative method, based upon participant observation among small samples of people for extended periods. A community of research participants might be defined on the basis of ethnicity, geography, language, social class, or on the basis of membership of a group or organization. An ethnographer aims to engage closely with the culture and experiences of their research participants, to produce a holistic analysis of their fieldsite.

Session 1: The Ethnographic MethodWhat is ethnography? Can ethnographic research and writing be objective? How does one conduct ethnographic research responsibly and ethically?

Session 2: Ethnographies in Confinement
The practice of ethnography varies greatly depending on its setting. This session will consider the value, practice, epistemology and ethics of ethnographic research conducted in organisations, particularly those, such as prisons and psychiatric institutions, which confine people. How can we ensure access, and what are the political and ethical ramifications of doing so? How can we ethically conduct research in an institution in which people are held against their will? What are the epistemological issues when ‘free’ researchers conduct research in spaces of confinement?

Session 3: Ethnographies of Freedom
Building on the previous week’s session, this session this session will consider how the practice of ethnography differs when it is conducted in more permeable institutions. There are many advantages to conducting research where the setting is less boundaried – access is less complex, and consent can feel harder to gauge – but other issues are raised. What is the role of the ethnographer in something that looks like everyday life? What does it mean to leave the field? What is the difference between ‘research’ and ‘friendship’? And what actually is the site of study?

Session 4: Photography and Audio Recording in Ethnographic Work
What kinds of audiovisual equipment, and practices of photography and sound recording, can be used to support an ethnographer’s research process? What kinds of the epistemological, theoretical, social, and ethical considerations tend to arise around possible use of these technologies in anthropological fieldwork and analysis?

This course aims to provide students with a range of specific technical skills that will enable them to undertake impact evaluation of policy. Too often policy is implemented but not fully evaluated. Without evaluation we cannot then tell what the short or longer term impact of a particular policy has been. On this course, students will learn the skills needed to evaluate particular policies and will have the opportunity to do some hands on data manipulation. A particular feature of this course is that it provides these skills in a real world context of policy evaluation. It also focuses primarily not on experimental evaluation (Random Control Trials) but rather quasi-experimental methodologies that can be used where an experiment is not desirable or feasible.

This course will introduce students to the approach called "Exploratory Data Analysis" (EDA) where the aim is to extract useful information from data, with an enquiring, open and sceptical mind. It is, in many ways, an antidote to many advanced modelling approaches, where researchers lose touch with the richness of their data. Seeing interesting patterns in the data is the goal of EDA, rather than testing for statistical significance. The course will also consider the recent critiques of conventional "significance testing" approaches that have led some journals to ban significance tests.

Students who take this course will hopefully get more out of their data, achieve a more balanced overview of data analysis in the social sciences.

To understand that the emphasis on statistical significance testing has obscured the goals of analysing data for many social scientists.

To discuss other ways in which the significance testing paradigm has perverted scientific research, such as through the replication crisis and fraud.

This module introduces the statistical techniques of Exploratory and Confirmatory Factor Analyses. Exploratory Factor Analysis (EFA) is used to uncover the latent structure (dimensions) of a set of variables. It reduces the attribute space from a larger number of variables to a smaller number of factors. Confirmatory Factor Analysis (CFA) examines whether collected data correspond to a model of what the data are meant to measure. STATA will be introduced as a powerful tool to conduct confirmatory factor analysis. A brief introduction will be given to confirmatory factor analysis and structural equation modelling.

This module is an extension of the three previous modules in the Basic Statistics stream, and introduces more complex and nuanced aspects of the theory and practice of mutivariate analysis. Students will learn the theory behind the methods covered, how to implement them in practice, how to interpret their results, and how to write intelligently about their findings. Half of the module is based in the lecture theatre; the other half is lab-based, in which students will work through practical exercises using the statistical software Stata.

Topics covered include:

Interaction effects in regression models: how to estimate these and how to interpret them

This module is shared with Geography. Students from the Department of Geography MUST book places on this course via the Department; any bookings made by Geography students via the SSRMC portal will be cancelled.

This workshop series aims to provide introductory training on Geographical Information Systems. Material covered includes the construction of geodatabases from a range of data sources, geovisualisation and mapping from geodatasets, raster-based modeling and presentation of maps and charts and other geodata outputs. Each session will start with an introductory lecture followed by practical exercises using GIS software.

The course will provide students with an introduction to the popular and powerful statistics package Stata. Stata is commonly used by analysts in both the social and natural sciences, and is the statistics package used most widely by the SSRMC. You will learn:

How to open and manage a dataset in Stata

How to recode variables

How to select a sample for analysis

The commands needed to perform simple statistical analyses in Stata

Where to find additional resources to help you as you progress with Stata

The course is intended for students who already have a working knowledge of statistics - it's designed primarily as a ""second language"" course for students who are already familiar with another package, perhaps R or SPSS. Students who don't already have a working knowledge of applied statistics should look at courses in our Basic Statistics Stream.

In this module students will be introduced to meta-analysis, a powerful statistical technique allowing researchers to synthesize the available evidence for a given research question using standardized (comparable) effect sizes across studies. The sessions teach students how to compute treatment effects, how to compute effect sizes based on correlational studies, how to address questions such as what is the association of bullying victimization with depression? The module will be useful for students who seek to draw statistical conclusions in a standardized manner from literature reviews they are conducting.

In this module, students will be introduced to multilevel modelling, also known as hierarchical linear modelling. MLM allows the user to analyse how outcomes are influenced by factors acting at multiple levels. So, for example, we might conceptualise children's educational process as being influenced by individual or family-level factors, as well as by factors operating at the level of the school or the neighbourhood. Similarly, outcomes for prisoners might be influenced by individual and/or family-level characteristics, as well as by the characteristics of the prison in which they are detained.

This module provides an applied introduction to panel data analysis (PDA). Panel data are gathered by taking repeated observations from a series of research units (eg. individuals, firms) as they move through time. This course focuses primarily on panel data with a large number of research units tracked for a relatively small number of time points.

The module begins by introducing key concepts, benefits and pitfalls of PDA. Students are then taught how to manipulate and describe panel data in Stata. The latter part of the module introduces random and fixed effects panel models for continuous and dichotomous outcomes. The course is taught through a mixture of lectures and practical sessions designed to give students hands-on experience of working with real-world data from the British Household Panel Survey.

This two-hour short course will introduce students to the concept of power analysis (also known as power calculations), type I and II errors, and how to do power analysis for T test, correlation and analysis of variance. Students should not expect to learn complex power analysis for structural equation modeling, multilevel modeling (the SSRMC offers individual courses on both) in this introductory course (Stata currently does not have commands for these analyses). This course aims to provide an easy and intuitive rationale behind the technique, as well as hands-on practice in how to perform power analysis in Stata.

Power analysis is an important skill for anyone doing statistical research; it is particularly useful when writing a grant proposal, and is sometimes required by funders. It involves calculating the number of observations required to undertake a given statistical analysis. If a sample is too small, significant associations may not be detectable, even though they may be present in the population from which the sample is drawn. Power analysis is useful when:

You plan to collect data for research, and want to calculate how many subjects are needed

You need to plan how much time and/or money to allow for a research project

Your face budget constraints in your research, and need to establish whether the research is feasible

Propensity score matching (PSM) is a technique that simulates an experimental study in an observational data set in order to estimate a causal effect.
In an experimental study, subjects are randomly allocated to “treatment” and “control” groups; if the randomisation is done correctly, there should be no differences in the background characteristics of the treated and non-treated groups, so any differences in the outcome between the two groups may be attributed to a causal effect of the treatment.
An observational survey, by contrast, will contain some people who have been subject to the “treatment” and some people who have not, but they will not have not been randomly allocated to those groups. The characteristics of people in the treatment and control groups may differ, so differences in the outcome cannot be attributed to the treatment.
PSM attempts to mimic the experimental situation trial by creating two groups from the sample, whose background characteristics are virtually identical. People in the treatment group are “matched” with similar people in the control group. The difference between the treatment and control groups in this case should may therefore more plausibly be attributed to the treatment itself.
PSM is widely applied in many disciplines, including sociology, criminology, economics, politics, and epidemiology.
The module covers the basic theory of PSM, the steps in the implementation (e.g. variable choice for matching and types of matching algorithms), and assessment of matching quality. We will also work through practical exercises using Stata, in which students will learn how to apply the technique to the analysis of real data and how to interpret the results.

The analysis of policy depends on many disciplines and techniques and so is difficult for many researchers to access. This module provides a mixed perspective on policy analysis, taking both an academic and a practitioner perspective. This is because the same tools and techniques can be used in academic research on policy options and change as those used in practice in a policy environment. This course is provided as three 2 hour sessions delivered as a mix of lectures and seminars. No direct analysis work will be done in the sessions themselves, but some sample data and questions will be provided for students who wish to take the material into practice.

Standard statistical techniques in the social sciences are good at uncovering relationships between variables, but less good at establishing whether these relationships are causal. If A and B are correlated, does that mean A "causes" B? That B "causes" A? Or could both A and B be driven by a third factor C?

Randomised controlled trials are a type of study often considered to be the gold standard in uncovering this kind of causality. Many students and early-career researchers avoid RCTs, assuming they are complex and expensive to run. However, that need not be the case. This module will explain the theory of RCTs, how they are implemented, and will encourage participants to think about how they might design an RCT in their own field of work.

Ethics is becoming an increasingly important issue for all researchers and the aim of this session is to demonstrate the practical value of thinking seriously and systematically about what constitutes ethical conduct in social science research. The session will involve some small-group work.

Using secondary data (that is, data collected by someone else, usually a government agency or large research organisation) has a number of advantages in social science research: sample sizes are usually larger than can be achieved by primary data collection, samples are more nearly representative of the populations they are drawn from, and using secondary data for a research project often represents significant savings in time and money. This short course, taught by Dr Deborah Wiltshire of the UK Data Archive, will discuss the advantages and limitations of using secondary data for research in the social sciences, and will introduce students to the wide range of available secondary data sources. The course is based in a computer lab; students will learn how to search online for suitable secondary data by browsing the database of the UK Data Archive.

This introductory course is for graduate students who have no prior training in social network analysis (SNA). In the morning, we overview SNA concepts and analyse key articles in the literature. In the afternoon, students learn to handle relational databases and code for SNA research using R.

This intensive one-day course on structural equation modelling will provide an introduction to SEM using the statistical software Stata. The aim of the course is to introduce structural equation modelling as an analytical framework and to familiarize participants with the applications of the technique in the social sciences.

The application of the structural equation modelling framework to a variety of social science research questions will be illustrated through examples of published papers. The examples used are drawn from very recent papers, as well as publications from the early days of the technique; some use path analysis using cross-national data, others confirmatory factor analysis, and other still full structural models, to test particular hypotheses. Some example papers may be found below, though they should not be treated as the gold standard, rather as an illustration of the variety of approaches and reporting techniques within SEM.

Students will engage in a critique of such examples, with the aim of gaining a better understanding of the SEM framework, as well as its application to real-life data. To further facilitate this application focus, the theoretical introduction will be accompanied by practical examples based on real, publicly-available data.