Many studies in education, human development, sociology, public health, and allied fields are longitudinal, or multilevel, or both. In longitudinal studies, it is often possible to repeatedly observe participants. This allows the assessment of growth in academic achievement or change in mental health status. Multilevel data arise because participants are clustered within social settings such as classrooms, schools, and neighborhoods. These settings often form a strict hierarchy, as when classrooms are nested within schools, which are in turn nested within districts. This environment may form a cross-classified structure, when schools draw students from multiple neighborhoods and neighborhoods send students to multiple schools. The nested versus cross-classified organization of these settings create the need for different analytic approaches.

Data that are both longitudinal and multilevel include studies of school effects on student academic growth and neighborhood and family effects on changes in mental health. In some cases the participants will migrate across social settings over time. Children may experience a sequence of classrooms during a school year and families may move to a new neighborhood. In other cases, the participants will remain in place, but the character of a school or neighborhood may change. This short course will consider the issues of analysis and, to a limited extent, design, that arise in longitudinal and multilevel research settings.

The starting point for our study will be the axiom that a statistical model represents a tentative conceptual model about the sources of variation in an outcome. The model is based on assumptions that must be made explicit and, when possible, verified. The model should reflect the measurement scale of key explanatory and outcome variables. The model guides not only the summary of quantitative evidence, but also the design of future research.

This short course in HLM will begin by considering two-level studies in which persons (level-1 units) are nested within organizations (level-2 units) such as schools. We will then consider two-level studies of individual change. We will view time-series data (level-1) as nested within persons (level-2). The level-1 model specifies how an individual is changing over time as a function of person-specific "micro-parameters." The level-2 model describes the population distribution of the micro-parameters of individual change as a function of macro-parameters.

The next phase will examine three-level models. Our initial focus will be the case in which repeated measures (level-1) are nested within individuals (level-2), who are themselves nested in organizations (level-3). Not all multilevel data involves a pure nesting. In many important cases, observations are cross-classified by two higher levels of random variation: (a) individuals may be nested in "cells" defined by the cross-classification of schools and neighborhoods, and (b) time-series observations may be cross-classified by the child and the classroom when repeated measures are collected on children who change classrooms during the elementary years. We will explore these cases and situations that involve both nesting and crossing of random factors.

All of the studies considered to this point involve nearly continuous outcomes, for which the normal distribution is at least plausible. The next step will be to generalize two- and three-level models to other types of outcomes: binary outcomes, counts, ordered outcomes, and multinomial data. All of these cases fall into the framework of the hierarchical generalized linear model.

Within this short course we will analyze statistical issues that cut across applications, including: (1) efficiency and robustness of inferences, (2) Bayes and empirical Bayes shrinkage estimation of random effects, (3) exploratory analyses and model checking, (4) univariate and multivariate hypothesis tests & confidence sets, and (5) optimal research design. The course will conclude by addressing methods to estimate hierarchical linear models from incomplete data. Software for the efficient analysis of two-level models in the presence of missing data will be demonstrated.

Value-added models of student achievement data are now key components of educational research and policy. These methods provide the student outcomes-based performance measures of schools and teachers that are increasingly the basis for consequential decisions about educators. Value-added (VA) models have also been utilized to study teachers and the policies and programs that influence their practice. The evolution of the implementation of VA models has been reported in many substantive methodological papers in the educational literature. Several recent, large-scale and high-profile research projects have tested the properties of teacher effects estimated with VA models. Debates about the use of these estimates to evaluate teachers are widespread in scholarly journals, editorials, blogs, and have even made page one stories in the daily newspapers.

The term “value-added models” refers to a large of class of statistical models for longitudinal student achievement data. Common features of the models include : (1) the capacity to make inferences about the effects on student outcomes of clustering units for students such as schools or teachers, and (2) students’ prior achievement provides a means to account for the differences among students assigned to the clustering units.
In this short course we will provide an introduction to the primary variants of VA models. We will discuss the potential for errors in inferences about individual teachers or schools that are made from value-added models. We will discuss the sources for these errors, the empirical evidence concerning their existence, and methods that may be used to reduce them. While our focus will be on the evaluation of teachers, most methods will also have application to schools. The operational issues that arise when using administrative data to fit VA models will also be examined.

Over the course of the three days, we will cover:

The basic data requirements for value-added modeling

Issues of causal effects and structural models in the context of making inferences about individual teachers or programs and policies

VA models from the statistical tradition of mixed effects or hierarchical linear models

We will review findings from the empirical research on value-added modeling on the stability, reliability, persistence, and confounding of VA estimates. Challenges in implementing the models will be explored, as well as the potential for using value-added for teacher accountability and improving educational outcomes.

A key goal of the course is to expose participants to the computational methods used to fit value-added models from the statistical tradition of mixed-effects models. Participants will gain hands-on experience fitting models using maximum likelihood methods available in R,SAS, and Stata, and implementing Bayesian versions of the models using the BUGS language. We will also explore utilizing fixed-effects linear regression models in order to make inferences about individual teachers. Since measurement error in students’ prior achievement test scores creates a potential for bias in value-added models fit by linear regression, we will introduce methods to account for such measurement error when student test scores are used as covariates. The methods we will present may correct for measurement error in both linear and nonlinear models, and take full advantage of recent developments in modeling achievement test scores.

When we conclude our 3-day intensive short course, participants will be::

Familiar with the commonly used methods for value-added modeling

Aware of potential pitfalls in those methods

Informed about the policy debates on using value-added models for teacher evaluations, and

Exposed to software for fitting cross-classified mixed-effects models and fixed-effects linear models with error-prone covariates such as prior achievement scores.

These methods are relevant not only to VA estimation for teachers and schools, but also more generally to estimating the effects of policies, programs and other educational interventions.