Welcome to the Institute for Digital Research and Education

Stata Learning Module
Reshaping data wide to long

This module illustrates the power (and simplicity) of Stata in its ability to reshape data files. These examples take
wide data files and reshape them into long form. These
show common examples of reshaping data, but do not exhaustively demonstrate the
different kinds of data reshaping that you could encounter.

This is called a wide format since the years of data are wide. We may want the data to be
long, where each year of data is in a separate observation. The reshape command can accomplish this, as shown below.

long tells reshape that we want to go from wide to longfaminc tells
Stata that the stem of the variable to be converted from wide to
long is faminci(famid) option tells reshape that famid is the unique identifier for records in their
wide formatj(year) tells reshape that the suffix of faminc (i.e., 96 97 98) should be placed in a variable called
year

Example #2: Reshaping data wide to long

Consider the file containing the kids and their heights at 1 year of age (ht1) and at 2 years of age (ht2).

Lets reshape this data into a long format. The critical questions
are:Q: What is the stem of the variable going from wide to long.A: The stem is
htQ: What variable uniquely identifies an observation when it is in the
wide form.A: famid and birth together uniquely identify the
wide observations.Q: What do we want to call the variable which contains the suffix of
ht, i.e., 1
and 2.A: Lets call the suffix age.

With the answers to these
questions, the reshape command will look like this.

We would like to make name and inc into long formats but their suffixes are characters (d & m) instead of numbers. Stata can handle that as long as you use
string in the command to indicate that the suffix is a character.

stem-of-wide-vars is the stem of the wide variables, e.g., faminc
wide-id-var is the variable that uniquely identifies wide
observations, e.g., famid
var-for-suffix is the variable that will contain the suffix of
the wide variables, e.g., year