Adding summary statistics to a data file

Adding a statistic to each observation, reducing the file to summary statistics.

Stata has commands that allow you to either add a summary statistic to each observation in memory, or to reduce the file according to values of a group so that each resulting observation is the summary statistic for that group.

Answers:

2. The generate command has functions, such as "log" below, to create unique values on each observation. The egen command has a different set of functions. Some of its functions put unique values on each observation, while others put summary statistics across all observations (or groups) on each observation.

For example, the following generate command would calculate the natural log of the age of each facility in exampfac.dta and add that value to each facility's record:

generate lage= log(age)

In contrast, the following egen command would calculate the median age of all facilities of each type (authorit), and it would add that value to each facility's record:

sort authorit
by authorit: egen medage= median(age)

By the way, you can use bysort to combine the above two commands and reduce your typing:

3. After the collapse command, the resulting file had 13 observations and 2 variables. The number of observations is determined by the number of distinct values in the by variable. The number of variables is: one for each summary statistic calculated, and one for each by variable.