Example 29.7: Log-Linear Model for Count Data

These data, from Thall and Vail (1990), are concerned
with the treatment of people suffering from epileptic seizure episodes.
These data are also analyzed in Diggle, Liang, and Zeger (1994).
The data consist of the number of epileptic seizures in an
eight-week baseline period, before any treatment, and in each of
four two-week treatment periods, in which patients received either
a placebo or the drug Progabide in addition to other therapy.
A portion of the data is displayed in Table 29.5.
See "Gee Model for Count Data, Exchangeable Correlation"
in the SAS/STAT Sample Program Library for the complete data set.

Model the data as a log-linear model with (the Poisson variance function) and

where

Yij= number of epileptic seizures in interval j

tij= length of interval j

The correlations between the counts are modeled as
, (exchangeable correlations).
For comparison, the correlations are also modeled
as independent (identity correlation matrix).
In this model, the regression parameters have the interpretation
in terms of the log seizure rate displayed in Table 29.6.

The difference between the log seizure rates in the
pretreatment (baseline) period and the treatment
periods is for the placebo group and
for the Progabide group.
A value of indicates
a reduction in the seizure rate.

The following statements input the data, which
are arranged as one visit per observation:

Some further data manipulations create an observation for the
baseline measures, a log time interval variable for use as an
offset, and an indicator variable for whether the observation
is for a baseline measurement or a visit measurement.
Patient 207 is deleted as an outlier, as in
the Diggle, Liang, and Zeger (1994) analysis.

The GEE solution is requested by using the
REPEATED statement in the GENMOD procedure.
The SUBJECT=ID option indicates that the id variable
describes the observations for a single cluster, and
the CORRW option displays the working correlation matrix.
The TYPE= option specifies the correlation structure;
the value EXCH indicates the exchangeable structure.

These statements first produce the usual output from fitting
a generalized linear model (GLM) to these data. The
estimates are used as initial values for the GEE solution.

Information about the GEE model is displayed in Output 29.7.2.
The results of fitting the model are displayed in Output 29.7.3.
Compare these with the model of
independence displayed in Output 29.7.1.
The parameter estimates are nearly identical, but the
standard errors for the independence case are underestimated.
The coefficient of the interaction term, , is
highly significant under the independence model and marginally
significant with the exchangeable correlations model.