Using STATA for the self-controlled case series method

Our tutorial paper
explains how to carry out a case series analysis using STATA. The files
corresponding to the examples in this paper are given here.

Correction to section 4.3 of the tutorial paper: It is no longer
necessary to download the file 'aglm.ado' as the standard STATA command 'xtpoisson'
can be used to fit poisson regression models with absorbing factors. The
following command is given to fit a standard case series model:

xi: xtpoisson nevents i.exgr i.agegr,
fe i(indiv) offset(loginterval)

All the example do-files below now use 'xtpoisson'.

MMR and meningitis in Oxford example

To run the MMR and meningitis in Oxford example
detailed in the tutorial paper save these two files in your STATA
working directory:

Then 'do' the do file (type do oxford
in the command window, or open in the STATA do file editor and press
Ctrl+D or click on 'Do current file'). Details of what the commands in
the file are doing are given in the file itself as well as in the
tutorial paper.

Files for ITP and MMR examples in the tutorial paper

Save 'itp.dta', the ITP and MMR data, along with the do files into your STATA working directory.

Covariates and Interactions

Repeat exposures

Multiple exposures

Semi-parametric model

The semi-parametric version of the analysis for the MMR and meningitis in
Oxford data is in 'oxford_sp.do' and for the
ITP and MMR data it is in 'itp_sp.do'.

The semi-parametric version of analysis 1 of the OPV and intussusception data is
more computationally intensive, the do file is 'intuss_sp.do'.

These do files require the day before the start of the observation period to
be named sta, the end of the
observation period end, the
individual identifier indiv and the
day the adverse event occurred eventday.
Generate the exposure group cut points, naming them excp1,
excp2,
excp3 etc... and put the number of
exposure group cut points - 1 (usually this is the number of exposure groups)
into a local macro named nexgr.
Exposure group factors are generated as in the parametric models as a list of
increasing ordinals with a zero on the end, so if the exposure groups do not
necessarily occur one straight after another it will be necessary to correct
them using a recode function as
described in section 6.6 (repeat exposures) in the tutorial paper.

Analysis for censored, perturbed or curtatiled post-event exposures

These do files were used to fit analysis 4 of
the validation study in the paper 'Case series analysis for censored,
perturbed or curtailed post-event exposures'. The files use the
intussusception and OPV data from the tutorial paper.

'intuss_cens_pseudo.do'
this file just contains the pseudo-likelihood. This can be bootstrapped
to get confidence intervals, the bootstrap command is shown at the end
of intuss_censored.do, though is preceeded with a * so that it will not
run, as it is very slow.

'aglm.ado', 'xtpoisson' and fitting a GLM with absorbing factors

It has been pointed out to us (many thanks to Therese Stukel of the Institute
for Clinical Evaluative Sciences, Canada) that it is possible to fit a
log-linear model with absorbing factors using the standard STATA command `xtpoisson',
the examples above now all use this command rather than `aglm.ado'. However, we
have kept the old information on the aglm ado file here:

To use `aglm.ado' save the file which fits a GLM with absorbing factors:

For those wishing to understand what 'aglm.ado' does, these are the changes made to 'glm.ado' to create 'aglm.ado':

Open 'glm.ado' in a text editor.

On line 2 replace glm with
aglm, so that the line reads program
aglm, eclass byable(onecall).

First we change the call to regress in the
IRLS option to a call to
areg. On line 762 replace
regress with
areg. On line 763 replace
[iw=`W'
with [aw=`W', delete
mse1 and insert
absorb(indiv) in its place. Lines 762-3 should now read
cap areg `z' `xvars' /* */ [aw=`W'*`wt'/`Wscale'], absorb(indiv) `constant'.

Options for predict differ between
regress and
areg, so on line 769 replace
xb with
xbd giving
predict double `eta' if `touse', xbd.

The mse1 option does not exist for
areg, this re-scales the mean squared error to 1. To re-scale the variance-covariance matrix accordingly, on line 850 replace
e(V) with
e(V)/(e(rss)/e(df_r)).

The degrees of freedom from the absorbing factors needs to be taken into account. On line 848 add
df_a to the end of the line, so that it reads
tempname b V df_a. After line 850 add an extra line
scalar `df_a' = e(df_a). Add to the end of line 1041 (line 1040 before adding the extra line)
-`df_a', so that line 1041 reads
local df=`nobs'-`p'-`df_a'.

Save in the relevant directory as type 'all files' with file name 'aglm.ado'.

The absorbing factor must always be called indiv. It would be
better to specify the name of the absorbing factor as an option when giving the
command, and also to delete the parts of the file that cannot be used.