How can I use Stata to calculate power by simulation?

Note 1: If you are interested in calculating power for tests on means, proportions, variances, or correlations,
you may want to use the command power, which has subcommands to do that.

Note 2: This FAQ has been updated for Stata 14. The procedure is based on random draws, so results are different from previous versions because of the new 64-bit Mersenne Twister pseudorandom numbers.

First, the general procedure is

Use the underlying model to generate random data with (a) specified
sample sizes, (b) parameter values that express the difference one is
trying to detect with the hypothesis test, and (c) nuisance parameters
such as variances.

Run the Stata estimation program (regress, glm, etc.) on
these randomly generated data.

Get the parameter estimates and standard errors using e(b) and
e(V).

Calculate the test statistic and p-value. For example, if it is a
z-test, divide the parameter estimate of interest by its standard
error and use the
normal() function to get the p-value.
Specifically, suppose b is the coefficient estimate with standard error
sb. The two-sided p-value to test the hypothesis of a
zero coefficient is then equal to 2[1−phi( | t | ) ], where t =
b/sb and phi is the standard normal distribution function
returned by normal().

Do Steps 1–4 many times, say, N, and save the
p-values in a file with N observations.

The estimated power for a level alpha test is simply the proportion of
observations (out of N) for which the p-value is less than
alpha.

Warning: If there are
covariates, make sure they remain fixed throughout all N iterations
of the simulation! Do not regenerate them each time.

Note: As a check, always run the simulation for the null case to make sure
the power comes out to be alpha (to within the sampling error of the number
of iterations). More generally, in the null case, the distribution of the
p-values should be uniform. This may be easily tested in Stata using
the ksmirnov command.

Now an example: We wish to test the effect of a drug dose on a
Poisson-distributed response y in rats. For a sample of n
rats, we have the Poisson regression model

log mui = b0 + bixi (i = 1,2,...,n)
y ~ Poisson(mu)

where xi is the dose in milligrams given to the
ith rat. Suppose we are trying to decide what sample size would give
reasonable power for the test of the null hypothesis b1 =
0 if the true value of b1 were 0.64 (the alternative
hypothesis), under an experimental design with three levels of dosage (0.2,
0.5, and 1.0) repeated r times on n = 3r different
rats. The following program estimates the power for a fixed value of
r. One can then run it for several values to arrive at an r of
choice.

/* poispow.do - does power estimation for Poisson regression model
version 14.1
model: log(mu) = b0 + b1*x
y ~ Poisson(mu)
Specifically, power is estimated for testing the hypothesis b1 = 0, against a
user-specified alternative for a user-specified sample size. Without loss of
generality, we can assume the true value of b0 is 0.
In the `args' statement below:
N is number of iterations in the simulation
r is the number of rats receiving the jth dose (j=1,3)
d1, d2 and d3 are the fixed dosages
b1 is the "true" value of b1 (the alternative hypothesis).
*/
args N r d1 d2 d3 b1
drop _all
set obs 3
generate x=`d1' in 1
replace x=`d2' in 2
replace x=`d3' in 3
expand `r'
sort x
generate mu=exp(0 + `b1'*x) /* (the "zero" is to show b0 = 0) */
save tempx, replace
/* Note: Here, I generated my "x" and mu-values and stored them in a dataset
-tempx- so that the same values could be used throughout the simulation */
/* set up a dataset with N observations which will eventually contain N
"p"-values obtained by testing b1=0. */
drop _all
set obs `N'
generate pv = .
/* Loop through the simulations */
local i 0
while `i' < `N' {
local i=`i'+1
preserve
use tempx,clear /* get the n = 3*r observations of x
and the mean mu into memory */
gen xp = rpoisson(mu) /* generate n obs. of a Poisson(mu)random
variable in variable -xp- */
quietly poisson xp x /* do the Poisson regression */
matrix V=e(V) /* get the standard-error matrix */
matrix b=e(b) /* get the vector of estimated coefficients */
scalar tv=b[1,1]/sqrt(V[1,1]) /* the "t"-ratio */
scalar pv = 2*(1-normal(abs(tv))) /* the p-value */
restore /* get back original dataset with N observations */
quietly replace pv=scalar(pv) in `i' /* set pv to the p-value for
the ith simulation */
_dots `i' 0
}
/*The dataset in memory now contains N simulated p-values. To get an
estimate of the power, say for alpha=.05, just calculate the proportion
of pv's that are less than 0.05: */
count if pv<.05
scalar power=r(N)/`N'
scalar n=3*`r'
noisily display "Power is " %8.4f scalar(power) " for a sample size of " /*
*/ scalar(n) " and alpha = .05"
exit

We now run this program for various values of r (10, 20, 30, 40, 50,
60, and 70), with 100 simulations per run and with α = 0.05
hard coded in (of course this could easily be changed or made a user input).

. set seed 1234
. run poispow 100 10 .2 .5 1.0 0.64
Power is 0.3100 for a sample size of 30 and alpha = .05
. run poispow 100 20 .2 .5 1.0 0.64
Power is 0.5700 for a sample size of 60 and alpha = .05
. run poispow 100 30 .2 .5 1.0 0.64
Power is 0.7300 for a sample size of 90 and alpha = .05
. run poispow 100 40 .2 .5 1.0 0.64
Power is 0.8700 for a sample size of 120 and alpha = .05
. run poispow 100 50 .2 .5 1.0 0.64
Power is 0.9200 for a sample size of 150 and alpha = .05
. run poispow 100 60 .2 .5 1.0 0.64
Power is 0.9800 for a sample size of 180 and alpha = .05
. run poispow 100 70 .2 .5 1.0 0.64
Power is 0.9700 for a sample size of 210 and alpha = .05

Although each estimate is not that precise, being based on only 100 trials,
one could fit a logistic or other regression curve through these data to get
a good estimate of which value of r would give a prescribed power,
such as 0.80. Of course, one could get better accuracy by increasing
N at the expense of more computer time.

Finally (perhaps this should have been done first), we run the program with
b1 = 0 and a large value of N (say, 1,000) to see
if the level of the test is “close” to 0.05. If not, the normal
approximation is not adequate, or we may have programmed the procedure
incorrectly. To get the most provocative case (i.e., the smallest sample
size considered), we run this for r = 10:

. set seed 1234
. run poispow 1000 10 .2 .5 1.0 0.0
Power is 0.0430 for a sample size of 30 and alpha = .05

There were 43 simulated p-values less than 0.05, which for 1000
binomial trials gives a 95% confidence interval of about (0.0313, 0.0575) for
the level of the test (see cii command results below). This suggests
that the normal approximation z-test is quite good for 30
observations (r = 10).

. cii proportions 1000 43

-- Binomial Exact --

Variable

Obs Proportion Std. Err. [95% Conf. Interval]

1,000 .043 .0064149 .0312912 .0574863

Note: Users of Stata 14.0 and earlier should use the syntax cii 1000 43
instead of cii proportions 1000 43.

Finally, we check the entire distribution of simulated p-values for a
uniform distribution: