Monday, October 12, 2015

Sample Size Estimation Based on Precision for Survey and Clinical Studies such as Immunogenicity Studies

Sometimes, we may need to calculate the
sample size to estimate a population proportion or a population mean with a
precision or margin of error. Here we use the terms ‘precision’ and ‘margin of
error’ interchangeably. The precision may also be referred as “half of the
confidence interval”, “half of the width of CI”, and “Distance from mean to
limit” depending on the sample size calculation software.

Statistician may need to estimate the sample
sizes for the following situations:

Example 1: A survey estimated that 20% of all
Americans aged 16 to 20 drove under the influence of drugs or alcohol. A
similar survey is planned for New Zealand. The researchers want to estimate a
sample size for the survey and they want a 95% confidence interval to have a
margin of error of 0.04.

Example 2: an immunogenicity study is planned
to investigate the occurrence of the antibody to a therapeutic protein. There
is no prior information about the percentage patients who may develop the
antibody to the therapeutic protein. How many patients are needed for the study
with a 95% confidence interval and a precision of 10%?

Example 3: A tax assessor wants to assess the
mean property tax bill for all homeowners in Madison, Wisconsin. A survey ten
years ago got a sample mean and standard deviation of $1400 and $1000. How many
tax records should be sampled for a 95% confidence interval to have a margin of
error of $100?

These are set of situations where the sample
size estimation is based on the confidence interval and the margin of error.
The examples #1 and #2 are dealing with the one-sample proportion where we
would like to estimate the sample size in order to obtain an estimate for
population proportion with certain precision. The example #3 is dealing
with one-sample mean where we would like to estimate the sample size in order
to obtain an estimate for population mean with certain precision.

Sample Size to Estimate A Proportion With a
Precision

The usually formula is:

N = z^2 p(1-q) / d^2

where p is the proportion (may be obtained
from the previous study or and d is the precision or margin of error. Z is the
Z-score e.g. 1.645 for a 90% confidence interval, 1.96 for a 95% confidence
interval, 2.58 for a 99% confidence interval

For example #1, the sample size will be
calculated as:

N = 1.96^2 x 0.2 x 0.8/0.04^2 = 384.2 round
up to 385

Similarly, if we use PASS, the input
parameters will be

Confidence
Interval: Simple Asynptotic

Interval
Type: Two-sided

Confidence
level (1-alpha): 0.95

Confidence
Interval Width (two-sided): 0.08
(note: 0.04 x 2)

P
(Proportion): 0.2

For example #2, since there is no prior
information about the proportion, the practical way is that if no estimate of p
is available, assume p = 0.50 to obtain a sample that is big enough to ensure
precision.

If we use formula, the sample size will be
calculated as:

N = 1.96^2 x 0.5 x 0.5 / 0.1^2 = 96

Similarly, if we use PASS, the input
parameters will be

Confidence
Interval: Simple Asymptotic

Interval
Type: Two-sided

Confidence
level (1-alpha): 0.95

Confidence
Interval Width (two-sided): 0.2 (note:
0.1 x 2)

P
(Proportion): 0.5

Sample Size to Estimate A Proportion With a
Precision

The usually formula for is:

N = (s t/d)^2

Where s is the standard deviation, t is the
t-score (approximate to Z-score if assuming normal) and d is the precision or
margin of error.

For example #3:

N=(1000 x 1.96/100)^2 = 385

Similarly, if we use PASS, the input
parameters will be:

Solved for: Sample size

Interval type: two-sided

Population size: infinite

Confidence Interval (1-alpha): 0.95

Distance from mean to limits: 100

S (standard deviation): 1000

The sample size calculation based on the
precision is population in survey in epidemiology studies and polling in
political science. In clinical trials, it seems to be common in immunogenicity
studies. In immunogenicity studies, it is not just for one sample situation, it
may also be used in the two sample situation. In a book “Biosimilars:
Design and Analysis of Follow-on Biologics” by Dr Chow, sample size section
mentioned the calculation based on precision:

In immunogenicity studies, the
incidence rate of immune response is expected to be low. In this case, the
usual pre-study power analysis for sample size calculation for detecting a
clinically meaningful difference may not be feasible. Alternatively, we may
consider selecting an appropriate sample size based on precision analysis
rather than power analysis to provide some statistical inference.

The half of the width of the CI
by w=Z(1-alpha)/2*sigma hat which is usually referred to as the maximum error
margin allowed for a given sample size n. In practice, the maximum error margin
allowed represents the precision that one would expect for the selected sample
size. The precision analysis for sample size determination is to consider the
maximum error margin allowed. In other words, we are confident that the true
difference signma=pR-pr would fall within the margin of w=Z(1-alpha)/2*sigma
for a given sample size of n. Thus, the sample size required for achieving the
desired precision can be chosen.

This approach, based on the
interest in only the type I error, is to specify precision while estimating the
true delta for selecting n.Under a fixed power and
significance level, the sample size based on power analysis is much larger than
the sample size based on precision analysis with extremely low infection rate
difference or large allowed error margin.

SAS Proc
Power can also calculate the sample size. The exact method is used for sample
size calculation in SAS. The obtained sample size is usually greater that the
ones calculated by hand (formula) or using PASS.

For
confidence interval for one-sample proportion situation, the SAS codes will be
something like this: