I'm on the stage of experiment design of some biomedical time-course study. Let's say we will have 2 groups of subjects - case and control. The total number of subjects is limited (for example, 30), and we need to decide how it's better to split subjects between case and control. One way is to have balanced dataset 50% of case vs. 50% of control. Another way is to assign 2/3 of subjects to case and 1/3 to control, since the case subjects would be more interesting.

What experiemnt design would you suggest? Which issues/pitfalls should be considered in both case? Should the experiment design depend on questions asked?

Are you assigning patients to cases or control, or merely collecting those who happen to be cases and controls?
–
FomiteNov 28 '12 at 2:42

@EpiGrad: Yes, we're assigning. But did I say patients? The question is more general. Subjects can be mice, for examples. Are you thinking about any ethical issues with humans?
–
yukNov 28 '12 at 20:55

3 Answers
3

How this is answered depends on how the data will be analyzed. But there are general principles that can be applied. The main idea is to identify what is under experimental control and vary that to optimize the ability of the statistical test to detect an effect if one indeed exists.

To illustrate how this sort of thing is worked out, let's consider the classic textbook situation evoked by the description in the question: a t-test to compare the treatment mean $\bar{X}_1$ to the control mean $\bar{X}_2$, assuming (at least approximately) equal variances $\sigma^2$ in the two groups. In this situation we can control the numbers of subjects placed into the control group ($n_1$ of them) and the treatment group ($n_2$ of them). The strategy is to do this in a way that maximizes the chances of detecting a difference between the group means in the population, if such a difference really exists. Along the way we are usually willing to make reasonable simplifying assumptions, expecting that the optimal design attained in such idealized circumstances will be close to optimal under true experimental conditions. Watch for those assumptions in the following analysis.

the latter have $\chi^2$ distributions with $n_1-1$ and $n_2-1$ degrees of freedom, respectively.

Whence the sum $(n_1-1)s_1^2 + (n_2-1)s_2^2$ will be $\sigma^2$ times a $\chi^2(n-2)$ distribution no matter what values $n_1$ and $n_2$ may take.

Accordingly, when the null hypothesis is true, $t$ will have a Student $t$ distribution with $n-2$ degrees of freedom regardless of how we choose $n_1$ and $n_2$. Our choice of $n_1$ and $n_2$ will not affect that. However, when the null hypothesis is false, the ratio $$\frac{\bar{X}_1 - \bar{X}_2}{s_{12}}$$ will tend to be near the standardized effect size--the size of the difference between the groups relative to $\sigma^2$. In computing $t$, that difference is magnified by the coefficient $$\frac{1}{\sqrt{1/n_1 + 1/n_2}}.$$ We obtain the greatest power to detect a difference when the magnification is as great as possible.

obviously is maximized by making $(n_1-n/2)^2$ as small as possible (giving a maximum value close to $\sqrt{n}/2$.) Consequently, the best power to detect a difference between the treatment and control means with this test is attained when the subjects are divided as evenly as possible (but randomly) between the groups.

As a further demonstration of this result, I simulated the case where $n=16$, with Normal distributions within each group, using four standardized effect sizes $\delta = 0, 1/2, 1,$ and $3/2$. By running $N=5000$ replications of each experiment for all reasonable values of $n_1$ (ranging from $2$ through $n/2=8$), a Monte-Carlo distribution of $t$ was obtained. The following plots show how its mean value corresponds closely to the experimental factor except when there is no effect ($\delta=0$), exactly as claimed.

Because the real world is rarely as nicely behaved as a computer simulation, let's give our analysis a hard time by simulating highly skewed distributions within the groups (which is hard for a $t$ test to deal with). To this end I used a Gamma$(2, 1)$ distribution for the control group and a Gamma$(2, 1+\delta/2)$ distribution for the treatment group. Here are the results:

Although these plots are not as perfect as the preceding ones, they come close enough despite the substantial departures from the simplifying assumptions ($n$ is small and the data are nowhere near Normally distributed). Thus we can trust the conclusion that the experiment will be most powerful when subjects are close to equally divided between treatment and control groups.

Even if the case subjects are more interesting, I would recommend using balanced group sizes. Reason: AN(C)OVA and other analysis methods are more robust when group sizes are equal. See, e.g., Glass, Peckham and Sanders (1972).

As you surmised, your level of interest in accurately describing the treatment vs. control groups is relevant, as is the prior information you have on each. Perhaps you have a theoretical reason to expect higher variance in the treatment group... that could encourage you to make the treatment group larger.

One of the downsides of an unbalanced design is that it is harder to explain.