Ethical clinical trials, where the goals is to administer the best of two treatments to the most patients, was the original motivation behind multi-armed bandits when Thompson introduced this problem 1933. Since then, many variants of this problem have been investigated, especially under the impetus of online advertising. While in the traditional bandit setup, one is allowed to reevaluate the allocation strategy after each patient, we study a framework where the strategy must function is a small number of stages, typically 2 or 3. Such a restriction is particularly resounding when the treatment effect can only be measured days or weeks after administration. Our minimax analysis provides guidelines for the size of the stages as well as the allocation policy within each stage. Moreover, we show that a very small number of stages (at most 5) is already enough to recover the optimal bounds from the unrestricted setup.