Correct me if I'm wrong, but isn't that the point? I assume that the
hypothesis is that one or more of these genes are true predictors,
i.e. for these genes the p-value should be significant. For all the
other genes, the p-value is uniformly distributed. Using a
significance level of 0.01, and an a priori knowledge that there are
significant genes, you will end up with on the order of 20 genes, some
of which are the "true" predictors, and the rest being false
positives. this set of 20 genes can then be further analysed. A much
smaller and easier problem to solve, no?