It is commonly accepted that the length of the polyQ tract influences the transactivation capacity of the receptor in an inverse manner; that is, the longer the tract, the lower the activity. To support this hypothesis, a clear negative impact on AR activity is documented in relationship with pathological expansions of the repeat length (40 or more), known as the Kennedy syndrome (5). This syndrome is characterized by spinobulbar muscular atrophy and hypoandrogenism due to partial androgen insensitivity. On the other hand, controversies still exist about the effect of variations in polyQ within the normal polymorphic range. The normal distribution of the (CAG)n is reported as 6–39 repeats, with a median of 21–22 in White Caucasian, 19–20 in African-American, 22–23 in Asian, and 23 in Hispanic populations. Clinical observations showing a linear correlation between testosterone level and CAG repeat length support the notion of a functional effect of the polymorphism within the normal range. In fact, increased circulating testosterone and estradiol levels in men with a higher number of CAG repeats can be considered as a compensatory mechanism aimed to overcome the weaker AR activity (6, 7). However, such a linear correlation has not been clearly demonstrated by in vitro experiments. The first two functional studies reported that the longest tract (Q31) displayed lower activity when compared with the shortest one (Q15). However, no significant differences were observed by comparing these two types of alleles to an intermediate number of CAG repeats (20 or 24) (8, 9). Quite strikingly, two recent articles provided evidence for the lack of a stepwise reduction in activity with increasing CAG length across the polymorphic range (10, 11). The reporter gene assay with three different CAG lengths (16, 22, and 28) has indeed shown the highest AR activity in the presence of 22 CAG repeats(10). The other study, performed in a human prostate tumor cell model, has provided mechanistic insights into how both increased and decreased polyQ allele length may negatively affect receptor function (11). This study has revealed a critical polyglutamine size (Q16-Q29) for optimal androgen-induced AR signaling, which corresponds to 91–99% of AR alleles within different ethnic groups. These novel in vitro findings have introduced a new concept for the analysis of AR-CAG repeat length in relationship to AR-related diseases, indicating that linear regression models are likely to be inappropriate.

The study by Davis-Dao et al. (12) indicates a disadvantage only in the case of short CAG repeats; however, upcoming investigations will probably shed light on whether the “optimal range” hypothesis can be applied also to this specific pathological context. In fact, the stratified analysis of nearly 4000 subjects, included in articles dealing with male infertility and AR-CAG length, has provided clinical evidence for the potential benefit of a CAG range corresponding to 22–23 triplets in spermatogenesis (24). However, it must be taken into consideration that this specific range may not be the same across different ethnic groups and may even vary in different tissues because the effect of polyQ repeat on transactivation is cell specific, presumably due to distinct profiles of coregulator proteins (11). Moreover, it is possible that spermatogenesis, more than the process of testis descent, depends predominantly on the genomic action of androgens, and thus on the direct consequence of the CAG length on transactivation. Clearly, more functional studies are needed for the interpretation of clinical data in different types of androgen-dependent diseases.

That blacks average fewer AR-CAG repeats has been held up as evidence blacks are more "masculinized". I was unconvinced one could draw that conclusion even accepting an inverse relationship between CAG repeat length and AR activity (since the CAG repeat represents only one link in androgen-related pathways, and there may be any number of other racial differences in relevant genes). Now it appears that compared to blacks, whites may in fact be more likely to have "optimal" CAG repeat lengths.

In addition, variation in another polymorphism of the AR gene, GGN repeat length, could conceivably lower AR activity in blacks relative to whites:

Contrary to previously published data from Caucasians
and Asian populations, which have the 2 by far most
common GGN alleles of 23 and 24 in the former, and
21 and 22 in the latter, we found 4 common alleles of
20, 21, 22, and 23 in our study population with the highest
frequency of 20 GGN allele followed by 22, 21 and
23 (GGN)n (Fig. 2). ["Androgen receptor gene CAG and GGN polymorphisms in infertile Nigerian men"]

Some more background from the first article above:

Throughout the human genome there are trinucleotide repeat sequences susceptible to either expansion or contraction during replication, giving rise to length polymorphisms in the general population. The polymorphic CAG repeat, which encodes an uninterrupted polyglutamine (polyQ) tract in the N-terminal transactivation domain of the androgen receptor (AR), is the most extensively studied genetic variant in individuals with disorders of the male reproductive system.

Despite an impressive number of studies, the pathogenic role of this polymorphism and its clinical relevance are still a matter of debate. Although a recent meta-analysis of 33 publications (1) supports a pathogenetic role for longer polyQ length in male infertility, the authors conclude their work stating that there is a need for new, well-designed studies (1). In fact, available data do not allow us to establish what range of AR-CAG repeat lengths predisposes impaired sperm production or to estimate the entity of the associated risk (1). Similar to other genetic variants, the literature related to CAG repeats suffers from an abundance in conflicting case-control association studies and a paucity of functional data (2). There are several plausible explanations for these apparent controversies, mostly related to: 1) poor study design (inappropriate selection of patients and controls, particularly with respect to their phenotype and their ethnic/geographic origin, and underpowered size of the study population); and 2) intrinsic complexity of the interaction between the AR and its endogenous/environmental ligands. An additional intricacy derives from the presence of another polymorphic trinucleotide repeat, (GGN)n, in the first exon of the AR gene, which may modulate the functional effect of the CAG repeat length, stressing the need for a combined analysis of the two AR polymorphisms (3, 4).

We have sequenced the whole genomes of 2,500 people. We have genotyped about 120,000 Icelanders with an Illumina chip. We can impute whole genome sequence down to variants with less than 0.1% frequency into about 370,000 Icelanders -- there are only 320,000 living today!”