Signal region synapomorphy

22 Oct 98 webmaster

The prion gene may be good for something after all -- helping resolve a vexatious issue in mammalian evolution.

Eutherian mammals are thought to have experienced a very rapid radiation roughly 100 million years ago (mya) into the various taxonomic orders of today such as rodents, primates, carnivores, and ruminants. Distant events long ago are difficult to resolve by aligning sequence data because fixed mutational changes are rare during the brief window critical to tree topology. If the radiation considered here took place over 1 million years, the 1:99 ratio of branch lengths in the phylogenetic tree implies observed change will be overwhelmingly post-radiation and irrevelent.

For many genes, eg cytochrome oxidase or DNA polymerase, there is no reason whatsoever to expected a favorable acceleration of rate of change during the divergence window -- these genes have the same function whatever the skeletal morphology or habitat niche so do not experience a slackening of selective pressure (though founder drift effects could be enhanced). Recent work in hox genes further illustrate that remarkable changes in morphology can arise from a few modest point mutations or duplications in genes that direct development.

In the prion gene and many others, the rate of evolutionary change varies markedly by codon, by as much as a factor of 50. Stretches such as AGAAAAGA see no change accepted for more than 310my in any lineage; at the other extreme, serine- asparagine toggle codons have fixed changes in many disparate lineages with a characteristic time scale of 10my and are often seen in extant species as alleles. Other codons, such as the ancestral tryptophan in DWEDRY, exhibit synapomorphic changes, here to tyrosine (a 2bp change with DCEDRY as probable intermediate, seen only in post-guinea pig rodents).

Thus the situation is really worse than the first two factors [branch fraction, steady rates] suggest: the mutations most likely to occur during the critical 1mya window are in codons with highest rates of evolution -- exactly those which are likely to be over-written numerous times subsequently. In the main prion gene, the codon most likely to change in 1my is GGHNQW. Following any particular lineage forward in time results in several changes. Since the only data is from extant species, nothing reliable can then be inferred about the period of interest 100mya.

Codons with phylogenetic signal relative to the mammalian radiation are thus rare because of conflicting requirements -- a slowly evolving codon needs to have changed during a particular narrow window. If the rate of change by codon position is plotted, codons can be weighted for relevence by convolution of the Fourier transform with a time scale semi-gaussian for the polytenic node in question, ie, by low band-pass Fourier filtration.

Codons that change too fast are effectively discarded; codons that change too slowly are moot. It is sometimes hypothesized that rate of change is regionaly smooth, yet adjacent amino acids in an alpha helix do not have their side chains in proximity -- one may be a structurally critical interior residue, the next a weakly constrained polar surface residue. In the prion gene, 256 codons are quickly reduced to a handful of potentially synapomorphic positions because there are many invariant residues and additionally rapidly changing positions. The key idea here is to not treat good information (codons with an appropriate rate of change) on an equal footing with mediocre information (codons experiencing multiple hits).

This approach is equally applicable to mutational sites involving insertions and deletions (called indels when the event cannot be resolved) in genes that are not evolving chaotically. However, those indels involving the tandem repeats and oligo-glycines in the prion gene are the analogue of ser-asn toggle codons: they change too fast to have applicability to 100my time scales. Specific indels are rare to begin with; to occur twice in separate lineages or to revert has the effect of squaring an already miniscule probability. (Tandem repeats have special structural features that enhance these rates through replication slippage; retrotransposons raise other special issues.)

Fortuitously, the prion gene contains the perfect indel relative to the mammalian radiation in its signal region. The event, most parsimoniously explained as a 6 bp deletion (two codons so no frameshift), cleanly separates rodents, primates, lagomorphs from the ancestral lineage (marsupial sequence) and ferungulates (ruminants, cetaceans, carnivores, perissodactyls). In other words, the deletion establishes the existence of a common ancestor to the mouse-human-rabbit lineage not shared by the cow-mink-horse lineage.

How solid is the evidence for this interpretation?

First note that, deletions and insertions are extremely rare in the prion gene -- excluding tandem repeat slippage, the only other known example in 80 mammalian species is a glycine (GGG codon) inserted in an ancestor of Murinae rodents at the hyper-variable GPI junction: YYDGRRSsavlf.

Second, the 26-residue signal region has been quite stable to point mutation, almost comparable to mature protein. All genetic change accepted in the signal region in the last 100my in 80 species can be explained by a dozen or so point mutational events and the indel under consideration here; another few point mutations accommodates the marsupial divergence at 178my. There is little singlet change (where sequencing experimental error is concentrated in any case), a few conservative toggle codons, and a limited number of deeper synapomorphic changes all consistent with seldom-disputed aspects of the phylogenetic tree (assumed here throughout). For example, ancestral tyrosine at position 8 has gone to cysteine in old world monkeys-great apes and to 4-codon serine in ferungulates.

(Serine is unique in having 6 codons not in the same column of the standard genetic code; direct change requires two base changes and seldom occurs; threonine and cysteine, at the intersections of serine rows and columns, usually mediate change. The effect results in relatively frozen 4-codon and 2-codon serine, giving constraints on toggle opportunities; 2-codon serine can be safely inferred from ser-asn toggles, 4-codon serine from ser-ala-thr.)

Next, note that the indel event can be simply explained by an insertion or deletion involving codons 3 and 4. Two similar scenarios also work, with the indel beginning at position 2 or 3 of codon 2. This region has seen very little change at silent codon positions.

Alignment programs such as ClustalW or Blast often do not gap correctly in this region. This error then trickles down through research papers on the rate of change in the prion gene (or of nuclear genes in general) when alignments are not hand-gapped. The effect is not trivial when compounded with gross errors regarding the octarepeat region, because illusory changes can then quantitatively dominate the picture of prion gene evolution. Note that it is most unclear at both the DNA and protein level which residues are still homologous (or even what homologous means) because of the split between function and descendancy.

Chicken and marsupial, safe outgroups, have 26 residues in the signal region, as do all ferungulates. Despite the great span of time, they align quite well with conservative single base changes needed for concordance. This argues for the indel being a deletion within the rabbit-primate-rodent lineage, rather than separate insertions within birds, marsupial, and ferrungulates. Further support could be obtained by sequencing mamalian orders not represented in the data, such as basal sloths: all are predicted to have 26 residues in the signal region.

Another region where an indel synapomorphy seems to cleanly separate rodent-primate-lagomorph from ferungulates is in the terminal octapeptide repeat, which is always a nonapeptide in ferungulates but never in the other group. A tetra-glycine becomes a tri-glycine. The marsupial also has a nonapeptide but is not overwhelming in its similarity otherwise.. The first and final repeats are not subject to erasure and over-writing like middle repeats by the nature of the slippage mechanism.

In conclusion, [(rodent-primate-lagomorph), ferrungulate), marsupial] is the only tree with topology consistent with the signal and repeat region deletions. There are further synapomorphic codons but they simply confirm agreed-upon divisions such rodent-primate or ferrungulate-others. Ruminants cannot be separated from carnivores with protein sequences from this region. The rabbit node cannot be placed -- there is little value in long branch sequencing unless taken in 3's to suppress singlets, eg [(rabbit, hare), pika]. Two similarly chosen marsupials would be of even more value; [(opossum, kangaroo), monotreme] works, again because the topology is certain and the branches are not too short. Prion researchers have squandered immense resources sequencing too closely related taxa.

Here is a very curious recent paper on this same topic concerning a mitochondrial gene that also goes against the grain:

Conflict among individual mitochondrial proteins in resolving the phylogeny of eutherian orders.

"The phylogenetic relationship among primates, ferungulates (artiodactyls + cetaceans + perissodactyls + carnivores), and rodents was examined using proteins encoded by the H strand of mtDNA, with marsupials and monotremes as the outgroup. Trees estimated from individual proteins were compared in detail with the tree estimated from all 12 proteins (either concatenated or summing up log-likelihood scores for each gene). Although the overall evidence strongly suggests ((primates, ferungulates), rodents), the ND1 data clearly support another tree, ((primates, rodents), ferungulates).

To clarify whether this contradiction is due to (1) a stochastic (sampling) error; (2) minor
model-based errors (e.g., ignoring site rate variability), or (3) convergent and parallel evolution (specifically between either primates and rodents or ferungulates and the outgroup), the ND1 genes from many additional species of primates, rodents, other eutherian orders, and the outgroup (marsupials + monotremes) were sequenced. The phylogenetic analyses were extensive and aimed to eliminate the following artifacts as possible causes of the aberrant result: base composition biases, unequal site substitution rates, or the cumulative effects of both.

Neither more sophisticated evolutionary analyses nor the addition of species changed the previous conclusion. That is, the statistical support for grouping rodents and primates to the exclusion of all other taxa fluctuates upward or downward in quite a tight range centered near 95% confidence. These results and a site-by-site examination of the sequences clearly suggest that convergent or parallel evolution has occurred in ND1 between primates and rodents and/or between ferungulates and the outgroup. While the primate/rodent grouping is strange, ND1 also throws some interesting light on the relationships of some eutherian orders, marsupials, and montremes. In these parts of the tree, ND1 shows no apparent tendency for unexplained convergences."

What about convergent evolution in the signal region?

Now a great many proteins have a signal region; within a given species, these are all processed by the same endoplasmic reticulum machinery -- a limited set of signal endopeptidases must recognize the clip point in all these pre-proteins. The 'signature' of a signal peptide is not a specific linear sequence (like that of a restriction site) but rather a more generic pattern of central hydrophobicity followed by a serine or cysteine and charged residues.

This sounds like a prescription that would tolerate rapid rates of change yet it does not. It also sounds like a prescription for convergent evolution or for building exportable proteins by swapping in a universal signal domain (analogue of the Rossmann fold for nucleotides). So why, on a Blast search against an 850,000,000 bp data set, do prion queries only return other prion signal regions?

The answer may be that signal peptides are very ancient, dating back to the divergence with eubacteria -- there has been a great span of time in which to diverge. Convergent evolution does not act in this instance to drive the domain to a universal linear sequence, rather to a common generic property pattern within an immense sequence space (20 to the 26th power). There may not be many proteins with 'new' signal peptides; the source may be existing signal proteins through gene duplication and divergence.

Note: why use INDEL for INsertion or DELetion? Because when aligning sequences, one often sees that a gap has to be introduced. That doesn't necessarily mean the shorter sequence had a deletion; the longer sequence might equally have had an insertion. In many situations it is not possible to resolve the issue. So rather than call it 'insertion or deletion' which is too cumbersome for constant use, or call it 'deletion' which is biased, people went for a neutral term, indel.

The indel in the signal region of the prion protein happened to be resolvable and was a deletion. Resolution is only probablistic: 1 rare event is a whole lot better than 2 rare events, eg the ancestral signal could be 24 aa and the marsupial and ferungulate lineages could each have had the same 2 aa insert in the same spot while mouse-human stayed at ancestral length. Resolution is also predictive: 3-toed sloths, elephants, platypus prions etc. will have 26 aa. Guinea pig is likely 24 aa but could go either way, depending on exactly when it branched off relative to the deletion event in the common ancestor of rodent-primate-rabbit not shared by artiodactyl-carnivore.

Synapomorphy is one of many learned-sounding terms in newer taxonomic theory that do not convey any meaning per se. however, these terms end up being convenient. It refers to a character value [here an aligned amino acid, elsewhere a bump on a tooth] that occurs only and everywhere on a topological subtree. example: DWEDRY in all rodents is DYEDRY in every other species. The tryptophan at this position is a good synapomorphic character for rodents. But the tyrosine is not, because it does not identify a monophyletic subtree. This is a 2bp change that presumbably passed through cysteine (which may still be present in some pre-Murinid rodents). One could also speak of local synapomorphies, eg at codon 4, serine is diagnostic of hamsters within rodents but not within mammals.

Knockouts vs knockouts?

26 Oct 98 webmaster opinion

Recent papers about knockout mice are split between several finding no ill effects and a few finding minor abnormalities. In either case, there is no support for essentiality of the prion gene, no disease phenotype much less lethality associated with loss-of-function, and thus no explanation for the conservative evolution of the gene. Note transgenic mice expressing PrP with specific amino-proximal deletions develop a neurologic syndrome with ataxia and cerebellar lesions.

How good are the controls for knockout mice? Terrible. No one has ever determined the prion sequence of a realMus musculus domesticus. A highly inbreed lab mouse bears less of a relationship to a wild housemouse than a toy poodle does to a Canadian wolf.

Could 'wildtype' mice already be knockouts of normal prion function? Suppose the prion gene were essential but inbreeding fixed a bad allele, with a compensatory change in another gene strongly selected by the inbreeding process: all the experiments then compare a point mutation knockout to a deletion knockout.

No one in their right mind would ever use linc [long incubation] mice as controls -- this allele is doubly defective relative to sinc [short incubation]: L108F - T189V. (Linc is thus 3 base changes from sinc: C428T and ACc671-673GTc; the 'missing link' would have ile or ala as transition. Sinc can be shown to be closer to wildtype -- see below.). While linc doesn't cause TSE during mouse lifetimes, TSE is not a disease of normal function. The loci in humans that cause familial CJD are speculated to thermodynamic destabilize native protein, yet these are mainly mild conservative changes compared to linc. (It should not be thought that long incubation times (to scrapie passage) means this protein is 'better' than wildtype; on the contrary, it probably means that it is less like a real prion in structure, hence harder to recruit under the like-like principle of the species barrier. Linc mice are analagous to the many bizarre alleles in sheep prion -- artefacts of animal husbandry.

Is linc is a knockout (or severe setback) of some aspect of normal function? Yes, the severity of the mutations implies this. The argument is threefold quantitative: a residue's functional importance is inversely related to its characteristic rate of evolutionary change; codons 108 and 189 (and surrounding domains) are experiencing exceeding slow rates of change. (The baseline is set in pseudogenes, introns, or intra-gene loop regions of similar base composition not experiencing selection.) Second, statistical measures of the substitutability of one residue for another reflect general design criteria of proteins (roughly PAM or Blosum matrices): if there is to be an accepted change at some codon, then it is far more likely to be certain residues than others. Third, multiple mutations are generally worse than additive in effect.

Assuming for the moment that sinc has full wildtype function, to replace both a leucine with a phenylalanine and a threonine with valine at codon positions with 100 million year scales of invariance is a bit like winning the lottery the same day you shoot a hole-in-one at golf blindfolded. This is the measure of neutrality of linc relative to sinc for retaining the full gamut of normal prion function. [A third strain of mouse was in wide use in the 1980's but has not shown up in sequences since, M133V relative to sinc; similar arguments apply to it. I call it kinc in view of probable induced structural changes.]

Naturally very few papers in the prion literature actually state up front which of the 3x3=9 genotypes of mice were used. A person familiar with the myriad strain names and their histories might be able to work this out in some cases. The key issue may be whether linc mice were derived from sinc or vice versa. The latter scenario means that even though sinc might be closer to wildtype, if it got there through a compensated linc mouse, its knockouts are no better than linc knockouts.

Alternatively, it could be argued that loss of prion function in 'wildtype' is simply not detected under conditions of cage life (or swimming or maze tests). That is, a mouse could be deaf, dumb, lack night vision and olefaction, and roll over for predators -- what does it matter when you are never more than 6 inches from your food bowl? A gene may be essential in the wild but not in the animal room. (E. coli can dispense with hundreds of genes when grown in rich broth.)

Are sinc mice really wildtype?

Fortunately, the wild type sequence of Mus musculus can be reliably predicted while we wait for someone to sequence the prion gene of real wildtype mice. The basic approach is to (1) align mouse prion with rat, gerbil, 3 hamsters, and 2 cotton rats within the Murinae, (2) clamp to the reliably dated, known topology, (3) use, in effect, an ancestor of primates reconstructed from the dozens of sequences as outgroup, (4) apply marsupial, bird, and ferungulate prion sequences as more remote outgroups and quantitate rates of change and acceptable substitutions at relevent positions in sinc mouse.

Forgetting now the differences between sinc, linc, and kinc mice, let us concentrate on dubious variations mice have relative to what is known about this protein from its evolution. One sees immediately mice have too many changes:

Changes in the signal and GPI region are major:

Are the changes in the'wildtype' mouse signal region and the chaos near the site of GPI attachment enough throw off these processes, resulting in a mix of GPI-attached, transmembrane terminal peptide, and extracellular-released? This could result in effects on normal function via distribution even though these domains do not appear per se in mature protein.

This is difficult to assess because sophisticated identification algorthms [eg Psort] don't like any rodent signal peptides. However, using a virtual chimera of human signal shows that mouse gpi attachment is still expected. The hypervariable region surrounding the GPI join has no good explanation; the 3' terminus itself shows extraordinary conservation. Asp preceding the join, DGRRS-ss appears to be a very old insertion in rodents. On top of this, mouse has a slippage insert with terminal point mutation immediately after the GPI. One might suppose that the GPI splice signature would be critical and conserved but instead it is one of the most variable regions in the whole protein.

Changes in the repeat region could disturb mouse prion function:

Mouse prion has three very unusual changes in this region. The first shortens the first repeat to 8 residue [seen in new world monkeys as well], the second substitutes a serine for glycine adjacent to the tryptophan of the third repeat, which is then iterated in the fourth repeat. These two changes are probably related to a single slippage even though the serine codons are different at third position. Note rats -- also highly inbred-- show two idiosyncratic serines in the repeat region at different sites.

While serines are tolerated sporadically at the first and final repeats, it is precisely these serine substitutions at the second and third repeat that are unprecedented and quite possibly enough to knock out or disrupt the structure/function of the repeat region. Mouse prion is a very poor place to study copper and zinc binding to this region because of these unique serine substitutions; most studies fortunately have been done with PHGGGWQG repeats (general mammal).

Conclusion

When the functions of the prion protein finally become measurable, it will be interesting to see if the mouse prion is fully working. Knockout mice are simply not persuasive at this point (and golden hamster would not be much better)
because there is no evidence that normal function is retained by lab mice controls. It is high time that real wildtype mouse was sequenced

Gerstmann-Straussler-Scheinker disease (GSS), a cerebello-pyramidal syndrome associated with dementia and
caused by mutations in the prion protein gene (PRNP), is phenotypically heterogeneous. The molecular mechanisms
responsible for such heterogeneity are unknown. Since we hypothesize that prion protein (PrP) heterogeneity may be
associated with clinico-pathologic heterogeneity, the aim of this study was to analyze PrP in several GSS variants.
Among the pathologic phenotypes of GSS, we recognize those without and with marked spongiform degeneration. In the
latter (i.e. a subset of GSS P102L patients) we observed 3 major proteinase-K resistant PrP (PrPres) isoforms of
ca. 21-30 kDa, similar to those seen in Creutzfeldt-Jakob disease. In contrast, the 21-30 kDa isoforms were not
prominent in GSS variants without spongiform changes, including GSS A117V, GSS D202N, GSS Q212P, GSS Q217R,
and 2 cases of GSS P102L.

This suggests that spongiform changes in GSS are related to the presence of high levels of
these distinct 21-30 kDa isoforms. Variable amounts of smaller, distinct PrPres isoforms of ca. 7-15 kDa were seen
in all GSS variants. This suggests that GSS is characterized by the presence PrP isoforms that can be partially cleaved
to low molecular weight PrPres peptides.

Comment (webmaster):
Two of these mutations are apparently new and not on Medline. One presumes they turned up during screening of GSS patients. People should stop calling GSS a disease or just pick one genotype for it. I favor getting rid of both FFI and GSS and just sticking with 'CJD D202N M129M' or whatever. GSS is a subset of CJD with no deep underlying definition or common ground -- no wonder there is paper after paper wrestling with 'phenotypic variability.'

Both D202N and Q212P are found in alpha helix 3 in the mouse and hamster nmr structures. D202 is an invariant residue in mammals (but glutamate in birds) just past the 2nd glycosylation site, hydrogen bonded to Y149, Y157, T199, and T199 amide. Q212 is also strongly invariant (but deleted in birds) just prior to the second cysteine of the disulphide and hydrogen bonded to T216

Better3D blow-ups will be posted shortly. There is also a whole long story to about how hydrogen bond acceptors cannot be replaced by donors or non-acceptors or donors by acceptors or non-donors etc. etc. even though under other circumstances these mutations might be conservative. There are applications to sheep allele hazards and to whether any of the lab mice strains have normal functioning prion protein. E200K R208H V210I Q217R M232R are the other known mutations in this vicinity .

Nematode genome 85% complete: no prion yet

10/16/1998
WUSTL and the Sanger Centre have finished sequencing 85,341,695 bases of the 100 Mb Caenorhabditis elegans genome (make that 86,572,592 as of 25 Nov 98)

Comment (webmaster):

The prion gene has been tracked back 410 million years to the fish-mammal divergence using antibody 3F4 to the core invariant epitope. Yet there is no sign of the gene earlier in yeast, fruit fly, or nematode.

Here is the hnRNP gene product, still the best Blast hit to prion protein, found long ago by hybridization. Note that its terminal repeat does bear an uncanny resemblance, in composition and residue order, to the prion protein octarepeat (and also to the yeast sup35 prionlike protein repeat.

...The evolutionary
conservation of the PrP gene has been reported in the genomes of many vertebrates as well as certain invertebrates. In the genome of
nematode Caenorhabditis elegans, the sequence capable of hybridizing with the mammalian PrP cDNA probe has been demonstrated,
predicting the presence of the PrP gene homologue in C.elegans. In this study, Southern analysis with the hamster PrP cDNA (HaPrP)
probe confirmed the previous observation. Moreover, Northern analysis revealed that the sequence is actively transcribed in adult
worms.

Thus, we screened C.elegans cDNA libraries with the HaPrP probe and isolated a cDNA that hybridizes to the same sequence in
C.elegans that hybridized with the HaPrP probe in the Southern and Northern analyses. The deduced amino acid sequence of this cDNA,
however, is substantially homologous with heterogeneous nuclear ribonucleoprotein (hnRNP) core proteins rather than mammalian PrPc. The hnRNPs contain the glycine-rich domain in the C-terminal half of the molecule, which also seemed to be in PrPc at the N-terminal half of the molecule. Both of the glycine-rich domains are composed of tracts with high G + C content, indicating that
these tracts may [cause] the hybridizing signals. These results suggest that this cDNA clone is derived from a novel hnRNP gene
homologue in C.elegans but not from a predicted PrP gene homologue.

Triplet repeat diseases: innocent inclusions?

Katrina L. Kelner opinion piece
Science 22 Oct 98

" In a curious set of neurodegenerative diseases, a long string of the nucleotide triplet CAG
lodges within genes, causing the death of subsets of neurons and ultimately disease. Exactly
how these strings of repeats cause cell death is not known, but they do not simply disrupt the function of their target gene. Rather, the long CAG
string has a deadly--but undefined--effect of its own.

One popular idea is that the CAG repeats cause the protein to form a toxic aggregate in the nucleus of cells. These so-called nuclear inclusions
are common in the brains of patients with these disorders. But in two recent papers in Cell, this explanation is called into question. One group
shows, in a cultured cell model system for Huntington's disease (F. Saudou et al., Cell 95, 55 1998), that cells may die even without the presence of nuclear inclusions. In the
most dramatic experiment, expression of a fragment of the mutant huntingtin protein containing a 68-repeat insertion, together with an
inhibitory form of the ubiquitin-conjugating enzyme, resulted in far fewer intranuclear inclusions. The mutant huntingtin actually triggered
more cell death in this situation than it would have in the presence of inclusions, leading the authors to the bold suggestion that the inclusions
may actually be protective.

A second group made transgenic mice that mimicked the disorder spinocerebellar atrophy type 1 (A. Klement et al., ibid., p. 41.), in which the repeat-containing protein ataxin-1 lacked a self-aggregating region. These mice had no nuclear inclusions, but still showed the characteristic degeneration of cerebellar Purkinje cells. The field may now have to look elsewhere for the mechanism by which these repeats do their damage
to the cell."

Comment (webmaster): While these two Cell papers should be taken seriously (not forgetting the large literature on these diseases pointing in the other direction), we have seen similar errors of interpretation many times in CJD. It is impossible to show absence of aggregates, only non-detectibility up to the sensitivity of whatever methods used. Many other effects also come into play in transgenic mutants when using proteins of unknown function and neuropathological phenotyping.

Speaking of ontogeny recapitulating phylogeny, here is the phylogenetic version of repeat disease anticipation:

Evolution of the primate androgen receptor: A structural basis for disease.

Choong CS, Kemppainen JA, Wilson EM
J Mol Evol 1998 Sep;47(3):334-42

Comparison of androgen receptor from five primate species, human, chimpanzee), baboon, macaque) and collared brown lemur supports their phylogeny with complete conservation of the DNA and steroid binding domain protein sequence. A linear increase in trinucleotide repeat expansion of homologous CAG and GGC sequences
occurs in the NH2-terminal transcriptional activation region and is proportional to the time of species divergence.

A serine phosphate/glutamine repeat interaction is observed where increasing CAG repeat length is associated with an increased rate of serine 94 phosphorylation. Disparity in the calculated and apparent molecular weight with CAG repeat expansion of an AR NH2-terminal fragment
suggests self-aggregation with increasing glutamine repeat length into the pathological range. These results suggest that a CAG/glutamine
repeat expanded during divergence of the higher primate species, which may have a direct effect on AR structure and support a common pathway in CAG trigenic diseases in the pathophysiology of neurodegeneration observed in X-linked spinal bulbar and muscular atrophy.