[Preface – the subject of protein evolution pops up on a regular basis in ID circles. Recently, William Dembski mentioned the study alluded to in the title of this essay as an improved argument/piece of evidence for intelligent design. Specifically, Dembski said:

“(2) The challenge for determining whether a biological structure exhibits CSI is to find one that’s simple enough on which the probability calculation can be convincingly performed but complex enough so that it does indeed exhibit CSI. The example in NFL ch. 5 doesn’t fit the bill. The example from Doug Axe in ch. 7 of THE DESIGN OF LIFE (www.thedesignoflife.net) is much stronger.”

“The example from Doug Axe in ch. 7 of THE DESIGN OF LIFE” would appear to be Axe’s 2004 paper in the Journal of Molecular Biology, the subject of my first ever essay on The Panda’s Thumb. Since I have been a bit remiss in re-posting older essays here, I thought I would use this excuse to put this here. It’s “published” without change, so as to maintain some sort of continuity. As always, enjoy.]

Douglas Axe recently (well, sort of) published an article in the Journal of Molecular Biology entitled “Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds” (Axe, J Mol Biol 341, 1295-1315, 2004). In his discussion of the experimental observations, Dr. Axe mentions some numbers that are likely to generate much discussion amongst Intelligent Design advocates and critics. For example, Stephen Meyer (2004) cites Axe at a key point in the argument in his recent article advocating Intelligent Design, “The Origin of Biological Information and the Higher Taxonomic Categories,” much discussed in previous Panda’s Thumb threads (here).

“Axe (2004) has performed site directed mutagenesis experiments on a 150-residue protein-folding domain within a B-lactamase enzyme. His experimental method improves upon earlier mutagenesis techniques and corrects for several sources of possible estimation error inherent in them. On the basis of these experiments, Axe has estimated the ratio of (a) proteins of typical size (150 residues) that perform a specified function via any folded structure to (b) the whole set of possible amino acids sequences of that size. Based on his experiments, Axe has estimated his ratio to be 1 to 10^77. Thus, the probability of finding a functional protein among the possible amino acid sequences corresponding to a 150-residue protein is similarly 1 in 10^77.”

More recently, Dembski cited Axe in his Expert Witness Report for the Dover trial (see this).

“Recent research by Douglas Axe (see Appendix 3) provides such evidence in the form of a rigorous experimental assessment of the rarity of function-bearing protein sequences. By addressing this problem at the level of single protein molecules, this work provides an empirical basis for deeming functional proteins and systems of functional proteins to be unequivocally beyond Darwinian explanation.”

Given that this subject is often raised by ID proponents (such as this), and that the Biologic Institute (where Axe works) has made some news accounts, it seems appropriate to review Axe’s work. The purpose of this PT blog entry is to try and lay out the study cited above (Axe DD, J Mol Biol 341, 1295-1315, 2004) in a form that is accessible to most interested parties, and to discuss a larger context into which this work might be placed. Needless to say, the grand pronouncements being made by the ID camp are not warranted.

RNA-based regulation is all the rage in biology today. The more familiar mechanisms involve small RNAs such as microRNAs and silencing-associated RNAs. The biogenesis and functioning of these RNAs involves enzymes and complexes that have been termed, among other things, Dicers and Slicers. These subcellular kitchen utensils work by processing either the small RNA precursor or the base-paired target RNA. This mode of regulation is most often associated with eukaryotes, and indeed homologous enzymes and mechanisms are not found in prokaryotes. However, systems with remarkable functional similarity may occur in bacteria. A recent review by Sorek et al. brings one such example into focus.

One curious feature of bacterial genome is the occurrence of arrays of direct repeats in which the repeated units are separated by so-called spacers of unique sequence unrelated to the repeat units. The sizes of the repeat units vary from bacteria to bacteria, ranging from between 24 to 47 bp. Likewise, the spacer sizes vary from 26-72 bp. These arrays are flanked by an apparent leader sequence, and yet again by arrays of protein-coding (CAS) genes, the number and composition of which vary considerably from bacteria to bacteria. The general arrangement is shown in the following figure, which is part a of Figure 1 from Sorek et al. (shown beneath the fold): Read the rest of this entry »