A computational biologist's personal views on new technologies & publications on genomics & proteomics and their impact on drug discovery

Friday, November 15, 2013

Did The Biochemists of Yore Know Morse Code?

So, this piece is going to be mostly asking questions. In one of the corners of my dream world I have a scientific historian on retainer, but in the real world my substitute is to throw some questions out and hope some knowledgeable people leave comments. If someone I spark someone’s term paper or thesis topic, I ask only that I get an electronic draft!

First, a bit of confessional back-story. A while ago, someone from a publisher
contacted me by email. Would I be interested in writing something about Craig Venter’s
new book, Life at the Speed of Light, around its release? I might, I replied, but only if I could see
the book. Very promptly I had my family
asking what the express package was that had shown up, which indeed had a
copy. That was a number of weeks ago,
and I’m still not finished the book (but it is most definitely released;
another deadline blown to add to my life tally). It’s not a bad book nor a super long one; it’s
just I've had some self-starting issues.
When the book arrived, I was deeply immersed in Anathem, and even on
Kindle that Neal Stephenson tale is a bit of a doorstop (but a very good one,
with its own fun little riff on synthetic biology). Then some crazy crush at work, and at home I’m
usually mentally exhausted by homework help (which is another whole topic worth
covering, if I ever get the courage). So,
end result is that this is not a review of Life at the Speed of Light, but one
should eventually happen, if I can figure out the angle I want to attack. Or perhaps I'll just run a bunch of pieces triggered by bits of the book.

I do recommend that it be widely read, with
the caveat that one should know in advance it is a highly Venter-centric view
of synthetic biology. Since he and the
set of talented people he has nucleated around him has been a major contributor
to the field, that’s a fair way to tackle things, but it can’t be called a
broad overview of the field (at least based on the chapters I have read).

The early chapters are a nice readable summary of how DNA
came to be known as the genetic code.
There’s a whole rumination possible there on how well Venter treated the
subject and how I might do it differently (I love the Hershey-Chase experiment,
which is barely covered). There’s also room
for some gripes tempered with the realization that writing a whole book is a
daunting task. Still, the notes format
is infuriating and nearly useless – pages in the book have the chapter names
but not numbers on every pair of pages, but the notes are organized into
sections by chapter numbers but without names.
On an e-reader this wouldn't bother, but in hardcover form it is a
nuisance.

But, back to my main topic (which, admittedly, could be seen
as one more procrastination strategy with regard to writing an actual
review). Venter brings out the idea,
which I learned back as a freshman, that biochemists were strongly betting on
protein to be the carrier of heredity, with DNA a dark horse or even considered
a foolish idea. Only after a series of
experiments, culminating with Hershey-Chase, was DNA recognized as clearly the master
informational molecule.

Now, there are a bunch of questions to think about here, and
I’ve wondered about it since I was an undergraduate. How strong was the case for proteins, and who
were the strongest proponents? After
some of the key experiments, such as Avery's, who were the
holdouts? Were they convinced, or were
there holdouts, well into the 1950's and 1960's, unconvinced by the mounting evidence for DNA (and perhaps
viewed as cranks).

Nowadays, the idea that DNA is too simple seems
ludicrous. But I've been steeped in
digital thinking my whole life; I learned about DNA around the same time
(around age 8) I learned how to convert decimal to binary and skimmed Gödel, Escher, Bach before I hit junior high.
So it seems strange to argue that
a language with only four letters couldn't be greatly expressive. But, the 1920s through 1950s were a very
different era. Shannon, Turing and
others roughing out computer science would have been publishing around then, and probably not in circles that
would be widely followed by biochemists.
Now, as I suggest in the title, the clear model for a simple language
would have been Morse Code, which requires 3 symbols (the pauses between code
groups are really a third character after dot and dash). The family had a little telegraph set when I
was young (though I never got to learning Morse) and Dad (who grew up in the 20’s
and 30’s) built a crystal radio, so it seems likely that at least some
biochemists of that time might have been exposed to such a code scheme. This would have also been around the time
that most of the periodic table had been filled out and the last key chunk of
the underlying basis for the atoms (neutrons) was discovered in the 1930's as
well. Did chemists think about the
variety that can be created with a 2 letter alphabet of proteins (with neutrons
being almost like accent marks) and compare that to the variety of proteins
that must be codable?

Talking about this with a colleague today, we came up with
some additional questions. We think now
in terms of there being twenty canonical amino acids and a bunch of rare ones,
but that view of reality has been strongly shaped by knowledge of the genetic
code. Natural amino acids are much more
varied, though many of the others are rare.
What was the history of working out the amino acids in proteins, and
when was the set of twenty canonized? Were
there amino acids kicked out when they didn’t fit the code, such as
hydroxyproline (abundant in collagen, which surely was a source material for
early protein biochemistry studies). Did
biochemists attempt to create structures like the periodic table to hold the
amino acids? Was there any thought that
perhaps there remained important amino acids yet to be discovered, much like
Technetium filled a long-standing hole in Mendeleev’s creation?

Ideally, I’d have a better idea of these answers, but the
fact is that textbooks present the streamlined (“Whig History”) view of things;
few if any of the dead ends and side tracks are left in the narrative. Perhaps there are leads in Eighth Day of Creation or even in Kuhn’s The Structure of Scientific Revolutions, but I don’t
remember the latter even touching molecular biology (I haven’t read either in
about two decades, which isn't really a good thing). If you have any leads on any of this, please
feel free to put them in the comments.
Perhaps some enterprising student or scholar has already written a treatise
covering these grounds; if so I’d love to know of it.

4 comments:

If I recall from Crick's "What Mad Pursuit", the geneticist Lionel Penrose (father of physicist Roger) didn't buy into DNA even after the double helix but had his own self-assembling protein model which he modeled from wood(!)

Thanks for the critical review of the Venter's histirical book. If you take the time to read my brief review of the NGS future ("Rothberg sequencing"), I will be grateful for any criticisms:http://biomics.ru/nomera/2013/57-rothberg-sequencing-potentials-for-semiconductor-sequencing.html

This will be a complete sidetrack but if you are going to mention Morse Code you should expect that. Morse Code was much more than just three symbols. The gap between intra-character, the full letter, the word, the sentence, and even the paragraph were all slightly different. Then there were the odd ball letters. Zero is one long dash, and the letter C has a funny gap after the first two dots.

Did biochemists know Morse Code? If they were Boy Scouts before 1965 or so they sure did. (I missed out.)

And sitting for my captains license this spring I know that even a mariner does not need to know three dots, three dashes, three dots for SOS. Not on the test. But a few morse codes are on the test. The letter A (dot dash) is a big one for the fairway buoy opening to sea one can pass on either side (the light blinks the code). There are a few more morse codes on buoys but fortunately for me I don’t have to learn the whole alphabet. (I think only morse code alpha is going to be on the test.)

Follow by Email

Search This Blog

About Me

Dr. Robison spent 10 years at Millennium Pharmaceuticals working with various genomics & proteomics technologies & working on multiple teams attempting to apply these throughout the drug discovery process. He spent 2 years at Codon Devices working on a variety of protein & metabolic engineering projects as well as monitoring a high-throughput gene synthesis facility. After a brief bit of consulting, he rejoined the cancer drug discovery field at Infinity Pharmaceuticals in May 2009. In September 2011 he joined Warp Drive Bio, a startup applying genomics to natural product drug discovery. Other recurring characters in this blog are his loyal Shih Tzu Amanda and his teenaged son alias TNG (The Next Generation).
Dr. Robison can be reached via his Gmail account, keith.e.robison@gmail.com
You can also follow him on Twitter as @OmicsOmicsBlog.