TalkOrigins’ Misrepresentations of Werner Gitt and Information Theory

Werner Gitt’s book In The Beginning Was Information [Free PDF] is superb and provided the initial inspiration for all I have done with Information Theory. I owe Werner Gitt a debt of gratitude for producing one of the most clear and profound science books I’ve ever read.

The thesis of Gitt’s book is that Claude Shannon’s information theory directly implies design in biology because the existence of language is always preceded by intelligence.

Gitt connects this to John 1:1:

“In the beginning was the WORD and the WORD was with God and the WORD was God. Through Him all things were made….” Language and information are the basis of all creative acts because Jesus Christ is the basis of all language; Jesus Christ IS the language of God.

I’ve said in other places that I’ve found many criticisms of Gitt and all of them are wrong. I was specifically asked about these pages:

A striking contradiction is readily apparent in Gitt’s thinking- he holds that his view of information is an extension of Shannon, even while he rejects the underpinnings of Shannon’s work. Contrast Gitt’s words

(4) No information can exist in purely statistical processes.

and

Theorem 3: Since Shannon’s definition of information relates exclusively to the statistical relationship of chains of symbols and completely ignores their semantic aspect, this concept of information is wholly unsuitable for the evaluation of chains of symbols conveying a meaning.

with Shannon’s statement in his key 1948 paper, “A Mathematical Theory of Communication”

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.

It becomes very difficult to see how he has provided an extension to Shannon, who purposely modeled information sources as producing random sequences of symbols (see the article Classical Information Theory for further information). It would be more proper to state that Gitt offers at best a restriction of Shannon, and at worst, an outright contradiction.

TalkOrigins is misinterpreting both Gitt and Shannon on this point. Shannon’s mathematical analysis can only be applied to the statistical aspects of language because the meaning of any statement cannot be reduced to a number. But Shannon is very clear that the semantical aspect of communication objectively exists.

When Shannon says, “These semantic aspects of communication are irrelevant to the engineering problem” what he means is that a communication system doesn’t care about the meaning of its message, it only cares about its contents. In the introduction to the book “A Mathematical Theory of Communication” published by the University of Illinois Press, Shannon’s co-author Warren Weaver talks of the importance of the meaning of a message.

He says:

Relative to the broad subject of communication, there seem to be problems at three levels. Thus it seems reasonable to ask, serially:

Level A: How accurately can the symbols of communication be transmitted? (The technical problem)

Level B: How precisely do the transmitted symbols convey the desired meaning? (The semantic problem)

Level C: How effectively does the received meaning affect conduct in the desired way? (The effectiveness problem)

The semantic problems are concerned with the identity, or satisfactorily close approximation, in the interpretation of meaning by the receiver, as compared with the intended meaning of the sender. This is a very deep and involved situation, even when one deals only with the relatively simpler problems of communicating through speech.

He says, “it is clear that communication either affects conduct or is without any discernible and probable effect at all.”

Then he goes on to say:

So stated, one would be inclined to think that Level A is a relatively superficial one, involving only the engineering details of good design of a communication system; while B and C seem to contain most if not all of the philosophical content of the general problem of communication.

The mathematical theory of the engineering aspects of communication, as developed chiefly by Claude Shannon at the Bell Telephone Laboratories, admittedly applies in the first instance only to problem A, namely, the technical problems of accuracy of transference of various types of signals from sender to receiver. But the theory has, I think, a deep significance which proves that the preceding paragraph is seriously inaccurate….

Thus the theory of level A is, at least ot a significant degree, also a theory of levels B and C.

So in other words TalkOrigins is saying that information theory is concerned with the transmission of randomly generated sequences of symbols and only cares about the statistical aspects of the transactions. This is 100% wrong. Because Weaver says on page 7:

The information source selects a desired message out of a set of possible messages (this is a particularly important remark, which requires considerable explanation later). The selected message may consist of written or spoken words, or of pictures, music, etc.

What TalkOrigins is doing is denying the importance of meaning in a signal, then conflating this with the fact that Shannon’s formulas cannot discern meaning but only the accuracy of the signal. This is an egregious misrepresentation of Shannon. It’s inexcusable.

Let’s go on – it gets better:

In SC2 Gitt notes that Chaitin showed randomness cannot be proven (see Chaitin’s article “Randomness and Mathematical Proof”), and that the cause of a string of symbols must be therefore be known to determine information is present; yet in SC1 he relies on discerning the “ulterior intention at the semantic, pragmatic and apobetic levels.” In other words, Gitt allows himself to make guesses about the intelligence and purpose behind a source of a series of symbols, even though he doesn’t know whether the source of the symbols is random. Gitt is trying to have it both ways here. He wants to assert that the genome fits his strictly non-random definition of information, even after acknowledging that randomness cannot be proven.

You can’t prove randomness. But you can prove non-randomness! Shannon and Gitt show that you can identify non-random statistical aspects of a signal (“ergodic” means statistically consistent aspects of language).

You cannot study language without guessing about the intelligence and purpose behind a source of a series of symbols. That’s the very definition linguistics and cryptography. In linguistics and cryptography, you often don’t know the source of a message; you’re trying to accurately guess.

The fact that randomness cannot be proven does not contradict Gitt at all – it contradicts Neo Darwinism! You can’t prove randomness but you can prove non-randomness. It’s possible to prove that all languages including the Genetic Code are non-random.

Neo-Darwinism says that random copying errors create evolution and random processes created the genetic code in the first place. Both of these statements are by definition impossible to prove. Neo Darwinism is by definition scientifically unprovable. And every mathematician that studies randomness knows this. (This is why I advocate a theory of evolution by systematic re-arrangement of genes. That is scientific. Random mutation theory is not.)

Next from TalkOrigins:

Gitt describes his principles as “empirical”, yet the data is not provided to back this up. Similarly, he proposes fourteen “theorems”, yet fails to demonstrate them. Shannon, in contrast, offers the math to back up his theorems. It is difficult to see how Gitt’s “empirical principles” and “theorems” are anything but arbitrary assertions.

Neither do we see a working measure for meaning (a yet-unsolved problem Shannon wisely avoided). Since Gitt can’t define what meaning is sufficiently to measure it, his ideas don’t amount to much more than arm-waving.

TalkOrigins is pretending that Gitt said he could prove his theorems. Gitt explicitly stated that they are to be taken as true until any exception can be found. All logical propositions ultimately rest on unprovable statements (after Gödel). Gitt has offered these as theorems and the burden of proof is on TalkOrigins to show that any of Gitt’s theorems are incorrect. They have not done so. We all know from experience that every one of Gitt’s theorems matches our experience.

More from TalkOrigins:

By asserting that data must have an intelligent source to be considered information, and by assuming genomic sequences are information fitting that definition, Gitt defines into existence an intelligent source for the genome without going to the trouble of checking whether one was actually there. This is circular reasoning.

Gitt does not assert that data must have an intelligent source to be considered information. He asserts that all sequences of symbols that are non-repeating and have statistics, syntax and semantics that we know the origin of have an intelligent source. That is not circular reasoning. This is standard inductive inference.

Thus every objection TalkOrigins makes to Gitt in this article is shown to be wrong.

This is patently false. It contradicts all the other genetics literature defining codes, the genetic code and the reasons why we call DNA the genetic code. I’ve got a stack of biology books and none of them call DNA a cipher. Most do not even contain the word cipher.

To say that DNA is not a code is an inexcusable misrepresentation of one of the most basic facts in all of science. TalkOrigins should be ashamed of this. I’m amazed this article even exists.

TalkOrigins says:

An essential property of language is that any word can refer to any object. That is not true in genetics. The genetic code which maps codons to proteins could be changed, but doing so would change the meaning of all sequences that code for proteins, and it could not create arbitrary new meanings for all DNA sequences. Genetics is not true language.

Webster’s Dictionary defines language as (2): a systematic means of communicating ideas or feelings by the use of conventionalized signs, sounds, gestures, or marks having understood meanings

DNA fits most definitions of language. Not necessarily all. The stipulation that any word can refer to any object is minor. It also may very well be wrong because in theory there’s an infinite number of possible configurations of biological machines.

TalkOrigins says:

The word frequencies of all natural languages follow a power law (Zipf’s Law). DNA does not follow this pattern (Tsonis et al. 1997).

[Creationist] Claim: In every case where a machine’s origin can be determined, it is the result of intelligent agency. (A machine is a device for transmitting or modifying force or energy.) Out of billions of observations, there are no exceptions. It should be considered a law of nature that machines, including those in living organisms, have an intelligent cause.

TalkOrigins Response:

1. The claim is an argument by analogy: Life is like man-made objects in containing machines, therefore it is like man-made objects in having an intelligent cause. It suffers the weaknesses of all arguments by analogy. In particular, it ignores dissimilarities between life and design, and the similarity has questionable relevance to intelligence.

My answer: The use of the word code in biology is not analogy (Yockey, 2005). Therefore the argument that life is designed is not an argument from analogy.

TalkOrigins:

Many machines occur in nature without the involvement of intelligence or, indeed, of any kind of life. The following list is far from exhaustive.

* Inclined planes, perhaps the simplest type of machine, are ubiquitous on earth. Functions include causing waves to break and making it easier for animals to climb heights.
* Ice wedges, another form of wedge, contribute significantly to erosion.
* Molecular bonds function as springs as they transmit and distribute forces through materials.
* Thunder clouds generate electrical forces.
* The earth as a whole is a dynamo, converting mechanical motion of convection into a magnetic field.
* Geysers produce eruptions which are predictable and fairly regular. If Paley’s watch can be considered a machine, surely Yellowstone’s Old Faithful is a machine, too, but I have never heard any suggestion that it is designed.

The problem here is an unacceptably vague definition of machine. An inclined plane is a machine by one definition, but that’s obviously not what the creationist was talking about. Thanks to TalkOrigins for a straw-man argument.

In my work I have focused on machines that use or produce code – i.e. variations on the Turing Machine. None of the above machines are Turing machines in the remotest sense and none of them produce codes.

TalkOrigins:

Other machines are created by life but not by intelligence. Genetic algorithms design or help to design many kinds of machines, from antennae to jet engines (Marczyk 2004). One may attempt to argue that items designed by a genetic algorithm inherit the intelligent agency of the algorithm’s designer, but this misses the point that no human mental activity directs the immediate operation of the algorithm. In some cases, for example in some electronic circuits, the algorithmically-designed results show no resemblance to their human-designed versions, and indeed, cannot be explained via human design methods (Koza et al. 2003).

All Genetic Algorithms originate from conscious beings. There are no known exceptions. At all times, GA’s are obeying the rules of a man-made code. Intelligence is always necessary to have a GA.

Thus we see that at every single point, TalkOrigins has failed to identify flaws in Gitt’s work. I would say the same of every other atheist critique. Gitt’s book remains an outstanding expose of the flaws of materialistic biology.

9 Responses

You say “DNA fits most definitions of language”. I take it this means that there are some definitions which it doesn’t fit. Can you please explain what these definitions are, and why you feel that it doesn’t matter that it doesn’t fit them.
Also, You say “The stipulation that any word can refer to any object is minor.”
Can you explain why you think it’s minor?

a body of words and the systems for their use common to a people who are of the same community or nation, the same geographical area, or the same cultural tradition: the two languages of Belgium; a Bantu language; the French language; the Yiddish language.
2.
communication by voice in the distinctively human manner, using arbitrary sounds in conventional ways with conventional meanings; speech.
3.
the system of linguistic signs or symbols considered in the abstract ( opposed to speech).

4.
any set or system of such symbols as used in a more or less uniform fashion by a number of people, who are thus enabled to communicate intelligibly with one another.
5.any system of formalized symbols, signs, sounds, gestures, or the like used or conceived as a means of communicating thought, emotion, etc.: the language of mathematics; sign language.
6.
the means of communication used by animals: the language of birds.
7.communication of meaning in any way; medium that is expressive, significant, etc.: the language of flowers; the language of art.
8.linguistics; the study of language.
9.
the speech or phraseology peculiar to a class, profession, etc.; lexis; jargon.
10.
a particular manner of verbal expression: flowery language.
11.
choice of words or style of writing; diction: the language of poetry.
12.Computers . a set of characters and symbols and syntactic rules for their combination and use, by means of which a computer can be given directions: The language of many commercial application programs is COBOL.
13.
a nation or people considered in terms of their speech.
14.
Archaic . faculty or power of speech.

It seems that the crucial definition here is 12, particularly the “syntactic rules”. You have yet to demonstrate that there exist any rules determining which amino acids follow which others. This is crucial as without rules governing where they fit there is no evidence for assuming DNA is a language.
You also mention meaning several times. Can you clarify what meaning DNA carries?

The meaning of any code is the result you obtain in its decoded form. This is true at every level of decoding. For example the meaning of the codon GGG is the amino acid Glycine – you can look it up in any genetic code table in any biology book in the world.

I have already referenced this several times in our conversations. Chris the reason I’ve stopped communication with you is that you seem to not read what I send:

Chris, the “syntactic rules” you ask for are easy to understand. Like in a natural language, patterns of AA must obey many rules: for each mRNA there must be start codons and stop codons at specific places. There are ribosome binding sites; spliceosome recognitions sites, etc., etc. The binding sites for promoters and all forms of gene and protein regulation obey definite syntactic rules. The ENCODE project has identified thousands of components of the syntatic rules. These rules definitely determne which AA must follow others.
Hope this helps you.

I want to propose a a simple method to illustrate how DNA functions. I hope it will be useful and assists to strengthen your arguments relating to the coding abilities of DNA.

Chris and a post on the Infidels’s website by one of your ferocious critics prompt this response. Here is a quote from the mentioned post

“The fact that random mutation can in fact increase the information content (In DIRECT contradiction to the author’s claim he made during his lecture) of DNA does seem to destroy the whole argument as well since then we not only have a changing code but one that clearly has evolved over time to become more complex. Self-evolving and non-conscious DNA also brings up the issue of whatever ‘message’ there originally was is now gone anyway. So how could there be any intent on the part of an intelligent and conscious sender? For the sake of argument, assume random mutations are destroying information. Then that also means that part of the original message is gone, so what was the point of sending it if after a while it is gone? what was the message? can’t god create a non-mutating message if this is to be the proof of its existence? “

What is the message contained in DNA? The message is information on how to manufacture protein molecules. The majority of these proteins are enzymes. Enzymes control chemical reactions. The basic message of the genetic code is to control these chemical reactions, the sum of which is regarded as life.

First a few important properties about by the behavioral patterns of the genetic code.
(I am sure most readers will be familiar with the genetic code and the basic structure of DNA and RNA and the roles of uracil, thymine, cytosine, adenine and guanine but allow me to stress the following points)

DNA is only the carrier of the genetic code. ( Similar to a hard disk and ROM and RAM chips being only carriers of the binary code and do not constitute the code). The code is contained in the sequence of the organic bases.

The most important molecule without which the decoding of the code will be impossible , is the small tRNA molecule. tRNA molecules are specific for specific amino acids and is absolutely necessary to place the amino acid in the correct specific place in the protein molecule ( peptide chain)

Life is a multitude of chemical reactions occurring at a blistering speed. These reactions
won’t happen or will happen only at a very slow rate, at normal body temperature . The code ‘s messages regulate and allow these reactions to occur at a blistering speed at normal body temperatures. The reactions are speed up by catalysts known as enzymes. Each enzyme is usually very specific for only one of these multitudes of life sustaining reactions.

The organic bases only act as symbols and can be replaced by any other type of symbol. Let us replace the organic bases with the numbers 1 to 5. Numbers 1,2,3 and 4 are present in the double helix portion of the code carrier. (DNA) Numbers 1, 2 ,3 and 5 are contained in the single strand portion of the code carrier. (mRNA)

Let us now refer to the different amino acids as letters of the Roman alphabet used in the English language.(or most Western languages) The genetic code contains 20 letters ( amino acids). The 20 most frequently used letters in English can be used to replace the amino acids and the genetic code will still be able to write the majority of the English verbs. Allocate three numbers to represent one letter. An example is to use 111 to represent A etc.

The back bone of the the genetic code is chemical molecules ( pentose sugars and phosphates) . The back bone has no bearing on the code and can be ignored . The sequence of the symbols is the only important factor. It doesn’t matter what symbols are used, organic molecules , numbers or whatever.

Only relatively small portions of the enzymes act as catalysts . However the spatial orientation of these active areas are of the utmost importance. The structure of the rest of the protein enzyme molecule will determine this orientation.

Letters create sentences. The most important part of a sentence is a verb.
Imagine the code to create the following sentences

The brown dog jumps over the red fox. The brown dog mjpus over the red fox
The first sentence describes an action. The second is meaningless, it needs a verb.

The desired action is for something to jump. The active part of an enzyme is comparable to a verb. It causes something to happen. For this action to happen the active part must occupy the correct three dimensional position in space . In the genetic language the verb (active region) is important but useless if placed ln the wrong position.

thebrowndogjumpsovertheredfox.
bsvtroffxhvtrjumpsonhreoxdhrex

Both sentences contain an active part (jumped) in a similar place and the enzyme will be active

Another example

thebrowndogtheredfoxoverjumps
thebrowndogovertheredfoxjumps.

Here in both sentences the active parts are situated in the wrong positions and the enzyme will be inactive , unable to catalyze the desired chemical reaction.

These examples also illustrate another remarkable ability of the genetic code namely the ability of certain living creatures to distinguish between own and foreign proteins. This ability form the bases of the immune reaction.

Most mutations are harmless and won’t cause any changes in the active catalatic region or three dimensional structure of the protein. Not only the enzymes, but also the structural proteins will be affected by this “noise”. The “innocent noise” changes the protein without disturbing its basic structure or function. The result is that the proteins of different individuals differ slightly.

The “ noise” is at the root of the immune system and our modern ability to identify individuals through their DNA.

This illustration just touched the basic elements of the genetic code and illustrates the purpose of its message. The primary message is “create life” and it is accomplished by catalyzing chemical reactions. . This message is still relevant and didn’t change through the ages.

One last remark. For my code to be able to be decoded the system must contain something to recognize the letters and to insert them in the sentence, something equivalent to tRNA.

I just want to add three journal abstracts about enzymes’s active region,
Donald et al showed the that enzymes are capable to to;erate mutation even in areas near the active region.
Zheng Yuan et al showed a diference in flexibility between the active and non active regions
Finkelstein describe how forced mutation at the active site can enhance enzymatic activuty. ( My verb analoque jumping ( is or are jumping) comparex to jump or jumped. Which one creates the more powerful image?

THE JOURNAL OF BIOLOGICAL CHEMISTRY
0 1992 by The American Society for Biochemistry and Molecular Biology, Inc.
Vol. 267, No. 15, Issue of May 25, pp. 10248-10251,1992
Printed in U. S. A .
Region-directed Mutagenesis of Residues Surrounding the Active Site
Nucleophile in @-Glucosidase from Agrobacterium faecaZis*
(Received for publication, December 2,1991)
Donald E. Trimbur$, R. Antony J. WarrenO, and Stephen G. Withers$
From the Departments of $Chemistry and IMicrobiolonv, University of British Columbia,
Vancouver, British Columbia V6T 1 Y5, Canada
”

”
The active site nucleophile of the &glucosidase of
Agrobac t e r ium fae cal i s has recently been identified by
the use of inhibitors. A combination of site-directed
and i n v i t r o enzymatic mutagenesis was carried out on
the &glucosidase to probe the structure of the active
site region. Forty-three point mutations were gener-
ated at 22 different residues in the region surrounding
the active site nucleophile, Glu3″. Only five positions
were identified which affected enzyme activity indi-
cating that only a few key residues are important to
enzyme activity, thus the enzyme can tolerate a number
of single residue changes and still function. The impor-
tance of Glu3′” to enzymatic function has been con-
firmed and other residues important to enzyme struc-
ture or function have been identified.
”

National Laboratory of Biomacromolecules, Institute of Biophysics, Academia Sinica, Beijing 100101, China
Received July 24, 2001.
Accepted December 10, 2002.
Revision received October 21, 2002.

Next Section
Abstract
”
Protein flexibility is inherent to protein structural behavior. Experimental evidence for protein flexibility is extensive both in solution and in the solid state. A major question is whether the flexibility observed in enzymes is simply an inherent property of proteins that must always be borne in mind or is essential for catalysis or substrate binding. The temperature factors or B-values, as determined crystallographically, are linearly related to the mean square displacement of an atom and give an indication of atomic flexibility in the crystalline state. In this paper, we describe the frequency distributions of the normalized B-factor (B′-factor) for the active site and non-active site residues in the selected 69 apo-enzymes. This analysis was performed over the entire sequences and for different structural subsets defined by the three-dimensional structure of proteins, as α-helices, β-structures and coil conformation and buried and non-buried residues. The results show that in all cases, the active site residues predominantly occur in region of low B′-factor and the non-active site residues have a tendency to exist in the high B′-factor region. This observation suggests that the active site residues, in general, are less flexible than the non-active site residues and therefore the vibrational and the fast collective motions of the Cα atoms of proteins appear not to have clear biological significance.
”

Enzymes have been used by humans to carry out chemical transformations on small-molecule substrates for hundreds of years—for example, for brewing beer or making cheese. However, the high substrate specificity of many enzymes makes it challenging to generate new ‘biocatalysts’ that efficiently carry out chemical transformations on non-native substrates. Sandström et al. now show that it is possible to generate a very small library of mutant lipases and still find a highly active variant that is able to perform an enantioselective reaction on a larger non-native substrate. The authors selected nine amino acids in the substrate-binding pocket for simultaneous mutation, making a library that only contained 1,024 variants. The most active variant identified had mutations at five of the targeted positions. An analysis of single and double mutants suggested that the native enzyme lay in a relatively ‘flat’ or inactive region of the protein fitness landscape, and the authors believe that ‘walking’ (via directed evolution) from that position would not have yielded a highly enantioselective mutant lipase for the new substrate. However, that several mutations were simultaneously incorporated meant that the authors had ‘hopped’ to a different, more active region of the protein fitness landscape. This approach may facilitate the discovery of new biocatalysts for other enantioselective chemical transformations.Just a jump to the left
”

Okay, I’ve still got a little more of the above to check out but in the meantime I still question your collective explanations:

Start and stop codons do not qualify as syntactic rules. Start codons do not always specify methionine, however, this is always the first amino acid. Therefore, the information specified by the code is ignored. Stop codons simply don’t have a stable mRNA to bind to. They break apart too quickly to be useful.

This indicates that these are not necessarily “read” by a ribosome as it would be able to recognize these codons without the associated machanical fit to the peptide under construction.

Also, the authors’ dismissal of the lack of an arbitrary codon-amino acid relationship is entirely unfounded. Taking your favourite example, GGG could never specify anything other than glycine. This is governed by chemical reactions, not the interpretation of the cellular components of a peptide chain.

And Perry, sorry to sound a little petulant but the reason you’ve seen this question before is that you have done a poor, not to mention downright evasive, job of answering it. The other contributors here have been of more use.

Note: I’ve also questioned your interpretation of both “science” and “information” in previous correspondence and on this site. While I realise that this may not be an appropriate thread I have yet to receive an answer. Feel free to reply to my email instead.