|Cancer||Cell death||Cell cycle||Cytoskeleton||Exo/endocytosis||Differentiation||Division||Organelles||Signalling||Stem cells||Trafficking|
Cell Biology International (2004) 28, 729739 (Printed in Great Britain)
Chance and necessity do not explain the origin of life
J.T. Trevorsa* and D.L. Abelb
aLaboratory of Microbial Technology, Department of Environmental Biology, Room 3220, Bovey Building, University of Guelph, Guelph, Ontario, Canada, N1G 2W1
bThe Gene Emergence Project, The Origin-of-Life Foundation Inc., 113 Hedgewood Dr., Greenbelt, MD 20770-1610, USA
Where and how did the complex genetic instruction set programmed into DNA come into existence? The genetic set may have arisen elsewhere and was transported to the Earth. If not, it arose on the Earth, and became the genetic code in a previous lifeless, physical–chemical world. Even if RNA or DNA were inserted into a lifeless world, they would not contain any genetic instructions unless each nucleotide selection in the sequence was programmed for function. Even then, a predetermined communication system would have had to be in place for any message to be understood at the destination. Transcription and translation would not necessarily have been needed in an RNA world. Ribozymes could have accomplished some of the simpler functions of current protein enzymes. Templating of single RNA strands followed by retemplating back to a sense strand could have occurred. But this process does not explain the derivation of “sense” in any strand. “Sense” means algorithmic function achieved through sequences of certain decision-node switch-settings. These particular primary structures determine secondary and tertiary structures. Each sequence determines minimum-free-energy folding propensities, binding site specificity, and function. Minimal metabolism would be needed for cells to be capable of growth and division. All known metabolism is cybernetic – that is, it is programmatically and algorithmically organized and controlled.
Keywords: Cellular communication, Chance, Necessity, Genetic control, DNA, RNA, Evolution, Information theory, Life origin, Astrobiology, Panspermia.
*Corresponding author. Tel.: +1 519 824 4120x53367; fax: +1 519 837 0442.
Genetic information consists of linear digital algorithmic programs. These programs are recorded into the sequencing of DNA's primary structure. Algorithms consist of sequences of decision-node switch-settings. Each nucleotide selection in the sequence constitutes a quaternary decision-node selection. “Quaternary” is not used here in a chemical sense, but to distinguish four-branch decision nodes from binary decision nodes with only two branches (a fork in the road). Each selection of a nucleotide is from among the four real options.
Not all segments of the sequence are critical. But in those sections that are, each switch-setting is highly determinative of minimum-free-energy folding and binding success (Fontana and Schuster, 1998; Reidys et al., 2001; Schuster, 1995; Wuchty et al., 1999).
Peer-reviewed life-origin literature presupposes that, given enough time, genetic instructions arose via natural events. Thus far, no paper has provided a plausible mechanism for natural-process algorithm-writing. Only 200 million years separated the end of Earth's bombardment (Davies, 2001; Whittet, 1997) from the presumed first appearance of life on Earth 3.8 billion years ago (Martin and Russell, 2003; Mojzsis et al., 1996; Parsons et al., 1998; Schopf, 1993; Van Zuilen et al., 2002). Following cooling, it is difficult to understand how natural processes could have generated the following aspects of life in such a short time: (1) a genetic operating system with which to record programming instructions, (2) the programs themselves for production or assembly of every individual building block, biochemical pathway, and metabolic cycle needed for even the simplest protometabolism to develop, and (3) a coding system with which to translate triplet codon “language” into polyamino acid language.
a genetic operating system with which to record programming instructions,
the programs themselves for production or assembly of every individual building block, biochemical pathway, and metabolic cycle needed for even the simplest protometabolism to develop, and
a coding system with which to translate triplet codon “language” into polyamino acid language.
2 The derivation of genetic instruction in nature
There is an immense gap from prebiotic chemistry and the lifeless Earth to a complex DNA instruction set, code encryption into codonic sequences, and decryption (translation) into amino acid sequences. Sound reductionistic science should keep breaking down the immensity of the life-origin problem into its component problems. Life-origin specialists should appreciate that the generation of instructions is a separate and distinct problem from that of devising a language system with which to record those instructions. And these two problems are separate and distinct from the problem of encryption/decryption (coding) associated with translation to proteins. We tend to naively refer to genomes as genetic code. Molecular biological cybernetics is not that simple. The phenomenon of instructions must be constantly delineated from the phenomenon of an overall language, and also from the phenomenon of code encryption/decryption. Translating one operating system into another is not the same problem as writing algorithmic programs. Genes and emergent gene networks represent programming. These algorithms are written in a pre-existent operating system environment. As in computer science, this language is used by the programmer. We must not only find models for specific genetic programming, but for the genetic operating system.
Natural processes have never been observed to write conceptual instructions, to symbolize such algorithmic meaning, or to translate it using code bijection (a one-to-one correspondence of “meaning” between alphanumeric symbols) from one language system into another. In addition, inanimate nature seems to possess no attributes capable of encryption/decryption of coded messages. Genes are algorithmic programs “instantiated” into the physical medium of nucleotide sequence. What do we mean by “instantiated”? Algorithmic programs are abstract. They are integrated strings of fundamentally nonphysical choice commitments. But of necessity any program must be recorded into a physical medium (e.g., a computer disc, a typed letter, an email, a telephone or TV signal, DNA). This recordation into physicality of choices is called “instantiation”. Successive ribonucleotide selections instantiate the abstract “instructions” for ribozyme secondary and tertiary structures. Successive triplet codon selections incorporate the instructions for polyamino acid sequence. This primary structure ultimately determines protein conformation and contextual function. The genetic operating system uses a bijective coding system whereby a certain triplet codon represents a certain amino acid. Thus, we are left with three separate missing mechanisms of molecular evolution theory which remain to be explained:
How did inanimate nature write (1) the conceptual instructions needed to organize metabolism? (2) a language/operating system needed to symbolically represent, record and replicate those instructions? (3) a bijective coding scheme (a one-to-one correspondence of symbol meaning) with planned redundancy so as to reduce noise pollution between triplet codon “block code” symbols (“bytes”) and amino acid symbols? We could even add a fourth question. How did inanimate nature design and engineer (4) a cell [Turing machine? (Turing, 1936)] capable of implementing those coded instructions?
the conceptual instructions needed to organize metabolism?
a language/operating system needed to symbolically represent, record and replicate those instructions?
a bijective coding scheme (a one-to-one correspondence of symbol meaning) with planned redundancy so as to reduce noise pollution between triplet codon “block code” symbols (“bytes”) and amino acid symbols?
We could even add a fourth question. How did inanimate nature design and engineer
a cell [Turing machine? (Turing, 1936)] capable of implementing those coded instructions?
3 The role of natural selection
Natural selection could select the fittest already-programmed phenotypes. Evolution works through differential survival and reproduction of the superior members of each species. Phenotypes are the finished products of nucleic acid (genetic) algorithms. Natural selection could not have programmed nucleic acid algorithms at the covalently-bound primary structure (sequence) level. The environment does not select nucleotide or codonic sequences. The environment favors only the fittest phenotypes. It knows nothing of genotypic programming directly. Nature has no ability to optimize a conceptual cybernetic system at the decision node (covalently-bound sequence) level. Nature cannot organize conceptual, holistic operating systems and instructions from “necessary” (Monod, 1972) mass/energy relationships. Freedom of selection is necessary at each decision node. Gene regulation and coordination are programmed algorithmically. No known hypothetical mechanism has even been suggested for the generation of nucleic acid algorithms.
The standard genetic code table is published in a multitude of scientific papers and books. Numerous papers have been published in an attempt to explain the origin of the genetic code (Barbieri, 2003; Beebe et al., 2003b; Crick, 1968; Di Giulio, 1997, 1999a,b, 2000, 2001a,b, 2002, 2003; Di Giulio and Medugno, 1998, 1999, 2000, 2001; Freeland et al., 2003; Guimaraes, 2004; Jukes, 1993; Jukes and Osawa, 1993; Knight et al., 1999; Osawa et al., 1992; Ribas de Pouplana et al., 1998; Ribas de Pouplana and Schimmel, 2000, 2001c; Ronneberg et al., 2000, 2001; Schimmel and Ribas de Pouplana, 1999; Schimmel and Wang, 1999; Seligmann and Amzallag, 2002; shCherbak, 2003; Skouloubris et al., 2003; Stevenson, 2002; Szymanski and Barciszewski, 2000; Szymanski et al., 2000; Trevors, 2003; Wang and Schultz, 2002; Woese et al., 2000; Wong, 1975, 1988; Xue et al., 2003; Yarian et al., 2002; Yarus and Christian, 1989; Yarus, 2000). All code origin models have problems and lack detail. In addition, no physical mechanism has been suggested for the source of abstract genetic instructions themselves. New formats and approaches are needed to investigate the origin of instructions, coding, biochemical pathways, cycles, metabolism, and life itself.
4 The role of long periods of time
The argument has been repeatedly made that given sufficient time, a genetic instruction set and language system could have arisen. All that would be needed would be diversification, environmental selection, and continuing optimization. But extended time does not provide an explanatory mechanism for spontaneously generated genetic instruction. What is needed is a plausible mechanism for natural-process-generation of functional algorithms. We need empirical evidence of prescriptive genetic information arising spontaneously, without artificial investigator selection and amplification. A fulfilled prediction of the latter would be ideal. So far, none has occurred.
No amount of time proposed thus far, can explain this type of conceptual communication system. It is not just complex. It is conceptually complex. First, the ribosome/tRNA/aminoacyl tRNA synthetase/amino acid holistic translative system would have had to pre-exist any messages. Then we have to explain how the DNA and mRNA sequence provided the codon-encrypted instructions for the correct proteins to be synthesized. Only then could the receiver and destination have known what those instructions meant.
The appearance of genetic control does not seem possible unless the transmitted message and the decoded outcome were pre-arranged. It is an immense challenge to envision how this coding/decoding could have occurred on the early natural-process Earth, especially under harsh conditions for life. One possibility is that the first coding/decoding system was very simple and the first code produced only a few useful proteins. The code was then optimized to produce larger and larger genomes and more complex organisms. If this did occur, then the first small genetic instruction set would have needed the capacity to become more complex and diverse. As new more complex instructions were sent, the translative system would have had to maintain relative constancy for needed proteins to be synthesized. This could only be achieved if the three-base codonic code for one amino acid was already established. A change in the number of nucleotides per codon at some time during early evolution would have resulted in reading frame shifts. This would have resulted in a catastrophic loss of everything gained up to that point. Life would have had to start over again.
5 Code bijection is a separate problem from programming
Code bijection is a one-to-one correspondence of “meaning” between alphanumeric symbols in different languages or operating systems. The phrase “genetic code” should be reserved for describing this one-to-one correspondence between each triplet codon and its corresponding amino acid (Yockey, 1992, 2000, 2002a,b,c). Proper use of the term genetic code applies only to the redundant (usually erroneously called degenerative in the literature), noise-reducing table of codon assignments. These assignments are widely regarded as an amazingly optimal coding system (Bradley, 2002; Freeland and Hurst, 1998; Gilis et al., 2001; Labouygues and Figureau, 1984). Such optimization makes it all the more difficult to explain the molecular evolution of such exquisite non-human genetic algorithms. Random walks (Markov Processes) (Holland, 2003) will never provide an adequate explanation for the generation of such a highly refined translative coding system. Nor will it provide an explanation for the codonic operating system and specific programs generated through selection of each nucleotide in a strand.
Similarly, it is difficult to envision how the laws of physics and chemistry could explain encryption/decryption. No physicochemical link exists between programmed nucleotide sequence and amino acid functional sequence. How did each tRNA acquire the correct anticodon? In addition, how would each tRNA link up with the correct amino acid on its opposite end? Each aminoacyl tRNA synthetase would then have to establish a link with the correct amino acid. Paul Schimmel's group and others have done some fine work in this area (Beebe et al., 2003a,b; Hendrickson et al., 2002; Nangle et al., 2002; Nomanbhoy and Schimmel, 2001; Ribas de Pouplana et al., 2001; Ribas de Pouplana and Schimmel, 2001a,b,c; Schimmel and Ribas de Pouplana, 2001; Tamura and Schimmel, 2001, 2002). But much of this research involves human engineering.
Cause-and-effect physicality has no ability to anticipate or devise a conceptual system that employs symbolic representationalism. Both the semantics and syntax of codonic language must translate into appropriate semantics and syntax of protein language. That symbolization must then translate into the “language” of three-dimensional conformation via minimum-free-energy folding. No combination of the four known forces of physics can account for such conceptual relationships. Symbolism and encryption/decryption are employed. Codons represent functional meaning only when the individual amino acids they prescribe are linked together in a certain order using a different language. Yet the individual amino acids do not directly react physicochemically with each triplet codon. Even after a linear digital sequence is created in a new language, “meaning” is realized at the destination only upon folding and lock-and-key binding.
How did 20 specific tRNA, aminoacyl tRNA synthetases, and amino acids self-organize into a holistic translative operating system? The origin of translation defies natural-process modeling as a holistic system. The ribosome is only one aspect of translation. Yet we do not even know how ribosomes formed to provide such sophisticated translation machinery.
The four known forces of physics know nothing of the phenomenon of linguistic translation. The laws of physics and chemistry cannot explain why each tRNA just happens to have the correct anticodon and links up with the correct amino acid and the correct aminoacyl tRNA synthetase. Cause-and-effect physicality has no ability to anticipate or devise a conceptual system that employs symbolic representationalism. Both the semantics and syntax of codonic language must translate into appropriate semantics and syntax of protein language.
The “chicken and egg” paradox, therefore, remains a stubborn problem. Little progress has been made on the origin of the genetic code on the Earth or elsewhere. Its improbable delivery to the Earth by bombardment or impact events billions of years ago has its own problems.
6 Astrobiological considerations
Exploring other planets and moons will provide new scientific knowledge. But astrobiology seems unlikely to explain the origin of the phenomenon of genetic instruction. Most investigators agree that the origin of life on the Earth was a very improbable event. Thus far, no evidence exists of life elsewhere in the universe. Life may have arisen on another planet or moon. But some regard it extremely unlikely that spores could have survived prolonged space travel and rapid entry into our atmosphere (Dose, 1986; Weber and Mayo Greenberg, 1985). Others disagree (Davies, 2001; Secker et al., 1996), citing spore coatings of silicon or carbon (lithopanspermia) as a mechanism of shielding even interstellar organisms from UV radiation, excessive speed, shock, and heat (Melosh, 1993; Weiss et al., 2000). Even if panspermia were true (Arrhenius, 1908; Hoyle and Wickramasinghe, 1981, 1986; Parsons, 1996), the origin of an operating system language and specific prescriptive genetic information (source code) would remain unexplained. No fixed laws or formulae can program metabolic computation. Instruction is abstract and conceptual. Yet instruction is exactly what genomes do. In addition to instructing, they actually perform metabolic function in a hardware-implementation sense.
Much attention has been given to recent media announcements by NASA suggesting that liquid water once existed on Mars (Associated Press, USA Today. April 1 2004 “NASA rover finds more signs of past water on Mars”). Water is central to life as we know it (Ball, 2004; Good, 1973; Papagiannis, 1992). Water is the biological solvent. Its high polarity and high dielectric constant contribute to protein and ribosomal folding through hydrogen-bond and van der Waals like forces. Hydrophobic and hydrophilic groups within the same polypeptide help determine protein conformation (e.g., thermostable beta-sheets) (Brack, 1993; Trevors, 2002). Some proteins’ conformation depends upon trapped water molecules within their innermost folds. The lipid bilayers of membranes are permeable to water and ionic solutes (Deamer and Bramhall, 1986). Clay surface adsorption thought to assist early RNA polymerization depends upon one or two molecular layers of water molecules (Anderson and Banin, 1975). Water is a powerful hydrolytic agent. Hydrolysis is used to break down biomolecules for recycling into different primary structures (sequences). Organic solvents could not do this (Brack, 1993).
The presence of water suggests that a planet has (1) enough mass to retain an atmosphere, (2) enough rotation to aid cooling, and (3) the right distance from its sun to provide a habitable zone or ecosphere (Papagiannis, 1992). But the mere presence of liquid water on a planet tells us little about life origin. Astrophysicist Paul Davies calls this “the ingredients fallacy” (Davies, 2003). He points out that even the presence of carbon, hydrogen, nitrogen, oxygen, sulfur, and water on a planet no more guarantees life than the presence of silicon guarantees the presence of computers. The algorithmic complexity of life puts our finest computers to shame. Water has been present in abundance on Earth for approximately 4 billion years. Yet many investigators have abandoned earth-based life-origin models for more likely astrobiological sources. Water's presence does not create or explain life.
The role of carbon polymers is also important to life. Cairns-Smith proposed clay-life models of life (Cairns-Smith, 1966, 1977, 1990; Cairns-Smith et al., 1972; Cairns-Smith and Walker, 1974). Clay-life models have not been pursued for decades for multiple reasons. Information retention would be limited to crystal irregularities in the otherwise regular, highly-ordered crystal matrix. But the number of such irregularities would be insufficient to retain the amount of information required by life. In addition, no satisfactory mechanism of genetic takeover was ever proposed. The information would have to be translated from clay crystal irregularities into nucleotide sequences. No theoretical means of code bijection (one-to-one correspondence between languages) exists. Access to the non-surface and deeper clay layers to read clay information also remained unexplained by this model. Finally, clay crystal irregularities are randomly distributed. As such, they may meet the Shannon definition of “information” (which Shannon himself eschewed). But such randomly distributed irregularities would never have programmed sophisticated metabolic functions.
The only other options discussed for non-carbon life have been nonlayered silicon (Mann and Perry, 1986; Trevors, 1997a; Williams, 1986) and boron polymers. Silicon polymers cannot gain sufficient length for adequate information retention. Silicon forms bonds with other elements that would interfere with silicon–silicon chain formation. Silicon lacks the relatively easily-broken-and-rejoined covalent-like bonds enjoyed by carbon, hydrogen, and oxygen in organic compounds. Silicon bonds are too rigid and irreversible for cellular metabolic recycling of structural, enzymatic, regulatory, and informational biopolymers. Silicon is too insoluble in an aqueous environment. Sand, a typical silicon compound, is a good example. No organisms could have been produced except in an aqueous environment. Carbon, unlike silicon, is amenable with the help of catalysts to dehydration synthesis even in an aqueous environment. Yet carbon-based organisms do not dissolve in ponds, rivers, and oceans. Carbon chains are unique. Finally, silicon chains lack the ability of carbon chains to establish a lipoprotein-like connection between different kinds of biomolecules. Lipids have a different solubility and serve different functions from proteins. Both are needed for life as we know it. Carbon–carbon bonds provide both kinds of branching using the same basic building blocks. Lipoprotein molecules can cooperate to contribute to cellular survival through such functions as membrane formation. Silicon oxide can form layers, but lacks the unique properties of lipoprotein needed for semi-permeable membranes, active transport, secretion, and excretion.
The main interest in silicon has not been as the backbone molecule of a life system, but its role as a surface adsorbent and catalyst for proper alignment and polymerization of polyadenines and polyuracils (Burton et al., 1974; Ding et al., 1996; Ertem and Ferris, 1996, 1998, 2000; Ferris et al., 1996, 1988, 1989; Ferris and Ertem, 1992; Friebele et al., 1980, 1981; Huang and Ferris, 2003; Kawamura and Ferris, 1999; Miyakawa and Ferris, 2003; Paecht-Horowitz and Eirich, 1988; Trevors, 1997a). But polyadenines and polyuracils, like the monotonous clay crystals to which they adsorb, contain almost no Shannon uncertainty (often misnamed “information”). Clay surface adsorption could not possibly be the source of highly informational genetic instructions.
The notion of boron life has never received serious attention. Based on current astronomical knowledge, there appears to be insufficient boron in the cosmos to support life on any planet. Any evolutionary scenario would require large quantities of boron compounds to provide enough diversity from which to select happenstantial algorithmic metabolic function. The replicative potential of carbon biopolymers is lacking in the case of both boron and silicon polymers. In addition, metabolic function depends largely upon folding. Neither silicon nor boron offers the peculiar secondary and tertiary folding-versatility needed to catalyze and support life. Unique lock-and-key binding fits are not afforded by boron or silicon molecules. No empirical evidence exists for any form of life other than carbon-chemistry life. Silicon and boron life models present so many challenges that it is difficult to imagine either using even the most elementary definition of life.
One could postulate that the extraterrestrial origin of life would have provided a longer time period for life to have originated and evolved elsewhere before it was delivered to the ancient Earth by one or more impact events. If we ignore the need for cosmic cooling following a hot Big Bang, we only extend the available time from 4 billion to 14 billion years. This does little to overcome the statistical prohibitiveness of algorithmic self-organization.
Organic compounds were found in the Murchison meteorite (Deamer and Pashley, 1989; Kvenvolden et al., 1970; Ponnamperuma, 1972). The transport of life in rock fragments from Mars has been suggested. This possibility need not be limited to a nearby planet. The maximum amount of additional time available for evolution to arise on Mars over that available on Earth is only about 100 million years (Line, 2002). A maximum extension of 600 million additional years anywhere in the Solar System exists over that available on habitable Earth (Line, 2002). This suggests that the origin of life on Mars, other planets and moons, and its transport to the Earth becomes highly improbable.
Many biochemical questions remain as to how exclusively right-handed ribonucleotides could have been activated and polymerized with 3′5′ bonds in an aqueous environment (Shapiro, 1984, 1987, 1988, 1999, 2000, 2002). Ribose and ribonucleotides are hard to make and are not stable. In addition, when delivery of life from one planet to another is proposed, it focuses on the delivery of spores and/or bacterial cells. Possibly, the spores were simply the carrier for the genetic instruction set that was inserted into the primitive lifeless Earth. But neither DNA nor its instantiated instructions are themselves alive. All of the genetic instructions in DNA are usually still present milliseconds after cell death. Yet the cell is dead. Clearly, genetic programs are not synonymous with life itself. If viable spore-seeding of Earth did occur, an intervening time period of about 3.8 billion years (Mojzsis et al., 1996; Parsons et al., 1998; Schopf, 1993; Van Zuilen et al., 2002) passed, to bring us to our present state of evolution.
7 The nature of prescriptive information
All of the above problems pale in comparison to the difficulty of explaining the origin of (1) an operating system, (2) genetic programming, and (3) encryption/decryption coding. Natural processes, mechanisms, and chemical catalyses do not explain any of these emergent conceptual phenomena. There is an immense paucity of information/knowledge that we do not have. For example, transcription and translation can only function if there is a predetermined shared meaning as to what required proteins are synthesized at translation. How was this shared meaning between source and destination pre-established? Each specific genetic message from DNA to RNA to protein can only be decoded if the coding/decoding apparatus and operating system pre-exist the message. The message received by the ribosome must be decrypted and translated into particular proteins needed for certain tasks. These proteins in turn must be transported to the correct binding site, the true destination of the source's original message. Even “meaningful” RNA or DNA inserted into a lifeless physical world such as the ancient Earth, would not be “readable”. It could not communicate its coded message for protein synthesis unless a language (operating system) context already existed. Programs must be executable. This requires the equivalent of a hardware and an operating system context. All necessary structures/functions for protein synthesis would have had to be in place, and a predetermined specific correspondence between codon sequence and amino acid sequence had to have predated translation.
8 The RNA world
It is not surprising that the RNA world model is so appealing despite its many biochemical problems (Joyce and Orgel, 1999; Shapiro, 1984, 1987, 1988, 1999, 2000, 2002). The RNA world conveniently bypasses translative coding issues. The same molecule acts as a catalyst and physical information matrix. Ribonucleotides do not have to be grouped into triplet “block codes” such that each symbolize a different amino acid letter in a protein sentence. No plausible natural-process mechanism for development of such an ingenious noise-reducing, redundancy-based coding scheme is needed for RNA world theory.
RNA may have been copied to DNA (as by current retroviruses). This would have allowed DNA to take over as the more stable, double-stranded, genetic instruction set. DNA has many other advantages such as the capability of exact replication and partitioning between offspring cells, and transcription to mRNA (Line, 2002). But retroviruses depend upon reverse transcriptase, a complex protein enzyme not available in the RNA world. Most metabolic functions require highly tailored protein enzymes that not only act to regulate DNA, but do most of the work of the cell. The “Which came first, DNA or proteins?” question, otherwise known as the “chicken and the egg problem”, is a real enigma and not easily solved despite a century of life-origin research. In addition, the conversion from RNA to DNA worlds does not explain the origin of initial RNA genetic programming. How did covalently-bound nucleotide sequencing anticipate what amino acid sequences would be needed? Moreover, the instruction code for the enzymes needed to make all this function are contained in the genetic instruction set. The instruction set needs protein synthesis to replicate the instruction set and regulate cell division (Trevors, 2004).
9 Did genetic instruction arise by “necessity”?
We keep intuiting, “there must be some natural mechanism that we are overlooking which will explain the origin of genetic instruction”. By this we mean some cause-and-effect “necessity” rather than ”chance” (Monod, 1972). But is such a natural mechanism a plausible scenario for the origin of genetic instructions?
Natural mechanisms are all highly self-ordering. Reams of data can be reduced to very simple compression algorithms called the laws of physics and chemistry. No natural mechanism of nature reducible to law can explain the high information content of genomes. This is a mathematical truism, not a matter subject to overturning by future empirical data. The cause-and-effect necessity described by natural law manifests a probability approaching 1.0. Shannon uncertainty is a probability function (−log
Living cells are capable of replication, error correction, and sufficient mutation for diversity; yet sufficient accuracy of replication is maintained to preserve the relative constancy of any species. The DNA template can be transcribed to mRNA which is then translated into predetermined, useful, necessary, proteins. This is an immense instructional complexity and prone to errors at several steps and at every base. However, within the confines of a cell, the entire process functions at a high level of fidelity. Exceptions include the effects of mutagen(s), cell stress, injury, and death. Under appropriate environmental conditions, minimal errors are made. This means that the genetic communication system is functionally optimized (Bradley, 2002; Freeland and Hurst, 1998; Freeland et al., 2000; Gilis et al., 2001; Labouygues and Figureau, 1984).
The sequence of nucleotides in DNA determines the coded instruction set. A predetermined knowledge of the decryption cipher and the cellular enzymes and organelles is needed to translate the coded information. Specific sequences of deoxyribonucleotides are essential to communicate each biomessage. The correct protein must be synthesized in the correct amounts and at the correct time. Without a coding/decoding system, message sequences in the first mRNA and DNA molecules would have been meaningless (nonfunctional metabolically). Communication within the protocell could not have been established. The nucleotide sequence is also meaningless without a conceptual translative scheme and physical “hardware” capabilities. Ribosomes, tRNAs, aminoacyl tRNA synthetases, and amino acids are all hardware components of the Shannon message “receiver”. But the instructions for this machinery is itself coded in DNA and executed by protein “workers” produced by that machinery. Without the machinery and protein workers, the message cannot be received and understood. And without genetic instruction, the machinery cannot be assembled.
10 Did the genetic code arise by “chance”?
It is not reasonable to expect hundreds to thousands of random sequence polymers to all cooperatively self-organize into an amazingly efficient holistic metabolic network. The spontaneous generation of long sequences of DNA out of sequence space (Ω) does have the potential to include the same sequences as genetic information. But there is no reason to suspect that any instructive biopolymer would isolate itself out of Ω and present itself at the right place and time. Eigen and Schuster, along with others, have pointed out repeatedly that a competition for resources would have existed in any prebiotic environment. This would have greatly limited both sequence space and hypercyclic advance (Eigen, 1971a,b, 1983, 1987, 1992; Eigen et al., 1980a,b; Schuster, 1984; Smith, 1979). The latter is especially true of a theoretical RNA world where the number and length of RNA strands is greatly limited. In aqueous solution, a maximum of 8–10 RNA mers can polymerize (Ferris et al., 1996; Joyce and Orgel, 1999). Up to 55 mers can polymerize on montmorillonite (Ferris et al., 1996), but only at the expense of information content. The polyadenines and polyuracils contain essentially no Shannon uncertainty. They could not have contributed to algorithmic programming of genes.
Even if all the right primary structures (digital messages) mysteriously emerged at the same time from Ω, “a cell is not a bag of enzymes”. And, as we have pointed out several times, there would be no operating system to read these messages.
Without selection of functional base sequencing at the covalent level, no biopolymer would be expected to meet the needs of an organizing metabolic network. There is no prescriptive information in random sequence nucleic acid. Even if there were, unless a system for interpreting and translating those messages existed, the digital sequence would be unintelligible at the receiver and destination. The letters of any alphabet used in words have no prescriptive function unless the destination reading those words first knows the language convention.
11 The role of biological editing functions
No new information can be inserted into existing DNA without sophisticated restriction and ligase enzymes. The editing function of these enzymes, including that of the far less sophisticated ribozymes, must be itself algorithmically instructed. All in vitro ribozymic editing requires extensive artificial selection by humans (e.g., SELEX) (Ellington and Szostak, 1990; Robertson and Joyce, 1990; Tuerk and Gold, 1990). Nucleotide sequence is deliberately manipulated and steered through many iterations to achieve the experimenter's goal. SELEX experiments demonstrate the extraordinary inventive prowess of some excellent RNA chemists. But the fine work of these biopolymer engineers has little or nothing to do with natural selection. In addition, “directed evolution” is a self-contradictory nonsense phrase that has no place in the literature. If an experiment is directed, it is not evolutionary. Evolution has no goal.
Whatever prescriptive information DNA contains has to be instantiated into its physical matrix as the strand forms with covalent bonds. Conformation and function are ultimately determined by primary structure. The linear digital sequence of nucleotide selections constitutes the message of “messenger molecules”. A genetic information system or convention must have been devised prior to ribonucleotide sequencing. Only then could message source and destination (binding sites) “be on the same page”.
Random sequences are the antithesis of prescribed genetic information. There is no empirical or rational justification for theorizing that the random shuffling of nucleotides could generate instructions for a metabolic network. Progress has been made, however, on the evolution of already existing genetic instructions (Altman, 2000; Altreuter and Clark, 1999; Alves et al., 2002; Baltscheffsky, 1997; Baltscheffsky et al., 1999; Benner et al., 1987; Benner and Ellington, 1987, 1988; Blankenship and Hartman, 1998; Castresana and Saraste, 1995; Cech 1993, 2000; Cunchillos and Lecointre, 2002, 2003; Goossens et al., 2003; Hartman, 1975; Morowitz et al., 2000; Trevors, 1997b; Wachtershauser, 1990, 1992). But none of these papers provide mechanisms whereby stochastic ensembles in prebiotic environments acquire algorithmic programming prowess. Even the earliest protometabolism would have needed integrative management.
New approaches to investigating the origin of the genetic code are required. The constraints of historical science are such that the origin of life may never be understood. Selection pressure cannot select nucleotides at the digital programming level where primary structures form. Genomes predetermine the phenotypes which natural selection only secondarily favors. Contentions that offer nothing more than long periods of time offer no mechanism of explanation for the derivation of genetic programming. No new information is provided by such tautologies. The argument simply says it happened. As such, it is nothing more than blind belief. Science must provide rational theoretical mechanism, empirical support, prediction fulfillment, or some combination of these three. If none of these three are available, science should reconsider that molecular evolution of genetic cybernetics is a proven fact and press forward with new research approaches which are not obvious at this time.
JTT is supported by an NSERC (Canada) Discovery grant; DLA is supported by grants from The Origin-of-Life Foundation, Inc. a 501-c-3 science foundation.
Arrhenius S. Worlds in the making. 1908:
Barbieri M. The organic codes: an introduction to semantic biology. 2003:
Beebe K, Ribas de Pouplana, L, Schimmel, P. Elucidation of tRNA-dependent editing by a class II tRNA synthetase and significance for cell viability. EMBO J 2003:22:668-75
Benner SA, Allemann, RK, Ellington, AD, Ge, L, Glasfeld, A, Leanz, GF. Natural selection, protein engineering, and the last riboorganism: rational model building in biochemistry. Cold Spring Harb Symp Quant Biol 1987:52:53-63
Burton FG, Lohrmann, R, Orgel, LE. On the possible role of crystals in the origins of life. VII. The adsorption and polymerization of phosphoramidates by montmorillonite clay. J Mol Evol 1974:3:141-50
Cairns-Smith AG. Seven clues to the origin of life. 1990:
Di Giulio M, Medugno, M. The historical factor: the biosynthetic relationships between amino acids and their physicochemical properties in the origin of the genetic code. J Mol Evol 1998:46:615-21
Ding PZ, Kawamura, K, Ferris, JP. Oligomerization of uridine phosphorimidazolides on montmorillonite: a model for the prebiotic synthesis of RNA on minerals. Orig Life Evol Biosph 1996:26:151-71
Eigen M, Gardiner, WC, Schuster, P. Hypercycles and compartments. Compartments assists – but do not replace – hypercyclic organization of early genetic information. J Theor Biol 1980:85:407-11
Eigen M. Steps toward life. 1992:
Ferris JP, Huang, CH, Hagan, WJ. Montmorillonite: a multifunctional mineral catalyst for the prebiological formation of phosphate esters. Orig Life Evol Biosph 1988:18:121-33
Ferris JP, Ertem, G, Agarwal, V. Mineral catalysis of the formation of dimers of 5′-AMP in aqueous solution: the possible role of montmorillonite clays in the prebiotic synthesis of RNA. Orig Life Evol Biosph 1989:19:165-78
Friebele E, Shimoyama, A, Ponnamperuma, C. Adsorption of protein and non-protein amino acids on a clay mineral: a possible role of selection in chemical evolution. J Mol Evol 1980:16:269-78
Goossens A, Hakkinen, ST, Laakso, I, Seppanen-Laakso, T, Biondi, S, De Sutter, V. A functional genomics approach toward the understanding of secondary metabolism in plant cells. PNAS 2003:100:8595-600
Guimaraes RC. The Genetic Code as a Self-Referential and Functional System. International Conference on Computation, Communications and Control Technologies 2004:7:160-5
Hoyle F, Wickramasinghe, NC. Space travelers. 1981:
Kawamura K, Ferris, JP. Clay catalysis of oligonucleotide formation: kinetics of the reaction of the 5′-phosphorimidazolides of nucleotides with the non-basic heterocycles uracil and hypoxanthine. Orig Life Evol Biosph 1999:29:563-91
Kvenvolden K, Lawless, J, Pering, K, Peterson, E, Flores, J, Ponnamperuma, C. Evidence for extraterrestrial amino-acids and hydrocarbons in the Murchison meteorite. Nature 1970:228:923-6
Mann S, Perry, CC. Structural aspects of biogenic silica. Silicon Biochemistry 1986:121:40-58
Martin W, Russell, MJ. On the origins of cells: a hypothesis for the evolutionary transitions from abiotic geochemistry to chemoautotrophic prokaryotes, and from prokaryotes to nucleated cells. Philos Trans R Soc Lond B Biol Sci 2003:358:59-83
Nangle LA, De Crecy Lagard, V, Doring, V, Schimmel, P. Genetic code ambiguity. Cell viability related to the severity of editing defects in mutant tRNA synthetases. J Biol Chem 2002:277:45729-33
Parsons I, Lee, MR, Smith, JV. Biochemical evolution. II: origin of life in tubular microstructures on weathered feldspar surfaces. Proc Natl Acad Sci U S A 1998:95:15173-6
Schimmel P, Ribas de Pouplana, L. Genetic code origins: experiments confirm phylogenetic predictions and may explain a puzzle [published erratum appears in Proc Natl Acad Sci U S A 1999 May 11;96(10):5890]. Proc Natl Acad Sci U S A 1999:96:327-8
Schimmel P, Ribas de Pouplana, L. Formation of two classes of tRNA synthetases in relation to editing functions and genetic code. Cold Spring Harb Symp Quant Biol 2001:66:161-6
Secker J, Wesson, PS, Lepock, JR. Astrophysical and biological constraints on radiopanspermia. J R Astro Soc Can 1996:90:184-92
Seligmann H, Amzallag, GN. Chemical interactions between amino acid and RNA: multiplicity of the levels of specificity explains origin of the genetic code. Naturwissenschaften 2002:89:542-51
Skouloubris S, de Pouplana, LR, de Reuse, H, Hendrickson, TL. A noncognate aminoacyl-tRNA synthetase that may resolve a missing link in protein evolution. Proc Natl Acad Sci U S A 2003:100:11297-302
Trevors JT. Hydrophobic Medium (HM) water interface, cell division and the self-assembly of life. Theory Biosci 2002:121:163-74
Turing AM. On computable numbers, with an application to the entscheidungs problem. Proc R Soc Lond Math Soc 1936:42:230-65
Williams RJP. Introduction to silicon chemistry and biochemistry. Silicon Biochemistry 1986:121:24-39
Xue H, Tong, KL, Marck, C, Grosjean, H, Wong, JT. Transfer RNA paralogs: evidence for genetic code-amino acid biosynthesis coevolution and an archaeal root of life. Gene 2003:310:59-66
Yarian C, Townsend, H, Czestkowski, W, Sochacka, E, Malkiewicz, AJ, Guenther, R. Accurate translation of the genetic code depends on tRNA modified nucleosides. J Biol Chem 2002:277:16391-5
Yockey HP. Information theory and molecular biology. 1992:
Yockey HP. Information theory, evolution, and the origin of life. Fundamentals of life 2002:335-48
Yockey HP. Informatics, information theory, and the origin of life. Fourth International Conference on Computational Biology and Genome Informatics 2002:
Received 8 April 2004/19 May 2004; accepted 24 June 2004doi:10.1016/j.cellbi.2004.06.006