More Recent Comments

Sunday, April 26, 2009

Should Scientific Organizations Advocate Accommodationism?

 
John Wilkins has started an interesting debate on the topic of Science and religion for individuals and organisations. He starts with a couple of multiple choice questions.

Get on over to Evolving Thoughts and share your evolving thoughts on the subject. Meanwhile you can answer my own multiple choice question in the sidebar.
What should scientific organizations like AAAS and NAS say about religion?
a) that religion and science are compatible
b) that religion and science are incompatbile
c) nothing


Saturday, April 25, 2009

Science in the Media:Put Up or Shut Up

 
Kathy Sykes is a professor of sciences and society at the University of Bristol (UK). She writes about science journalism in the latest issue of New Scientist [Science in the media: Put up or shut up].

Sykes doesn't like the fact that scientists are criticizing popular science journalism. Ryan Gregory has already posted an article about this and I urge you to go to Genomicron and leave a comment on his posting: Scientists about media: put up or shut up?.

I just want to make one point. Sykes writes ...
Similarly, New Scientist recently took flak over its cover that proclaimed "Darwin was wrong". The article inside described discoveries that are leading to modifications to the theory of evolution. A cheap trick to sell magazines while giving fodder to the enemies of evolution? Sales certainly went up that week, but if more people than usual bought the magazine and read the article, more people will have found that scientists agree that Darwin was fundamentally right.
The three most important criteria for good science journalism are: accuracy, accuracy, and accuracy. Everything else is secondary.

My objection to that article in New Scientist was that it had nothing to do with Darwin. It's not a question of whether Charles Darwin was right or wrong about horizontal gene transfer and the early evolution of prokaryotes. He had absolutely nothing to say about the matter. Dragging Darwin's name into modern molecular evolution was a cheap ploy to boost sales. People reading the article would have still got the wrong impression about Darwin's contributions, even if they had ignored the cover.

The article was scientifically inaccurate because it misrepresented the state of science in 2009 [Explaining the New Scientist Cover].


Does Intelligent Design Creationism Make Scientific Predictions?

 
It is often claimed that Intelligent Design Creationism doesn't make predictions. This is not true. IDC predicted that irreducibly complex systems could not evolve. That was a firm prediction by Michael Behe.

The prediction has been shown to be wrong. There are many natural evolutionary pathways known to give rise to irreducibly complex systems. The citric acid cycle is a clear example and so is the bacterial flagellum.

Here's another prediction, according to Barry Arrington on Uncommon Descent [FAQ4 is Open for Comment].
ID does not make scientifically fruitful predictions.

This claim is simply false. To cite just one example, the non-functionality of “junk DNA” was predicted by Susumu Ohno (1972), Richard Dawkins (1976), Crick and Orgel (1980), Pagel and Johnstone (1992), and Ken Miller (1994), based on evolutionary presuppositions. In contrast, on teleological grounds, Michael Denton (1986, 1998), Michael Behe (1996), John West (1998), William Dembski (1998), Richard Hirsch (2000), and Jonathan Wells (2004) predicted that “junk DNA” would be found to be functional.

The Intelligent Design predictions are being confirmed and the Darwinist predictions are being falsified. For instance, ENCODE’s June 2007 results show substantial functionality across the genome in such “junk DNA” regions, including pseudogenes.

Thus, it is a matter of simple fact that scientists working in the ID paradigm carry out and publish research, and they have made significant and successful ID-based predictions.
This one is more contentious. There are many scientists who think that much of what we currently call "junk DNA" actually has a function. Even though they might be atheists, their prediction is the same as the creationists.

I'm convinced that most of our genome is truly junk. I predict that the creationist prediction will turn out to be wrong. I wonder if it means that intelligent design creationism will be falsified?


Not Me

 
Andy Thomson is a psychiatrist. He gave a talk at the Atheist Convention 2009 in Atlanta, Georgia (USA). PZ Myers thinks that Thomson's explanation of religious belief is just what he (PZ) believes.

Not me. The talk is far too adaptationist for my liking. The entire lecture is based on evolution by natural selection—the Darwinian explanation.
[Darwin's] idea gives us the only workable explanation we have for the design and architecture of the human mind.
No it isn't the only workable explanation. I believe that our present mind is also due, in part, to accidents of evolution some of which might have nothing to do with design. Some of them might even be maladaptations. The architecture of our brain is a product of evolution but not all of that evolution is adaptation by natural selection.

We have got to stop trying to explain everything as an adaptation or the consequences of an adaptation. Many, but not all, people are prone to superstitious beliefs. Much of that is due to culture and it can be changed. Our brains are not perfect. They can be tricked into believing all sorts of silly things and believing in God is just one of them. It does not deserve a special evolutionary explanation.

At some point in the near future, religion will be only a minor problem in most Western industrialized nations. Will we have psychiatrists giving lectures about how are brains are adapted to be atheists?

Of course not, just as today we don't have psychiatrists and psychologists giving lectures about how the human brain is adapted to prefer slavery or the inferiority of women. Perhaps they would have if they had lived 1000 years ago.



Watch the video starting at 27 minutes. You'll see Thomson praising research that locates thoughts like "God's Love" and "God's Anger" to specific parts of the brain. These are the same parts of the brain used in other thoughts. Presumably, they are the same parts of the brain used when thinking about being abducted by UFO's or believing in Santa Claus. That's not a big deal, is it?

So when Thomson says, that this data, "Supports theories that ground religious belief in evolved adaptive mechanisms," he could just as easily have said the same thing about UFO abductions ("The evidence support theories that ground belief in UFO abductions in evolved adaptive mechanisms.")

What is the alternative? Did anyone think that these thoughts would map to a special part of the brain that was used exclusively for thinking about God's Love?


Evolution of the Long Distance Runner

 
Today's Toronto Star has a feature article on marathon running [Any schmo can run a marathon]. The subtitle is more informative "Humans, scientists say, are built for speed – or, at least, endurance. It's all in our shortish toes and big behind."

As one of those schmos who can't run a marathon,1 I'm always intrigued by claims that all the rest of you have evolved over millions of years to become the perfect marathon runners. The article, by staff reporter Cathel Kelley, focuses on the claims of Daniel Lieberman, an anthropologist at Harvard University. He is one of many scientist who claim that humans are vastly superior at long distance running compared to other mammals, and even compared to our ancestors. They claim there's been selection for the ability to run long distances. Is this a reasonable explanation?

Lieberman's latest paper shows that individuals with short toes are possibly better runners than those with longer toes (Rolian et al. 2009). Since humans tend to have shorter toes than non-bipedal primates, this suggests a possible evolutionary adaptation to running.

An earlier paper promoted the idea that our gluteus maximus (GM) muscle is also an adaptation for long distance running (Lieberman et al. 2006). The closing paragraph of that second paper supports an adaptive explanation but it expresses the appropriate caveats.

Future experimental and paleontological research is necessary to clarify the functional and evolutionary history of the human GM. Based on the above results, we offer several alternative scenarios that merit further study. As noted above, one possibility is that australopithecines had an intermediate configuration of the GM (Berge, 1994Go; Berge and Daynes, 2001Go), retaining some kind of caudal portion but with a less expanded cranial portion than is evident in Homo. If so, then the caudal portion would likely have been an effective extensor of the femur during climbing and perhaps walking, and the cranial portion would have helped to stabilize the sacrum, but probably would not have been a strong trunk stabilizer. An implication of this scenario is that the expansion of the cranial portion of the GM is a derived trait of Homo that would have been selected for control of trunk flexion during endurance running (Bramble and Lieberman, 2004Go) and/or foraging (Marzke et al., 1988Go). An alternative possibility, however, is that the configuration of the GM in Australopithecus was much like that of Homo in terms of the loss of the GMIF. Either the australopithecine GM as a whole was relatively smaller, as many researchers suggest, or possibly as large as in humans (Haeusler, 2002Go). As shown above, the GM in either case is unlikely to have played much of a role in level terrain walking, and is unlikely to have been selected for running given that the genus lacks many other features associated with running capabilities (Bramble and Lieberman, 2004Go). According to this scenario, the derived anatomy of the GM in Australopithecus was probably a reconfiguration of the gluteal musculature for climbing, or a novel adaptation for foraging tasks such as digging that involve flexion of the trunk (Marzke et al., 1988Go). We cannot discount the hypothesis that expansion of the GM might have been useful for walking on uneven terrain. However, it is clear that expansion of the GM in Homo would have benefited any activity that requires trunk stabilization, especially running. Regardless of which scenario is correct, the expansion of cranial portion of the GM is a uniquely hominid characteristic, perhaps distinctive to the genus Homo, which played a vital role in the evolution of human running capabilities.
The newspaper description of the endurance running hypothesis (ERH) is a little more descriptive.
Humans' ability to run is unique among primates.

Why running? Because that's how we killed our food.

Experts call it persistence hunting. The Homo genus did not develop the most basic projectile – the spear – until 200,000-300,000 years ago. That left our ancestors equipped with little more than sharpened sticks for nearly two million years of carnivorous prehistory.

"Even middle-aged college professors can run at a speed that's above the trot-gallop transition of most animals," Lieberman says.

"Why is that important? Quadrupeds cannot pant and gallop at the same time. Their guts are too busy sloshing around like a piston. So, every 10 or 15 minutes, they overheat."

When they overheat, animals must stop to cool. But their bipedal pursuers keep on coming. After several stops and starts, the prey succumbs to heat exhaustion or its heart gives out.

This explains why they don't run the Iditarod in August.

Lieberman contends that this is the only explanation of how humans were capable of killing large game before developing projectile weapons.

"I defy most people to go out and kill a wildebeest with a wooden stick," he says.
Here's how I understand this story.

About one million years ago the entire human population was engaged in hunter-gatherer activities on the African savanna. Most of the small groups obtained a significant amount of their food by hunting large animals. The males would run after these large animals with no weapons. The animals would run away but the humans kept chasing them until the animals couldn't run any more and they dropped dead. (Presumably the wildebeests never caught on to the fact that they could just turn around and gore the pesky humans. Or maybe they couldn't because the humans could outrun them? Here's what happens when marathon-adapted humans try running with bulls.)

There was considerable variation within the human population. Some men had short toes and some men had long toes. Some men had well-developed gluteus maximus muscles and some men didn't. Presumably, the men with genetic traits that enabled them to run faster or farther than the other men got more food than their friends. Their friends either died of starvation or else they had so little meat they couldn't get a mate and reproduce.

Over time there was selection for men who could run farther and faster and humans became adapted to long-distance running. (Presumably the women were good at it as well because they inherited their genes from their fathers.)

When humans began to inhabit other locations that didn't require running, the adaptations remained because by that time all the low fitness variations had been eliminated from the population. That's why there was no loss of this ability when humans began to settle in northern forests and caves, and began to farm and create cities. We all remain well-adapted to long distance running so that, with only a little training, we could all chase down a wildebeest on the African savanna.

I assume the wildebeests just didn't evolve as quickly or they would have adapted as well.

The bison on the North American plains probably could run faster than the natives because, to the best of my knowledge, the North American natives didn't run after buffalo in order to make them die of heat exhaustion. They used sneaky tricks like forcing them to charge over cliffs. They also sneakily used bows and arrows. The natives only started chasing buffalo when horses became available, which is very strange since humans are better at long-distance running than horses—or so the story goes.

One of the problems with evolutionary psychology is that the psychologists claim to know exactly what human societies were like one million years ago. That's one of the problems with the endurance running hypothesis as well. It is based on the assumption that we know how primitive societies obtained food (by running after large animals on the savanna). In fact, we don't know if this is true and we don't even know what percentage of the species might have adopted this lifestyle.


1. Because my toes are too long.

[Image Credit (upper): Constantina Dita-Tomescu]

Lieberman, D.E., Raichlen, D.A., Pontzer, H., Bramble, D.M., and Cutright-Smith, E. (2006) The human gluteus maximus and its role in running. Journal of Experimental Biology 209:2143-2155. [DOI: 10.1242/jeb.02255]

Rolian, C., Lieberman, D.E., Hamil, J., Scott, J.W., and Werbel, W. (2009) Walking, running and the evolution of short toes in humans. Journal of Experimental Biology 212:713-72. [DOI: 10.1242/jeb.019885]

A Horse of a Different Color

 
John Hawks continues to post interesting articles on his blog and he continues his policy of not allowing comments. I want to ask a question about his latest posting [A horse of a different color] so I'm asking it here.

John is referring to a short paper on the evolution of coat colors in horses. Apparently, the analysis of DNA from ancient fossil horses reveals that most of them were bay in color. The chestnut coat color wasn't detected until about nine thousand years ago.

The authors of the paper claim there is no evidence for selection of coat color in horses prompting the following comment by John Hawks.
The pigment-altering mutations at these genes do not all show statistical signs of selection in contemporary samples of horses. But they aren't there in the ancient horses. That's the best evidence of selection you could possibly have. Message: tests of selection on contemporary samples are weak, particularly for loci with rare alleles or more than two alleles.
John, if I understand you correctly, you're saying that as long as an allele wasn't detectable in ancient populations but is detectable today then random genetic drift is ruled out.

That doesn't make sense to me. Perhaps you could explain? There must be more to your statement than that.


Friday, April 24, 2009

On the Existence of God and the Courtier's Reply

Atheists and theists often discuss the existence of God. Unfortunately, these discussions often degenerate into classic Christian apologetics where the main goal of the theist is to rationalize why his or her god doesn't conflict with rationality.

Before long they are rambling on about how to resolve the problem of evil or why god doesn't reveal herself. These problems only exist once you've accepted the premise that there is a god/spirit. This sort of apologetics has nothing to do with the fundamental question of whether god exists in the first place.

PZ Myers invented the parable of The Courtier's Reply to describe this situation.1 Rather than address the burning question—is the Emperor wearing any clothes?—the believers will complain that you don't understand the latest fashion.

They say you can't have a serious discussion about the existence of god because you aren't versed in the sophisticated arguments of Christian apologetics. In other words, you have to be intimately familiar with all the ways of rationalizing superstitious belief in god before challenging the very existence of god.2

It's amazing how few theists get the point. The latest person to demonstrate a fundamental misunderstanding of simple logic is Joe Hinman at Atheistwatch. Hinman has a Master's degree in Theology and he is currently studying for a Ph.D. in the history if ideas. He exposes himself by complaining about Anti-Intellectual Tendencies in Atheism.
So What this courtier's reply is saying is that if the skeptic says stupid things about theology and demonstrates that he knows nothing about it and the theist says "O your criticism is invalid because you don't understand what you are criticizing" then all the atheist has to do is say "that's the courtiers reply" and the theist is supposed to go "O my God, I've violated a law of logic!" and give up and stop believing in God. But in realty it's into a log of logic, I never heard it in a logic class.It's not in a logic text book, and the meaning of it is silly. I'ts just saying 'You can't point out my ignorance of theology because I will not allow theology to have any kind of validity or importance and religious people may not not any sort of human dignity." That's all it's saying. It's nothing more than anti-intellectual stupidity.

...

This anti-intellectual tendency is not confined to this one tactic. The new tactick, which I have noticed for a few years now, is to deny any sort of discipline of scholarship that has developed within the theological community. So any self defense that a believer could make is automatically suspect and wrong merely becasue it is theological. But then one wonders how the skeptics knowledge that theology is all bull shit could ever have developed in the first place? When we consider the history of Biblical scholarship it becomes clear that the atheists are merely arguing in a circle.

The history of scholarship shows us that it was not invented in answer to pressing atheist attacks on the bible.
Bingo! Christian apologetics was developed by people who believe in god. They needed to explain why their beliefs seem inconsistent with the real world. Many of these rationalizations are extremely "sophisticated" as you might expect since the problem is difficult.

I don't give a damn about those rationalizations no matter how many books have been written. Atheists don't have a problem with evil or sin or life after death or the resurrection. It's only theists who have a problem.

If Joe Hinman wants to explain why he is a theist then I'd be happy to discuss that topic. What's his best evidence for the existence of a spiritual world?


1. Also see The Emperor's New Clothes and the Courtier's Reply.

2. It's like saying that you have to learn how to cast a horoscope before you can question astrology.

Can watery asteroids explain why life is 'left-handed'?

It's time to re-visit the so-called "racemization" or "chirality" problem. The "problem" is thought to be the absolute preference for left-handed amino acids in living organisms.

Naturally occurring amino acids are racemic mixtures of both L- and D-amino acids. How did life come to select only one of the two possible stereoisomer for making proteins?

Sandwalk readers will know that I prefer an evolutionary explanation. In the beginning there were only a small number of amino acids that combined to make catalytically active peptides. One of these might have been glycine, which doesn't have L- and D- forms. Glycine might have formed spontaneously from acetate or glycerol.

Next came other simple amino acids whose biosynthesis might have been assisted by short polyglycine peptides with a few other naturally occurring amino acids. The peptides are biological catalysts, like modern enzymes only less efficient. It's possible that the primitive pathway might have favored the synthesis of amino acids like L-alanine and/or L-serine. (Many enzymatically catalyzed pathways are stereospecific—they make only one of the two possible forms.) The accumulation of L-alanine and L-serine could have been entirely by accident. It could just as easily have been D-alanine and D-serine.

Once the simple amino acids started to accumulate by biosynthesis, additional pathways started to evolve and more amino acids were added to the mix. These additional amino acids were all derived from the simple ones so they too were exclusively L- forms. Eventually life evolved from this chemical mixture and all the pathways were making the same form of amino acid. This process would have had to take place in a "warm little pond" in order to produce appropriate concentrations of the amino acids (and other things).

According to this scenario, the exclusive presence of L-amino acids instead of D-amino acids is just an accident. The fact that all amino acids are of the same form as the first ones is just a consequence of the fact that more complicated pathways started with the first ones as precursors.

Is there another theory? Yes there is. Some people think that life began in a soup containing all twenty or so common amino acids. They believe that all these amino acids formed spontaneously by chemical reactions rather than by the evolution of primitive catalysts from simple peptides.

Some people believe that the amino acids, and other chemicals, formed in outer space and they were delivered to Earth in meteorites. There has long been evidence that meteorites contain amino acids, lending support to this explanation.

Does this solve the chirality problem? No, it doesn't, because the amino acids found in meteorites are mixtures of L- and D-forms. People who support the idea that all twenty amino acids were present from the beginning would have to account for the selection of only one form from the mixture. Since this is highly unlikely, most favor a solution where some form of chemical synthesis preferentially results in a huge excess of left-handed amino acids. So far no example of such a reaction has been found.

On the other hand, there are hints that such a chemical reaction might be possible. There are 74 different amino acids in the Murchison meteorite and all of them are racemic mixtures (L- + D- forms). But in some cases there's a slight excess of the L- form of the amino acid suggesting that chemical formation of amino acids in outer space may favor the left-handed version—the same version that's found in living cells.

A recently published paper shows that the slight excess of one amino acid, isovaline, is enhanced by formation in liquid water (Glavin and Dworkin, 2009), giving rise to a press release that was reported in New Scientist as Watery asteroids may explain why life is 'left-handed'.

The scientific paper examines the chirality of amino acids found in several different meteorites. The main finding is that there's an excess of L-isovaline over D-isolvaline in most samples. The excess can be as high as 18%. Most other amino acids have equal amounts of the L- and D-forms.

Isovaline is not one of the naturally occurring amino acids found in protein and the difference is significant. All 20 of the common amino acids have a structure like that shown on the left of the figure below. The central carbon atom (called the α-carbon) is attached to an amino group (—NH3+) and a carboxyl group (—COO-). The carboxyl group makes it an acid and that's why these compounds are called amino acids.

Each carbon atom can have four covalent bonds. In the standard amino acids one of the other groups is always a hydrogen atom (—H). The other is a side chain shown as "R" in the figure. If the four groups bound to the α-carbon atom are different then the amino acid will exist in two different forms; L- and D-.1


The simplest amino acid is glycine where the R group is just a hydrogen atom. Thus, glycine is not a chiral compound and there's no such thing as L-glycine or D-glycine. All other natural amino acids are chiral.

Valine has a R group consisting of a branched 3-carbon chain. Isovaline, which is not a natural amino acid, is quite different. The hydrogen group found in all the standard amino acids is replaced by a methyl group (—CH3).

Why is this important? It's important for two reasons. First, because isolvaline is extremely rare on Earth you can be confident that the meteorite isn't contaminated by isovaline from living organisms. Second, all amino acids will spontaneously undergo racemization, or conversion of L- forms into D- forms and vice versa. Over million of years this reaction will lead to equal amounts of the two forms. With amino acids like isovaline, where there are four large chemical groups bound to the α-carbon atom, the rate of this reaction is very slow (billions of years rather than 10 million years).

The paper by Glavin and Dworkin suggests that there may be natural chemical processes leading to an increase of one stereoisomer over the other and this natural preference for L-amino acids may be the reason why life selected the L- forms over the D- forms. Because isovaline is so stable it may preserve the original bias that has been lost in the case of the other amino acids.

In order for extraterrestrial organic matter to have fueled the origin of life, a lot of meteorites carrying organic matter had to arrive on the primitive Earth. The problem of amino acid concentrations and stabiltity were discussed in a classic paper by Jeffrey Bada published in 1991.

Some of his calculations are worth remembering.

The current flux of extraterrestrial organic material is about 3 × 108 grams per year from cosmic dust and micrometeorites. About 1% of this is amino acids and most of them are not the ones found in living organisms. This should give rise over time to a concentration in the oceans of about 0.1 nM (10-10 M). That's not sufficient for life to have originated.

The flux in the past was almost certainly much greater and lots of organic material might have been delivered by large meteorites; however, it's unlikely that amino concentrations in the oceans could ever have been more than 10-100 pM for all amino acids combined.

Most amino acids will spontaneously degrade over time. There's a window of opportunity that only lasts about 10 million years because in that time all the water in the oceans will pass through hydrothermal vents and the high temperature will destroy most chemicals—including amino acids.

Bada concludes with ...
There are no known effective abiotic processes for generating chiral amino acids, which suggests that on the early Earth, only racemic amino acids would have existed. Because of the problem of racemization, it is likely that only after biotic protein synthesis became an efficeint process in the evolution of early life could the chirality of amino acids be maintained in proteins. Instead of amino acid chirality preceding the origin of life, it may have developed after life was well established, and possibly in close association with the origin of protein biosynthesis. As to why the protein amino acids consist only of the L-enantiomers, it is probably just a matter of chance.
The important lesson here is that there are several different scenarios leading to the preference for L- amino acids. It's wise to keep in mind that abiotic (chemical) synthesis of amino acids with a bias for the L- forms is not the only possibility.


1. It's better to call these L- and D- forms of the amino acids rather than "left-handed" and "right-handed." The "handedness" refers to the direction in which the stereoisomers rotate polarized light and in modern terminology there is no obligatory connection between the L- designation and the optical activity. As it turns out, most of the L-amino acids are also l-amino acids (levorotary = left-handed) actually d-amino acids (dextrorotary = right-handed), but there are some exceptions. L-cysteine, for example, is "right-handed" truly "left-handed" (see Specific Rotation and Temperature Coefficients of Amino Acids). Thanks to DK (see comments) for correcting my earlier mistake.

Bada, J. (1991) Amino acid cosmogeochemistry. Phil trans. R. Soc. Lond. 333:349-358.

Glavin, D.P. and Dworkin, J.P. (2009) Enrichment of the amino acid l-isovaline by aqueous alteration on CI and CM meteorite parent bodies. Proc. Natl. Acd. Sci. (USA) 106: 5487-5492 [DOI: 10.1073/pnas.0811618106]

Thursday, April 23, 2009

Nobel Laureate: Sir Paul Nurse

 

The Nobel Prize in Physiology or Medicine 2001

"for their discoveries of key regulators of the cell cycle"

Sir Paul M. Nurse (1949 - ) won the Noble Prize in 2001 for his contributions to understanding the regulation of gene expression in yeast cells. His co-recipients were Leland Hartwell and Tim Hunt.

A major part of most signaling pathways is the phosphorylation of proteins. The attachment of a phosphate group to a protein can convert it from an active state to an inactive state, or vice versa. Enzymes that catalyze phosphorylations are called "kinases" and one of the most common kinases is cyclin-dependent kinase or CDK. The activity of the kinase is itself regulated by proteins called cyclins.

Cyclins are produced at various stages of the cell cycle as it progresses from a growth state through mitosis and cell division as shown below in the press release. Paul Nurse's contribution to understanding the cell cycle was to characterize the cyclin-dependent kinase.

THEME:
Nobel Laureates
Press Release

Summary

All organisms consist of cells that multiply through cell division. An adult human being has approximately 100 000 billion cells, all originating from a single cell, the fertilized egg cell. In adults there is also an enormous number of continuously dividing cells replacing those dying. Before a cell can divide it has to grow in size, duplicate its chromosomes and separate the chromosomes for exact distribution between the two daughter cells. These different processes are coordinated in the cell cycle.

This year's Nobel Laureates in Physiology or Medicine have made seminal discoveries concerning the control of the cell cycle. They have identified key molecules that regulate the cell cycle in all eukaryotic organisms, including yeasts, plants, animals and human. These fundamental discoveries have a great impact on all aspects of cell growth. Defects in cell cycle control may lead to the type of chromosome alterations seen in cancer cells. This may in the long term open new possibilities for cancer treatment.

Leland Hartwell (born 1939), Fred Hutchinson Cancer Research Center, Seattle, USA, is awarded for his discoveries of a specific class of genes that control the cell cycle. One of these genes called "start" was found to have a central role in controlling the first step of each cell cycle. Hartwell also introduced the concept "checkpoint", a valuable aid to understanding the cell cycle.

Paul Nurse (born 1949), Imperial Cancer Research Fund, London, identified, cloned and characterized with genetic and molecular methods, one of the key regulators of the cell cycle, CDK (cyclin dependent kinase). He showed that the function of CDK was highly conserved during evolution. CDK drives the cell through the cell cycle by chemical modification (phosphorylation) of other proteins.

Timothy Hunt (born 1943), Imperial Cancer Research Fund, London, is awarded for his discovery of cyclins, proteins that regulate the CDK function. He showed that cyclins are degraded periodically at each cell division, a mechanism proved to be of general importance for cell cycle control.

One billion cells per gram tissue

Cells having their chromosomes located in a nucleus and separated from the rest of the cell, so called eukaryotic cells, appeared on earth about two billion years ago. Organisms consisting of such cells can either be unicellular, such as yeasts and amoebas, or multi-cellular such as plants and animals. The human body consists of a huge number of cells, on the average about one billion cells per gram tissue. Each cell nucleus contains our entire hereditary material (DNA), located in 46 chromosomes (23 pairs of chromosomes).

It has been known for over one hundred years that cells multiply through division. It is however only during the last two decades that it has become possible to identify the molecular mechanisms that regulate the cell cycle and thereby cell division. These fundamental mechanisms are highly conserved through evolution and operate in the same manner in all eukaryotic organisms.

The phases of the cell cycle

The cell cycle consists of several phases (see figure). In the first phase (G1) the cell grows and becomes larger. When it has reached a certain size it enters the next phase (S), in which DNA-synthesis takes place. The cell duplicates its hereditary material (DNA-replication) and a copy of each chromosome is formed. During the next phase (G2) the cell checks that DNA-replication is completed and prepares for cell division. The chromosomes are separated (mitosis, M) and the cell divides into two daughter cells. Through this mechanism the daughter cells receive identical chromosome set ups. After division, the cells are back in G1 and the cell cycle is completed.

The duration of the cell cycle varies between different cell types. In most mammalian cells it lasts between 10 and 30 hours. Cells in the first cell cycle phase (G1) do not always continue through the cycle. Instead they can exit from the cell cycle and enter a resting stage (G0).

Cell cycle control

For all living eukaryotic organisms it is essential that the different phases of the cell cycle are precisely coordinated. The phases must follow in correct order, and one phase must be completed before the next phase can begin. Errors in this coordination may lead to chromosomal alterations. Chromosomes or parts of chromosomes may be lost, rearranged or distributed unequally between the two daughter cells. This type of chromosome alteration is often seen in cancer cells.

It is of central importance in the fields of biology and medicine to understand how the cell cycle is controlled. This year's Nobel Laureates have made seminal discoveries at the molecular level of how the cell is driven from one phase to the next in the cell cycle.

Cell cycle genes in yeast cells

Leland Hartwell realized already at the end of the 1960s the possibility of studying the cell cycle with genetic methods. He used baker's yeast, Saccharomyces cerevisiae, as a model system, which proved to be highly suitable for cell cycle studies. In an elegant series of experiments 1970-71, he isolated yeast cells in which genes controlling the cell cycle were altered (mutated). By this approach he succeeded to identify more than one hundred genes specifically involved in cell cycle control, so called CDC-genes (cell division cycle genes). One of these genes, designated CDC28 by Hartwell, controls the first step in the progression through the G1-phase of the cell cycle, and was therefore also called "start".

In addition, Hartwell studied the sensitivity of yeast cells to irradiation. On the basis of his findings he introduced the concept checkpoint, which means that the cell cycle is arrested when DNA is damaged. The purpose of this is to allow time for DNA repair before the cell continues to the next phase of the cycle. Later Hartwell extended the checkpoint concept to include also controls ensuring a correct order between the cell cycle phases.

A general principle

Paul Nurse followed Hartwell's approach in using genetic methods for cell cycle studies. He used a different type of yeast, Schizosaccharomyces pombe, as a model organism. This yeast is only distantly related to baker's yeast, since they separated from each other during evolution more than one billion years ago.

In the middle of the 1970s, Paul Nurse discovered the gene cdc2 in S. pombe. He showed that this gene had a key function in the control of cell division (transition from G2 to mitosis, M). Later he found that cdc2 had a more general function. It was identical to the gene ("start") that Hartwell earlier had identified in baker's yeast, controlling the transition from G1 to S.

This gene (cdc2) was thus found to regulate different phases of the cell cycle. In 1987 Paul Nurse isolated the corresponding gene in humans, and it was later given the name CDK1 (cyclin dependent kinase 1). The gene encodes a protein that is a member of a family called cyclin dependent kinases, CDK. Nurse showed that activation of CDK is dependent on reversible phosphorylation, i.e. that phosphate groups are linked to or removed from proteins. On the basis of these findings, half a dozen different CDK molecules have been found in humans.

The discovery of the first cyclin

Tim Hunt discovered the first cyclin molecule in the early 1980s. Cyclins are proteins formed and degraded during each cell cycle. They were named cyclins because the levels of these proteins vary periodically during the cell cycle. The cyclins bind to the CDK molecules, thereby regulating the CDK activity and selecting the proteins to be phosphorylated.

The discovery of cyclin, which was made using sea urchins, Arbacia, as a model system, was the result of Hunt's finding that this protein was degraded periodically in the cell cycle. Periodic protein degradation is an important general control mechanism of the cell cycle. Tim Hunt later discovered cyclins in other species and found that also the cyclins were conserved during evolution. Today around ten different cyclins have been found in humans.

The engine and the gear box of the cell cycle

The three Nobel Laureates have discovered molecular mechanisms that regulate the cell cycle. The amount of CDK-molecules is constant during the cell cycle, but their activities vary because of the regulatory function of the cyclins. CDK and cyclin together drive the cell from one cell cycle phase to the next. The CDK-molecules can be compared with an engine and the cyclins with a gear box controlling whether the engine will run in the idling state or drive the cell forward in the cell cycle.

A great impact of the discoveries

Most biomedical research areas will benefit from these basic discoveries, which may result in broad applications within many different fields. The discoveries are important in understanding how chromosomal instability develops in cancer cells, i.e. how parts of chromosomes are rearranged, lost or distributed unequally between daughter cells. It is likely that such chromosome alterations are the result of defective cell cycle control. It has been shown that genes for CDK-molecules and cyclins can function as oncogenes. CDK-molecules and cyclins also collaborate with the products of tumour suppressor genes (e.g. p53 and Rb) during the cell cycle.

The findings in the cell cycle field are about to be applied to tumour diagnostics. Increased levels of CDK-molecules and cyclins are sometimes found in human tumours, such as breast cancer and brain tumours. The discoveries may in the long term also open new principles for cancer therapy. Already now clinical trials are in progress using inhibitors of CDK-molecules.


The different phases of the cell cycle. In the first phase (G1) the cell grows. When it has reached a certain size it enters the phase of DNA-synthesis (S) where the chromosomes are duplicated. During the next phase (G2) the cell prepares itself for division. During mitosis (M) the chromosomes are separated and segregated to the daughter cells, which thereby get exactly the same chromosome set up. The cells are then back in G1 and the cell cycle is completed.

This year's Nobel Laureates, using genetic and molecular biology methods, have discovered mechanisms controlling the cell cycle. CDK-molecules and cyclins drive the cell from one phase to the next. The CDK-molecules can be compared with an engine and the cyclins with a gear box controlling whether the engine will run in the idling state or drive the cell forward in the cell cycle.


[Photo Credit: Havard University]

The images of the Nobel Prize medals are registered trademarks of the Nobel Foundation (© The Nobel Foundation). They are used here, with permission, for educational purposes only.

Wednesday, April 22, 2009

It's a Beauty Pageant - What Did you Expect?

 
I think it's safe to assume that most of you don't watch beauty pageants. However, I think it's also a safe bet that you've seen excerpts from at least one or two while you were waiting for M*A*S*H or Star Trek reruns to begin.

Remember the impromptu speeches where the contestants showed us why "intelligent" and "beauty queen" don't belong together in the same sentence? Some of the responses have become classics. It's what we expect from women who enter beauty contests. What's the big deal?

According to The Chicago Sun-Times, during the latest Miss America contest one of the contestants, Miss Arizona, was asked "Do you think the U.S. should have universal health care as a right of citizenship? Why or why not?"

Her response was, "I think this is an issue of integrity regardless of which end of the political spectrum that I stand on. I've been raised in a family to know right from wrong, and politics, whether or not you fall in the middle, the left or the right, it's an issue of integrity, whatever your opinion is and I say that with the upmost conviction."

Right. That's exactly why most of us don't pay any attention to these shows. The only surprise here is that she didn't mention world peace or freedom.

The blogosphere is all aglow over the response of Miss California.


What's the problem? Were you expecting an intelligent answer from someone with a fake smile and ten pounds of makeup?

Here's another quote on the same topic. Can you guess who said it?1
"I'm a Christian. And so, although I try not to have my religious beliefs dominate or determine my political views on this issue, I do believe that tradition, and my religious beliefs say that marriage is something sanctified between a man and a woman."
Here's two more people who have announced on television that they are personally opposed to gay marriage. They are a little bit more important than Miss California.




1. Barack Obama during an inteview with the Chicago Daily Tribune, as reported on about.com: Lesbian Life.

Facts Supporting Intelligent Design Creationism?

 
We all know that the Intelligent Design Creationist movement consists almost exclusively of attacks on science. The idea seems to be that if you can cast doubt on evolution then this is evidence in favor of God.

Some unnamed Professor has challenged students to come up with facts that support Intelligent Design Creationism. The only criterion is; "fact can be any observation in biology that is substantiated by publication in a scientific journal,"

Casey Luskin attempts to meet the challenge over on the Discovery Institute propaganda site, Evolution News & Views: Helping Students Answer a Professor's Challenge to "Find a Fact" That Supports Intelligent Design (Part 2).

Here's a list of "scientific" publications submitted by Luskin. I haven't read all of them but, of the ones I've read, there isn't a single one containing a fact that supports the existence of God, let alone evidence that he/she designed anything at all. Furthermore, many of them aren't from a scientific journal. It looks like Casey Luskin has goofed, once again.

Let me know if any of these publications contain evidence of Intelligent Design Creationism. Is this the best they can do? (I've put asterisks in front of the ones I've read.)
*Douglas D. Axe, "Extreme Functional Sensitivity to Conservative Amino Acid Changes on Enzyme Exteriors," Journal of Molecular Biology, Vol. 301:585-595 (2000)

*Douglas D. Axe, "Estimating the Prevalence of Protein Sequences Adopting Functional Enzyme Folds," Journal of Molecular Biology, 1-21 (2004)

*Michael Behe, Darwin's Black Box: The Biochemical Challenge to Evolution (Free Press, 1996)

*Michael J. Behe & David W. Snoke, "Simulating Evolution by Gene Duplication of Protein Features That Require Multiple Amino Acid Residues," Protein Science, Vol 13:2651-2664 (2004)

Geoff Brumfiel, “Outrageous Fortune,” Nature, Vol. 439: 10-12 (Jan. 5, 2006)

Bract, "Inventions, Algorithms, and Biological Design," in Progress in Complexity, Information, and Design (Vol. 1.1, 2002)

*William A. Dembski, The Design Inference: Eliminating Chance Through Small Probabilities (Cambridge University Press, 1998)

a. William A. Dembski and Robert J. Marks II, “Conservation of Information in Search: Measuring the Cost of Success” (In publication, 2009)

b. William A. Dembski and Robert J. Marks II, “The Search for a Search: Measuring the Information Cost of Higher Level Search” (In publication, 2009)

*William Dembski and Jonathan Wells, The Design of Life: Discovering Signs of Intelligence in Living Systems, (FTE, 2008) (see www.thedesignoflife.net)

*Wayt T. Gibbs, “The Unseen Genome: Gems among the Junk,” Scientific American (November, 2003)

Guillermo Gonzalez and Jay Wesley Richards, The Privileged Planet: How our Place in the Cosmos is Designed for Discovery, (Regnery, 2004)

*Graham Lawton, "Why Darwin was wrong about the tree of life," New Scientist (January 21, 2009)

Hiroaki Kitano, ”Systems Biology: A Brief Overview,” Science, Vol. 295: 1662-1664 (March 1, 2002)

Wolf-Ekkehard Lönnig, "Dynamic genomes, morphological stasis, and the origin of irreducible complexity," in Dynamical Genetics pp. 101-119 (Valerio Parisi, Valeria De Fonzo, and Filippo Aluffi-Pentini eds., 2004)

Casey Luskin, “Human Origins and Intelligent Design,” Progress in Complexity and Design, (Vol 4.1, November, 2005)

Casey Luskin, "Intelligent Design Has Scientific Merit in Paleontology," part of the "Does Intelligent Design Have Merit?" debate at OpposingViews.com (September, 2008)

*Wojciech Makalowski, “Not Junk After All,” Science, Vol. 300(5623) (May 23, 2003)

Stephen C. Meyer, Marcus Ross, Paul Nelson & Paul Chien, "The Cambrian Explosion: Biology's Big Bang," in Darwinism, Design, and Public Education (John A. Campbell and Stephen C. Meyer eds., Michigan State University Press, 2003)

*a. Stephen C. Meyer, “The Cambrian Information Explosion,” in Debating Design (edited by Michael Ruse and William Dembski; Cambridge University Press 2004)

b. Stephen C. Meyer, “The origin of biological information and the higher taxonomic categories,” Proceedings of the Biological Society of Washington, Vol. 117(2):213-239 (2004)

Scott A. Minnich & Stephen C. Meyer, “Genetic analysis of coordinate flagellar and type III regulatory circuits in pathogenic bacteria,” in Proceedings of the Second International Conference on Design & Nature, Rhodes Greece (M.W. Collins & C.A. Brebbia eds., 2004)

Paul Nelson and Jonathan Wells, “Homology in Biology,” in Darwinism, Design, and Public Education, (Michigan State University Press, 2003)

*Richard v. Sternberg, "On the Roles of Repetitive DNA Elements in the Context of a Unified Genomic– Epigenetic System," Annals of the New York Academy of Sciences, Vol. 981: 154–188 (2002)

J.T. Trevors and D.L. Abel, "Chance and necessity do not explain the origin of life," Cell Biology International, Vol. 28: 729-739 (2004)

D. L. Abel & J. T. Trevors, “Self-organization vs. self-ordering events in life-origin models," Physics of Life Reviews, Vol. 3: 211–228 (2006)

Øyvind Albert Voie, "Biological function and the genetic code are interdependent," Chaos, Solitons and Fractals, Vol. 28:1000–1004 (2006)

Jonathan Wells, "Using Intelligent Design Theory to Guide Scientific Research" Progress in Complexity, Information, and Design (Vol. 3.1.2, November 2004)

Jonathan Wells, "Do Centrioles Generate a Polar Ejection Force?," Rivista di Biologia / Biology Forum, Vol. 98:71-96 (2005)


The Trouble with NCSE

 
I am a member of National Center for Science Education (NCSE). That does not mean I agree with everything they do. My biggest bone of contention is over their accommodationist tactics [National Academies: Science, Evolution and Creationism, Appeasers, Spaghetti Monsters, and NCSE].

I don't like the fact that NCSE cozies up to theistic evolutionists like Ken Miller and Francis Collins while, at the same time, actively distancing itself from vocal atheist scientists like Richard Dawkins. I think NCSE shouldn't takes side and shouldn't promote the idea that science and religion are compatible.

Jerry Coyne agrees. He has published a lengthy essay on his blog where he takes NCSE to task [Truckling to the Faithful: A Spoonful of Jesus Helps Darwin Go Down]. I have warned many people at NCSE that they risk losing the support of non-theist scientists but, for the most part, they think the risk is worth it.

I wonder if they still think that way?


Tuesday, April 21, 2009

How to Evaluate Genome Level Transcription Papers

It's often very difficult to evaluate the results of large-scale genome studies. Part of the problem is that the technology is complicated and the controls are not obvious. Part of the problem is that the results depend a great deal on the software used to analyze the data and the limitations of the software are often not described.

But those aren't the only problems. We also have to take into consideration the biases of the people who write the papers. Some of those biases are the same ones we see in other situations except that they are less obvious in the case of large-scale genome studies.

Laurence Hurst has written up a nice summary of the problem and I'd like to quote from his recent paper (Hurst, 2009).
In the 1970s and 80s there was a large school of evolutionary biology, much of it focused on understanding animal behavior, that to a first approximation assumed that whatever trait was being looked at was the product of selection. Richard Dawkins is probably the most widely known advocate for this school of thought, John Maynard Smith and Bill (WD) Hamilton its main proponents. The game played in this field was one in which ever more ingenious selectionist hypotheses would be put forward and tested. The possibility that selection might not be the answer was given short shrift.

By contrast, during the same period non-selectionist theories were gaining ground as the explanatory principle for details seen at the molecular level. According to these models, chance plays an important part in determining the fate of a new mutation – whether it is lost or spreads through a population. Just as a neutrally buoyant particle of gas has an equal probability of diffusing up or down, so too in Motoo Kimura's neutral theory of molecular evolution an allele with no selective consequences can go up or down in frequency, and sometimes replace all other versions in the population (that is, it reaches fixation). An important extension of the neutral theory (the nearly-neutral theory) considers alleles that can be weakly deleterious or weakly advantageous. The important difference between the two theories is that in a very large population a very weakly deleterious allele is unlikely to reach fixation, as selection is given enough opportunity to weed out alleles of very small deleterious effects. By contrast, in a very small population a few chance events increasing the frequency of an allele can be enough for fixation. More generally then, in large populations the odds are stacked against weakly deleterious mutations and so selection should be more efficient in large populations.

In this framework, mutations in protein-coding genes that are synonymous – that is, that replace one codon with another specifying the same amino acid and, therefore, do not affect the protein – or mutations in the DNA between genes (intergene spacers) are assumed to be unaffected by selection. Until recently, a neutralist position has dominated thinking at the genomic/molecular level. This is indeed reflected in the use of the term 'junk DNA' to describe intergene spacer DNA.

These two schools of thought then could not be more antithetical. And this is where genome evolution comes in. The big question for me is just what is the reach of selection. There is little argument about selection as the best explanation for gross features of organismic anatomy. But what about more subtle changes in genomes? Population genetics theory can tell you that, in principle, selection will be limited when the population comprises few individuals and when the strength of selection against a deleterious mutation is small. But none of this actually tells you what the reach of selection is, as a priori we do not know what the likely selective impact of any given mutation will be, not least because we cannot always know the consequences of apparently innocuous changes. The issue then becomes empirical, and genome evolution provides a plethora of possible test cases. In examining these cases we can hope to uncover not just what mutations selection is interested in, but also to discover why, and in turn to understand how genomes work. Central to the issue is whether our genome is an exquisite adaption or a noisy error-prone mess.
Sandwalk readers will be familiar with this problem. In the context of genome studies, the adaptationist approach is most often reflected as a bias in favor of treating all observations as evidence of functionality. It you detect it, then it must have been selected. If it was selected, it must be important.

As Hurst points out, the real question in evaluating genome studies boils down to a choice between an exquisitely adapted genome or one that is messy and full of mistakes. The battlefields are studies on the frequency of alternative splicing, transcription, the importance of small RNAs, and binding sites for regulatory proteins.

Let's take transcription studies as an example.
Consider, for example, the problem of transcription. Although maybe only 5% of the human genome comprises genes encoding proteins, the great majority of the DNA in our genome is transcribed into RNA [1]. In this the human genome is not unusual. But is all this transcription functionally important? The selectionist model would propose that the transcription is physiologically relevant. Maybe the transcripts specify previously unrecognized proteins. If not, perhaps the transcripts are involved in RNA-level regulation of other genes. Or the process of transcription may be important in keeping the DNA in a configuration that enables or suppresses transcription from closely linked sites.

The alternative model suggests that all this excess transcription is unavoidable noise resulting from promiscuity of transcription-factor binding. A solid defense can be given for this. If you take 100 random base pairs of DNA and ask what proportion of the sequence matches some transcription factor binding site in the human genome, you find that upwards of 50% of the random sequence is potentially bound by transcription factors and that there are, on average, 15 such binding sites per 100 nucleotides. This may just reflect our poor understanding of transcription factor binding sites, but it could also mean that our genome is mostly transcription factor binding site. If so, transcription everywhere in the genome is just so much noise that the genome must cope with.
There is no definitive solution to this conflict. Both sides have passionate advocates and right now you can't choose one over the other. My own bias is that most of the transcription is just noise—it is not biologically relevant.

That's not the point, however. The point is that as a reader of the scientific literature you have to make up your mind whether the data and the interpretation are believable.

Here's two criteria that I use to evaluate a paper on genome level transcription.
  1. I look to see whether the authors are aware of the adaptation vs noise controversy. If they completely ignore the possibility that what they are looking at could be transcriptional noise, then I tend to dismiss the paper. It is not good science to ignore alternative hypotheses. Furthermore, such papers will hardly ever have controls or experiments that attempt to falsify the adaptationist interpretation. That's because they are unaware of the fact that a controversy exists.1
  2. Does the paper have details about the abundance of individual transcripts? If the paper is making the case for functional significance then one of the important bits of evidence is reporting on the abundance of the rare transcripts. If the authors omit this bit of information, or skim over it quickly, then you should be suspicious. Many of these rare transcripts are present in less that one or two copies per cell and that's perfectly consistent with transcriptional noise—even if it's only one cell type that's expressing the RNA. There aren't many functional roles for an RNA whose concentration is in the nanomole range. Critical thinkers will have thought about the problem and be prepared to address it head-on.


1. Or, maybe they know there's a controversy but they don't want you to be thinking about it as you read their paper. Or, maybe they think the issue has been settled and the "messy" genome advocates have been routed. Either way, these are not authors you should trust.

Hurst, L.D. (2009) Evolutionary genomics and the reach of selection. Journal of Biology 8:12 [DOI:10.1186/jbiol113]

Monday's Molecule #118: Winners

 
UPDATE: The molecule is cyclin-dependent kinase 2 (CDK2), a protein involved in signaling [PDB 1b38]. The Nobel Laureate is Paul Nurse.

This week's winners are Mike Fraser of Toronto and Alex Ling of the University of Toronto.


This is a very famous protein but most of you won't be able to identify it from the structure alone. You'll need a hint of some sort.

Letting you know that the ligands are Mg2+ and adenosine-5′-triphosphate might not be enough so I'll also tell you that one of the authors on the structure paper was M.E. Noble.

There is one Nobel Laureate who is most closely identified with the function of this particular molecule, although that scientist was NOT the first to identify it. You have to identify the Nobel Laureate who got the prize for working out the function of the protein.

The first person to identify the molecule and the Nobel Laureate wins a free lunch at the Faculty Club. Previous winners are ineligible for one month from the time they first won the prize.

There are six ineligible candidates for this week's reward: Bill Chaney of the University of Nebraska, Elvis Cela from the University of Toronto, Peter Horwich from Dalhousie University, Devin Trudeau from the University of Toronto, Shumona De of Dalhousie University, and Maria Altshuler of the University of Toronto.

I note that Canadians are trouncing the rest of the world. That's as it should be.

I still have one extra free lunch donated by a previous winner to a deserving undergraduate so I'm going to continue to award an additional free lunch to the first undergraduate student who can accept it. Please indicate in your email message whether you are an undergraduate and whether you can make it for lunch.

THEME:

Nobel Laureates
Send your guess to Sandwalk (sandwalk (at) bioinfo.med.utoronto.ca) and I'll pick the first email message that correctly identifies the molecule and names the Nobel Laureate(s). Note that I'm not going to repeat Nobel Prizes so you might want to check the list of previous Sandwalk postings by clicking on the link in the theme box.

Correct responses will be posted tomorrow.

Comments will be blocked for 24 hours. Comments are now open.



Sequenced genomes contain thousands of "unknown" genes

 
The total number of genes in the human genome has dropped from the initial estimates of 30-35,000 to about 25,000. Of these, more than 4,000 encode functional RNAs, leaving about 20,500 protein-encoding genes in the human genome [Humans Have Only 20,500 Protein-Encoding Genes].

Up to 40% of these protein-encoding genes are "unknown" in the sense that no function has been assigned to their protein products. In the jargon of genomics, the genes are "unannotated," meaning that nobody has assigned a function to the gene in the human genome database (Reichardt, 2007).

That means 8,000 unknown genes. About 1000 of these genes are "orphan" genes—genes that have no homologues in other species, including chimpanzees (Clamp, 2007).

Humans aren't unique. All sequenced eukaryotic genomes have a high percentage (~30-40%) of "unknown" protein-encoding genes.

A new paper in PLoS One looks at the "unknown" genes in the filamentous fungus Neurospora crassa (pink bread mold) (Kasuga et al. 2009). The Neurospora genome has about 9,000 protein-encoding genes and more than half of them have not been annotated. They are the "unkown" genes.

The genomes of about 40 different species of fungus have been sequenced and many of these are filamentous fungi related to Neuropsora. What this means is that it's possible to compare the Neurospora genes to those in many different genomes from closely related species; those that are part of the same family (less closelyrelated); part of the same phylum; and distantly related. You can't do such an extensive study with human genomes because there aren't very many mammalian genomes that have been sequenced and carefullyannotated. A draft sequence of the chimpanzee genome, for example, has been published but it is neither complete nor reliable enough for genomic comparisons. The only other primate genome is from macaque (Rhesus monkey) and that's far from finished. (The human and mouse genomes are the only ones listed as "complete" on the NCBI/Entrez website.)

The question is: are the unknown genes confined to Neurospora and its close relatives? If so, it would suggest that new genes have evolved within the past several million years and that's why we don't know their function.

Kasuga et al. created six sets of genes ...
  1. Genes with homologs in distantly related eukaryotes and possibly prokaryotes. These are ancient genes.
  2. Genes that are only found in fungi and not in plants or animals or protists (Dikarya).
  3. Genes found only in Ascomycetes.
  4. Genes confined to the Pezizomycotina clade to which Neurospora belongs.
  5. Genes found only in Neurospora.
  6. Others: genes that are found in some of the first groupings but not in all the smaller grouping.
The classification depends on the similarity cutoff. If the lowest cutoff is 25% sequence identity, then there will be more homologs in the eukarote or prokaryote class than if the cutoff is raised to 35%. The distibution of the various classes at each of three minimum sequence identify cutoffs is shown in their second figure.


Taking the 30% threshold numbers (middle group), it looks like there are 2,358 highly conserved genes with homologs in distantly related eukaryotes and prokaryotes. In contrast, there are 2,219 genes that don't have homologs in any other species. These are the orphan genes in Neurospora.

You might expect that most of the unknown/unannotated genes would be confined to Neurospora and closely related species. You might expect that highly conserved genes would be more likely to have been identified. That's partly true. Here are the numbers.


Only 16.5% of the highly conserved genes are mystery genes of unknown function. While this is much lower that the total (56%), it's still surprising that so many of the core genes remain unidentified. Presumably they are doing something very important. There are dozens of thesis projects available for talented graduate students who want to make a valuable contribution to biology.

It's not a surprise that 94% of the orphans are unannotated. These genes are likely to be new genes that have evolved recently in Neurospora and they would be expected to carry out unusual reactions that aren't found in other species. These "genes" are also the ones most likely to be artifacts (false positives) of the gene searching software. They may not be genes at all.


[Image Credit: Neurospora-National Institute of General Medical Sciences]

Clamp, M., Fry, B., Kamal, M., Xie, X., Cuff, J., Lin, M.F., Kellis, M., Lindblad-Toh, K. and Lander, E.S. (2007) Distinguishing protein-coding and noncoding genes in the human genome. Proc. Natl. Acad. Sci. (USA) 104:19428-19433. [DOI 10.1073/pnas.0709013104]

Kasuga, T., Mannhaupt, G., and Glass, N.L. (2009) Relationship between Phylogenetic Distribution and Genomic Features in Neurospora crassa. PLoS ONE 4(4):e5286. [DOI:10.1371/journal.pone.0005286]

Reichardt, J.K.V. (2007) Quo vadis, genoma? A call to pipettes for biochemists. Trends in Biochemical Sciences (TIBS) 32:529-530. [DOI:10.1016/j.tibs.2007.10.001]