Sandwalk: ORFans

Sunday, November 26, 2006

ORFans

Over on talk.origins there's a discussion about ORFans. It was started by referring to an article from The Christian Post that reported on a talk given by Paul Nelson. According to Nelson, the presence of ORFan genes in bacterial genomes represents a serious change to evolution.

Ernest Major posted a nice analysis of the paper with references to the many eplanations of the origin of ORFans. I'd like to add a bit more to his description of the "problem."

Here's the primary reference ...

Yin, Y. and Fischer, D. (2006) On the origin of microbial ORFans: quantifying the strength of the evidence for viral lateral transfer. BMC Evolutionary Biology 2006, 6:63
[Get your free copy here]
Open Access Charter

ORF stands for "open reading frame" a term that refers to a stretch of codons for amino acids. It means that this ORF probably identifies a protein encoding gene. In order to be meaningful, the ORF should; (a) begin with a start codon, (b) end with a termination codon, and (c) contain a minimum number of codons (typically more than 100).

In this age of genomics and bioinformatics, there are computer programs that scan both strands of DNA to identify ORF's. These are putative genes. When the first genomes were sequenced there were a lot of putative genes that matched sequences already in the database. In other words, the computer programs identified ORF's that showed significant sequence similarity to individual genes that had already been cloned and sequenced by other labs. These genomic ORF's represented genes that were homologous to known genes.

Yin and Fischer are interested in the ORF's that aren't homologous to known genes. They concentrate on bacterial (prokaryote) genomes since the coverage is more extensive. As more and more genomes were sequenced the number of new genes represented by these non-homologous ORF's declined, as expected. Today, for every new genome that's added to the database, almost 80% of the genes have been previously identified.

The surprise is that there are so many unique ORF's in every genome. These are putative genes that have no known homologues. They are ORFans. In order to determine the number of ORFans, Yin and Fischer analyzed the complete genomes of 277 bacteria. For each and every gene they ran a search against all other genes in the database. The result was the histogram shown below.

The figure shows the distribution of all 818,906 ORF's in 277 sequenced prokaryote genomes. (A typical genome has about 3000 genes.) The bottom axis represents the frequency of each of the putative genes in the database. The tall bar at the extreme left-hand side shows the number of ORF's that are only found in a single species. These are the ORFans. There are almost 80,000 of them; or, about 280 per genome. This is what the paper is all about.

There are some putative genes that are only present in one or two related species. These are represented by the bars at U=0.01, 0.02 etc. Some of these are also counted as ORFans since they are only present in closely related species.

As you can see, there's a broad peak of genes found in about 60% (U=0.6) of all sequenced prokaryote genomes. These represent the standard genes of metabolism. Hardly any genes are present in every single species (U=1.0). This is because the database may be incomplete, the genes may have diverged too far to be detectable, or the species is really missing that gene.

Where did the unique genes (ORFans) come from? If they are real, it seems unlikely that they sprung into existence in a single lineage. They were most likely "borrowed" from a distantly related species by a process known as lateral gene transfer. However, as more and more genomes from diverse species are added to the database it becomes worrisome that the source of these genes isn't identified.

What about viruses? It has long been known that viral genes can be incorporated into bacterial genomes so this seems like a good possibility. Yin and Fischer screened all 818,906 ORF's against the viral database to test this hypothesis. They found that only 2.8% of bacterial ORFans have detectable homologues in the viral genomes. Thus, the transfer of viral genes to bacterial genomes doesn't seem to account for all of the ORFans.

The authors discuss the problems with their experiment and urge us not to reject the viral origin hypothesis just yet. There are only 280 bacteriophage in the viral genome databse and this represents a very tiny percentage of all bacteriophage. (There may be 100 million different phage.) There are still lots of places for ORFan homologues to hide.

I think there's another problem; one that the authors are not taking seriously. It's quite possible that many of the ORFans aren't real genes at all. The computer programs that detect these ORF's are notorious for their false positives. There may be ORFan "genes" that are never transcribed or there may be ORFan "genes" that are transcribed and translated but the protein product doesn't do anything. It's an accident of evolution. In addressing this problem the authors make the common mistake of pointing to those cases where known ORFans have proven to be functional genes, while ingoring that fact most haven't. Just because some of them are real genes doesn't mean that all of them are. If most ORFans are artifacts then it's not surprising that they aren't found in other species.

2 comments :

Lorena said...: This is a very interesting subject that seems to hardly be addressed. It would be nice if you discussed how these ORFans could be found to be functional or not. In prokaryotic organisms these ORF can be knocked out and checked for function. But what about eukaryotes?; Thursday, February 03, 2011 8:36:00 PM
The Other Jim said...: In Drosophila, this is being done by RNA knockdown...

http://www.sciencemag.org/content/330/6011/1682.full; Friday, February 04, 2011 3:33:00 AM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Sunday, November 26, 2006

ORFans

2 comments :