More Recent Comments

Tuesday, November 28, 2006

Linux Commands

 

Here's a list of the top 200 Linux commands. Some of them are very useful, especially for us old fogies. For example, I use "whoami" at least once a day.

Some of them will drive you crazy. Don't even think about typing "vi" unless you're prepared to throw the keyboard into the monitor. "Format" is lots of fun, try it with every letter of the alphabet.

"Grep" isn't what you think it is.

The computer is "Darwin." It's owned by David Greig (DIG) and lives in my office. "Darwin" is the server for the newsgroups talk.origins and sci.bio.evolution. DIG moderates talk.origins with a lot of help from Darwin. Josh Hayes attempts to control sci.bio.evolution—it's a losing battle.

I think I'll try out some of the more unusual Linux commands on Darwin. I wonder what "kill" does? ....

Monday, November 27, 2006

Something to be Proud of?

 

Gene Expression has a new icon in the sidebar. Apparently the author is proud to be an appeaser.

For the record, here's what it means to be a Neville Chamberlain Atheist. It means you're happy to attack Intelligent Design Creationists like Micheal Denton (Nature's Destiny) and Michael Behe (Darwin's Black Box) for mixing science and religion. But, you don't say a word when Ken Miller (Finding Darwin's God), Francis Collins (The Language of God: A Scientist Presents Evidence for Belief) and the Rev. Ted Peters (Evolution from Creation to New Creation) spout equally bad religious nonsense in the name of science.

The Neville Chamberlain Atheists object when Behe talks about intelligent design but mum's the word when Ken Miller talks about how God tweaks mutations to get what He wants. Hypocrisy is a strange thing to be proud of.

He must be joking, right?

An Inconvenient Truth

If you haven't seen it, get yourself to a video store tomorrow. Then read the debate about whether the National Science Teachers Association should accept 50,000 free DVD's [What's up NSTA?].

I don't agree with PZ on this one. The science is good but Al Gore is exploring the possibility of a run for the Democratic nomination in 2008. I saw him in action for three days at Chautauqua last summer. If it were Carl Sagan I'd say NSTA should show the DVDs in every classroom but it's silly to pretend that Al Gore isn't a politician.

Who Let Him Out on his Own?

Depak Chopra demonstrates, once again, why we call them IDiots. PZ Myers has pointed out the foolishness in his posting on Pharyngula, "Oh, no...not more Chopra!". The thrust of the rant has something to do with seeing things in your mind. (I didn't pay much attention, it's kindergarten stuff.) Apparently, the fact that you can create an image of a yellow flower means that God exists. PZ tells it like it is but he stops short of the important part of Chorpa's article where Chopra says,
Yet you assume--as do all who fall for the superstition of materialism--that flowers and the color yellow exist 'out there' in the world and are photographically reproduced by the brain, acting as a camera made of organic tissue. In fact, existence of flowers shifts mysteriously once it is closely examined. The experience of sight, sound, touch, taste, and smell is created in consciousness. Molecules don't assemble in your head to make the sound of a trumpet blaring in a brass band, for example. The brain is silent. So where does the world of sights and sounds come from?

Materialists cannot offer any reasonable explanation. The fact is that an enormous gap exists between any physical, measurable event and our perception. If I talk to you, all I am doing is vibrating air with my vocal cords. Every aspect of that event can be seen and measured, but turning those vibrating air molecules into meaningful words has never been seen or measured. It can't be.
I can't resist. Yes, Depak, turning the vibrating air molecules that come out of your mouth into meaningful words has never been seen or measured. And your point is?

Why does the Huffington Post put up with this IDiot?

Recording Lectures

Every time I give a lecture there’s a bunch of recorders in front of me. Following the lecture, there’s an active trade in lecture recordings on our student newsgroups.

I have mixed feeling about this. On the one hand, I understand why students would want to take advantage of cheap technology to make a permanent record of my words of wisdom. :-)

On the other hand, my words aren’t always wise and I don’t want students to memorize everything I say without checking it against the textbook and other sources. Lecture recordings should be supplements to learning and not the only source. (Don’t get me started about podcasts!)

This point was brought home in one of the threads on our student forums. The students in one of our biochemistry courses had just finished a midterm exam. One of the multiple choice questions was about cholesterol. For those of you who haven’t committed the structure of cholesterol to memory—I am one—I’m including a picture. The description in the textbook (it happens to be my textbook) is ....
Steroids are a third class of lipids found in the membranes of eukaryotes, and, very rarely, in bacteria. Steroids, along with lipid vitamins and terpenes, are classified as isoprenoids because their structures are related to the five carbon molecule isoprene. Steroids contain four fused rings: three six-carbon rings designated A, B, and C and a five-carbon D ring.

I then go on to describe cholesterol, an important steroid.

Choice “C” in the multiple choice question referred to the 4-ring structure of cholesterol. It was a correct choice and the students were supposed to choose another response, which happened to be an obvious incorrect choice. Cholesterol certainly has four rings, so what’s the problem?

The day after the exam, students started complaining on the newsgroup. Apparently Prof. X (no, it wasn’t me, this isn’t my course) said in lecture that cholesterol has only three rings and students have the recording to prove it! Several students demanded that they be given a mark for choosing response C. The complaints quickly escalated with some highly indignant students demanding an extra mark on the exam. According to their logic, it is unfair for students to be penalized because the Professor made a mistake in the lecture.

Other students chimed in. They pointed out that the Professor’s notes referred to four rings and the textbook clearly shows four rings; A, B, C, and D. They suggested that their fellow students have a responsibility to study from the notes and textbook as well as the recording. If there was a discrepancy, then it was up to the student to resolve it, including asking the Professor if necessary.

One of the best responses was from student “YYZ,” who has given me permission to quote him.
I’m saying you can’t only listen to the lecture and that’s it. You have to analyze what he says, look at the slides, think over if things make sense, etc. Studying isn’t mindlessly memorizing words coming out of a professors mouth ...
By Jove, I think he’s got it! It’s refreshing to see that some students understand how to study and it’s refreshing to see them take on the whiners. That’s how things are going to change in the universities. Professors are the enemy and nothing they say has any credibility (at least in the first two years). Responsible students have to speak up.

World AIDS Day

The Faculty of Medicine and the University of Toronto are hosting a series of events this week in association with World AIDS Day (Friday, Dec. 1). Check the flyers around the campus for events near you. There will be a student presentation in my class on Wednesday prior to the symposium on Promoting Evidence-based ART in Resource-poor Settings.

Light a Candle

 
Light a candle and Bristol-Myers Squibb will donate $1, up to a total of $100,000, to the national AIDs fund (USA). I'm usually not a fan of big PHARMA but .... why not?

Monday's Molecule #3

Name this molecule. Comments will be blocked for 24 hours. Comments are now unblocked.



BTW, you guys did great on last week's molecule but not so good on the supplemental questions in the comments. Does everyone now know the 21st, 22nd, and 23rd amino acids?

Sunday, November 26, 2006

What a Coincidence!!!

 

Listen to this What a Coincidence. That's my little girl! She's the one on the left.

Imagine No Religion

In The God Delusion, Richard Dawkins refers to John Lennon who asked us to imagine no religion. For those of you who never knew John Lennon, here he is singing Imagine. No, John, you're not the only one. (Thanks to The Scientific Indian for finding the video on Google Videos.)

The Three Domain Hypothesis (part 3)

The scientific dispute over The Three Domain Hypothesis is based on the validity of RNA trees, the importance of protein trees that disagree with the rRNA tree, the evidence for fusions, and the frequency of Lateral Gene Transfer (LGT). But, as usual, there’s more to it than just science. The side with the best advocates has a huge advantage in fights like this.

Let's set the stage by quoting from the article by William Martin.
Thus, it seems to me that there is a schisma abrew in cell evolution, with the rRNA tree and proponents of its infallibility on the one side and other forms of evidence, proponents of LGT, or proponents of a symbiotic origin of eukaryotes on the other. The former camp is well organized behind a unified view (be it right or wrong, still a view) and is arguing that we already have the answers to microbial evolution. The latter camp is not organized into castes of recognized leadership and followers, meaning that (if we are lucky) concepts and their merits, not position or power, will determine the outcome of the battle as to what ideas might or might not be worthwhile entertaining as a working hypothesis for the purpose of further scientific endeavour.
The article by Norman Pace represents the side that already has the answers. He is a strong proponent of the Three Domain Hypothesis. These days, the main thrust of his argument is that we should all jump on the bandwagon or risk being left behind. I heard him speak in San Francisco last April and he sounded more like a preacher than a scientist. His article in Nature, ”Time for a Change”, is an example of the way the Three Domain Hypothesis proponents have been arguing for 20 years.

One of the key problems in deep phylogeny is choosing the right gene. Pace argues in favor of ribosomal RNA—not a surprise since he has invested over 20 years in this molecule. Ideally, what kind of gene do we want to examine in order to determine the deepest branches in the tree of life? According to Pace there are three criteria ....
1. The gene must be universal.
2. The gene must have resisted lateral gene transfer.
3. The gene must be large enough to provide useful phylogenetic information.
Only ribosomal RNA meets all three criteria, says Pace.

There’s no question about #1. Ribosomal RNA genes are fond in all species. There are very few other genes that meet this criterion. Almost all other candidates are absent in at least a few species. Ribosomal RNA satisfies #3 as well. Even the small subunit is large enough.

What about #2? Which genes have “resisted” lateral gene transfer? You can’t just declare by fiat that ribosomal RNA genes haven’t been transferred. It’s a debatable question as we’ll see later on.

I would add three other criteria.
4. The gene must be unique, or if it isn’t, paralogues must be easily recognized.
5. The gene must encode a protein because it’s much more accurate to analyze amino acid sequences than nucleic acid sequences. (And easier to align.)
6. The gene must be highly conserved in order to retain significant sequence similarity at the deepest levels.
Ribosomal RNA doesn’t do so well when we add these criteria. Most bacterial genomes have multiple copies of ribosomal RNA genes. They are usually 99% similar but there are known examples of more divergent paralogues. This is not likely to be a serious problem for deep phylogeny, but it has caused problems at the species level.

Ribosomal RNA does not encode protein. That’s a serious problem that Pace never addresses.

Ribosomal RNA genes are well conserved but not as highly conserved as some others. This is why rRNA can be used to distinguish closely related species whereas the sequences of other genes are identical unless the species diverged more than 10-20 million years ago. Part of the problem with using rRNA sequences in deep phylogeny is that they are too divergent.

Having declared that ribosomal RNA genes are the best choice, Pace then goes on to show us the “true”universal tree of life. As you can see, it is divided into three distinct clusters separated by long branches. The clades represent Bacteria, Archaea, and Eukaryotes; the Three Domains. The prokarotes (Bacteria and Archaea) seem to associate and the eukaryotes seem to be more distantly related.

But first impressions can be misleading. Pace puts the root on the branch leading to bacteria and not on the long branch leading to Eukaryotes. This root is based entirely on two old 1989 papers, which he references. Both of these papers have been refuted, but that’s not something you would learn from reading Pace’s article. (There are other, more recent, experiments that root the tree on the bacterial branch and these should have been used. The fact that they weren’t reflects Pace’s degree of critical thinking on this problem. )

To many of us, the large scale structure of the tree of life just doesn’t look right. The long branches leading from the trifurcation point to Bacteria and Eukaryotes smack of artifact. The branching within each of the domains looks too simple. It’s part of the reason why there’s skepticism about the rRNA tree, as we’ll see.

The rest of the article is a passionate defense of the importance of bacteria. I agree with him, for the most part, and so do lots of evolutionary biologists. Bacteria are much more important than eukaryotes! :-)

Pace contributes very little to the debate since he is not willing to entertain any doubts about the Three Domain Hypothesis. For that we have to look at some other papers.



Microbobial Phylogeny and Evolution: Concepts and Controversies Jan Sapp, ed., Oxford University Press, Oxford UK (2005)

Jan Sapp The Bacterium’s Place in Nature

Norman Pace The Large-Scale Structure of the Tree of Life.

Woflgang Ludwig and Karl-Heinz Schleifer The Molecular Phylogeny of Bacteria Based on Conserved Genes.

Carl Woese Evolving Biological Organization.

W. Ford Doolittle If the Tree of Life Fell, Would it Make a Sound?.

William Martin Woe Is the Tree of Life.

Radhey Gupta Molecular Sequences and the Early History of Life.

C. G. Kurland Paradigm Lost.

ORFans

Over on talk.origins there's a discussion about ORFans. It was started by referring to an article from The Christian Post that reported on a talk given by Paul Nelson. According to Nelson, the presence of ORFan genes in bacterial genomes represents a serious change to evolution.

Ernest Major posted a nice analysis of the paper with references to the many eplanations of the origin of ORFans. I'd like to add a bit more to his description of the "problem."

Here's the primary reference ...
Yin, Y. and Fischer, D. (2006) On the origin of microbial ORFans: quantifying the strength of the evidence for viral lateral transfer. BMC Evolutionary Biology 2006, 6:63
[Get your free copy here]
Open Access Charter

ORF stands for "open reading frame" a term that refers to a stretch of codons for amino acids. It means that this ORF probably identifies a protein encoding gene. In order to be meaningful, the ORF should; (a) begin with a start codon, (b) end with a termination codon, and (c) contain a minimum number of codons (typically more than 100).

In this age of genomics and bioinformatics, there are computer programs that scan both strands of DNA to identify ORF's. These are putative genes. When the first genomes were sequenced there were a lot of putative genes that matched sequences already in the database. In other words, the computer programs identified ORF's that showed significant sequence similarity to individual genes that had already been cloned and sequenced by other labs. These genomic ORF's represented genes that were homologous to known genes.

Yin and Fischer are interested in the ORF's that aren't homologous to known genes. They concentrate on bacterial (prokaryote) genomes since the coverage is more extensive. As more and more genomes were sequenced the number of new genes represented by these non-homologous ORF's declined, as expected. Today, for every new genome that's added to the database, almost 80% of the genes have been previously identified.

The surprise is that there are so many unique ORF's in every genome. These are putative genes that have no known homologues. They are ORFans. In order to determine the number of ORFans, Yin and Fischer analyzed the complete genomes of 277 bacteria. For each and every gene they ran a search against all other genes in the database. The result was the histogram shown below.

The figure shows the distribution of all 818,906 ORF's in 277 sequenced prokaryote genomes. (A typical genome has about 3000 genes.) The bottom axis represents the frequency of each of the putative genes in the database. The tall bar at the extreme left-hand side shows the number of ORF's that are only found in a single species. These are the ORFans. There are almost 80,000 of them; or, about 280 per genome. This is what the paper is all about.

There are some putative genes that are only present in one or two related species. These are represented by the bars at U=0.01, 0.02 etc. Some of these are also counted as ORFans since they are only present in closely related species.

As you can see, there's a broad peak of genes found in about 60% (U=0.6) of all sequenced prokaryote genomes. These represent the standard genes of metabolism. Hardly any genes are present in every single species (U=1.0). This is because the database may be incomplete, the genes may have diverged too far to be detectable, or the species is really missing that gene.

Where did the unique genes (ORFans) come from? If they are real, it seems unlikely that they sprung into existence in a single lineage. They were most likely "borrowed" from a distantly related species by a process known as lateral gene transfer. However, as more and more genomes from diverse species are added to the database it becomes worrisome that the source of these genes isn't identified.

What about viruses? It has long been known that viral genes can be incorporated into bacterial genomes so this seems like a good possibility. Yin and Fischer screened all 818,906 ORF's against the viral database to test this hypothesis. They found that only 2.8% of bacterial ORFans have detectable homologues in the viral genomes. Thus, the transfer of viral genes to bacterial genomes doesn't seem to account for all of the ORFans.

The authors discuss the problems with their experiment and urge us not to reject the viral origin hypothesis just yet. There are only 280 bacteriophage in the viral genome databse and this represents a very tiny percentage of all bacteriophage. (There may be 100 million different phage.) There are still lots of places for ORFan homologues to hide.

I think there's another problem; one that the authors are not taking seriously. It's quite possible that many of the ORFans aren't real genes at all. The computer programs that detect these ORF's are notorious for their false positives. There may be ORFan "genes" that are never transcribed or there may be ORFan "genes" that are transcribed and translated but the protein product doesn't do anything. It's an accident of evolution. In addressing this problem the authors make the common mistake of pointing to those cases where known ORFans have proven to be functional genes, while ingoring that fact most haven't. Just because some of them are real genes doesn't mean that all of them are. If most ORFans are artifacts then it's not surprising that they aren't found in other species.

A Cartoon for the Appeasers

 
Non Sequitur by Wiley