Friday, March 30, 2007

Animal Chauvinism

 
There's much to criticize in the field of evolutionary developmental biology or evo-devo. Some of the "theories" are little more than wide-eyed speculation. I'm thinking particularly of The Plausibility of Life by Marc Kirschner and John Gehart.

The thing that bugs me more than anything else is the attempt to create a general theory of evolution based entirely on a subset of living species; namely multicellular animals. Most proponents of evo-devo seem to be entirely unaware of the the fact that there are other species where genes are developmentally regulated.

This strange bias is spectacularly illustrated in a recent review in Nature Reviews: Genetics. The authors, Ronald Jenner and Matthew Wills, say,
Study of the model organisms of developmental biology was crucial in establishing evo–devo as a new discipline. However, it has been claimed that this limited sample of organisms paints a biased picture of the role of development in evolution. Consequently, judicious choice of new model organisms is necessary to provide a more balanced picture. The challenge is to determine the best criteria for choosing new model organisms, given limited resources.
Great! I couldn't agree more. When I used to teach this stuff I would begin with development in bacteriophage lambda where there is a beautiful example of a genetic switch. I then described development during sporulation in the bacterium Bacillus subtilis where there's a nice simple example of communication between the mother cell and the developing spore. Both of these examples made it into my textbook back in 1993.

Yeast development got a lot of play in my courses and it still does in the courses that are taught here. I would also look for examples of plant development since that's where I first learned about development as an undergraduate. We need to teach more plant development.

So, as you can imagine, I was excited to read the abstract of this paper. Jenner and Wills bemoan the fact that most of the work in the field is based on just six model organisms: Caenorhabditis elegans, Gallus gallus, Xenopus laevis, Mus musculis, Danio rerio, and Drosophila melanogaster. How right they are. The evo-devo crowd needs to expand their horizons to cover bacteria, protists, fungi, and plants.

So I eagerly read on to see which organisms they would name. Here are their choices: sea urchin, dung beetle, water flea (Daphnia), and sea anemone. All animals.

Evo-devo is never going to gain widespread respectability among evolutionary biologists unless the proponents abandon their animal chauvinism and start to recognize that development is important in four other kingdoms. [Press Release from the University of Bath]
Jenner, R.A., Wills, M.A. (2007) The choice of model organisms in evo-devo. Nat Rev Genet. 8:311-314. Epub 2007 Mar 6.

University Classes Doubled in Size when Grade 13 Was Abolished in Ontario

Friday's Urban Legend: FALSE

Back in the 20th century Ontario had a unique education system where students spent an extra year in high school. They didn't graduate until they had completed Grade 13.

This system was abolished in order to bring Ontario into line with the rest of the civilized world. There didn't seem to be any logical reason to force Ontario students to stay in high school for an extra year. When they entered university they ended up being a year older than students from every other country and every other province in Canada.

The new system began with an overhaul of the high school curriculum so that five years worth of material could be taught into four years. Some voluntary breadth courses were abandoned. On implementation day all students entering grade nine were going to graduate at the end of grade 12.

This created a double cohort of graduating students since those completing the new four year program were graduating at the same time as the class ahead of them who were the last to finish grade 13. Naturally, the universities in Ontario were expected to accommodate the double cohort so that students in the first year of the new system would not be penalized. It was widely believed that the "double cohort" really meant there would be twice as many students entering university at some point.

Newspapers published articles about the double cohort as though the class sizes would double. Parents believed that classes would double in size and so did students. Even today, after we have seen the result, it is still widely believed that there were twice as many students in the double cohort year.

It never happened. The universities knew that their enrolment would not double and they published lots of data to explain why. As it turned out they were right and they publicized that too. Still the myth persists. An article in this week's Toronto Star show how little we've learned (see below the fold).

Let's start with a little quiz. Here's some data on the size of first year science classes at the University of Toronto. The red bars represent students enrolled in our first year biology class. Green is for chemistry, blue is calculus, and yellow is physics. I haven't told you when the so-called "double" cohort entered university. See if you can guess by looking at the data.


The double cohort class entered university in the Fall of 2003. The universities predicted that class sizes in that year would increase by about 20-25% over those of the previous years. They also predicted that class sizes would remain at that level for several years. The double cohort hit universities at the same time that applications were expanding because of the echo boom and because of increases in participation rates. The chart below shows the increase in university students throughout Canada over the past decade. You can see that the numbers grew from 2000 to 2005 and this has nothing to do with the double cohort in Ontario. Even without a double cohort there was a predicted increase in enrolment during this time frame.

Why was the "double cohort" increase only 25% and not 100%? There are many reasons but the most obvious one is that universities attract students from all over Canada and from many foreign countries as well. The double cohort only affected graduates from Ontario high schools. If only half the students at the University of Toronto are from Ontario, for example, then the expected increase would only be 50% assuming that the double cohort really was twice the size and assuming that all qualified applicants were accepted.

The reason it was less than that had to do with other, predicted, events. First, a significant number of students in the last year of the five year program were allowed to "fast-track" in order to finish in four years and get ahead of the double cohort. A significant number of students in the first year of the new four year program took an extra year in order to fall behind the double cohort. Many more students than normal in the double cohort went to universities in other provinces.

Let's look closely at the actual numbers by normalizing the class size to that of 1997-98.


Now we see that the largest increase was in 2002-03, the year before the double cohort entered university. This increase is entirely due to expanded enrolment in anticipation of further increases that are due to increased participation. It had the added benefit of accommodating the fast-trackers. The actual enrolment increase in the double cohort year was only 10-15% higher than that in the previous year and it was less than the numbers in the following year. This is an important point. The real increase in that particular year (2003-04) was no more than 20% and in most cases was considerably less. Part of the increase (about 20%) in this period was due to demographic factors unrelated to the double cohort as demonstrated in the chart for Canada as a whole.

This brings me to the Toronto Star article [Double cohort graduating again].
There was much concern when the last Grade 13s and the first graduating-year Grade 12s combined to create the largest group to finish high school en masse in the province's history.

The decision was designed to cut public education costs and bring Ontario in line with the rest of the continent, where 12 grades were already the norm, but it left educators facing serious challenges.

Would universities and colleges have enough staff and classroom space? What about residences? Would crowded schools affect the quality of education? Would thousands of students fall through the cracks just because they happened to be born in the wrong year?

Four years later, the Ontario government is again straining to accommodate the double cohort. Apart from concern about a flood of entries to the labour force, the province has to provide an extra $240 million a year to create 14,000 graduate school spaces by 2009.
The Star interviewed three students. I'd like to quote the remarks on one of them in order to illustrate the double cohort mythology.
As part of the double cohort's older half, Allard regrets not having fast-tracked her way through high school.

"In high school, I thought it was no big deal. Now I've come to realize that for the rest of my life, this group is going to follow me wherever I go. Whether it's grad school, medical school or work, there will be twice as many people trying to do everything I'm trying to do. If I'd fast-tracked, I could have gone to university a year earlier."
As a double cohort student, I presume Allard was interested in the numbers. She probably read the predictions and she probably read about the actual increase in class size. I can't imagine that she didn't. At some point she must have been exposed to the fact that her class was less than 20% larger than the one ahead of her and smaller than the one behind her.

She has just spent four years in university were one hopes she learned how to think critically. She must have noticed that her classes weren't twice as large as other classes. So why does she say that she will always be competing with twice as many people? I can't help but feel that we've failed to do a good job of educating if there are so many out there who believe in things that are easily refuted by facts and observations.
Young will graduate with a degree in political science from the University of Western Ontario next month. A good student in high school, he had no trouble getting into his university of choice. In fact, he liked being in the double cohort.

"It was fun," he says. "I was in the younger year of the cohort, so I got to spend my year with twice as many students, and half of them were older than me."
Hmmm ... one wonders just how much attention he was paying in class. A good many of his classmates were not from Ontario so they were the same age. The class was only 20% bigger than the previous class so where did he get the idea that there were twice as many students?
Young's plans were also affected by the double cohort. Had it not been for the increased competition for graduate school positions, he says he likely would have continued his education.

"In a different year, I probably would have worked a bit, then considered getting my master's, which would have helped me land the kind of job I want."
The data is clear. His "competition" is no greater than most other years. This is because the actual increase in the graduating class this year will be less than 20% and the number of graduate positions has increased significantly. We must not have done a good job of teaching critical thinking in this case either. Maybe it's not a requirement in political science?

Thanks to Brenda Bradshaw in our office for gettng some of the data on very short notice.

Thursday, March 29, 2007

Blogging with Bush

 
This is cool. President George Bush reads blogs and he even quotes them in his speeches. These bloggers were from Iraq. They say America is winning [Blogging with Bush].

[Hat Tip: Canadian Cynic]

Wednesday, March 28, 2007

Evolution of Mammals

 
A paper in this week's issue of Nature presents a nice summary of recent work on mammalian evolution. Bininda-Emonds et al. (2007) have combined a lot of data from various studies in order to construct a supertree of mammalian evolution. The study incorporates fossil data with molecular sequence data to arrive at estimates of divergence times for 4,510 species of mammal out of a total of 4,554 extant species (99% complete).

This is a study of macroevolution. The authors are addressing questions about the mode, tempo, and pattern of speciation over a period of more than 150 million years. The main questions are when did mammals diversify and did it have anything to do with the mass extinction event at the Cretaceous/Tertiary (K/T) boundary. This is the event that resulted from an asteroid impact 65 million years ago.

The results are presented in the form of a large phylogenetic tree showing the major groups of mammals. The first split in the mammalian tree occurred 166 million years (My) ago when monotremes such as platypus and echidnas (black) split off from the other mammals. Marsupials such as opposums, kangaroos, and koalas (orange) separated from placental mammals 148 My ago.

Within the placental mammals, all of the extant orders appeared by 75 My ago. This includes the clades labelled on the outside of the circle plus other. For a compete list and a description of the species, see the NCBI Taxonomy website [Eutheria].


All of these orders were established at least 10 Myr before the mass extinction event (dashed circle on the circular tree). This is one of the main conclusions of the meta-analysis. The most significant diversification of mammals takes place well before the extinction of non-avian dinosaurs.

The other conclusion is that subsequent radiations at the level of families were not significant until after 50 Myr ago. This period of diversification lasted until about 10 Myr ago. There is no evidence to suggest that the radiations within each order were synchronous, ruling out global climate change as a mechanism.

Furthermore, the data clearly shows no connection between the mass extinction event at the K/T boundary (65 Myr ago) and subsequent radiations of mammalian groups. This effectively rules puts an end to the long held belief that mammals diversified after the devastation in order to fill up the niches left by dinosaurs. This is not the first paper to refute that belief but it may be the final nail in the coffin.

This summary serves as a warning to those who continue to associate evolution with environmental change. At this level of analysis there does not seem to be a connection between rates of speciation and climate change. This is most obvious with respect to the asteroid impact of 65 My ago. While it led to mass extinction, it did not lead to increases in the rate of evolution of the survivors. The branching pattern of cladogenesis in the figure is hardly affected by the cataclysm.

Similarly, there are no other speciation events that correlate with known climate change over the past 150 million years, including recent ice ages. There is growing recognition among evolutionary biologists that rates of speciation cannot be attributed to large-scale environmental change. (The data has not prevented speculation. Many reports on this paper attempt to manufacture some correlation between global environmental change and speciation. The old idea of a link between them is too entrenched to give up so easily.)

There's an interesting sidebar to this story. The paper clearly states the two main conclusions,
... the pivotal macroevolutionary events for extant mammalian lineages occur either well before the boundary (significant decrease in diversification rate at approximately 85 Myr ago, after establishment and initial radiations of the placental superorders and major orders at approximately 93 Myr ago) or well afterwards, from the Early Eocene onwards (when net diversification began to accelerate)....

Therefore, the demise of the non-avian dinosaurs, and the K/T mass extinction event in general, do not seem to have had a substantial direct impact on the evolutionary dynamics of the extant mammalian lineages.
However, in the title of the paper, The delayed rise of present-day mammals, the authors focus attention on the second conclusion at the expense of the first. Some of the press releases picked up on this emphasis, leading to the false impression that mammalian evolution is more recent than scientists thought [Did the Dino Die-Off Make Room for Mammals?] while others got it right [Mammals not such late developers, after all].

The point about early diversification is emphasized in the Nature News & Views commentary that's published with the article in the March 29th issue. David Penny and Matthew J. Phillips begin with a summary of the evidence for early evolution,
On page 507 of this issue, Bininda-Emonds and co-authors1 present an evolutionary tree of more than 4,500 mammals, and conclude that more than 40 lineages of modern mammals have survived from the Cretaceous, some 100 million to 85 million years (Myr) ago, to the present. This is paralleled by Brown and colleagues' analyses for birds, just published in Biology Letters: they claim that more than 40 avian lineages have likewise survived from before the extinctions at the Cretaceous/Tertiary (K/T) boundary 65 Myr ago. These numbers of surviving lineages push back the evolutionary history of many mammals and birds much further than earlier estimates based on smaller data sets. But strong claims need strong evidence to support them.
Later on they re-emphasize this point,
But the most challenging aspect of the phylogeny is the inference that more than 40 lineages of living mammals (and of birds, as described by Brown et al. 2007 ) survived from the Cretaceous to the present.
There are some quibbles about the data. Personally, I think the estimates for early divergence are too recent rather than too late . It all depends on the first fixed data point which is the separation of monotremes. This date (166 Myr ago) is a minimum estimate and there's evidence for an older date. The popular report on the Nature website [Disappearing dinos didn't clear the way for us] mentions this possibility. Mark Springer of the University of California, Riverside (USA) is interviewed and the article states,
"This is a reasonable first approximation," he [Springer] says. "Some of the dates and relationships are probably right on, and some are probably going to move around."

For example, says Springer, the team estimates that the deepest split in the mammals' family tree, between the egg-laying monotremes (such as the duck-billed platypus) and the rest happened 166 million years ago. But some molecular analyses suggest it happened more than 200 million years ago; Springer thinks this earlier date is probably closer to the truth. If that fundamental point changes, he notes, other things will have to shift too. "That date influences everything else through the tree," he says.
I suspect he's right and all the dates will move back in time. One wonders whether the late radiation at 50 My will then shift closer to the K/T boundary.

It's clear that more work needs to be done but the significance of this paper is that it assembles a lot of evidence into one place and publicizes a debate that's been smoldering among evolutonary biologists for over adecade.

Bininda-Emonds, O.R.P., Cardillo, M., Jones, K.E., MacPhee, R.D.E., Beck, R.M.D., Grenyer, R., Price, S.A., Vos, R.A., Gittleman, J.L., and Purvis, A. (2007) The delayed rise of present-day mammals. Nature 446: 507-512. [PDF]

Penny, D. and Phillips, M.J. (2007) Evolutionary biology: Mass survivals. Nature News & Views, Nature 446: 501-502. [PDF]

Some Days I Feel Really Old

 
The students in my biochemistry course are also taking a course called Molecular Cell Biology. They use the textbook Molecular Biology of the Cell by Bruce Alberts et al.

I was showing several students my copy of the older 3rd edition. It has this picture (below) on the back cover so I asked them to name the street they were crossing. None of them had a clue.


The street is only two blocks from the house in St. John's Wood where the authors meet to work on the book. I've been there and I crossed the street at that very crosswalk. For people my age the street and the crosswalk (and the white building in the background) are holy places. For students born after 1988 they aren't. I feel really old.

The Largest Single Organism on Earth

 
The largest known organism is not some giant squid or other cephalopod. It's a stand of quaking aspen in Utah known as Pando. What seem to be individual trees are actually just the visible expression of a gigantic underground organism. Every "tree" is connected via the root system. The individual "trees" are genetically identical. (Erroneously referred to as "clones.")

The total size and weight of this organism isn't known with certainty but it's surely more than 6,000,000 kg. The Wikipedia article mentions that Pando is probably the oldest known organism as well, dating back 80,000 years. I'd like to confirm this, if possible. Does anyone know how accurate this date is and whether there is anything older?

Lots of plants are bigger than cephalopods. There are even some mushrooms that are bigger!

Nobel Laureates: Dam and Doisy

 

The Nobel Prize in Physiology or Medicine 1943.


Henrik Carl Peter Dam (1895-1976): "for his discovery of vitamin K"

Edward Adelbert Doisy (1893-1986): "for his discovery of the chemical nature of vitamin K"

Henrik Dam and Edward Doisy won the Nobel Prize in 1943 for their contributions to the understanding of blood clotting, especially the role of vitamin K.

Dam was working at the Biochemical Institute in Copenhagen during the 1930's. He was studying diet in chickens and noticed that his flock was suffering from frequent hemorrhages. After eliminating the most obvious causes, including lack of vitamin C, Dam proceeded to isolate the missing factor that caused the deficiency in blood clotting. The effort is described in the presentation speech.
In cooperation with F. Schønheyder, it was found by Dam in 1934 that an addition of hempseed to the food prevented the bleedings. This forced him to the conclusion that hempseed must contain a still unknown substance which has a protective effect against certain hemorrhages. This substance, which was found to be necessary for the coagulation of the blood, is termed by Dam the coagulation vitamin or vitamin K. Dam moreover found that this vitamin occurs not only in the vegetable kingdom, for example in the seeds of cabbage, tomatoes, soya beans and lucerne, but also in certain animal organs, especially in the liver. Dam and the American investigator Almquist showed almost simultaneously that activity follows the non-saponifiable lipoid fraction. Vitamin K is formed also by bacteria in the intestinal canal, as was shown in 1938 by Almquist and his co-workers. The organism's need of this vitamin may thus be satisfied either by supply with the food, or by its formation in the intestinal canal.
Dam was able to show that a lack of vitamin K led to a deficiency in prothrombin, the precursor of thrombin. Thrombin is the enzyme that cleaves fibrinogen to create fibrin and it is fibrin molecules that interact to form a blood clot.

The nature of vitamin K remained a mystery until 1939 when Edward A. Doisy, Professor of Biochemistry at St. Louis University School of Medicine, determined its structure and synthesized it in the laboratory.


By 1943 it was apparant that vitamin K could relieve the symptoms of inappropriate hemorraging in humans and treatment with vitamin K became routine as described in the Nobel Prize presentation speech.
It was in fact soon found that this vitamin was to assume great importance in the treatment of hemorrhagic diseases in man. Certain diseases of the liver and gall ducts with jaundice are characterized by a marked tendency to hemorrhage, and it was found that this tendency, being due to a lack of prothrombin, could be remedied with vitamin K. In this way operative treatment in such cases has become much less risky than before. Also in certain protracted intestinal diseases there is a hemorrhagic tendency, due to insufficient absorption of vitamin K through the intestine. These cases too have been successfully treated with vitamin K.

It is, however, in the checking of hemorrhages in newborn babies that this vitamin has assumed its greatest practical importance. At this early age, hemorrhages - sometimes involving menace to life - occur far oftener than in more advanced stages. A great many of these cases have proved to be due to deficiency of vitamin K and can be cured by the supply of that vitamin. What is more, by treating the mother shortly before delivery, or the newborn child immediately afterwards, it is possible also to prevent the occurrence of such hemorrhages. Even if there are also neonatal hemorrhages which are not due to a lack of vitamin K and therefore cannot be cured by the supply thereof, the number of cases of such deficiency in the neonatal stage is rather large, and then vitamin K often conduces to save life. Indeed, it may be said that the discovery of vitamin K has revolutionized the treatment of these not uncommon cases.
Nowadays the role of vitamin K is so well understood, and the compound is so easily available, that it's rare to encounter deficiencies.

Tuesday, March 27, 2007

Most Metabolic Diseases Affect Unimportant Genes

 
Okay, so the title is a little bit disingenuous. Obviously metabolic diseases like cystic fibosis, thalasemia, phenylketonuria, and Huntington's Disease are not trivial. They cause devastating problems for patients and family. Many metabolic diseases are lethal. That's not "unimportant."

The point isn't that the genes are "unimportant" in that sense. What I meant is that defects in essential genes—the ones are part of core metabolism—do not usually show up as metabolic defects. The reason is that any defects in, say, RNA polymerase, will usually be embryonic lethals and we will never see them [RNA Polymerase Genes in the Human Genome].

The defects that are most likely to show up as metabolic diseases are those where the defect is not so severe as to prevent embryonic development. Thus, a defect in adult hemoglobin (thalasemia), for example, will only be manifest after birth and even then there are compensating genes that can prevent death. Same with cystic fibrosis. Not to minimize the consequences of the disease, but we only see it as a metabolic defect because it isn't immediately lethal.

The point of this little note is to correct a widespread misconception. Many people think that metabolic diseases identify the most important genes in humans. The ones that are essential for life. In fact that's not usually the case. The really important genes do not have associated metabolic diseases. As a general rule, it's only the second tier of important genes that are associated with metabolic disease. The ones that are not essential for cell survival during fetal development.

Vitamin K

 
Vitamin K (phylloquinone) is a lipid vitamin found in plants as K1, or phytylmenaquinone, and in bacteria as K2, or multiprenylmenaquinone. Vitamin K is related to ubiquinone [Monday's Molecule #10]. Ubiquinone serves as an electron carrier in reactions such as membrane-associated electron transport [Ubiquinone and the Proton Pump]. Related cofactors in plants (plastoquinone) and bacteria (menaquinone) can be absorbed in the intestine and converted to vitamin K.


Although we can't synthesize vitamin K ourselves, we usually get enough of it from intestinal bacteria. Vitamin K deficiency is not common for this reason. The most common symptom of vitamin K deficiency is hemorraging due to a defect in blood clotting. The symptoms are frequently seen in newborn babies, especially those born prematurely because they lack intestinal bacteria. This is why premature babies are given vitamin K.

Vitamin K is a cofactor in reactions required for the synthesis of some of the proteins involved in blood coagulation. It is the coenzyme for a mammalian carboxylase that catalyzes the conversion of specific glutamate residues to γ-carboxyglutamate residues. The reduced (hydroquinone) form of vitamin K participates in the carboxylation as a reducing agent.

Silent Mutations and Neutral Theory

 
This is a post about the quality of science writing and what can be done about it. I'm picking on an article in SEED magazine here but it's not because SEED is any worse than the competition. It's partly because SEED makes claims about raising the quality of science writing and science education. For example, this statement from a SEED press release seems to indicate that they aspire to better science writing than the competition [Seed Media Group Adds Scientific and Political Pundits to Editorial Team] and certainly their commitment to science bloggers suggests the same aspiration.
As part of its growth strategy, Seed Media Group will develop original science content aimed at a general audience for distribution across a number of media channels, including magazines, books, newspapers, online, topical blogs, digital, film and television. Seed Media Group's endeavors will present science in the same culturally articulate and accessible style that earned Seed a prestigious UTNE Independent Press Award in 2004 and the support of leading advertisers.
It's reasonable, in my opinion, to expect that SEED will live up to this billing. They've certainly made a major step in that direction by hiring PZ Myers to write a monthly science column. The science in his first two contributions is impeccable. As we will see, it raises the question of whether you need to be a scientist in order to get the science right. I hope that's not true.

I'm not going to criticize PZ's articles. Instead, I want to examine another article published in the March 2007 issue of SEED. (That's the one with the "TRUTH" prominently displayed on the cover!) The article in question is titled The Sound of Silence and it's written by Lindsay Bothwick, an experienced science writer with a M.Sc. from McGill and a Masters degree in journalism from Ryerson University here in Toronto. She's a senior editor at SEED so I'm assuming she can take criticism.

The article talks about silent mutations in protein-coding regions. The focus is on a recent Science paper showing that some silent mutations affect the activity of a protein. The point I will make is that the SEED article is very misleading and misrepresents the state of knowledge in this field.

Before getting into the article, let me give you some background.

The most common kinds of mutations are those where one nucleotide is substituted for another. For example, a G or an A or a T replaces a C. This substitution usually results from an error during DNA replication.

If the mutation (allele) persists in a population, it's called a single nucleotide polymorphism or SNP (pronounced snip). The term polymorphism means that there are at least two different alleles segregating in the population. Often these are the original "wild-type" allele and the new mutant allele.

We now recognize that genomes within a population are very heterogeneous. Polymorphism is common. This level of variation was discovered in studies during the 1960's and it's much higher than most scientists thought prior to 1960.

There are three explanations that can, in theory, account for this high level of polymorphism

First, if we think about SNP's, they can represent a transient phase of fixation by natural selection. In this case, one of the alleles is rapidly replacing the other and we just happen to catch it in the act. Back in the days when natural selection was the only game in town it was thought that this transient stage would be rare so populations were not expected to show much variation.

Polymorphism can also be explained by balancing selection. This is when the population has to maintain several different alleles because there is selection for heterogeneity. The classic example is the mutation for sickle cell anemia. When a person is homozygous for the mutant allele they exhibit the symptoms of anemia but when heterozygous they are resistant to malaria. Balancing selection is not common and it can't explain the variation that was discovered in the 1960's.

The third explanation was that the variation is mostly neutral. The idea here is that the majority of mutations are not being acted upon by natural selection. They are not being removed by purifying selection; they are not being maintained by balancing selection; and they are not rising to fixation under positive selection. It was the discovery of significant polymorphism in populations that gave rise to Neutral Theory in the first place ((Kimura, 1968, King and Jukes, 1969).

Neutral mutations will eventually become fixed or be eliminated from the population and the change in frequency is due entirely to random genetic drift. Drift is a much slower process than natural selection so there will always be large numbers of neutral alleles in the process of becoming fixed or extinct.

Neutral Theory and random genetic drift explains variation and it also explains molecular evolution and the (approximate) molecular clock. There are no other explanations that make sense and nobody has offered a competing explanation since Motoo Kimura (1968) or Jack King and Thomas Jukes (1969) published their papers almost fifty years ago. (Aside from occasional nitpicks, of course. There are always scientists who like to show that some mutations that were thought to be neutral are actually beneficial or deleterious. None of them have mounted a serious claim that most variation or most of molecular evolution can be explained by natural selection.)

The history of variation and the competing explanations were well covered by Lewontin in his 1974 book The Genetic Basis of Evolutionary Change. (Lewontin published the classic 1960's papers that revealed extensive within population variation.)

Long-held assumptions about "silent" genetic mutations have been torn down, challenging a fundamental evolutionary theory.

Lindsay Borthwick
SEED
March 2007
This brings me to the article in the March issue of SEED [The Sound of Silence]. It begins with a conclusion that's all too common in popular science writing these days,
Scattered throughout the human genome are thousands of mutations that biologists have treated mostly as footnotes. They're hardly few in number—in coding regions of the genome, there are as many as 15,000—but biologists regard them as mutations that simply don't change the way a cell functions. Both in name and effect, they have been accepted as "silent." Now, however, new discoveries are showing that silent mutations appear to play an important role in dozens of human genetic diseases, a fact that is forcing biologists to discard a long-held evolutionary theory and to reexamine the very rules governing the transfer of information from DNA to proteins.
What's going on here? Has there been some extraordinary new discovery that's about to overthrow evolutionary theory and the "rules" of information flow? I will attempt to show that this rhetoric is completely unjustified. It presents a misleading picture of the state of modern science.

The author is talking about silent mutations. These are mutations in the coding region of a gene that alter a codon without changing the amino acid. The genetic code is redundant because there are 64 possible codons and only 20 amino acids. This means that several amino acids have multiple codons. For example, there are six codons for leucine (Leu): TTA, TTG, CTT, CTC, CTA, and CTG. If an original TTA codon is mutated to CTA then it still specifies leucine and this is a silent mutation.

The concept of codon bias has been known for almost forty years and it's an important part of all university courses in molecular biology. Some codons are more efficient than others during translation because the levels of various tRNAs in a cell are not identical. A rare codon will be translated less efficiently because the tRNA that binds to it will not be recognized as frequently as the codon for more abundant tRNAs. There are published codon usage tables for most species showing the preferred codons in that species. Highly expressed genes will preferentially use the codons recognized by the abundant tRNA species. All students know what these tables mean. It means that not all silent mutations are neutral. (It's on the exam!)

This is only one possible reason for silent mutations not being neutral. The various possibilites were discussed by Jukes back in 1980 when he revisited the evidence for neutral changes (Jukes, 1980)) . He gave some specific examples and then addressed the theoretical problem,
The question arises, are these silent changes actually neutral, or have they taken place for adaptive reasons, such as the requirement for a specific secondary structure in mRNA, or a preferential use for certain transfer RNAs in regulating the rate of synthesis of a protein?
For some results the answer is that the silent changes really are neutral, although in a few cases there is evidence of adaptation. The point is that these issues have been recognized and dealt with for decades.

The existence of a few exceptions to a rule does not invalidate the generality. That's an important point. It's one that all science journalists need to grasp. There are no absolute, inviolate, rules in biology. The generalities are all about relative frequencies. Are most silent mutations neutral or are most subject to natural selection?

As Jukes put it 27 years ago,
The neutral approach to molecular evolution is a proposal to prove a negative, which is something like trying to show that a given substance is not a carcinogen. The counterrresponse to the publications by Kimura (1968) and King & Jukes (1969) has been quite strong. Any exceptions to neutrality are usually taken as disproof of it, and many authors have cited such exceptions for this purpose. We have, indeed, developed evidence for such exceptions ourselves, because a theory should be challenged by those who have postulated it.

For example, the finding that synonymous codons for each amino acid are not use in equal amounts in β-hemoglobin mRNA has been cited as disproof of the neutral model, as if such a departure from randomness in a single gene were pertinent.
As the old expression goes, "those who are ignorant of history are doomed to repeat it." It helps a lot to be aware of the history of biology and the contributions of those who developed our current understanding. Modern science writers often fail to understand that there's not much that's new in biology these days. In this case, it's just not true that biologists were too stupid to recognize that some silent mutations weren't neutral. There was no "orthodoxy" that all silent mutations were neutral and, therefore, no orthodoxy has been overturned.

Silent mutations have no impact on the amino acid sequence of proteins and, therefore, were not expected to change their function.

Lindsay Borthwick
SEED
March 2007
The SEED article goes on to describe the results of experiments done by Kimchi-Safaty et al. (2007). They presented evidence that a cluster of three silent mutations in the MDR1 gene led to a slow down in translation and subsequent misfolding of the protein. Lindasy Bortwick then writes, "Through a series of elegant experiments, the team put to rest the idea that silent mutations were neutral." Of course, they did no such thing. They merely added one more data point to something that we already knew; namely, not all silent mutations are neutral.

Borthwick closes with,
Most fundamentally, the involvement of silent mutations in disease undermines the neutral theory of molecular evolution. This theory, posited by Motoo Kimura in the late 1960s and a powerful influence ever since, asserted that the vast majority of mutations were neutral, having no effect on the fitness of an organism, and spread through a population by chance. The fact that silent mutations are not harmless anomalies of nature means that they are not neutral. In contrast, some, if not all, silent sites must be subject to the forces of Darwinian natural selection.
The theme of the article is that neutral theory is in big trouble. This point is emphasized in the highlighted quotations that are prominently displayed on page 35 (see the two boxes above). That's totally wrong and it distorts the modern consensus among knowledgeable scientists. Neutral Theory is alive and well, thank-you very much. It can easily accommodate one more example of a non-neutral mutation.

I believe that science writers have an obligation to get the concepts right and I believe they shouldn't misrepresent the science they're supposed to be presenting in a "culturally articulate and accessible style" to a general audience. A layperson reading this article would go away with the impression that a decades old concept has just been overthrown by a single paper published in Science. That's irresponsible journalism.

Jukes, T.H. (1980) Neutral Changes Revisited. In The Evolution of Protein Structure and Function, pp. 203-219.
.
Lewontin, R.C. (1974) The Paradox of Variation. in Evolution Mark Ridley ed., Oxfrod University Press, Oxford UK.

Kimchi-Sarfaty, C., Oh, J.M., Kim, I.W., Sauna, Z.E., Calcagno, A.M., Ambudkar, S.V., and Gottesman, M.M. (2007) A "silent" polymorphism in the MDR1 gene changes substrate specificity. Science 315, 525-528.

Kimura, M. (1968) Evolutionary rate at the molecular level. Nature 217, 624-626.

King,J.L. and Jukes,T.H. (1969) Non-Darwinian evolution. Science 164, 788-798.

The Taxonomy Song

 
Pop on over to Evolving Thoughts where John Wilkins has posted a YouTube video of kids singing The Taxonomy Song. Do you recognize the accent? I think they're Australian, no?

Problem is, they get the highest levels wrong (the rest is okay). How many people can correctly name the FIVE KINGDOMS of life? How many people can come up with a better classification scheme?

While you're over on John's blog, read Microbial Species - postlude. It's a discussion about how to identify bacteria species. Remember that the biological species concept doesn't work very well with bacteria because they don't have sex in the same that eukaryotes do. John has some important insights into this problem but it hurts my brain when I try and understand them.

Homer Jay Simpson Evolves

 
Everyone's going to be posting this but here it is anyway, ... just in case you don't see it elsewhere. As far as cartoon versions of evolution go, it's not bad. Just remember that individuals don't evolve, populations evolve. And don't forget that there's no direction to evolution and humans are not the only living mammal that evolved.

Monday, March 26, 2007

Internet Connection Speeds

 
Here are the results of a test for internet connection speed from InternetFrog.com. This is the speed I get at the university.

Now, here's some questions for all you technical experts out there. The download speed varies from a low of 2 Mbps to a high of close to 9Mbps. Why? Does anyone have faster connections on a regular basis?

The upload speed varies from a low of about 150 Kbps to a high of 1.7 Mbps. Why? And why is the upload speed so much slower than the download speed? Does that depend on the speed of my processor?



[Hat Tip: Kevin Black]

Happy 66th Birthday Richard Dawkins

 
Today is Richard Dawkins' birthday [RichardDawkins.net]. Go [here] to enter your own birthday message.

I may disagree with Dawkins on some parts of evolutionary theory but I think he's done a wonderful job of focusing attention on evolution as a scientific fact. I'm also a strong supporter of his attacks on superstition and support for rationality (e.g., The God Delusion).

Dawkins is just one of many intelligent men and women whose brains did not shut down when they turned 50 or 60. We need to point this out because there are, unfortunately, too many people who think that you can only make a contribution to science when you're under 40.

Monday's Molecule #19

 
Name this molecule. You must be specific but we don't need the full correct scientific name. (If you know it then please post it.)

As usual, there's a connection between Monday's molecule and this Wednesday's Nobel Laureate. This one's easy once you know the molecule and make the connection. There'll be a few extra bonus points for guessing Wednesday's Nobel Laureate(s).

Comments will be blocked for 24 hours. Comments are now open.

Sunday, March 25, 2007

RNA Polymerase Genes in the Human Genome

 
The structure of yeast RNA polymerase II was solved by Roger Kornberg [Nobel Laureate: Roger Kornberg]. There are many different polypeptide subunits labelled Rpb1 to Rpb12 in the nomenclature used by yeast workers. The mammalian enzyme is very similar. Most of the same subunits are present but they have different names.

The core of RNA polymerase is composed of two very large subunits called Rpb1 and Rpb2 in yeast. In mammals they are called subunits A (220 KDa) and B (140 KDa). These subunits are homologous to the β and β′ subunits in bacterial RNA polymerases. The genes for these polypeptides in humans are called POLR2A and POLR2B. They are located on chromosomes 17p13.1 and 4q12 respectively.

The Online Mendelian Inheritance in Man database has entries for both genes but there are no genetic diseases associated with mutations in either gene [OMIM POLR2A and OMIM POLR2B]. This should not be a surprise since it is rare for genetic diseases to be associated with important essential genes.

Recall that mammals have four different RNA polymerases [Eukaryotic RNA Polymerases]. Both RNA polymerase I and RNA polymerase III have homologous large A and B subunits. The genes for these polypeptides are called POLR1A (194 KDa, chromosome 2p11.2), POLR1B (128 KDa, chromosome 2q13), POLR3A (155 KDa, chromosome 10q22-q23), and POLR3B (~120 KDa, chromosome 12q23.3). As is the case with the large subunits of RNA polymerase II, none of these genes are associated with metabolic diseases because they are essential, important housekeeping genes.

These genes make up a typical eukaryotic gene family. It's important to remember that a gene family refers to homologous genes within the same genome and not to a group of homologous genes from different species. Gene families arise from gene duplication events.

The "A" genes evolved from a common ancestral RNA polymerase β gene several billion years ago and the "B" genes evolved from an ancestral β′ gene. The β and β′ genes, in turn, evolved from a common ancestor near the time life began about 3.5 billion years ago.

The "A" and "B" genes have evolved independently by divergence. In such cases the family members are often on different chromosomes and the intron-exon organization of each member is very different in spite of the fact that the genes are still closely related in amino acid sequence.

In addition to the "A" and "B" genes for each RNA polymerase, there are genes for three different subunits of RNA polymerase I (POL1C, POL1D, POL1E), 12 different subunits of RNA polymerase II (SURB7 and POL2C - POL2L), and 9 different subunits of RNA polymerase III. There are also dozens of genes for the general transcription factors required for initiation, elongation, and termination. Altogether, there are at least 80 different genes required for transcription and that's not counting any gene-specific regulatory genes.

The fourth RNA polymerase in humans is the mitochondrial version. Its gene is POLRMT located on chromosome 19p13.3. The large subunit of the mitochondrial RNA polymerase is only distantly related to the others. There are no metabolic defects associated with mutations in POLRMT [OMIM POLRMT].

The Salem Conjecture

 
The Salem Conjecture was popularized by Bruce Salem on the newsgroup talk.origins. It dates to before my time on that newsgroup (1990) and I haven't been able to find archives to research the exact origin. The conjecture was explained by Bruce on numerous occasions, here's a statement from Sept, 5, 1996.
My position is not that most creationists are engineers or even that engineering predisposes one to Creationism. In fact, most engineers are not Creationists and more well-educated people are less predisposed to Creationism, the points the statistics in the study bear out. My position was that of those Creationists who presented themselves with professional credentials, or with training that they wished to represent as giving them competence to be critics of Evolution while offering Creationism as the alternative, a significant number turned out to be engineers.
This is the so-called "soft" version of the conjecture. The "hard" version is that there is something about being an engineer that leads one to become a creationist. That's not what Bruce said,
For a long the so-called "soft" hypothesis is the one I have been putting forth, not the one earlier attributed to me. I have also further qualified it by saying numerous times that religious belief was the most significant factor. The reason I prefer to call my idea a "conjecture" is that I have had only anecdotal data to support it.
The Salem Hypothesis has its own entry on Wikipedia [Salem Hypothesis]. Both versions of the Salem Conjecture are listed there. The talk.origins Jargon File is incorrect because it only lists the hard version and attributes it to Bruce Salem.

We all know that scientists overwhelmingly reject creationism so it doesn't come as a surprise that there are so few scientists in the creationists movement. Ironically, the creationists long for scientific validity while, at the same time, they attack all the basic principles of science. The few so-called scientists who subscribe to superstition get very prominent play among the creationists.

Engineers are not scientists and they did not have much scientific training in school. They are technologists (i.e., engineers) and that's not the same thing. I don't think engineers spend much time studying evolutionary theory in university. (It's probably too difficult for them.)

Among the general public the distinction between scientists and technologists is lost so whenever an engineer comes out in favor of superstition (s)he is counted as a scientist. This is what the Salem Conjecture says. Whenever you see a common run-of-the-mill creationist who claims to have scientific knowledge, chances are they're an engineer and not a scientist.

Here's how Bruce explained it on talk.origins on May 10, 1996 in response to an engineer who was objecting to the conjecture.
By your own admission you are running the risk of becoming yet another data point for something called the "Salem Hypothesis" or "Salem Conjecture" in which I noticed some time ago the number of people publically supporting Creationism whether in Creationist publications or this group claiming to be "scientists" were mostly engineers. Most of them had little knowledge of the scientific disciplines that relate to the scientific acceptance of evolution and an old earth. Many people have noticed subsequently that while engineers as a group seem more inclined as a majority to believe Darwin, those with a background in certain religions and those concerned with intelligent design seemed predisposed to accept
Creationism or the arguments that support it.
This morning Larry Faraman, the author of the blog I'm From Missouri, posted this message [The Salem Hypothesis].
I have been aware for a long time that engineers have an especially strong tendency to be skeptical of Darwinism, but I just now learned that this tendency has a name: the "Salem hypothesis." I am especially interested in this tendency because I am an engineer myself ....

I feel that the reason why we engineers tend to be skeptical of Darwinism is that we are a logical, practical, no bullshit, cut the malarkey, "I'm from Missouri," "show me" kind of people.
The irony is palpable. Mr. Faraman, an engineer, is skeptical of evolutionary biology and, by implication, most of the rest of science. On the other hand, he's not the least bit skeptical of creationism. Another solid data point for the Salem Conjecture. In this case, it's the "hard" version that Mr. Faraman is supporting. He claims that training in technology predisposes one to believe in superstitious nonsense. Maybe he's right. I look forward to hearing from other engineers on this point.

BTW, Missouri must be a very strange state. These days when someone begins a conversation with "I'm from Missouri" it's usually following by something irrational.

Saturday, March 24, 2007

Dennis Kucinich on Universal Health Care

 
This is why I would vote for Dennis Kucinich ... if only I could vote. Why don't you vote for him?



[Hat Tip: Corpus Callosum]

Gene Genie #3

 
Hsien Hsien Lei has just posted Gene Genie #3 at Genetics and Health. There are >26,000 genes in the human genome and we hope to cover them in a finite amount of time. At this rate we'll be done sometime in the Spring of 2207! Let's pick up the pace, fellow bloggers.

The next Gene Genie (#4) will be hosted right here on Sandwalk. Send me your articles by email or submit them at blogcarnival [gene genie]. You ain't never had a friend like Gene Genie!

Summary of Genes on Human Chromosomes

 
I've prepared a table of the number and types of gene on each human chromosome based on the data at the Ensembl site managed by the Wellcome Trust Sanger Institute in Cambridge UK.

The total number of genes comes to 26,290.

The different categories of gene are:

Known: The "known" protein-encoding genes are those for which there is solid full-length cDNA evidence that they are actually expressed.

Novel: The "novel" class is reserved for genes that are predicted but lack confirming evidence.

miRNA: Micro RNAs are short single-stranded RNAs that are thought to play a role in regulating gene expression.

snRNA: Small nuclear RNAs are required for a number of cellular processes such as RNA processing. Those required for splicing associate with proteins in the nucleus to form small nuclear ribonuleoprotein particles or "snurps."

rRNA: Ribosomal RNA forms the core of the ribosome.

snoRNA: Small nucleolar RNAs are required for proper processing of ribosomal RNA. The are located in the nucleolar region of the nucleus because that's where ribosomal RNA is made.

other RNA: The "other" category includes transfer RNA (tRNA) and some specialized RNAs such as 7SL RNA and P1 RNA.
Chr. Size (kb)Protein
known
Protein
novel
Pseudo-
genes
miRNArRNAsnRNAsnoRNAother
RNA
    Total
    Genes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
247,249,719
242,951,149
199,501,827
191,273,063
180,857,866
170,899,992
158,821,424
146,274,826
140,273,252
135,374,737
134,452,384
132,349,534
114,142,980
106,368,585
100,338,915
88,827,254
78,774,742
76,117,153
63,811,651
62,435,964
46,944,323
49,691,432
154,913,754
57,772,954

2,146
1,375
1,111
828
922
1,103
984
736
921
819
1,390
1,088
358
661
657
915
1,232
293
1,428
612
271
509
878
86
54
84
47
59
63
29
68
32
38
35
52
51
10
28
65
49
60
20
49
15
23
26
37
27

159
40
45
32
23
81
48
19
66
52
61
38
41
25
34
25
56
8
45
29
9
39
80
2

43
23
24
21
19
17
31
17
26
16
19
21
14
51
15
14
32
5
71
16
7
15
58
6

42
24
21
13
22
16
14
14
11
17
19
15
9
14
6
13
10
42
6
8
10
2
19
6

178
116
89
81
74
82
64
61
43
64
51
77
29
42
43
39
47
12
14
32
5
18
64
14

60
37
30
16
18
25
27
21
15
8
40
21
12
56
95
14
29
12
12
16
5
11
25
3

93
74
66
58
67
56
62
39
47
42
47
65
34
38
35
31
52
21
18
34
6
20
48
2

2,616
1,733
1,388
1,076
1,185
1,328
1,250
920
1,101
1,001
1,618
1,338
466
890
916
1,075
1,462
401
1,598
733
325
601
1,129
140

Daisy, the Canada Goose

 
From KARE 11 News in Minneapolis, St. Paul, Minnesota (USA) [Daisy the Goose]. Why would evolution favor behaviors where a beagle, a Canada goose, and a human could get along in a boat? The video on the TV station's website is much, much better than the YouTube video. It's worth watching.

PZ's Hooked a Live One!

 
Hop on over to Pharyngula where PZ Myers is pointing out the deficiencies of an IDiot school teacher in Colorado [What's the matter with Colorado?].

This guy, Ken Poppe, has actually written a book exposing his ignorance. Look at the cover—that's supposed to be DNA but the structure is so wrong it makes you wonder if Poppe knows anything about science at all.

But here's the fun part. Ken Poppe has popped into Pharyngula to comment on PZ's post. Poppe's first comment is "I'm not afraid of your witchhunt. Bring it on, secularists. Bring it on." It goes downhill from there.

Friday, March 23, 2007

How Many Genes Do We Have?

 
The number of genes in the human genome flutuates on a monthly basis as the genome annotators add new genes and remove false positives. It's an ongoing process that's not likely to be complete in the near future.

The original draft sequences of the human genome had between 25,000 and 30,000 genes but these numbers were not reliable since they were based entirely on computer predictions. The programs were still in the testing stage for complex genomes when they were used in 2001. They are much better now but it really takes human intevention to assess whether a prediction is correct or not. The annotation process is tedious.

The latest summary from NCBI is based on the Oct. 17, 2006 genome assembly [NCBI Reference Assembly]. It lists 28,961 genes for the public genome and 26,245 for the private Celera assembly.

The Ensembl site has better data because the curation seems to be more rigorous. It lists 26,720 genes of which 3,994 have RNA products (mainly ribosomal RNA, tRNAs, and snoRNAs) [Ensembl Homo sapiens]. This is not much different than the NCBI number. It looks like the total number of genes is stabilizing at 27,000 total genes and about 23,000 protein encoding genes.

Carl Zimmer recently posted an article about the number of genes in the human genome [You Don't Miss Those 8,000 Genes, Do You?]. He referred to the PANTHER database where they quote 25,431 genes on their current website [PANTHER pie chart]. This differs considerably from the 18,308 genes shown in Zimmer's original article at this site [PANTHER filtered NP]. The difference is due to filtering the total number of genes (25,431) by showing only those that have a RefSeq entry in the Entrez database. This is an underestimate since not all genes have been assigned a RefSeq entry, particularly those that produce an RNA product rather than a protein.

[Thanks to Scientia Natura for the cartoon]

Your Hotel Key Card Contains Personal Information and Credit Card Numbers

 
Friday's Urband Legend: FALSE

I received this warning in an email message from a friend.
Ever wonder what is on your magnetic key card?
Answer:
a. Customer's name
b. Customer's partial home address
c. Hotel room number
d. Check-in date and out dates. Customer's credit card number and expiration date!
When you turn them in to the front desk your personal information is there for any employee to access by simply scanning the card in the hotel scanner.

An employee can take a hand full of cards home and using a scanning device, access the information onto a laptop computer and go shopping at your expense.

Simply put, hotels do not erase the information on these cards until an employee re-issues the card to the next hotel guest.

At that time, the new guest's information is electronically "overwritten" on the card and the previous guest's information is erased in the overwriting process.

But until the card is rewritten for the next guest, it usually is kept in a drawer at the front desk with YOUR INFORMATION ON IT!

The bottom line is:
Keep the cards, take them home with you, or destroy them.
Snopes debunks this urban myth at [Card Sharks].

They say,
In January 2006, Computerworld investigated the key card rumors by collecting and examining over 100 hotel card keys and found no personally identifiable information on any of them:
As part of a Computerworld investigation into the allegations, reporters and other staff members who traveled last fall brought back 52 hotel card keys over a six-week period. The cards came from a wide range of hotels and resorts, from Motel 6 to Hyatt Regency and Disney World. We scanned them using an ISO-standard card reader from MagTek Inc. in Carson, Calif. — the type anyone could buy online.

We then sent the cards to Terry Benson, engineering group leader at MagTek, for a more in-depth examination using specialized equipment. MagTek also gathered cards from its own staff. In all, 100 cards were tested.

Most cards were completely unreadable with an off-the-shelf card reader. Neither Benson nor Computerworld found any personally identifiable information on them. Based on these results, we think it's unlikely that hotel guests in the U.S. will find any personal information on their hotel card keys
We also purchased our own MagTek card scanner and have scanned several dozen magnetic room keys we acquired during our various hotel stays over the last few years and likewise found not a single key with any personal information stored on it.

Nevertheless, the rumor dies hard. In a followup report consumeraffairs.com claims that there have been instances of personal information stored on a hotel key card [Hotel Key Cards: Identity Theft Risk or Not? "Mythbusters" Aside, the Answer's Not Clear-Cut]. In some cases it's because thieves have stolen hotel key cards and entered stolen credit card information so the key cards can be used as fake credit cards. In other cases, it appears there were hotels that encoded personal information in the past. (These reports sound a lot like hearsay.)









Thursday, March 22, 2007

Forgetfulness - Billy Collins Animated Poetry

 
I was just sent this a few minutes ago. (My wife again! Is there a message here?) I'm posting this right away, partially in hono(u)r of PZ Myers who just turned 50, but mostly so I won't forget.

How RNA Polymerase Works: The Topology of the Reaction and the Structure of the Enzyme

 
Transcription is one of the most important steps in gene expression. During the elongation phase, the transcription complex moves along double-stranded DNA creating a transcription bubble by local unwinding of the helix (Transcription). As RNA is synthesized it forms a transient DNA:RNA helix at the active site of the enzyme (How RNA Polymerase Works: The Chemical Reaction). We now know what this transcription bubble really looks like, thanks to the work of 2006 Nobel Laureate Roger Kornberg

The figure on the left is taken from a review in Science magazine written by Aaron Klug (A Marvellous Machine for Making Messages). It shows the structure of the RNA polymerase II complex (Eukaryotic RNA polymerases) associated with a DNA:RNA hybrid that Kornberg's lab synthesized. They solved the structure of the co-crystal.

The solid blue and green lines represent fragments of DNA. As you can see from the diagram it is in the form of a double helix at the front end of RNA polymerase where it enters the groove on the leading edge. (The transcription complex is moving from left to right.) The DNA is gripped by the "jaw" region near the opening of the grove.

As the double-stranded region reaches the active site (identified by the purple Mg2+ ion), it unwinds to a single-stranded form creating a bubble. The bubble isn't actually seen in the crystal structure but its location can be inferred (dotted green and blue lines).

It looks like the unwinding is promoted when the DNA runs into the "wall" and is forced to make a sharp upward turn before exiting near the "clamp" where the two strands of DNA come back together to form a helix.

The blue strand of the transcription bubble is the template strand and part of it is associated with a short strand of RNA (red) behind the active site. There's a large funnel at the bottom of the enzyme that serves as a pathway from the outside to the site of polymerization. This is where nucleoside triphosphates (NTPs) enter and leave the active site. It also appears to be the site where the 3′ end of the RNA is extruded when the enzyme backs up for proofreading (backtracking).

The "bridge" part of the enzyme is required for the translocation step. This is the step following addition of a ribonucleotide when the enzyme has to shift by one nucleotide (base pair) to the right. The new 3′ end of RNA has to be re-positioned at the active site during this shift. At the same time, one base pair of DNA is unwound by the "fork" region of the enzyme and one base pair is reformed at the back end of the bubble by the "zipper" region.

The "bridge" acts like a flexible ratchet allowing a shift of one base pair while maintaining a grip on the growing end of the RNA molecule. This movement is steered by the "rudder."

Most of these terms ("bridge," "rudder" etc.) refer to short α helices or loops within RNA polymerase and almost all of them are part of the conserved β and β′ subunits. The same features are seen in the bacterial enzymes although the resolution of the bacterial enzyme structures is not good enough to decipher the translocation step. This is one of the achievements of the Kornberg group in the two famous papers (Gnatt et all, 2001; Cramer et al., 2001).
Gnatt, A.L., Cramer, P. , Fu, J., Bushnell, D.A., and Kornberg, R.D. (2001) Structural Basis of Transcription: An RNA Polymerase II Elongation Complex at 3.3 Å Resolution. Science 292:1876 - 1882.

Cramer, P. , Bushnell, S.A. and Kornberg, D.A. (2001) Structural Basis of Transcription: RNA Polymerase II at 2.8 Ångstrom Resolution. Science 292:1863 - 1876.