More Recent Comments

Saturday, June 28, 2008

Good Science Writers: Helena Curtis

Helena Curtis is another science writer who didn't make it into The Oxford Book of Modern Science Writing. She died in 2005. One of many obituaries appeared in The villager [Helena Curtis, 81, wrote ‘elegant’ science textbooks].
Helena Curtis, a noted science writer and college biology textbook author, died on Feb. 11 at the age of 81. She was a resident of Sag Harbor and Greenwich Village.

Her first book, “The Viruses,” published in 1965 by Natural History Press, was followed in 1968 by “The Marvelous Animals.” In 1966, she was signed to a contract for a college biology textbook by Worth Publishers. The idea of a textbook written not by an academic, but by a professional science writer, in consultation with biology experts, was at that time revolutionary and greeted with skepticism. However, when Curtis’s “Biology” was published in 1968, it received a laudatory review in Scientific American by Nobel Laureate Salvador Luria. Through five editions in English it has sold 1.3 million copies. A shorter book, “Invitation to Biology,” has sold 600,000 copies. Both books have enjoyed success in Spanish and Italian editions, with more than 1 million of the books sold in Italian. On the later editions of both books, she was joined by N. Sue Barnes as co-author. Curtis also co-authored “Biology of Plants.”

Curtis’s books and her articles for encyclopedias, journals and magazines were praised for their scientific accuracy, elegant writing and wit. In 1988, Professor John O. Corliss of the University of Maryland said, with regard to the fifth edition of “Biology”: “The writing is about the closest to poetry that a scientific textbook can ever hope to get. It is thoroughly enjoyable, stimulating, imaginative, yet beautifully factual.”

The passage I've chosen is from her biology book. It's the opening paragraphs of Chapter 1. Her phrase "You and I are flesh and blood, but we are also stardust" is one of the most widely quoted sentences ever to come from a biology textbook.
Our universe began, according to current theory, with an explosion that filled all space, with every particle of matter hurled away from every other particle. The temperature at the time of the explosion—some 10 to 20 billion years ago—was about 100,000000000 degrees Celsius (1011 °C). At this temperature, not even atoms could hold together; all matter was in the form of subatomic, elementary particles. Moving at enormous velocities, even those particles had fleeting lives. Colliding with great force, they annihilated one another, creating new particles and releasing great energy.

As the universe cooled, two types of stable particles, previously present only in relatively small amounts, began to assemble. (By this time, several hundred thousand years after the "big bang" is believed to have taken place, the temperature had dropped to a mere 2500°C, about the temperature of white-hot wire in an incandescent light bulb.) These particles—protons and neutrons—are very heavy as subatomic particles go. Held together by forces that are still incompletely understood, they formed the central cores, or nuclei, of atoms. These nuclei, with their positively charged protons, attracted small, light, negatively charged particles—electrons—which moved rapidly around them. Thus, atoms came into being.

It is from these atoms—blown apart, formed, and re-formed over the course of several billion years—that all the stars and planets of our universe are formed, including our particular star and planet. And it is from the atoms present on this planet that living systems assembled themselves and evolved. Each atom in our own bodies had its origin in that enormous explosion 10 to 20 billion years ago. You and I are flesh and blood, but we are also stardust.

This text begins where life begins, with the atom. At first, the universe aside, it might appear that lifeless atoms have little to do with biology. Bear with us, however. A closer look reveals that the activities we associate with being alive depend on combinations and exchanges between atoms, and the force that binds the electron to the atomic nucleus stores the energy that powers living systems.

Darwinism at the ROM

Yesterday I attended a symposium on evolution at the Royal Ontario Museum [Darwin Symposium at the ROM]. The emphasis was on Charles Darwin, in line with the Darwin exhibit that is currently running at the ROM.

What I was expecting was a series of lectures that explain how Darwin fits into modern ideas of evolutionary biology. What I got was an adaptationist lovefest.

This was a free public symposium. By the time it started every seat in the auditorium was full and people were standing at the back. There were about 320 people of all ages and all walks of life. I sat beside a high school teacher and talked to retirees from the suburbs.

The first speaker was Michael Ruse. The original title of his talk was Has Darwinism Expired? but he modified it slightly to Is Darwin's Theory Past Its "Sell By" Date. His opening remarks were promising because he mentioned Stephen Jay Gould and Gould's criticism of Darwinism. He said that this was a distorted picture of evolution. It was downhill from that point on.

Ruse never explained why modern evolutionary theory differs from Darwin's evolution by natural selection. Instead he spent close to an hour going over examples of "evolution by natural selection." Most of his examples were, indeed, evidence of evolution but they were not necessarily evidence of evolution by natural selection. It's clear that Micheal Ruse does not distinguish between "evolution" and "natural selection." Evidence for evolution is treated as evidence for natural selection.

By the time he finished, the audience was completely unaware of random genetic drift, or any other mechanism of evolution. Ruse never explained why anyone would even bother to ask the question he asks in the title of his talk. According to Ruse, Darwinism is still the dominant paradigm in evolutionary biology. When examining characteristics of organisms biologists always ask "What is it for?", according to Michale Ruse. The answer will be explained by natural selection. This is the adaptationist fallacy. The correct question should be "Is this "for" anything?"

I know that Ruse is more of an adaptationist than a pluralist. I know that he favors Richard Dawkins and Daniel Dennett over Stephen Jay Gould and the pluralists. That's not the problem. What bothers me most is that when giving a public lecture Ruse does not even present the other side of the issue. What would it have cost him to mention that there are many evolutionary biologists who do not think of themselves as Darwinists? Why couldn't he explain that many of us think random genetic drift—and not natural selection—is the dominant mechanism of evolution? It doesn't diminish the importance of natural selection and adaptation. It doesn't diminish the contribution of Charles Darwin who still remains the greatest scientist who ever lived.

The second talk was by Spencer Barrett of the Department of Ecology & Evolutionary Biology here at the university of Toronto. Spencer Barrett was recently appointed to the rank of University Professor, our highest rank, in recognition of his work on evolution in flowering plants.

The title of his talk was A Darwinian Perspective on the Evolution of Plant Sexual Diversity and that's an accurate reflection of its content. Spencer Barrett is an adaptationist but in terms of his research he's a very successful example of this wordview. He chooses examples from plant evolution that almost certainly are adaptive and can be explained by natural selection. When faced with a strange example of plant sexual organs, Barrett begins by asking "What is the adaptive significance?"

After lunch we were treated to a lecture by Peter and Rosemary Grant on the evolution of Darwin's Finches. Most of you know the story. The Grants have spent 30 years collecting data on finches in the Galapagos. Everything about the evolution of Darwin's finches is explained by natural selection, especially changes in beak size. It has become the dominant example of evolution by natural selection.

The last lecture was delivered by Allan Baker of the Royal Ontario Museum. his title was Modern Darwinism: Natural Selection and Molecular Evolution. Baker works on bird evolution at the molecular level. He is trying to sort out the complicated, and controversial, relationship of bird clades. Baker pointed out that there are many conflicting data sets in the field and he explained how the use of signature sequences—in his case retrotransposon insertions—can be helpful. He noted in passing that he disagrees with the recent Science paper and cautions that bird evolution is still very much up in the air.

The irony here is that Baker was not studying "Darwinian" evolution at all. In spite of his title, it's extremely unlikely that the changes he looks at are due to natural selection. This was another missed opportunity, in my opinion. Baker could have explained to this public audience that molecular evolution is not Darwinian. It is an example of random genetic drift, which, incidentally, is why there's a molecular clock.

In talking to the lecturers afterward, I tried to find out how they thought about evolution. Baker, is well aware of the importance of random genetic drift. Barrett does not agree with me when I say that random genetic drift is the dominant mechanism of evolution at the molecular level and he does not agree that drift plays a role in speciation. Professor Barrett is one of the lecturers in our first year biology course on ecology and evolution. I've pointed out previously that in my second year course the students do not understand or appreciate random genetic drift and they tell me that it is barely mentioned in first year [Freedom in the Classroom]. I really enjoyed talking to Spencer Barrett and I hope we can continue the debate at another time.

The Grants claim that their evidence for natural selection is strong enough to rule out random genetic drift during the years when most of the finch population dies of starvation. The fluctuations in between could be due to drift.

Further reading ...

What Is Darwinism?
A Confused Philosopher
Darwin and Design by Michael Ruse
Why I'm Not a Darwinist
Evolution by Accident
Random Genetic Drift
Visible Mutations and Evolution by Natural Selection
Dennett on Adaptationism
The Evolution Poll of Sandwalk Readers

Friday, June 27, 2008

The Globe and Mail Reviews "Expelled"

So far, all of the Canadian reviews of Expelled have been good (i.e., they pan the movie). Some are more creative than others. Liam Lacy's review in the Globe and Mail is one of the best. It's set in the form of ten biblical verses [Expelled: No Intelligence Allowed]. The last verse is,
10. Then the Lord looked upon Ben Stein's work and declared: “Though I am a loving God, quite frankly, Ben, this film is an appallingly unscrupulous example of hack propaganda and it sucketh mightily. What's more, I didn't laugh once.”
The rating is "0." Denyse is going to be so disappointed in her journalist friends.

Mars to Sequence Cocoa Genome

Jonathan Eisen is excited because the Mars company is planning to sequence the cocoa genome [Combining two of my favorite things - chocolate and genomes]. See why this turns him on and why it isn't an example of science by press release.

The original press report is in the Washington Post [Unwrapping the Chocolate Genome]. The relevant point is the following ...
Mars plans to make the research results free and accessible through the Public Intellectual Property Resource for Agriculture, a group that supports agricultural innovation, as they become available. The intent is to prevent opportunists from patenting the plant's key genes.
Kudos to Mars if this turns out to be true.

In the interest of full disclosure I reveal that one of my children (Jane) works for Mars.

Thursday, June 26, 2008

Good Science Writers: David Suzuki

Of all the scientist writers who didn't make it into The Oxford Book of Modern Science Writing, David Suzuki surely counts as the most famous rejected Canadian.

David Takayoshi Suzuki earned his Ph.D. from the University of Chicago in 1961 and was a Professor at the University of British Columbia from 1963 until he retired in 2001. His research interests centered on the genetics of Drosophila melanogaster.

Suzuki founded the radio program Quirks and Quarks and serves as the host of the TV program On the Nature of Things. He has written 43 books and has received numerous awards for his contributions to science education. For the past three decades Suzuki has concentrated his efforts on environmental issues. Whether you agree with him or not, he is one of the world's best science writers.

The David Suzuki Foundation was set up, "to find ways for society to live in balance with the natural world that sustains us. Focusing on four program areas – oceans and sustainable fishing, climate change and clean energy, sustainability, and the Nature Challenge - the Foundation uses science and education to promote solutions that conserve nature and help achieve sustainability within a generation."

This essay on The Beauty of Wind Farms is copied from the New Scientist website. It was published on April 16, 2005.
OFF the coast of British Columbia in Canada is an island called Quadra, where I have a cabin that is as close to my heart as you can imagine. From my porch on a good day you can see clear across the waters of Georgia Strait to the snowy peaks of the rugged Coast Mountains. It is one of the most beautiful views I have seen. And I would gladly share it with a wind farm.

But sometimes it seems like I'm in the minority. All across Europe and North America, environmentalists are locking horns with the wind industry over the location of wind farms. In Alberta, one group is opposing a planned wind farm near Cypress Hills Provincial Park, claiming it would destroy views of the park and disturb some of the last remaining native prairie in the province. In the UK more than 100 national and local groups, led by some of the country's most prominent environmentalists, have argued that wind power is inefficient, destroys the ambience of the countryside and makes little difference to carbon emissions. And in the US, the Cape Wind Project, which would site 130 wind turbines off the coast of affluent Cape Cod, Massachusetts, has come under fire from famous liberals, including Senator Edward Kennedy and Walter Cronkite.

It is time for some perspective. With the growing urgency of climate change, we cannot have it both ways. We cannot shout from the rooftops about the dangers of global warming and then turn around and shout even louder about the "dangers" of windmills. Climate change is one of the greatest challenges humanity will face this century. It cannot be solved through good intentions. It will take a radical change in the way we produce and consume energy - another industrial revolution, this time for clean energy, conservation and efficiency.

We have undergone such transformations before and we can do it again. But first we must accept that all forms of energy have associated costs. Fossil fuels are limited in quantity and create vast amounts of pollution. Large-scale hydroelectric power floods valleys and destroys animal habitat. Nuclear power is terribly expensive and creates radioactive waste.

Wind power also has its downsides. It is highly visible and can kill birds. The fact is, though, that any man-made structure can kill birds - houses, radio towers, skyscrapers. In Toronto alone, it is estimated that 10,000 birds collide with the city's tallest buildings every year. Compared with this, the risk to birds from well-sited wind farms is very low.

Even at Altamont Pass in California, where 7000 turbines were erected on a migratory route, only 0.2 birds per turbine per year have been killed. Indeed, the real risk to birds comes not from windmills but from a changing climate, which threatens the very existence of bird species and their habitats. This is not to say that wind farms should be allowed to spring up anywhere. They should always be subject to environmental impact assessments. But a blanket "not in my backyard" approach is hypocritical and counterproductive.

Pursuing wind power as part of our move towards clean energy makes sense. It is the fastest-growing source of energy in the world - a $6 billion industry last year. Its cost has dropped dramatically over the past two decades because of larger turbines and greater knowledge of how to build, install and operate turbines more effectively. Prices will likely decrease further as the technology improves.

Are windmills ugly? I remember when Mostafa Tolba, executive director of the United Nations Environment Programme from 1976 to 1992, told me how when he was growing up in Egypt, smokestacks belching out smoke were considered signs of progress. Even as an adult concerned about pollution, it took him a long time to get over the instinctive pride he felt when he saw a tower pouring out clouds of smoke.

We see beauty through filters shaped by our values and beliefs. Some people think wind turbines are ugly. I think smokestacks, smog, acid rain, coal-fired power plants and climate change are ugly. I think windmills are beautiful. They harness the power of the wind to supply us with heat and light. They provide local jobs. They help clean our air and reduce climate change.

And if one day I look out from my cabin's porch and see a row of windmills spinning in the distance, I won't curse them. I will praise them. It will mean we are finally getting somewhere.
Today I was reminded of David Suzuki when John Pieret quoted from an article that Suzuki just published on [What a difference 50 years makes]. Here's an excerpt ...
I began speaking out on television in 1962 because I was shocked by the lack of understanding of science at a time when science as applied by industry, medicine, and the military was having such a profound impact on our lives. I felt we needed more scientific understanding if we were to make informed decisions about the forces shaping our lives. Today, thanks to computers and the Internet, and television, radio, and print media, we have access to more information than humanity has ever had. To my surprise, this access has not equipped us to make better decisions about such matters as climate change, peak oil, marine depletion, species extinction, and global pollution. That's largely because we now have access to so much information that we can find support for any prejudice or opinion.

Don't want to believe in evolution? No problem - you can find support for intelligent design and creationism in magazines, on websites, and in all kinds of books written by people with PhDs. Want to believe aliens came to Earth and abducted people? It's easy to find theories about how governments have covered up information on extraterrestrial aliens. Think human-induced climate change is junk science? Well, if you choose to read only certain national newspapers and magazines and listen only to certain popular commentators on television or radio, you'll never have to change your mind. And so it goes. The challenge today is that there is a huge volume of information out there, much of it biased or deliberately distorted. As I think about my grandson, his hopes and dreams and the immense issues my generation has bequeathed him, I realize what he and all young people need most are the tools of skepticism, critical thinking, the ability to assess the credibility of sources, and the humility to realize we all possess beliefs and values that must constantly be reexamined. With those tools, his generation will certainly leave a better world to its children and grandchildren 50 years from now.

[Photo Credit: Wikipedia: The copyright holder of this file, Joshua Sherurcij, allows anyone to use it for any purpose, provided that the copyright holder is properly attributed. Redistribution, derivative work, commercial use, and all other use is permitted.]

The Oxford Book of Modern Science Writing

The Oxford Book of Modern Science Writing, Richard Dawkins ed., Oxford University Press, Oxford, United Kingdom (2008)

I'd be lying if I said I read this book from cover to cover. Richard Dawkins has collected 83 short examples of science writing. Most of them are quite good, some of them are excellent, and some I didn't finish. The examples span the entire range of traditional science disciplines with a heavy emphasis on physics, astronomy, and biology. There are very few examples from chemistry or geology.

These are selections picked by Richard Dawkins and they strongly reflect his views of science and of good writing. One of the criteria for inclusion is "good writing by professional scientists, not excursions into science by professional writers" (p. xvii). There's nothing wrong with that, as long as Dawkins sticks to his guns.

Unfortunately, there are so many exceptions to the rule that one wonders why the rule was made up in the first place. We don't have anything by Carl Zimmer (a non-scientist), for example, but we do have examples from Matt Ridley who, while he has a Ph.D. in science has never been a professional scientist. Neither has Daniel Dennett. Rachel Carson worked as a biologist for a while but she was a full-time writer by the time she wrote most of her books. Roger Lewin has never been a professional scientist, as far as I know.

Perhaps Dawkins meant to restrict his authors to those who have earned an advanced degree in science regardless or whether they actually became working scientists. Perhaps that's what he means by "professional scientist." If that's what he means then, as he points out on page 171, Margaret Thatcher might qualify since she got her Master's degree in Chemistry and worked with with Dorothy Hodgkin as an undergraduate. Hodgkin is included. Thatcher isn't.

Dawkins has three other rules. First, all of the works were produced within the past 100 years. This is an excellent restriction, in my opinion. Second, the works must have been first published in English—no translations are allowed (with a few exceptions). Finally, no works by Richard Dawkins are included.

Dawkins introduces each author with a few paragraphs of background material that often includes personal anecdotes. This is where we learn that Dawkins and Gould, "enjoyed—or suffered—a kind of love/hate relationship." We also discover that Fred Hoyle wrote an article that serves as, "an example of the insight that a physical scientist can bring to biology," bearing in mind that it was written, "before Hoyle began the perverse campaign of his old age, against all aspects of Darwinism."1

Some of the choices are very pleasant surprises. I had never heard of James Jeans, the first author in the book. His entry is so interesting—and so in line with modern thought—that I can't resist a quotation. Keep in mind that this was written in 1930.
Into such a universe we have stumbled, if not exactly by mistake, at least as the result of what may properly described as an accident. The use of such a word need not imply any surprise that our earth exists, for accidents will happen, and if the universe goes on for long enough, every conceivable accident is likely to happen in time.
Many of Dawkins' choices have nothing to do with biology but, of those that do, most extol the virtues of design and natural selection. For example, there is a passage from Helena Cronin's The Ant and the Peacock that's as fine an example of science-related prose as can be found anywhere in the anthology. It is good writing but I don't it is good science. Dawkins does, and it's his book and his choice.

This brings me to an important point. In order to be included in any collection of good science writing the work has to be both good writing and good science.2 Both of these criteria are subjective so whether an author is included or not will depend very much on the point of view of the editor. In this case we learn almost as much about Richard Dawkins as we do about the authors he selects.

And the authors he omits. Of the three giants of population genetics, R.A. Fisher and J.B.S. Haldane get included but Sewell Wright isn't even named. There's nothing by Richard Lewontin, Niles Eldredge, Gabriel Dover or David Raup. Ken Miller, Francis Collins, and Simon Conway-Morris are missing as well. Daniel Dennett is there but not Michael Ruse. One gets the impression that a similar book edited by Stephen Jay Gould would look quite different.

Now don't get me wrong. There's nothing wrong with this. As far as I'm concerned "good" science writing includes scientific accuracy and Dawkins has every right to pick and choose those authors who get it right, in his opinion. However, I'd prefer that editors lay their cards on the table and admit openly that their selections are influenced by this bias.3

No book of this sort should be complete without Peter Medawar and this book is no exception. Medawar's famous wit is unequaled, and often unappreciated. I fear it is a lost art. Dawkins is such a fan of Medawar (as am I) that he includes five excerpts from his books and essays—more than any other science writer.

Let me close with a quotation from Peter Medawar's essay on Science and Literature where Medawar is discussing a modern trend among philosophers and scientists to write very complicated prose,
Let me end this section with a declaration of my own. In all territories of thought which science or philosophy can lay claim to, including those upon which literature also has a proper claim, no one who has something original or important to say will willingly run the risk of being misunderstood: people who write obscurely are either unskilled in writing or up to some mischief. The writers I am speaking of are, however, in a purely literary sense, extremely skilled.

1. In contrast to Dawkins' praise, I found the passage from Hoyle to be almost incomprehensible. It is not good science writing, in my opinion, and it certainly isn't "insightful."

2. There are exceptions. Dawkins included a passage from R.A. Fisher's book The Genetical Theory of Natural Selection even though he (Dawkins) recognizes that it may not be an example of good writing.

3. I'm going to post some examples of my own biases with respect to good science writing, concentrating almost exclusively on those writers that don't appear in The Oxford Book of Modern Science Writing.

Wednesday, June 25, 2008

Nobel Laureate: Susumu Tonegawa


The Nobel Prize in Physiology or Medicine 1987.
"for his discovery of the genetic principle for generation of antibody diversity"

Susumu Tonegawa (1939 - ) received the Nobel Prize in Physiology or Medicine for working out the mechanism of generating antibody diversity. This was one of the fundamental problems in immunology—and, indeed, all of biology. How do antibodies recognize so many different antigens?

We now know the answer thanks to Tonegawa and his coworkers. The genes for antibody proteins are constructed in antibody-producing cells by recombining bits of DNA from several different locations. Millions of different permutations can be constructed to create a random library of antibody molecules. The chance that one of these randomly constructed antibody molecules will recognize a new antigen, such as virus, is very high.

This is one of the most significant Nobel Prizes of the 2oth century. Tonegawa's discovery deserves a lot more attention that it normally gets.

The presentation speech was delivered (in Swedish) by Professor Hans Wigzell of the Karolinska Institute.

Nobel Laureates
Your Majesties, Your Royal Highnesses, Ladies and Gentlemen,

The defence of our body against infections is carried out by the immune system, a talented cellular society with a capacity to distinguish between self and non-self and with a memory capable of remembering a previous contact for decades. The system is managing this through the inbuilt capacity in a single human being to produce billions of different forms of protective molecules, antibodies. The Nobel Prize of this year is given for the elucidation of the unique capacity of the immune system to produce this enormous diversity of specific antibodies.

Susumu Tonegawa is the great molecular biologist in immunology. In a series of ingenious experiments carried out in the middle of the 1970's he solved the problem how our limited genetic material is capable of generating the diversity required to create protection against established as well as future disease provoking microorganisms. When Tonegawa did his experiments at the Basel Institute of Immunology in Switzerland other scientists had already generated a-considerable amount of knowledge regarding the features and functions of antibodies. But this knowledge had also led to uncertainty and even confusion. Antibodies are proteins and their structure is strictly ruled by genes, by the DNA in our chromosomes. When Tonegawa carried out his experiments it was commonly believed that each protein, each polypetide chain, was governed by its gene in a relation one to one. But at the same time calculations on the number of genes in the chromosomes in man determining proteins gave a number probably below one hundred thousand genes. They should suffice to all the proteins in the body, to the hemoglobin in the red blood cells, to the pigment in our eyes and so on. Only a minor part, maybe one percent, could probably be used for the creation of antibodies. Around one thousand genes being able to create billions of different forms of antibodies? The equation seemed impossible to solve.

Our antibodies are made up of two sorts of polypeptide chains, short and long ones. Tonegawa did first acquire a toolbox, filling it with the best precision tools there were of hybrid-DNA nature, developed new methods and started to study the actual construction of the genes determining the short chains of antibody molecules. He discovered something entirely new and revolutionary in genetics. On the chromosome where the gene for the short chain was expected to be located, there was not one single, but a string, of pearls of genes. One special gene resided at one position whereas two other sets of variable genes create two gene families, in all maybe around one hundred genes. When a cell should start to make antibodies - this was preceded by a gene-lottery.

One member of the largest gene family selected at random was cut out from the chromosome and moved close to a member of the second gene family, whereafter they created a functional gene for the short chain together with the solitary gene. Three and not one gene participate in the creation of the short chain of antibody molecules. Each member in one family can probably be linked to any one of the members of the second gene family, increasing variability by multiplication. The results showed beyond doubt that our body has the capacity to carry out advanced recombinant DNA processes. The intelligence of Nature can also be seen as the studies went on. The recombination of genes and their coupling together do not occur in exactly the correct manner. While such relative misfits should in other systems be bad, here they constituted yet another mechanism of increasing the diversity of antibodies. Experiments by Tonegawa as well as other scientists also revealed that the same genetic lottery principle did apply to the generation of the long chain although here the number of variants were even larger. Four different genes could be shown to create these chains together. The number of variant short chains should then be multiplied by the combinatorial possibilities of the heavy chain to give the variation at the antibody level, a fact which will also drastically enhance the diversity of antibodies.

The equation was in essence solved. A few hundred genes are used by the body in a new, revolutionary way and can thus generate billions of different antibodies. Through this genetic lottery the immune system is always prepared to react against known as well as unknown microorganisms. The economic usage of precious DNA is compensated by wasting more dispensable material. Every minute our body produces several millions of white blood cells - lymphocytes. Each one of these has undergone the hybrid-DNA procedure and is prepared with its own, unique antibodies. If not called upon to react they will rapidly die. If, however, they make contact with the fitting foreign structures they receive a reward, i.e., they are allowed to proliferate and live longer. After the great randomized gene lottery natural selection will pick the winners, thereby generating specific immunity, the cheapest and most efficient protection there is against infections.

Dr. Tonegawa,

On behalf of the Nobel Assembly of the Karolinska Institute I would like to congratulate you on your outstanding accomplishments and ask you to receive the Nobel Prize in Physiology or Medicine from the hands of His Majesty the King.

[Photo Credit: Nature]

[Figure credit: The figure showing immunoglobulin gene rearrangment is from]

Tangled Bank #108

The latest issue of Tangled Bank is #108. It's hosted at Wheat-dogg's world [The Tangled Bank #108].
Welcome to The Tangled Bank 108 and to the little-known but still fascinating Wheat-dogg’s World. I hope that after you peruse the fine entries in this edition of The Tangled Bank you’ll stroll around and check out things here in my neck of the Worldwide Woods.

Today we have science bloggers musing on some of the greater profundities of the universe as well on more concrete issues closer to home. Some of these posts ask more questions than they answer, but heck that’s what science is all about, hey?

If you want to submit an article to Tangled Bank send an email message to Be sure to include the words "Tangled Bank" in the subject line. Remember that this carnival only accepts one submission per week from each blogger. For some of you that's going to be a serious problem. You have to pick your best article on biology.

Get a Job at CFI

The Center for Inquiry (Toronto) is looking for someone to help out with their various activities. This is a part-time job. It's a wonderful opportunity for a student.

Please spread the word to anyone interested in taking on a decisive leadership position in the expanding freethought movement in Canada. Position starts mid July.

Deadline to apply: Monday, July 7.
Full info and updates will be provided online.
Visit Center for Inquiry.

The Centre for Inquiry is an international education and outreach organization dedicated to promoting and advancing reason, science, secular ethics and freedom of inquiry in all areas of human endeavour. We engage in educational lectures, debates and conferences, coordinate 30 campus freethought groups across Canada, run a robust series of secular humanist social and community services, and undertake advocacy defending church-state separation, the integrity of science and equality rights for non-believers. The new CFI Ontario is CFI's first location in Canada and our nation’s premiere venue for secular humanists, skeptics and freethinkers.


This position is two-fold:
  1. The successful candidate will act as an assistant director at CFI Canada headquarters in Toronto. He/she will lead CFI Ontario’s in-house and ongoing programming, event planning and hosting, promotions, newsletter publishing, social services, campus outreach and membership committees. There will be numerous leadership opportunities through support staff and volunteer recruitment, training, supervision and delegation.
  2. CFI's Canadian operations have recently expanded with the launching of new Communities in Montreal and Calgary and the anticipated launch of a Community of Vancouver in the next few months. The successful candidate will provide organizing assistance to our new CFI Communities in Canada.


If you are interested in applying, please email a cover letter, resume/CV and writing sample as a text, Word or PDF attachment, to Justin Trottier at Include a brief statement of your academic background, interests, your activities with the skeptic or humanist movements and/or other extracurricular, community, work or voluntary experience of relevance, and why interning at CFI is something you want to do. You are also encouraged to include any documentation or samples of your relevant experiences (eg. media coverage of your event, political policy statement you wrote, poster you created, etc).

This is an exciting opportunity to contribute to the overall growth of the secular community in Canada and to strengthen your relationship with CFI. We hope you will consider joining us.

Position starts mid July. Deadline to apply: Monday, July 7


This position will last one year with the possibility of renewal. The daily and weekly time commitment are flexible but would work out to ~ 15-20 hours/week. Please indicate your daily and weekly availability as well as the duration of your commitment.


An understanding of the freethought/humanist/skeptic community and/or some demonstration of commitment to the values of free and critical inquiry is essential.

To perform this job successfully an individual must possess excellent skills in organization, promoting and leading. The individual must also have the ability to exercise independent judgment and manage multiple priorities, the ability to organize and lead volunteers, strong verbal and written communication skills, and the ability to represent CFI via public speaking and media appearances. The job frequently involves speaking in front of crowds and other PR activities for which the successful candidate must be comfortable, experienced and proficient. Knowledge of the non-profit sector and community development strategies is ideal.

Since there is some travel access to own car is very helpful. In addition, because the computer resources at CFI Ontario are limited, access to own laptop is also ideal.

Since the successful candidate will be involved in setting up our Community of Montreal, he/she must be very comfortable conversing and writing in the French language. In addition, some knowledge of Montreal and Quebec culture is ideal.


To assist in specific projects, the following technical background is helpful, though not completely required. Candidates without such background should still apply. Candidates with such technical knowledge should highlight it in application:
  • Web development experience
  • Basic image editing skills in Adobe Photoshop or similar program
  • Experience using and maintaining SQL databases (eg. MySQL) or similar technology
  • Basic understanding of video technology and video editing, uploading and embedding (e.g. through youtube or google video)
  • Proficiency in Adobe Illustrator, Photoshop, Microsoft Publisher or similar program for poster and ad creation

Tuesday, June 24, 2008

Selfish Genes

In honey bee colonies the queen is the only fertile female. She lays all the eggs. The worker bees are female but sterile. The process of ovulation in worker bees is suppressed in response to phermomes. This is an example of genetic altruism where the reproductive benefit of worker bees is suppressed in favor of the good of the hive. There are good theories about why this would ultimately benefit the workers.

In some hives, a few worker bees can lay eggs and these eggs will hatch. The presence of "cheaters" in an altruistic society is expected and normal. Oxley et al. (2008) looked at the DNA from these "cheater" hives and compared it to the DNA from bees that were sterile. The idea was to identify the gene responsible for suppressing ovulation in workers; presumably that gene was somehow different in the hives with "cheaters."

Here's the abstract of the paper.
The all-female worker caste of the honey bee (Apis mellifera) is effectively barren in that workers refrain from laying eggs in the presence of a fecund queen. The mechanism by which workers switch off their ovaries in queenright colonies is pheromonally cued, but there is genetically-based variation among individuals: some workers have high thresholds for ovary activation, while for others the response threshold is lower. Genetic variation for threshold response by workers to ovary-suppressing cues is most evident in "anarchist" colonies in which mutant patrilines have a proportion of workers that activate their ovaries and lay eggs, despite the presence of a queen. In this study we use a selected anarchist line to create a backcross queenright colony that segregated for high and low levels of ovary activation. We used 191 informative microsatellite loci, covering all 16 linkage groups to identify QTLs for ovary activation and test the hypothesis that anarchy is recessively inherited. We reject this hypothesis, but identify four QTLs that together explain approximately 25% of the phenotypic variance for ovary activation in our mapping population. They provide the first molecular evidence for the existence of quantitative loci that influence selfish cheating behavior in a social animal.
This is an interesting paper but that's not the reason for this posting. The real reason is to contrast the actual paper with the press release from the University of Western Ontario (Canada) [Discovery proves 'selfish gene' exists]. Here's the complete press release.
A new discovery by a scientist from The University of Western Ontario provides conclusive evidence to support decades-old evolutionary beliefs about the existence of a so-called selfish gene.

Since renowned British biologist Richard Dawkins ("The God Delusion") introduced the concept of the ‘selfish gene’ in 1976, scientists the world over have hailed the theory as a natural extension to the work of Charles Darwin.

In studying genomes, the word ‘selfish’ does not refer to self-centred behaviour but rather to the blind tendency of genes wanting to continue their existence into the next generation. Ironically, this ‘selfish’ tendency can appear anything but selfish when the gene does move ahead for selfless and even self-sacrificing reasons.

For instance, in the honey bee colony, a complex social breeding system described as a ‘super-organism,’ female worker bees are sterile. The adult queen bee, selected and developed by worker bees, is left to mate with male drones.

Because the ‘selfish’ gene controlling worker sterility has never been isolated by scientists, the understanding of how reproductive altruism can evolve has been entirely theoretical – until now.

Working with Peter Oxley of the University of Sydney in Australia, Western biology professor Graham Thompson has, for the first time, isolated a region on the honey bee genome that houses this ‘selfish’ gene in female workers bees.

“We don’t know exactly which gene it is, but we’re getting close.”

“This basically provides a validation for a huge body of socio-biology,” says Thompson, who adds the completion of Honey Bee Genome Project in 2006 was crucial to this discovery.
So, what's the beef? The problem is that the press release is horribly confusing. In The Selfish Gene Dawkins argues that one can look at evolution from the perspective of the gene and not the organism. The goal of each and every gene, according to Dawkins, is to replicate itself and pass on copies to future generations. Every gene (allele) is selfish in his view. The selfish gene of The Selfish Gene has nothing to do with altruism. At least not directly.

Now, according to the Dawkins' view of evolution, worker sterility is not a violation of the selfish gene principle. Dawkins believes that Hamilton is correct and that altruistic behavior can be explained as an indirect way of propagating one's genes to future generations. Thus, the bee gene is a selfish gene in the Dawkins sense, but so is every other gene (allele) in the bee genome.

This study is not "conclusive evidence" of selfish genes. We've had that kind of conclusive evidence ever since the discovery of alleles that confer fitness advantage—alleles such as those for antibiotic resistance gene in bacteria. This study is interesting because it points to the discovery of altruistic genes (alleles) but that something quite different from what it says in the press release. The "cheater" allele represents selfishness of a different kind.

Incidentally, there's nothing in paper itself about "selfish genes" or Richard Dawkins.

If you want to follow up on this topic you should read the comments on Richard [New discovery proves 'selfish gene' exists]. As you might imagine, the readers over there are split between those who hail this as confirmation that Dawkins is vindicated and those who have actually read The Selfish Gene. Many are calling for clarification from Richard Dawkins himself. I hope he responds because this is a perfect opportunity for him to set the record straight.

[Photo Credit: The Telegraph (UK)]

Oxley, P.R., Thompson, G.J., Oldroyd, B.P. (2008) Four QTLs that Influence Worker Sterility in the Honey Bee (Apis Mellifera). Genetics. 2008 Jun 18. [Epub ahead of print]Click here to read [PubMed] [DOI: 10.1534/genetics.108.087270]

Monday, June 23, 2008

Monday's Molecule #77

Today's molecule is related to a previous Monday's Molecule. This time you have to name the molecule and identify the various symbols on the cartoon. Be as specific as possible.

There's a direct connection between today's molecule and a Nobel Prize. The prize was awarded for figuring out how this molecule was made. It was one of the most brilliant discoveries of the 20th century.

The first person to correctly identify the molecule (and its parts) and name the Nobel Laureate(s), wins a free lunch at the Faculty Club. Previous winners are ineligible for one month from the time they first collected the prize. There are five ineligible candidates for this week's reward. You know who you are.


Nobel Laureates
Send your guess to Sandwalk (sandwalk (at) and I'll pick the first email message that correctly identifies the molecule and names the Nobel Laureate(s). Note that I'm not going to repeat Nobel Laureate(s) so you might want to check the list of previous Sandwalk postings by clicking on the link in the theme box.

Correct responses will be posted tomorrow. I may select multiple winners if several people get it right.

Comments will be blocked for 24 hours. Comments are now open.

UPDATE: The molecule is immunoglobulin G (IgG)—same as last week. The V, D, and J symbols stand for variable, diversity, and joining regions of the protein. The antigen binding site is formed from the combination of these regions on the heavy (H) chain and the light (L) chain of the molecule. The ability of antibodies to recognize a huge number of different antigens is due to formation of a huge number of different antigen-binding sites. This is achieved by rearranging the genome in order to bring together one of hundreds of V regions with 20 or so D regions and 5-6 J regions. The recombination events are associated with mutations that serve to create even more diversity.

The generation of antibody diversity by genomic rearrangement was discovered by Susumu Tonegawa who received the Nobel PRize in 1987. Today's winner is Alex Ling or the University of Toronto.

Peter McKnight Reviews Expelled

The movie is called Expelled: No Intelligence Allowed. Perhaps you've heard of it? After flopping in the USA it's about to open in Canada on June 27th. I heard an ad on the radio today as I was driving to work.

The movie is full of errors and lies. For a complete list, check out Expelled Exposed.

Peter McKnight of The Vancouver Sun reviewed the movie last Saturday [No intelligence allowed in Stein's film]. It's an excellent review. I recommend you read the whole thing. Here are the opening paragraphs ...
Although you're probably not aware of it, scientists, lobby groups, the media and the courts are all united in a massive conspiracy to destroy your freedom. But have no fear, freedom fighter Ben Stein is here.

That, in effect, is the thesis of Expelled: No Intelligence Allowed, the new anti-science "documentary" which opens across Canada on June 27, was produced by Vancouver's Premise Media, and stars Stein, the lawyer, actor, game show host and speechwriter for former U.S. president Richard Nixon.

The subtitle of the film is wholly appropriate as there is precious little intelligence displayed in its more than 90 minutes. But the subtitle's reference to the content of the film was unwitting -- it was meant to refer to a giant conspiracy to banish intelligent design theory from the halls of academe and the culture as a whole.

[Hat Tip: John Pieret]

Darwin Symposium at the ROM

Come to the Darwin Symposium at the Royal Ontario Museum (Toronto) this Friday.
Darwin Symposium

Friday, June 27, 10:30 am - 3:30 pm

Status: Available

Join these leading thinkers in Evolution and Darwinism for a day of fascinating presentations.

10:30 am - 11:30 am
Michael Ruse: Has Darwinism Expired?

11:30 am - 12:30 pm
Spencer Barrett: A Darwinian Perspective on the Evolution of Plant Sexual Diversity

12:30 - 1:30 pm, LUNCH BREAK

1:30 - 2:30 pm
Rosemary and Peter Grant: Darwin's Finches

2:30 - 3:30 pm
Alan Baker: Modern Darwinism: Natural Selection and Molecular Evolution

Talks will be approximately 40-45 minutes long with a question and answer period after each talk. Lunch is not included.

Location: Royal Ontario Museum, Level 1B
Signy and Cléophée Eaton Theatre; Please enter through the South Entrance.

Cost: Free Lecture. Museum admission not included.

Come and Meet the Friendly Atheist!


Hemant Mehya (Friendly Atheist) is giving a talk in Toronto this Friday. The talk is sponsored by the University of Toronto Secular Alliance.
When? Friday, June 27th, 2008 (Meet & Greet: 6-7pm, Lecture: 7pm-9pm)

Where? U of T's Multi-Faith Centre, 569 Spadina Ave, Multi-Purpose Room (2nd Floor)

How Much? Free!

RSVP (if possible): Facebook

Hemant Mehta comes to Toronto to talk about faith and his experience with "selling his soul".

Hemant Mehta was born into the Jain faith, and became an atheist at age 14. In 2006, he created an auction on eBay offering up his atheist mind and body to attend a worship service of the winning bidder's choice. Every $10 would equal one hour in that particular place of worship.

The bidding ended on February 3, 2006 with the final bid sitting at $504 from Jim Henderson, a minister from Seattle, Washington. The money was later donated by Hemant to the Secular Student Alliance, a non-profit organization. The agreement was for Hemant to visit a variety of churches and to write about his experiences at them at the web-site

Hemant later developed these experiences into his book "I Sold My Soul On eBay". He continues to open up dialogue at his personal blog

Co-hosted by the University of Guelph Skeptics, sponsored by the Secular Students Alliance.
Even though the meeting notice and the RSVP are on Facebook, I'm told that the meeting is not exclusively for university students. If you don't have a Facebook account (I don't) then you can just show up at the meeting (I think).

Sunday, June 22, 2008

Professor Sues Students for "Anti-intellectualism"

Priya Venkatesan earned an Master's degree in genetics then went on to a Ph.D. in literature. She was a Professor at Dartmouth last year and now has a postion at Northwestern.

Here's an except from an article she wrote in Dartmouth Medicine last summer [Yin, meet yang].
In graduate school, I was inculcated in the tenets of a field known as science studies, which teaches that scientific knowledge has suspect access to truth and that science is motivated by politics and human interest. This is known as social constructivism and is the reigning mantra in science studies, which considers historical and sociological understandings of science. From the vantage point of social constructivism, scientific facts are not discovered but rather created within a social framework. In other words, scientific facts do not correspond to a natural reality but conform to a social construct.


In many ways, social constructivism has been reframed as postmodernism, since both movements question the scientific realm's theory of truth—that is, that scientific facts mirror an external reality which does indeed exist. However, this reframing is unnecessary, since clear distinctions exist between social constructivism and postmodernism. Through my experience in the laboratory, I have found that postmodernism offers a constructive critique of science in ways that social constructivism cannot, due to postmodernism's emphasis on openly addressing the presupposed moral aims of science. In other words, I find that while an individual ethic of motivation exists, and indeed guides the conduct of laboratory routine, I have also observed that a moral framework—one in which the social implications of science and technology are addressed—is clearly absent in scientific settings. Yet I believe such a framework is necessary. Postmodernism maintains that it is within the rhetorical apparatus of science—how scientists talk about their work—that these moral aims of science may be accomplished.
By all accounts, Priya Venkatesan is one of those post-modernist thinkers who are much more impressed with their words than with their ideas. They are the ones spoofed by the Sokal hoax back in 1996. Venkatesan is the author of a book called Molecular biology in Narrative form., which is also the title of her Ph.D. thesis. At least one reviewer thought the book was a joke [].

Following all the forms of both Literary deconstructionist criticism as well as Scientific Peer-reviewed journals, one might actually think this was a serious work -- and indeed, the sheer volume of dense text only adds to the realism. Happily, the content of the text seems to be some remarkably well-disguised trippy-hippy post-modernist anti-establishment pseudo-feminist rant taken directly from protest speeches from the sixties.

Well Done! This deserves a place on your shelf alongside your bound volumes of The Journal of Improbable Research and other 'Mad Scientist' Jokebooks!
Professor Venkatesan taught a freshman course on "Science, Technology and Society" at Dartmouth where she expounded at length on various post-modern ideas and the concept that, "scientific facts do not correspond to a natural reality but conform to a social construct." During one of those lectures some of her students challenged her and attempted to refute her post-modernist views. Some students applauded the dissidents.

The result was traumatic for Professor Venkatesan and she ended up quitting her job at Dartmouth. She also launched a lawsuit against her students for violating her civil rights. You can read about it in the Wall Street Journal [Dartmouth's 'Hostile' Environment].
After a winter of discontent, the snapping point came while Ms. Venkatesan was lecturing on "ecofeminism," which holds, in part, that scientific advancements benefit the patriarchy but leave women out. One student took issue, and reasonably so – actually, empirically so. But "these weren't thoughtful statements," Ms. Venkatesan protests. "They were irrational." The class thought otherwise. Following what she calls the student's "diatribe," several of his classmates applauded.

Ms. Venkatesan informed her pupils that their behavior was "fascist demagoguery." Then, after consulting a physician about "intellectual distress," she cancelled classes for a week. Thus the pending litigation.
Her lawsuit has been dropped but the fact that she initiated it in the first place is deeply troubling.

Here's the text of the email message she sent to her students.
Date: Sat, 26 Apr 2008 20:56:35 -0400 (EDT)
From: Priya.Venkatesan@Dartmouth.EDU
To: "WRIT.005.17.18-WI08":;, Priya.Venkatesan@Dartmouth.EDU
Subject: WRIT.005.17.18-WI08: Possible lawsuit

Dear former class members of Science, Technology and Society:

I tried to send an email through my server but got undelivered messages. I regret to inform you that I am pursuing a lawsuit in which I am accusing some of you (whom shall go unmentioned in this email) of violating Title VII of anti-federal [SIC] discrimination laws.
The feeling that I am getting from the outside world is that Dartmouth is considered a bigoted place, so this may not be news and I may be successful in this lawsuit.
I am also writing a book detailing my experiences as your instructor, which will "name names" so to speak. I have all of your evaluations and these will be reproduced in the book.

Have a nice day.
In most of my classes I'd be delighted if students challenged what I was saying and engaged in debate. That's what university is supposed to be all about. The troubling aspect of this case is not only that students were dissatisfied with what Professor Venkatesan was saying in class but, more importantly, that Dartmouth gave her the opportunity to spout such nonsense to freshman students in the first place. What was Dartmouth thinking?

The sad thing is that we know the answer to the last question. There are too many universities these days whose faculties subscribe to the gibberish of these post-modernist pseudo-intellectuals. This may be a bigger threat to university students than creationism.

There's lots of stuff about this on the internet. I wish I could have included a link to some of Venkatesan's supporters but I couldn't find any.

The Dartmouth Review: TDR Interview: Priya Venkatesan '90

Sepiamutiny: The Strange, Twisted Tale of Priya Venkatesan, PhD

The Reference Frame: Priya Venkatesan: a mad scholar sues her students

[Hat Tip: John Hawks Anthropology Weblog]

Saturday, June 21, 2008

Sequence Alignment

Sequence alignment is one of the crucial steps in deciding whether two genes/proteins are homologous. The two sequences are aligned from one end to the other and the number of identical, or similar, residues is counted. If this number reaches a significant percentage of the total length (usually >25%) then the two sequences are homologous—they descend from a common ancestor.

Sequence alignment is not straightforward, even for two sequences, because in addition to substitutions the genes might have undergone insertions or deletions (indels). In order to identify conserved residues, one needs to insert gaps in one sequence or the other to compensate for these indel events.

You can't just willy-nilly stick in gaps to maximize the number of aligned residues because the gaps represent true historical events (insertions and deletions). In theory, you can get high identity scores with any two sequences as long as you insert enough gaps but that isn't allowed. When the alignment is done by computer algorithm, each gap is associated with a gap penalty.

The determination of proper gap penalties is a major challenge in multiple sequence alignment. A crude estimate is that each gap comes with a penalty of 3—that is you have to generate at least three identities in order to make the gap worthwhile. The number of gaps and gap penalties have to be subtracted from the identity/similarity scores when deciding about homology. (This isn't always done.)

Here's an example of a multiple sequence alignment from a region of bacterial HSP70 genes. The letters represent the amino acid residues and the dashes are gaps due to insertions and deletions.

The HSP70 genes are the most highly conserved genes in biology so, in principle, it should be easy to align them. In fact, it is easy in most regions but the one shown above is the most difficult. This is a manual alignment that takes into account the similarities of groups of sequences. Those that are most similar are clustered together and whenever possible the alignment is adjusted so that the positions of the gaps in the most closely related sequences are identical.

This is a procedure known as phylogenetic alignment but it would be better to call it similarity alignment because what we're actually doing is clustering sequences by their overall similarity and not their phylogeny. (The fact that their phylogenetic relatedness closely corresponds to their similarity is a consequence of the the analysis and not a cause.)1.

The placing of gaps in this region of HSP70 sequences is very difficult. No computer program can come close to achieving the quality of alignments that well trained humans can achieve. That's because the overall alignment has to take into account a number of variables simultaneously and the progressive alignment takes many trial-and-error steps. As a general rule of thumb, if you see a paper where phylogenetic trees are constructed using computer-generated multiple sequence alignments only, then you should assign a low confidence value to that work.

Is this important? Indeed it is. The exact nature and position of the large gap in the above sequences, for example, plays an important role in testing the Three Domain Hypothesis. Different alignments give different trees and the most important variable is the position of gaps.

This brings me to an important paper just published in this week's issue of Science. Löytynoja and Goldman (2008) have developed a new algorithm for multiple sequence alignment. The abstract of their paper describes the problem, and their solution.
Genetic sequence alignment is the basis of many evolutionary and comparative studies, and errors in alignments lead to errors in the interpretation of evolutionary information in genomes. Traditional multiple sequence alignment methods disregard the phylogenetic implications of gap patterns that they create and infer systematically biased alignments with excess deletions and substitutions, too few insertions, and implausible insertion-deletion–event histories. We present a method that prevents these systematic errors by recognizing insertions and deletions as distinct evolutionary events. We show theoretically and practically that this improves the quality of sequence alignments and downstream analyses over a wide range of realistic alignment problems. These results suggest that insertions and sequence turnover are more common than is currently thought and challenge the conventional picture of sequence evolution and mechanisms of functional and structural changes.
The authors test their phylogeny-aware program (PRANK) against several other multiple sequence alignment programs (ClustalW, MAFFT, MUSCLE, and T-COFFEE) using a set of sequences that were "evolved" using a computer program that created substitutions and insertions/deletions. Since the true phylogeny of this artificial set is known, they were able to evaluate the performance of the various programs.

As you might expect, PRANK came out best in this test. I'm not sure that it would work best with real data but that's not really my point. My point is that this is an ongoing problem that has not been fully solved. It is still best to avoid multiple sequence alignments that have not been manually improved by humans with considerable experience in sequence alignment.

I'll close by quoting from the discussion in Löytynoja and Goldman (2008) just to remind everyone how important this is. They argue that even post-alignment human "refinement" of computer generated sequence alignments suffers from systemic bias.
Our analyses show that sequence alignment remains a challenging task, and alignments generated with methods based on the traditional progressive algorithm may lead to seriously incorrect conclusions in evolutionary and comparative studies. The main reason for their systematic error is disregard of the phylogenetic implications of gap patterns created—which is not corrected by considering alignment consistency (13) or using post alignment refinement (14, 15)—and this error is intensified by methods that intentionally force gaps into tight blocks. Affected methods can be positively misleading and become increasingly confident of erroneous solutions as more sequences are included. It is not the progressive algorithm as such that is defective, rather, correct alignment requires that we take account of sequences' phylogeny, irrespective of alignment method used or data type, but the original implementations of the progressive algorithm have a flaw that has gone unnoticed as long as different methods have been consistent in the error they create.

That such a significant error has passed undetected may be explained by the alignment field's historical focus on proteins, where these biases tend to be manifested in less-constrained regions such as loops (compare Fig. 1). Alignments with insertions and deletions squeezed compactly between conserved blocks may suffice for, and even be preferred by, some molecular biologists working with proteins. We have shown, however, that these patterns are, in fact, imposed by systematic biases in alignment algorithms, even in cases where they are incorrect and, indeed, phylogenetically unreasonable. We contend that algorithms that impose gap patterns like those found in structural alignments of proteins are inappropriate for the increasingly widespread analysis of genomic DNA and are likely to cause error when the resulting alignments are used for evolutionary inferences.

1. In a sense, phylogenetic alignment creates a circular argument. What we're trying to do is to build a phylogenetic tree from the multiple sequence alignments. If we use the presumed phylogeny to generate the alignments then we have a problem. Part of the problem goes away once we recognize that the alignment is driven by clustering similar sequences rather than phylogenetically related sequence.

Löytynoja, A. and Goldman, N. (2008) Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis. Science 320:1632-1635. [DOI: 10.1126/science.1158395]

A Graduate Student Oath

The Institute of Medical Studies (IMS) at the University of Toronto is a large department with many graduate students. Many of them are M.D.s doing clinical research.

The department has instituted a graduate student oath that beginning graduate students recite at their first meeting. The idea is to teach students the value of social and moral responsibilities. Beginning graduate students also have to take a mandatory seminar course on ethics.

The oath is explained and reproduced in this week's issue of science magazine in an article by Davis et al. (2008). Here it is.
"I, [NAME], have entered the serious pursuit of new knowledge as a member of the community of graduate students at the University of Toronto.

"I declare the following:

"Pride: I solemnly declare my pride in belonging to the international community of research scholars.

"Integrity: I promise never to allow financial gain, competitiveness, or ambition cloud my judgment in the conduct of ethical research and scholarship.

"Pursuit: I will pursue knowledge and create knowledge for the greater good, but never to the detriment of colleagues, supervisors, research subjects or the international community of scholars of which I am now a member.

"By pronouncing this Graduate Student Oath, I affirm my commitment to professional conduct and to abide by the principles of ethical conduct and research policies as set out by the University of Toronto."
What do you think? Is this something that all departments should consider?

Davis, K.D., Seeman, M.V., Chapman, J. and Rotstein, O.D. (2008) A Graduate Student Oath. Science 320:1587-1588. [DOI: 10.1126/science.320.5883.1587b]

Friday, June 20, 2008

Errors in Sequence Databases

Sandra Porter at Discovering Biology in a Digital World brings up an issue that has been bugging me for two decades [Biologists vs. the Age of Information]. The issue is the accuracy of information in biological databases.
Let's begin with GenBank - GenBank is the main database of nucleotide sequences at the NCBI. Sequence data are submitted to GenBank by researchers or sequencing centers. If mistakes are found, the information in the records can be updated by the submitters or by third parties if the corrected versions are published. This correction activity doesn't always happen though, and the requirement for third party annotations to be published makes it pretty unlikely that anyone will submit small corrections to a sequence.

This is why we see these kinds of quotes from Steven Salzberg (3):
So you think that gene you just retrieved from GenBank [1] is correct? Are you certain? If it is a eukaryotic gene, and especially if it is from an unfinished genome, there is a pretty good chance that the amino acid sequence is wrong. And depending on when the genome was sequenced and annotated, there is a chance that the description of its function is wrong too.
This is a serious problem. Most people don't realize that GenBank is full of sequences that are known to be incorrect and/or poorly annotated. In most cases, the errors are relatively minor such as one or two incorrect codons or deletion of a single codon. In other cases, the errors are more important, such as a pseudogene being represented as a real gene, or missing exons. Sometimes the identity of a gene is completely wrong. I've even seen examples where the species is incorrectly identified.

Sandra asks,
So what do we do? Do we care if the database information is up-to-date? If so, who should be responsible for the updates?

I'm sure some people would like the NCBI to be the final authority and just fix everything but I don't think that's very realistic.

Other people have proposed that wikis are the answer. Maybe they're right, but I really wonder if researchers would be any better at updating wikis than they are at updating information in places like the NCBI.

Well, dear readers, what do you think? Does GenBank need to be fixed? Do we just need more alternatives? Does it even matter?
Back in 1992, I spent part of a summer at the GenBank site in Los Alamos (New Mexico, USA). That was before GenBank moved to NCBI in Bethesda. My task was to explore the possibility of curating GenBank to fix all the errors. I worked with the HSP70 sequences since I had already documented most of the errors in those sequences (The HSP70 Sequence Database).

We decided that I could make corrections to any HSP70 sequence as long as I annotated the changes and got permission from the authors by 'phone.1 This didn't work. Most of the authors were unwilling to allow changes 'cause they weren't aware of the fact that there was a conflict between their sequences and the aligned sequence database. They didn't even know that others had sequenced the same gene and gotten a different sequence.

We discussed this problem. At the time, everyone was aware of the fact that the SwissProt database was curated and that the curators were making decisions on their own about which sequences were correct and which ones were errors. Here's an example of the entry for human HSPA1A showing the conflicts and variations.

Sometimes the SwissProt curators get it wrong and identify the correct sequence as an error and vice versa. Sometimes they really screw up. Here's an example of that mistake [P23931].

Curating a sequence database is incredibly expensive. You need to hire hundreds of competent workers who can analyze every sequence as it comes in. There are some tools that will help identify errors but in order to reach an acceptable level of accuracy you need to build aligned sequence databases for every gene. That can't be done automatically; you need to have real people look at the data and make the best alignment if you are going to use it to make judgements on the accuracy of a submitted sequence.

The final decision at GenBank was to forget about correcting errors and treat the database as an archive of submitted sequences. It would be up to every researcher to become aware of the error-prone nature of the database before drawing any conclusions. I think this was the correct decision—it was the only realistic decision. Unfortunately, the average researcher doesn't realize how may errors are being propagated in the sequence databases.

1. It was a huge ego-trip to have the power to change records in GenBank. All of the changes I made to other people's sequences have been removed but the ones I made to my own sequences are still there. You can check out [M76613] to see an example of what an annotated sequence could have looked like. Note the references to "old-sequence," "conflict," "variation," and "unsure." These represent differences between the genomic sequence and our older error-prone cDNA sequences.

Kristin Roovers Punished for Falsifying Data

Kristin Roovers was a post-doc at the Ottawa Health Research Institute in Ottawa (Canada) until last week. Her job was abruptly terminated when OHRI learned that she had been convicted and punished for falsifying data while she was a graduate student and a post-doc at the University of Pennsylvania. Apparently they first heard that something was wrong from an article in The Chronicles of Higher Education [Journals Find Fakery in Many Images Submitted to Support Research].

Read about it in yesterday's Ottawa Citizen [Researcher's tainted past leads Ottawa health facility to sever ties]. See the fraudulent data on baylab [Kristin-gate at the OHRI].

You can read the July 2007 report from the Office of Research Integrity (USA) at Case Summary - Kristin Roovers.

Here's the question. Why was she hired at OHRI? They probably didn't ask for letters of reference and they certainly didn't Google her name.