Sandwalk: I Don't Have Time for This!

Thursday, May 06, 2010

I Don't Have Time for This!

The banner headline on the front page of The Toronto Star says, "U of T cracks the code." You can read the newspaper article on their website: U of T team decodes secret messages of our genes. ("U of T" refers to the University of Toronto - our newspaper thinks we're the only "T" university in the entire world.)

The hyperbole is beyond disgusting.

The work comes from labs run by Brendan Frey and Ben Blencowe and it claims to have discovered the "splicing code" mediating alternative splicing (Barash et al., 2010). You'll have to read the paper yourself to see it the headlines are justified. It's clear that Nature thought it was important 'cause they hyped it on the front cover of this week's issue.

The frequency of alternative splicing is a genuine scientific controversy. We've known for 30 years that some genes are alternatively spliced to produce different protein products. The controversy is over what percentage of genes have genuine biologically relevant alternative splice variants and what percentage simply exhibit low levels of inappropriate splicing errors.

Personally, I think most of the predicted splice variants are impossible. The data must be detecting splicing errors [Two Examples of "Alternative Splicing"]. I'd be surprised if more than 5% of human genes are alternatively spliced in a biologically relevant manner.

Barash et al. (2010) disagree. They begin their paper with the common mantra of the true believers.

Transcripts from approximately 95% of multi-exon human genes are spliced in more than one way, and in most cases the resulting transcripts are variably expressed between different cell and tissue types. This process of alternative splicing shapes how genetic information controls numerous critical cellular processes, and it is estimated that 15% to 50% of human disease mutations affect splice site selection.

I don't object to scientists who hold points of view that are different than mine—even if they're wrong! What I object to is those scientists who promote their personal opinions in scientific papers without even acknowledging that there's a genuine scientific controversy. You have to look very carefully in this paper for any mention of the idea that a lot of alternative splicing could simply be due to mistakes in the splicing machinery. And if that's true, then the "splicing code" that they've "deciphered" is just a way of detecting when the machinery will make a mistake.

We've come to expect that science writers can be taken in by scientists who exaggerate the importance of their own work, so I'm not blaming the journalists at The Toronto Star and I'm not even blaming the person who wrote the University of Toronto press release [U of T researchers crack 'splicing code']. I'll even forgive the writers at Nature for failing to be skeptical [The code within the code] [Gene regulation: Breaking the second genetic code].

It's scientists who have to accept the blame for the way science is presented to the general public.

Frey compared his computer decoder to the German Enigma encryption device, which helped the Allies defeat the Nazis after it fell into their hands.

“Just like in the old cryptographic systems in World War II, you’d have the Enigma machine…which would take an instruction and encode it in a complicated set of symbols,” he said.

“Well, biology works the same way. It turns out to control genetic messaging it makes use of a complicated set of symbols that are hidden in DNA.”

Given the number of biological activities needed to grow and govern our bodies, scientists had believed humans must have 100,000 genes or more to direct those myriad functions.

But that genomic search of the 3 billion base pairs that make up the rungs of our twisting DNA ladders revealed a meagre 20,000 genes, about the same number as the lowly nematode worm boasts.

“The nematode has about 1,000 cells, and we have at least 1,000 different neuron (cells) in our brains alone,” said Benjamin Blencowe, a U of T biochemist and the study’s co-senior author.

To achieve this huge complexity, our genes must be monumental multi-taskers, with each one having the potential to do dozens or even hundreds of different things in different parts of the body.

And to be such adroit role switchers, each gene must have an immensely complex set of instructions – or a code – to tell them what to do in any of the different tissues they need to perform in.

I wish I had time to present a good review of the paper but I don't. Sorry.

Barash, Y., Calarco, J.A., Gao, W., Qun Pan, Q., Wang, X., Shai, O., Benjamin J. Blencowe, and Frey, B.J. (2010) Deciphering the splicing code. Nature 465: 53–59. [doi:10.1038/nature09000] [Supplementary Information]

26 comments :

crf said...: It's like they're code-breaking the Inigma machine?

And you're disagreeing that there is even a code! THEN, OMG, THAT'S JUST WHAT HITLER WOULD SAY!; Thursday, May 06, 2010 5:57:00 PM
Georgi Marinov said...: That the paper is overhyped beyond any justifiable level is undeniable. It's also undeniable that it is completely unjustified to assume that just because you see something in the data, it is functional. However, I don't think the "5% functional alternative transcripts" number is correct either, as if you look at deep sequencing data, many more genes than that show expression of different dominant isoforms between cell types, which to me implies that the phenomenon is more widespread than that. Of course not all of it is due to alternative splicing, we have alternative TSS choice and polyadenylation sites, but still, there is more diversity than 5%. A lot, maybe most of those junctions you can find in the data are probably due to errors in a mechanisms that doesn't have to be always accurate to let the organisms survive, but not all of them; Thursday, May 06, 2010 6:38:00 PM
Larry Moran said...: Georgi Marinov says,

However, I don't think the "5% functional alternative transcripts" number is correct either, as if you look at deep sequencing data, many more genes than that show expression of different dominant isoforms between cell types, which to me implies that the phenomenon is more widespread than that. Of course not all of it is due to alternative splicing, we have alternative TSS choice and polyadenylation sites, but still, there is more diversity than 5%.

We agree that a lot of what passes for variation is probably accident.

What we need to know is how much is biologically relevant. I agree with you that quantification of various alternatively spliced transcripts is the key bit of data we need. I'd love to see a table showing how many genes have reproducible levels of different transcripts in different cell types where the levels are clearly stated.

What cutoff would convince you that a real biological phenomenon is in play? Would the minor mRNA have to be greater than 0.1% of the major one or would it have to be 1% or 10%?

The fact that there could be different splice variants in one tissue than in another is not proof that they are functional. Given that different tissues have different splice factors, we expect that different error levels will occur in different tissues. Some of the deep sequencing methods are perfectly capable of picking up less than one transcript per cell and distinguishing that from 1/10th that amount. It looks like a ten-fold difference but is it relevant?

If you know of any paper that provides information on the levels of various transcripts per cell for all 20,000 genes then I'll be happy to read it. Heck, if you can even point me to papers that show 1000 genes like this I'd change my position.

Waiting .....

While Georgi is searching the literature, the rest of you might enjoy looking at the predicted alternatively spliced RNA for the most highly conserved gene in biology [The Frequency of Alternative Splicing]. Does anyone seriously think this gene has 18 different mRNAs, some of which only encode a small piece of the 70KDa protein?

This is what you need to defend if you're a proponent of the idea that most human genes show alternative splicing. Good luck.; Thursday, May 06, 2010 10:20:00 PM
Georgi Marinov said...: The literature on transcript quantification from deep sequencing is just appearing now, but people have been working on it for a while.

http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.1621.html#/

Of course, there is no published paper that looks at all 20,000 genes, because this would involve doing RNA-Seq on a lot of tissues and cell types. Which is certainly being done in various labs around the world but it will take some time for the results to be published

BTW, I was not referring to different minor isoforms found at 1% of the level of the major isoform, I was referring to major isoform switches between cell types; Thursday, May 06, 2010 11:50:00 PM
Anonymous said...: ...the rest of you might enjoy looking at the predicted alternatively spliced RNA for the most highly conserved gene in biology

Actually, I'd be more interested in predicted alternatively spliced RNA for rapidly evolving recent genes. You're seriously (and unfairly) stacking the deck in favor of spurious results by specifying "the most highly conserved gene in biology" for your analysis.; Friday, May 07, 2010 12:06:00 AM
Psi Wavefunction said...: IMNSHO, alternative splicing is way overrated. Just subfunctionalisation on a little bit of crack...; Friday, May 07, 2010 1:56:00 AM
Larry Moran said...: anonymous says,

You're seriously (and unfairly) stacking the deck in favor of spurious results by specifying "the most highly conserved gene in biology" for your analysis.

I take this as an admission that you can't explain the predicted alternative transcripts of the BiP gene, right?

If you recognize that the predicted alternative transcripts of a well-studied gene are mostly artifacts, then what's your basis for concluding that the predictions of a less well-studied gene are mostly accurate?

I thought rationality was something that scientists are supposed to admire? :-); Friday, May 07, 2010 10:15:00 AM
Larry Moran said...: Georgi Marinov says,

BTW, I was not referring to different minor isoforms found at 1% of the level of the major isoform, I was referring to major isoform switches between cell types

Fine. What's your cutoff? What percentage of the predominant form do you consider to be a "major isoform"? It would be good to specify this before the results are in so you can't be accused of post-hoc rationalization.

Let's say one tissue has 5000 copies per cell of a particular mRNA and 100 copies of a minor isoform (2%). Another tissue has only 150 copies of each isoform per cell (50%). Does that count?

These numbers are important. It astonishes me that support for abundant, biologically relevant, alternative splicing is so widespread when nobody knows what the data says.; Friday, May 07, 2010 10:28:00 AM
Anonymous said...: If you recognize that the predicted alternative transcripts of a well-studied gene are mostly artifacts, then what's your basis for concluding that the predictions of a less well-studied gene are mostly accurate?

Now you're being silly. The predictions are a starting point, to be followed up with experimental data. Of course the predictions are not going to be completely accurate (the recent UofT paper notwithstanding). The decision of which predictions are worth expending resources for experimental confirmation has to be informed by our knowledge of the underlying biology of the genes in question.; Friday, May 07, 2010 10:39:00 AM
Georgi Marinov said...: This comment has been removed by the author.; Friday, May 07, 2010 11:10:00 AM
Georgi Marinov said...: Let's say one tissue has 5000 copies per cell of a particular mRNA and 100 copies of a minor isoform (2%). Another tissue has only 150 copies of each isoform per cell (50%). Does that count?

There aren't that many genes expressed at 5000 copies per cell in any cell type, so it's not fair to require that. What I was talking about is one isoform being expressed at say, N1 copies per cell in one cell type with some or no minor isoforms there, and another isoform being expressed at N2 copies per cell with some or no minor isoforms in that cell type, where both N1 and N2 are significantly larger than the expression of the minor isoforms. In the paper I cited, at conservative cutoff levels, this happens a few hundred to a thousand times during a single developmental transition.

I am by no means a supporter of the "I see a read mapping here, therefore there must be a functional transcript it is coming from" approach to the data, but you are often at the other extreme, being too ready to dismiss data altogether; Friday, May 07, 2010 11:11:00 AM
Larry Moran said...: One of my colleagues has alerted me to a paper that provides some quantitative data for us to sink our teeth into.

Work from Chris Burge's lab at MIT indicates that 86% of human genes produce a minor isoform that is 15% or more of the level of the major isoform. If this data holds up, if strongly suggests that the minor alternatively spliced form isn't just "noise."

Wang et al. (2008) Nature 456: 470-476.; Friday, May 07, 2010 2:04:00 PM
Larry Moran said...: Georgi Marinov says,

... but you are often at the other extreme, being too ready to dismiss data altogether

I teach a course on "Scientific Controversies and Misconceptions" and one of the main points is that in real scientific controversies the data is contradictory.

I try to make graduate students (and colleagues) understand the implications of this. What it means is that the controversy isn't likely to be resolved with just one or two experiments. It also means that some of the data has to be wrong.

Scientists are obliged to recognize and respect other interpretations of the data. In my case, I'm fully aware of the deep sequencing experiments (but see comment above) and the EST data. I'm fully aware of the fact that many people interpret this data as evidence for massive amounts of biologically relevant alternative splicing.

I still prefer to interpret the data as being mostly due to errors in splicing, or "noise." I think the data showing that much of it is noise is more credible than the data showing that it is functional. (see Melamud and Moult (2009) Nucl. Acid Res. 37:4862-4872).

The thing that troubles me the most is when scientists—mostly those on the other side of this controversy—completely ignore the fact that there even IS a controversy. They publish papers where they don't even bother to reference the work of people who disagree with their interpretation. Note, for example, that there are no references to proponents of "noise" in the paper from the Blencowe/Frey labs.

We realize, of course, why they're doing this. If most alternative splicing is due to mistakes in the splicing machinery then all the recent paper has done is develop a method for predicting when these errors are likely to happen. That interpretation would greatly change the nature of their discovery and might even mean that their paper wouldn't be on the cover of Nature. So, I can guess why they would ignore the controversy.

This might qualify as unethical behavior. What do you think?; Friday, May 07, 2010 2:26:00 PM
DK said...: I am 100% with Larry on this one. The fact is, ALL of the transcript detection employed in these 'omics studies is not quantitative. Sure, authors like to pretend that it is, but no, it isn't. The question is simple and the way to answer it is very straightforward. But very labor-intensive. First, forget RNA. Protein-coding gene expression is not transcription. It's translation. So, make and purify Ab against every exon from >50 genes known to have tissue-specific isoforms. And from >50 genes that we don't know this about (including some, like HSA and actin, where we know fir sure that there are no splice isoforms at all). After that, only about 10,000 of Western runs ought to tell what's real and what's not. It will also provide an experimental test for the predictions made by paper biochemists cracking the enigma codes. But, of course, that's too difficult and not sexy to actually be funded and be done.; Friday, May 07, 2010 3:36:00 PM
Anonymous said...: This might qualify as unethical behavior. What do you think?

Good question. This might have qualified as unethical 30 years ago, but this is science in the 21st century and zealously ruthless self-promotion is required for survival today. I think as long as students of the literature are aware of this trend the harm done is not too great. Unfortunately however, the general public (read: media) really get mislead quite badly: a topic you have previously discussed at some length.; Friday, May 07, 2010 3:44:00 PM
Anonymous said...: 1. This back and forth was better than the best journal club meeting I have ever attended. Thank you all.

2. Re the 86% of human genes making an alternatively spliced product that is at 15% or more of the level of the major splice variant (phew): Does this suggest functionality merely because of the high level (or have I missed your point completely?)? I am perfectly willing to accept such a high error rate - look at the estimated 10-20% of fertilized eggs and blastocysts that spontaneously give up.; Friday, May 07, 2010 3:45:00 PM
Anonymous said...: I teach introductory biology courses at the community college level (soon in a tenure track position), often for non-majors who have little interest in the subject. I like to think that this gives me a better snapshot of what the majority of people are thinking regarding biology than my soon to be ending research job at TSRI. If it does, than your average person is very confused when they hear about contradictory findings (real or oversold). Probably the most important thing I do in these classes is try to explain how science works and why disagreement is so important.; Friday, May 07, 2010 3:50:00 PM
Georgi Marinov said...: Work from Chris Burge's lab at MIT indicates that 86% of human genes produce a minor isoform that is 15% or more of the level of the major isoform. If this data holds up, if strongly suggests that the minor alternatively spliced form isn't just "noise."

Wang et al. (2008) Nature 456: 470-476.

The problem with the Wang et al. paper is that at the time reads were single-end 32bp, which seriously confounds mapping when you're dealing with splices, and no real attempt was made there to quantify individual transcripts.

Reads now are paired-end, 75-100bp, and will be even longer with further improvements in sequencing technologies, and transcript-level quantification why still computationally difficult and not getting it right 100% of the time, at least exists, so we can say a lot more about the transcriptome and with much higher confidence.; Friday, May 07, 2010 8:41:00 PM
SPARC said...: This was so predictable: DI's Casey Luskin jumps the waggon.
Casey Luskin

<; Saturday, May 08, 2010 12:57:00 AM
SPARC said...: Let's hope that the author's didn't use PLIER for their analyses.; Saturday, May 08, 2010 1:40:00 AM
Georgi Marinov said...: This comment has been removed by the author.; Saturday, May 08, 2010 5:15:00 AM
Georgi Marinov said...: Scientists are obliged to recognize and respect other interpretations of the data. In my case, I'm fully aware of the deep sequencing experiments (but see comment above) and the EST data. I'm fully aware of the fact that many people interpret this data as evidence for massive amounts of biologically relevant alternative splicing.

I still prefer to interpret the data as being mostly due to errors in splicing, or "noise." I think the data showing that much of it is noise is more credible than the data showing that it is functional. (see Melamud and Moult (2009) Nucl. Acid Res. 37:4862-4872).

The thing that troubles me the most is when scientists—mostly those on the other side of this controversy—completely ignore the fact that there even IS a controversy. They publish papers where they don't even bother to reference the work of people who disagree with their interpretation. Note, for example, that there are no references to proponents of "noise" in the paper from the Blencowe/Frey labs.

See, I agree with you about the basic point of most of the novel transcripts being noise, and this come from having actually worked
with the data and seen where those come from and what they look like. What I don't agree is the 5% number, I claim it is higher than that, that's all.

We realize, of course, why they're doing this. If most alternative splicing is due to mistakes in the splicing machinery then all the recent paper has done is develop a method for predicting when these errors are likely to happen. That interpretation would greatly change the nature of their discovery and might even mean that their paper wouldn't be on the cover of Nature. So, I can guess why they would ignore the controversy.

This might qualify as unethical behavior. What do you think?

Yes, it is unethical behavior. But I am not certain that it is the case in all papers that people have published on the subject; Saturday, May 08, 2010 5:18:00 AM
Georgi Marinov said...: I am 100% with Larry on this one. The fact is, ALL of the transcript detection employed in these 'omics studies is not quantitative. Sure, authors like to pretend that it is, but no, it isn't. The question is simple and the way to answer it is very straightforward. But very labor-intensive. First, forget RNA. Protein-coding gene expression is not transcription. It's translation.

The problem with this is that transcripts don't have to be coding for proteins to be functional

So, make and purify Ab against every exon from >50 genes known to have tissue-specific isoforms. And from >50 genes that we don't know this about (including some, like HSA and actin, where we know fir sure that there are no splice isoforms at all). After that, only about 10,000 of Western runs ought to tell what's real and what's not. It will also provide an experimental test for the predictions made by paper biochemists cracking the enigma codes. But, of course, that's too difficult and not sexy to actually be funded and be done.

I have a hard time seeing how such a thing would ever even work. People are having a hard enough time trying to make good reliably working and sufficiently specific antibodies against whole proteins, against exons it is a whole another nightmarish level of difficulty; Saturday, May 08, 2010 5:22:00 AM
DK said...: The problem with this is that transcripts don't have to be coding for proteins to be functional

That's a cop out. Whenever people talk about function of splice isoforms, they they talk mainly, if not exclusively, about protein isoforms.

I have a hard time seeing how such a thing would ever even work.

Is it possible that this is because you've never done it yourself?

People are having a hard enough time trying to make good reliably working and sufficiently specific antibodies against whole proteins, against exons it is a whole another nightmarish level of difficulty

If people are having hard time making good Ab for Westerns, then it only means that these people are completely incompetent.

These aren't MAbs we are talking about and aren't Ab that should recognize native folds. Stick an exon into bacterial expression vector (make it into fusion protein if exon is < 150 bp), purify the protein (denatured or not, doesn't matter) = perfect antigen. Purify total IgG against membrane containing crapload of the antigen = perfect primaries. (Preabsorb against fusion partner if necessary). 1.5 month from A to Z. Trivial. Can, if desired, be made into quasi high throughput thing.; Saturday, May 08, 2010 1:31:00 PM
charlie wagner said...: I read the paper.

They're right. You're wrong.

Sorry!; Tuesday, May 11, 2010 2:14:00 PM
Devin said...: ^
You must have misread it. Don't worry, it happens.; Wednesday, May 12, 2010 12:15:00 AM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Thursday, May 06, 2010

I Don't Have Time for This!

26 comments :