tag:blogger.com,1999:blog-37148773.post6749045198580450027..comments2024-03-27T14:50:47.345-04:00Comments on <center>Sandwalk</center>: How to Frame a Null HypothesisLarry Moranhttp://www.blogger.com/profile/05756598746605455848noreply@blogger.comBlogger24125tag:blogger.com,1999:blog-37148773.post-11545698528811813742009-05-12T15:51:00.000-04:002009-05-12T15:51:00.000-04:00Well, the facts are facts, the question is how you...Well, the facts are facts, the question is how you interpret them when you don't have all the facts you need. If you ask me whether there is a lot of overselling in this article, the answer is yes, I agree with that. But this does not mean that we should automatically switch to the opposite extreme of the opinion spectrum either - that all ncRNA phenomena are products of that queen of the omics sciences, the artifactomics. <br /><br />The correct position according to me is to admit that there is a lot we don't know and we have yet to learn, then start figuring it out (which we are doing) but in the same time be very careful how we formulate and communicate our hypothesis about what might be going on to the public (which isn't happening). Because, as it is well known, the subtle details of the scientific debate will almost certainly be ignored or misinterpreted.Georgi Marinovhttps://www.blogger.com/profile/12226357993389417752noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-2800449715297719652009-05-12T09:32:00.000-04:002009-05-12T09:32:00.000-04:00Georgi Marinov says,
Truth to be said, I don't re...Georgi Marinov says,<br /><br /><I>Truth to be said, I don't recall any of these papers (and I admit that I have yet to find time to read the FANTOM papers in depth) making the grand claim that everything they found is functional, ...</I><BR><BR>You are correct. None of the papers makes the overt claim that everything is functional. Instead, they state or imply that a large percentage of the non-coding RNAs are functional.<br /><br />Here's a recent review from a few weeks ago by John Mattick: <A HREF="http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000459" REL="nofollow">The Genetic Signatures of Noncoding RNAs</A>. What do you think of this form of scientific paper?Larry Moranhttps://www.blogger.com/profile/05756598746605455848noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-58229962740532414782009-05-11T09:45:00.000-04:002009-05-11T09:45:00.000-04:00Let's be real clear on what I'm saying. I'm saying...<I>Let's be real clear on what I'm saying. I'm saying that it is scientifically unethical to claim that transcripts are functional simply because they exist. Ignoring an important counter-argument is not the way good scientists are supposed to behave.</I><BR>And I agree with this 100% too. <br /><br /><I>I find it interesting that so few of you have found papers where the issue is treated, correctly, as a controversy.<br /><br />Why is that?</I><BR>The cynical explanation as I said above is that when you have the hottest technology to come out since PCR, and you have spent a good amount of money to do the experiments (because this type of experiments are not cheap yet, although they will become soon), it is somewhat not in your best interest to treat the results as noise. I don't think it is all noise, as I said in previous posts, but I am rationalizing as to why if it was noise it would still be reported as more than that by the authors, who certainly should know what their data tell them better than anybody else. So what you do is to say "Hey, we discovered such and such transcripts, we don't know what they do, but it would be interesting if they turn out to be functional". <br /><br />The other explanation is that the standards of scientific reasoning one needs to meet in order to get the high-profile publication and the amount of publicity these papers receive aren't that high. It is as much a failure of authors as a failure of editors and reviewers. <br /><br />Truth to be said, I don't recall any of these papers (and I admit that I have yet to find time to read the FANTOM papers in depth) making the grand claim that everything they found is functional, they just do not spend too much time talking about the possibility that most of it is noise (which is still not the correct thing to do, of course)Georgi Marinovhttps://www.blogger.com/profile/12226357993389417752noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-90321638838373468272009-05-11T09:29:00.000-04:002009-05-11T09:29:00.000-04:00The argument that low transcript levels mean noise...<I>The argument that low transcript levels mean noise is not a convincing one.</I><BR><BR>I agree 100%. But low abundance is an important bit of information that's consistent with noise. That fact (low abundance) should not be ignored.<br /> <br /><I>I am not arguing that most of those transcripts are functional, let it be clear, but I don't think we should dismiss them without further consideration either.</I><BR><BR>Let's be real clear on what I'm saying. I'm saying that it is scientifically unethical to claim that transcripts are functional simply because they exist. Ignoring an important counter-argument is not the way good scientists are supposed to behave.<br /><br />I find it interesting that so few of you have found papers where the issue is treated, correctly, as a controversy.<br /><br />Why is that?Larry Moranhttps://www.blogger.com/profile/05756598746605455848noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-77672151823943008452009-05-11T09:22:00.000-04:002009-05-11T09:22:00.000-04:00MartinC
It can be a mistake, however, to assume t...MartinC<br /><br /><I>It can be a mistake, however, to assume that low level transcription simply equates to non-functional noise.</I><BR><BR>I agree, that's why I would never make such a stupid argument. On the other hand, if you are going to argue that a low abundance transcript is functional than you have to invoke hypotheses that make those transcripts unusual.<br /><br />What I'm challenging is the belief that because it exists, it must be functional. I'm also challenging the fact that most papers ignore the fact that these transcripts are rare. <br /><br /><I>Indeed the figure of less than one transcript per cell that Larry mentioned is not exactly unusual for known protein encoding mRNAs. A single mRNA gives rise to several thousand molecules of protein ...</I><BR><BR>That's an incorrect statement. A typical mammalian mRNA is only translated about 100 times or less. And there are very few proteins that can be functional in a mammalian cell at a concentration of only 100 molecules. A typical regulatory protein, for example, has to be present in >10,000 copies.<br /><br /><I>... such that a gene expressing an mRNA with a short half-life can still have important functional effects even though its average mRNA transcript level is below one per cell since the protein can still be present at several thousand copies.</I><BR><BR>Your reasoning is incorrect because your facts are wrong.Larry Moranhttps://www.blogger.com/profile/05756598746605455848noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-45117061591068602292009-05-11T08:33:00.000-04:002009-05-11T08:33:00.000-04:00The argument that low transcript levels mean noise...The argument that low transcript levels mean noise is not a convincing one. If you look at some RNA-Seq data (which allows you to get a crude estimate of the number of transcripts per cell) one of the striking things to be noticed is that some very famous (and presumably essential) genes are expressed at a few transcripts per cell at most. Of course, this might be an artifact of the cell culture systems and tissues that the datasets I have looked at personally come from, but it is definitely not a result that supports the "low expression = non-functionality" argument. <br /><br />I am not arguing that most of those transcripts are functional, let it be clear, but I don't think we should dismiss them without further consideration either. Probably a few of the novel RNA classes described will turn out to be reproducible errors inherent to the process, or having something to do with the silencing of the regions they originate from, or turn out to be trivial for some other reason, but some will turn out to be more than that. The future will tellGeorgi Marinovhttps://www.blogger.com/profile/12226357993389417752noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-90540283618855842682009-05-11T08:09:00.000-04:002009-05-11T08:09:00.000-04:00Transcript levels are certainly important and an u...Transcript levels are certainly important and an unbiased deep sequencing approach using cDNA isolated from a known number of cells is probably the best way to examine this question but this is something that has only recently become methodologically possible. <br />It can be a mistake, however, to assume that low level transcription simply equates to non-functional noise. Indeed the figure of less than one transcript per cell that Larry mentioned is not exactly unusual for known protein encoding mRNAs. A single mRNA gives rise to several thousand molecules of protein such that a gene expressing an mRNA with a short half-life can still have important functional effects even though its average mRNA transcript level is below one per cell since the protein can still be present at several thousand copies. There are many low copy number RNAs that have recently been identified, that show evidence of function since siRNA targeting leads to important cellular effects (frequently at the level of chromatin remodelling of specific promoters).<br />Neither of these points allow us to propose a generalized model for transcriptional regulation but it should at least remind us to keep out minds open about the possibility that novel functional transcripts exist in the database.<br />It doesn't, by itself rule out these same transcripts as simple noise either but suggests that a combination of chromatin analysis, transcriptional profiling and transcriptional functional analysis (siRNA targeting, for instance) provide the best route towards creating such a model.Sigmundhttps://www.blogger.com/profile/00262375488263086844noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-37487045873804924112009-05-10T09:08:00.000-04:002009-05-10T09:08:00.000-04:00MartinC asks,
The sort of things we see from the ...MartinC asks,<br /><br /><I>The sort of things we see from the data are a convergence of many factors (RNA POLII binding, multiple independent chromatin modifications, DNAse accessibility, high numbers of transcripts etc). We know that these factors are associated with promoter or other such regulatory regions so the evidence does suggest something more than background noise. As I've said, we are at an early stage in the understanding of this but its not a question of pure untestable speculation as some seem to imply.</I><BR><BR>RNA POLII binding, multiple independent chromatin modifications, and DNAse accessibility are not independent variables. They would all be associated with random noise so you can't use them to distinguish between noise and function.<br /><br />The abundance of transcripts, on the other hand, is important. That's why I list it as one of the criterion necessary to <A HREF="http://sandwalk.blogspot.com/2009/04/how-to-evaluate-genome-level.html" REL="nofollow">Evaluate Genome Level Transcription Papers</A>.<br /><br />Unfortunately, you won't find much information about the abundance of various transcripts in most of those papers. The authors know very well that they're dealing with only a few—perhaps less than one—transcripts per cell but for some strange reason they don't think it's important to mention this in the paper.Larry Moranhttps://www.blogger.com/profile/05756598746605455848noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-9037485494273726452009-05-10T08:29:00.000-04:002009-05-10T08:29:00.000-04:00MartinC, I'm content to ask if Larry can reconcile...MartinC, I'm content to ask if Larry can reconcile his remarks with some of the interesting new results that have come out in the past few years. I can, but I would rather not be putting words in Larry's mouth.Arthttp://www.aghunt.wordpress.comnoreply@blogger.comtag:blogger.com,1999:blog-37148773.post-43322893384241096282009-05-09T20:35:00.000-04:002009-05-09T20:35:00.000-04:00Seconding the "you protesteth too much" remark.Seconding the "you protesteth too much" remark.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-37148773.post-83525087998453673342009-05-09T06:31:00.000-04:002009-05-09T06:31:00.000-04:00Art, take as a model a 1 Mb genomic sequence which...Art, take as a model a 1 Mb genomic sequence which contains a single well defined promoter of a known functional gene exactly at the center point. <br />Now ask yourself, if we look at the EST database results from multiple tissues (essentially a sampling of the transcripts from the 1 Mb segment) do the two paragraphs predict the same result?<br />The second paragraph suggests that we will see many ESTs corresponding to the known gene and others corresponding to transcription initiated at the same promoter, but in the opposite orientation (essentially transcription linked to the promoter but in more than one direction).<br />Larry's paragraph, however, suggests multiple initiation events throughout the 1 Mb segment.<br />If the important point here is to distinguish 'noise' from signal then it is certainly not pedantic to point out that these two models predict very different transcription profiles and thus different possible interpretations of 'noise'.Sigmundhttps://www.blogger.com/profile/00262375488263086844noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-60012428049917483082009-05-08T20:18:00.000-04:002009-05-08T20:18:00.000-04:00Larry:
"The key fact that most scientists are ove...Larry:<br /><br /><I>"The key fact that most scientists are overlooking is that RNA polymerase and the various transcription factors must bind non-specifically at thousands of sites in a random sequence of junk DNA. This is just basic biochemistry of the sort that should be taught in undergraduate classes. Transcription will be initiated by accident at some of these sites even though they are not functional promoters. Again, this is basic biochemistry."</I>Neil et al.:<br /><br /><I>"However, most of the identified CUTs corresponded to transcripts divergent from the promoter regions of genes, indicating that they represent by-products of divergent transcription occurring at many and possibly most promoters."</I>One can be pedantic about this and find possible items of disagreement, but the basic gists of these two quotes are very similar.Arthttp://aghunt.wordpress.comnoreply@blogger.comtag:blogger.com,1999:blog-37148773.post-46583388798767684662009-05-08T02:37:00.000-04:002009-05-08T02:37:00.000-04:00Art, those papers you linked to do not support the...Art, those papers you linked to do not support the idea Larry described in his final paragraph. Read what he said again. I agree with the conclusions in the papers (a lot of apparent non-coding transcripts seem to come from bidirectional promoters and that a lot of eukaryotic promoters seem to be inherently bidirectional. That is quite a different point to that made by Larry. Whether the CUTs have a function in of themselves is a different matter (there are evidence that some do in a sequence specific manner - for instance those associated with CCND1, or in a non sequence specific manner as 'pioneer' transcripts that allow for the opening of chromatin for access to high output transcription of coding transcripts on the same or opposite strand) but that is a different question and one that really needs a lot more work in order to draw firm conclusions.Sigmundhttps://www.blogger.com/profile/00262375488263086844noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-30870494028813041832009-05-07T21:40:00.000-04:002009-05-07T21:40:00.000-04:00"The sort of things we see from the data are a con...<I>"The sort of things we see from the data are a convergence of many factors (RNA POLII binding, multiple independent chromatin modifications, DNAse accessibility, high numbers of transcripts etc). We know that these factors are associated with promoter or other such regulatory regions so the evidence does suggest something more than background noise. As I've said, we are at an early stage in the understanding of this but its not a question of pure untestable speculation as some seem to imply."</I>I think that the "noise" explanation is still pretty good. From a paper by Neil et al:<br /><br /><I>"Our data reveal numerous new CUTs with such a potential regulatory role. However, most of the identified CUTs corresponded to transcripts divergent from the promoter regions of genes, indicating that they represent by-products of divergent transcription occurring at many and possibly most promoters. Eukaryotic promoter regions are thus intrinsically bidirectional, a fundamental property that escaped previous analyses because in most cases divergent transcription generates short-lived unstable transcripts present at very low steady-state levels."</I>The paper: Helen Neil, Christophe Malabat, Yves d’Aubenton-Carafa, Zhenyu Xu, Lars M. Steinmetz & Alain Jacquier. 2009. Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature 457, 1038.<br /><br /><A HREF="http://aghunt.wordpress.com/2009/02/22/more-strangeness/" REL="nofollow">A bit more</A> about this subject.Arthttp://www.aghunt.wordpress.comnoreply@blogger.comtag:blogger.com,1999:blog-37148773.post-19127819012377020862009-05-07T09:48:00.000-04:002009-05-07T09:48:00.000-04:00Pondering Fool said:
"Certain sequences are favore...Pondering Fool said:<br />"Certain sequences are favored by the polymerase. Long stretches of sequence you would expect by chance certain regions would be favored."<br />I suppose the question we are asking is how do we distinguish the types of favored sites you mention with actual functional regions. Without the sort of whole genome approach that's been applied recently we are really just speculating and even at this stage we still have a lot of confirmatory work to do to really work out the rules. I would, however, suggest that what we are discussing here is not just a matter of a random sequence that just happens to produce a higher than background spike of RNA PolII binding. The sort of things we see from the data are a convergence of many factors (RNA POLII binding, multiple independent chromatin modifications, DNAse accessibility, high numbers of transcripts etc). We know that these factors are associated with promoter or other such regulatory regions so the evidence does suggest something more than background noise. As I've said, we are at an early stage in the understanding of this but its not a question of pure untestable speculation as some seem to imply.Sigmundhttps://www.blogger.com/profile/00262375488263086844noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-53722987401677731942009-05-07T08:41:00.000-04:002009-05-07T08:41:00.000-04:00If you have multiple CAGE tags, or TFs and PolII b...If you have multiple CAGE tags, or TFs and PolII binding consistently mapping to the same sites in the middle of nowhere, this is a good evidence it is not just transcriptional noise and things are more complicated than we thought. <br /><br />***********************<br /><br />Or it could still be noise just the sequence there for whatever reason (including chance) that has nothing to do with the transcript made is favored over other random sequences, hence it shows up over and over again. Certain sequences are favored by the polymerase. Long stretches of sequence you would expect by chance certain regions would be favored.PonderingFoolhttps://www.blogger.com/profile/10767758746935185528noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-10618073615621827852009-05-07T00:26:00.000-04:002009-05-07T00:26:00.000-04:00Which part of the data rules out noise? If you hav...<I>Which part of the data rules out noise? If you have widespread transcription then it implies that a large part of the genome is available for binding, right? </I>If you have multiple CAGE tags, or TFs and PolII binding consistently mapping to the same sites in the middle of nowhere, this is a good evidence it is not just transcriptional noise and things are more complicated than we thought. <br /><br />It still does not mean those are functional transcripts, of course, although it seems certain at this point that the repertoire of functional RNA molecules that get produced is really greater than the traditionally expected.<br /><br />As it was pointed out though, part of the problem maybe the way the papers are presented. Those are indeed data-heavy papers that not always have clear conclusions being apparent in the data. But because something on the order of a million dollars and above has been spent, they have to be published in prestigious journals, which means that "a story" has to be present. So this maybe the source of some of the stretching the limits of sound scientific reasoning we see.Georgi Marinovhttps://www.blogger.com/profile/12226357993389417752noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-35734498892651177712009-05-06T23:15:00.000-04:002009-05-06T23:15:00.000-04:00These kinds of people are almost as bad as ID prop...These kinds of people are almost as bad as ID proponents sometimes ._.Anthonzihttps://www.blogger.com/profile/13221898252667487718noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-25260820193765979982009-05-06T20:07:00.000-04:002009-05-06T20:07:00.000-04:00Theres so much interesting data being produced rec...<I>Theres so much interesting data being produced recently that its going to take several years before we put it into some sort of perspective.</I> <br /><br />Like human, yeast and worm "protein interactomes" that overlap at best at couple percents of total "interactions". Isn't it more reasonable to discard them as massive noise artefacts than spend years trying to make sense of something that obviously does not make sense?DKnoreply@blogger.comtag:blogger.com,1999:blog-37148773.post-28746873822572290162009-05-06T18:43:00.000-04:002009-05-06T18:43:00.000-04:00Agnosticism is good. I'd like to see a lot more of...<I>Agnosticism is good. I'd like to see a lot more of it.</I>Only in molecular biology, or in other fields as well?John S. Wilkinshttps://www.blogger.com/profile/04417266986565803683noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-33052734710908215522009-05-06T18:18:00.000-04:002009-05-06T18:18:00.000-04:00Larry, I tend to read most of the big project pape...Larry, I tend to read most of the big project papers as data dumps rather than take their conclusions as gospel, so to speak. Theres so much interesting data being produced recently that its going to take several years before we put it into some sort of perspective. <br />Being overly speculative is simply a necessity for recieving continuing funding purposes these days. You wont get a decent publication by simply confirming previously known or speculated points. <br />As for an example of a decently done paper I would suggest Barski et al in Cell 2007, High-Resolution Profiling of Histone Methylations in the Human Genome looking at chromatin structure corresponging to transciption and silencing.Sigmundhttps://www.blogger.com/profile/00262375488263086844noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-16127243642829088462009-05-06T17:20:00.000-04:002009-05-06T17:20:00.000-04:00MartinC says,
I disagree with your final paragrap...MartinC says,<br /><br /><I>I disagree with your final paragraph. In a mammalian genome RNA Pol and transcription factors do not have to bind non specifically across non functional sequence. We have known for years that the chromatin structure of genomic regions is critical to the binding capacity of such factors.</I><BR><BR>Yes, that's true. There are parts of the genome that are heterochromatic or at least bound in a "closed" conformation of chromatin. Those regions are less likely to bind RNA polymerase. It doesn't change the argument very much.<br /><br /><I>Promoter regions tend to be modified for access and downstream parts of genes and non functional parts of the genome correspondingly modified to prevent access to these factors.<br />A hypothesis that RNA Polymerase is simply binding and transcribing noisily across the genome at random is simply not supported by the current data (Encode project, CAGE tag deep sequence analysis etc).</I><BR><BR>Which part of the data rules out noise? If you have widespread transcription then it implies that a large part of the genome is available for binding, right? <br /><br /><I>I am not personally of the opinion that all or even most of the RNA that is transcribed has some adaptive function but I think the data is at least sufficient to suggest an agnostic approach to the question, at least in principle.</I><BR><BR>Agnosticism is good. I'd like to see a lot more of it. Can you point out a paper from one of the megaprojects that exhibits the kind of agnosticism that you admire?Larry Moranhttps://www.blogger.com/profile/05756598746605455848noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-45731978803582254382009-05-06T17:13:00.000-04:002009-05-06T17:13:00.000-04:00Larry, you protesteth too much!
I disagree with yo...Larry, you protesteth too much!<br />I disagree with your final paragraph. In a mammalian genome RNA Pol and transcription factors do not have to bind non specifically across non functional sequence. We have known for years that the chromatin structure of genomic regions is critical to the binding capacity of such factors. Promoter regions tend to be modified for access and downstream parts of genes and non functional parts of the genome correspondingly modified to prevent access to these factors. <br />A hypothesis that RNA Polymerase is simply binding and transcribing noisily across the genome at random is simply not supported by the current data (Encode project, CAGE tag deep sequence analysis etc). <br />I am not personally of the opinion that all or even most of the RNA that is transcribed has some adaptive function but I think the data is at least sufficient to suggest an agnostic approach to the question, at least in principle.Sigmundhttps://www.blogger.com/profile/00262375488263086844noreply@blogger.comtag:blogger.com,1999:blog-37148773.post-30439182167770278182009-05-06T15:26:00.000-04:002009-05-06T15:26:00.000-04:00My next probs and stats class are going to be requ...My next probs and stats class are going to be required to read this article. :-)Harriethttps://www.blogger.com/profile/17953435368705942387noreply@blogger.com