More Recent Comments

Showing posts with label Genome. Show all posts
Showing posts with label Genome. Show all posts

Wednesday, January 04, 2017

Do seahorses evolve faster?

Genome sequencing is becoming so routine that it's difficult to publish your new genome sequence in a top journal. The trick is to find something unique and exciting about your genome so you can attract the attention of the leading journals. The latest success is the seahorse genome published in the Dec. 15, 2016 issue of Nature (Lin et al., 2016.

The species is the tiger tail seahorse Hippocampus comes. The assembled genome is 502Mb or about 1/6th the size of the human genome. The seahorse has 23,458 genes (protein-coding?) or about the same number as most other vertebrates. About 25% of the genome is junk (transposon-related).1

Wednesday, December 14, 2016

The ENCODE publicity campaign of 2007

ENCODE1 published the results of a pilot project in 2007 (Birney et al., 2007). They looked at 1% (30Mb) of the genome with a view to establishing their techniques and dealing with large amounts of data from many different groups. The goal was to "provide a more biologically informative representation of the human genome by using high-throughput methods to identify and catalogue the functional elements encoded."

The most striking result of this preliminary study was the confirmation of pervasive transcription. Here's what the ENCODE Consortium leaders said in the abstract,
Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap with one another.
ENCODE concluded that 93% of the genome is transcribed in one tissue or another. There are two possible explanations that account for pervasive transcription.

Friday, December 09, 2016

Using conservation to determine whether splice variants are functional

We've been having a discussion about function and how to recognize it. This is important when it comes to determining how much junk is in our genome [see Restarting the function wars (The Function Wars Part V)]. There doesn't seem to be any consensus on how to define "function" although there's general agreement on using sequence conservation as a first step. If some sequence under investigation is conserved in other species then that's a good sign that it's under negative selection and has a biological function. What if it's not conserved? Does that rule out function? The correct answer is "no" because one can always come up with explanations/excuses for such an observation. We discussed the example of de novo genes, which, by definition, are not conserved.

Let's look at another example: splice variants. Splice variants are different forms of RNA produced from the same gene. If they are biologically relevant then they will produce different forms of the protein (for protein-coding genes). This is an example of alternative splicing if, and only if, relevance has been proven.

Tuesday, December 06, 2016

Restarting the function wars (The Function Wars Part V)

The term "function wars" refers to debates over the meaning of the word "function" in biology. It refers specifically to the discussion about junk DNA because junk DNA is defined as DNA that does not have a biological function. The wars were (re-)started when the ENCODE Consortium decided to use a stupid definition of function in order to prove that most of our genome was functional. This prompted a number of papers attempting to create a more meaningful definition.

None of them succeeded, in my opinion, because biology is messy and doesn't lend itself to precise definitions. Look how difficult it is to define a "gene," for example. Or "evolution."

Nevertheless, some progress was made. Dan Graur has recently posted a summary of the two most important definitions of function [What does “function” mean in the context of evolution & what absurd situations may arise by using the wrong definition?]. The two definitions are "selected-effect" and "causal-role" (there are synonyms).

How many proteins in the human proteome?

Humans have about 25,000 genes. About 20,000 of these genes are protein-coding genes.1 That means, of course, that humans make at least 20,000 proteins. Not all of them are different since the number of protein-coding genes includes many duplicated genes and gene families. We would like to know how many different proteins there are in the human proteome.

The latest issue of Science contains an insert with a chart of the human proteome produced by The Human Protein Atlas. Publication was timed to correspond with release of a new version of the Cell Atlas at the American Society of Cell Biology meeting in San Francisco. The Cell Atlas maps the location of about 12,000 proteins in various tissues and organs. Mapping is done primarily by looking at whether or not a gene is transcribed in a given tissue.

A total of 7367 genes (60%) are expressed in all tissues. These "housekeeping" genes correspond to the major metabolic pathways and the gene expression pathway (e.g. RNA polymerase subunits, ribosomal proteins, DNA replication proteins). Most of the remaining genes are tissue-specific or developmentally specific.

Friday, October 07, 2016

Scientists at the Lawrence Berkeley National Laboratory do not understand basic molecular biology

The Lawrence Berkeley National Laboratory employs a number of scientists who work on genes and gene expression. Here's part of a press release published two days ago [For Normal Heart Function, Look Beyond the Genes: Loss of noncoding elements of genome results in heart abnormalities, finds Berkeley Lab study]. It demonstrates that the workers at this National Laboratory don't understand anything about mammalian genomes.

The only other possibility is that the person who wrote the press release doesn't understand molecular biology1 and the scientists who work there just don't care what their institution publishes.
Researchers have shown that when parts of a genome known as enhancers are missing, the heart works abnormally, a finding that bolsters the importance of DNA segments once considered “junk” because they do not code for specific proteins.
Regular readers of this blog know that ...
  1. No knowledgeable scientist ever said that all noncoding DNA was junk.
  2. We've known about regulatory sequences for half a century. We've known about enhancers—just another kind of regulatory sequence—for thirty-five years. Nobody ever thought they were junk. Nobody ever thought they were unimportant.
When scientists sequenced the human genome, they discovered that less than 5 percent of our DNA were genes that actually coded for protein sequences. The biological functions of the noncoding portions of the genome were unclear.

Over the past fifteen years, however, there has been a growing appreciation for the importance of these noncoding regions, thanks in large part to the efforts of individual labs and, more recently, large international efforts such as the Encyclopedia of DNA Elements (ENCODE) project.

What became clear from this work is that there are many elements of the genome, including enhancers, that are involved in regulating gene expression, even though they do not encode for proteins directly.
At some point this flagrant misrepresentation of facts must be stopped. It's hurting science.

How can you believe anything in the press release once you read this? Do you think this represents the views of the scientists who published the paper? Is so, shame on them. If not, shame on the Lawrence Berkeley National Laboratory.


1. I sent her a link to this post.

Monday, September 05, 2016

How many lncRNAs are functional: can sequence comparisons tell us the answer?

A large percentage of the human genome is transcribed at some time or another during development. The vast majority of those transcripts are very rare transcripts that look very much like spurious products of accidental transcription initiation at sequences resembling true promoters. They have been rejected by genome annotators. They do not define genes. They are junk RNA. Pervasive transcription does not mean that most of the genome is functional.

Among the transcripts is a class called long non-coding RNAs or lncRNAs. These are usually defined as capped and polyadenylated transcripts longer than 200 nucleotides. Many of them are processed by splicing. They look a lot like mRNA except they don't encode any polypeptides.1

We don't know how many of these RNAs exist because different labs use different criteria to describe them. Some databases exclude low abundance lncRNAs and some include non-polyadenylated RNAs. There is general agreement that they number in the tens of thousands. A common number in the scientific literature is 60,000 lncRNAs.

Tuesday, August 23, 2016

Splice variants of the human triose phosphate isomerase gene: is alternative splicing real?

Triose phosphate isomerase (TIM) is one of the enzymes in the gluconeogenesis pathway leading to the synthesis of glucose from simple precursors. It also plays a role in the degradation of glucose (glycolysis). The enzyme catalyzes the following reaction ....


Triose phosphate isomerase is found in almost all species. The structure and sequence of the enzyme is well-conserved. It is a classic β-barrel enzyme that usually forms a dimer. The overall structure of a single subunit is classic example of an αβ-barrel known as a TIM-barrel in reference to this enzyme.

To the best of my knowledge, no significant variants of this enzyme due to alternative promoters, alternative splicing, or proteolytic cleavage are known.1 The enzyme has been actively studied in biochemistry laboratories for at least eighty years.

Wednesday, August 03, 2016

More junk science in Science

The latest issue of the journal Science (Aug. 1, 2016) has an article on a recent paper by Aires et al. (2016) published in Developmental Cell. Here's the abstract of the paper ...

Vertebrates exhibit a remarkably broad variation in trunk and tail lengths. However, the evolutionary and developmental origins of this diversity remain largely unknown. Posterior Hox genes were proposed to be major players in trunk length diversification in vertebrates, but functional studies have so far failed to support this view. Here we identify the pluripotency factor Oct4 as a key regulator of trunk length in vertebrate embryos. Maintaining high Oct4 levels in axial progenitors throughout development was sufficient to extend trunk length in mouse embryos. Oct4 also shifted posterior Hox gene-expression boundaries in the extended trunks, thus providing a link between activation of these genes and the transition to tail development. Furthermore, we show that the exceptionally long trunks of snakes are likely to result from heterochronic changes in Oct4 activity during body axis extension, which may have derived from differential genomic rearrangements at the Oct4 locus during vertebrate evolution.
... those ignorant of history are not condemned to repeat it; they are merely destined to be confused.

Stephen Jay Gould
Ontogeny and Phylogeny (1977)
The results were written up by a freelance journalist named Diana Crow [‘Junk DNA’ tells mice—and snakes—how to grow a backbone]. She writes ...
‘Junk DNA’ tells mice—and snakes—how to grow a backbone

Why does a snake have 25 or more rows of ribs, whereas a mouse has only 13? The answer, according to a new study, may lie in "junk DNA," large chunks of an animal’s genome that were once thought to be useless. The findings could help explain how dramatic changes in body shape have occurred over evolutionary history.

Scientists began discovering junk DNA sequences in the 1960s. These stretches of the genome—also known as noncoding DNA—contain the same genetic alphabet found in genes, but they don’t code for the proteins that make us who we are. As a result, many researchers long believed this mysterious genetic material was simply DNA debris accumulated over the course of evolution. But over the past couple decades, geneticists have discovered that this so-called junk is anything but. It has important functions, such as switching genes on and off and setting the timing for changes in gene activity.
Sandwalk readers will see all the mistakes and misconceptions in these paragraphs. She's talking about regulatory sequences that were never, ever, thought to be junk. The paper being discussed has nothing to do with junk DNA and the results do not in any way alter our understanding of developmental gene regulation.

If you look carefully at the abstract, you'll see the word "heterochronic." This is one of Stephen Jay Gould's favorite words. He wrote about it in Ontogeny and Phylogeny.
I wish to emphasize one other distinction. Evolution occurs when ontogeny is altered in one of two ways: when new characters are introduced at any stage of development with varying effects upon subsequent stages, or when characters already present undergo changes in developmental timing. Together, these two processes exhaust the formal concept of phyletic change.; the second process is heterochrony. [my emphasis ... LAM] If change in developmental timing is important in evolution, then this second process must be very common.
This was written in 1977—that's almost 40 years ago! These ideas were around for decades before Gould wrote his book1 and they have been shown to be correct by numerous studies in the 1980s.

What's going on here? Science is supposed to be one of the leading science journals. How could it publish an article that misrepresents the field so badly? Do the editors send these "Latest News" articles out for review?


1. Ed Lewis shared the Nobel Prize in 1995 for his contribution to "the genetic control of early embryonic development" [The Nobel Prize in Physiology or Medicine 1995].

Thursday, July 28, 2016

You are junk

There's an article about junk DNA in the latest issue of New Scientist (July 27, 2016) [You are junk: Why it’s not your genes that make you human]. I've already discussed the false meme at the beginning of the article [False history and the number of genes: 2016]. Now it's time to look at the main argument.

The subtitle is ...
Genes make proteins make us – that was the received wisdom. But from big brains to opposable thumbs, some of our signature traits could come from elsewhere.
You can see where this is going. You start with a false paradigm, "Genes make proteins make us," then proceed to refute it. This is called "paradigm shafting."1

False history and the number of genes: 2016

There's an article about junk DNA in the latest issue of New Scientist. The title is: You are junk: Why it’s not your genes that make you human. The author is Colin Barras, a science writer from Michigan with a Ph.D. in paleontology.

He begins with .....
IT WAS a discovery that threatened to overturn everything we thought about what makes us human. At the dawn of the new millennium, two rival teams were vying to be the first to sequence the human genome. Their findings, published in February 2001, made headlines around the world. Back-of-the-envelope calculations had suggested that to account for the sheer complexity of human biology, our genome should contain roughly 100,000 genes. The estimate was wildly off. Both groups put the actual figure at around 30,000. We now think it is even fewer – just 20,000 or so.

"It was a massive shock," says geneticist John Mattick. "That number is tiny. It’s effectively the same as a microscopic worm that has just 1000 cells."

Thursday, June 30, 2016

Do Intelligent Design Creationists still think junk DNA refutes ID?

I'm curious about whether Intelligent Design Creationists still think their prediction about junk DNA has been confirmed.


Here's what Stephen Meyer wrote in Darwin's Doubt (p. 400).
The noncoding regions of the genome were assumed to be nonfunctional detritus of the trial-and-error mutational process—the same process that produced the functional code in the genome. As a result, these noncoding regions were deemed "junk DNA," including by no less a scientific luminary than Francis Crick.

Because intelligent design asserts that an intelligent cause produced the genome, design advocates have long predicted that most of the nonprotein-coding sequences in the genome should perform some biological function, even if they do not direct protein synthesis. Design theorists do not deny that mutational processes might have degraded some previously functional DNA, but we have predicted that the functional DNA (the signal) should dwarf the nonfunctional DNA (the noise), and not the reverse. As William Dembski, a leading design proponent, predicted in 1998, "On an evolutionary view we expect a lot of useless DNA. If, on the other hand, organisms are designed, we DNA, as much as possible, to exhibit function."
I'm trying to write about this in my book and I want to be as fair as possible.

Do most ID proponents still believe this is an important prediction from ID theory?

Do most ID proponents still think that most of the human genome is functional?


Wednesday, June 15, 2016

What does a person's genome reveal about their ethnicity and their appearance?

If you knew the complete genome sequence of someone could you tell where they came from and their ethnic background (race)? The answer is confusing according to Siddhartha Mukherjee writing in his latest book "The Gene: an intimate history." The answer appears to be "yes" but then Mukherjee denies that knowing where someone came from tells us anything about their genome or their phenotype. He writes the following on page 342.

... the genetic diversity within any racial group dominates the diversity between racial groups. This degree of intraracial variability makes "race" a poor surrogate for nearly any feature: in a genetic sense, an African man from Nigeria is so "different" from another man from Namibia that it makes little sense the lump them into the same category.

For race and genetics, then, the genome is strictly a one-way street. You can use the genome to predict where X or Y came from. But knowing where A or B came from, you can predict little about the person's genome. Or: every genome carries a signature of an individual's ancestry—but an individual's racial ancestry predicts little about the person's genome. You can sequence DNA from an African-American man and conclude that his ancestors came from Sierra Leone or Nigeria. But if you encounter a man whose great-grandparents came from Nigeria or Sierra Leone, you can say little about the features of this particular man. The geneticist goes home happy; the racist returns empty-handed.
I find this view very strange. Imagine that you were an anthropologist who was an expert on humans and human evolution. Imagine you were told that there's a woman in the next room whose eight great-grandparents all came from Japan. According to Mukherjee, such a scientist could not predict anything about the features of that woman. Does that make any sense?

I suspect this is just a convoluted way of reconciling science with political correctness.

Steven Monroe Lipkin has a different view. He's a medical geneticist who recently published a book with Jon R. Luoma titled "The Age of Genomes: tales from the front lines of genetic medicine." Here's how they explain it on page 6.
Many ethnic groups carry distinct signatures. For example, from a genome sequence you can usually tell if an individual is African-American, Caucasian, Asian, Satnami, or Ashkenazi Jew, even if you've never laid eyes on the patient. A well-regarded research scientist whom I had never met made his genome sequence publically available as part of a research study. I remember scrolling through his genetic variant files and trying, more successfully than I had expected, to guess what he would look like before I peeked at his webpage photo. The personal genome is more than skin deep.
This makes more sense to me. If you know what you look for—and Simon Monroe certainly does—then many of the features of a particular person can be deduced from their genome sequence. And if you know which variants are more common in certain ethnic groups then you can certainly predict what a person might look like just by knowing where their ancestors came from.

What's wrong with that?


Tuesday, May 24, 2016

University of Toronto press release distorts conclusions of RNA paper

My colleague, Ben Blencowe, just published a paper ...

Sharma, E., Sterne-Weiler, T., O’Hanlon, D., and Blencowe, B.J. (2016) Global Mapping of Human RNA-RNA Interactions. Molecular Cell, [doi: 10.1016/j.molcel.2016.04.030]

ABSTRACT (Summary)

The majority of the human genome is transcribed into non-coding (nc)RNAs that lack known biological functions or else are only partially characterized. Numerous characterized ncRNAs function via base pairing with target RNA sequences to direct their biological activities, which include critical roles in RNA processing, modification, turnover, and translation. To define roles for ncRNAs, we have developed a method enabling the global-scale mapping of RNA-RNA duplexes crosslinked in vivo, ‘‘LIGation of interacting RNA followed by high-throughput sequencing’’ (LIGR-seq). Applying this method in human cells reveals a remarkable landscape of RNA-RNA interactions involving all major classes of ncRNA and mRNA. LIGR-seq data reveal unexpected interactions between small nucleolar (sno) RNAs and mRNAs, including those involving the orphan C/D box snoRNA, SNORD83B, that control steady-state levels of its target mRNAs. LIGR-seq thus represents a powerful approach for illuminating the functions of the myriad of uncharacterized RNAs that act via base-pairing interactions.

Monday, May 09, 2016

Research for a book

I'm on sabbatical this term, working on a possible book whose working title is "What's in Your Genome?: 90% of your genome is junk."

Here's some of the most important books I've read (or re-read) in the past few months.


I've also read a lot of papers and scribbled notes on what's important and what's bullshit not. The most difficult part about keeping up with the scientific literature is organizing it in some meaningful way so you can quickly find it again if you need to—something I do just about every day.

Everyone has their own method. What works for me is to keep an electronic reference with key words and links to a file folder on a particular topic. (I use EndNote.) Here are the folders with all the papers I've been reading in the past few months.


I don't know how other authors behave but for me the most difficult thing about writing a book is organizing my thoughts and planning how to present them in the most effective manner. I tend to write too much on too many topics so the initial drafts usually have to be pared down considerably. Keeping that in mind, what are YOUR favorite topics?


Sunday, March 27, 2016

Georgi Marinov reviews two books on junk DNA

The December issue of Evolution: Education and Outreach has a review of two books on junk DNA. The reviewer is Georgi Marinov, a name that's familiar to Sandwalk readers. He is currently working with Michael Lynch at Indiana University in Bloomington, Indiana, USA. You can read the review at: A deeper confusion.

The books are ...
The Deeper Genome: Why there is more to the human genome than meets the eye, by John Parrington, (Oxford, United Kingdom: Oxford University Press), 2015. ISBN:978-0-19-968873-9.

Junk DNA: A Journey Through the Dark Matter of the Genome, by Nessa Carey, (New York, United States: Columbia University Press), 2015. ISBN:978-0-23-117084-0.
You really need to read the review for yourselves but here's a few teasers.
If taken uncritically, these texts can be expected to generate even more confusion in a field that already has a serious problem when it comes to communicating the best understanding of the science to the public.
Parrington claims that noncoding DNA was thought to be junk and Georgi replies,
However, no knowledgeable person has ever defended the position that 98 % of the human genome is useless. The 98 % figure corresponds to the fraction of it that lies outside of protein coding genes, but the existence of distal regulatory elements, as nicely narrated by the author himself, has been at this point in time known for four decades, and there have been numerous comparative genomics studies pointing to a several-fold larger than 2% fraction of the genome that is under selective constraint.
I agree. That's a position that I've been trying to advertise for several decades and it needs to be constantly reiterated since there are so many people who have fallen for the myth.

Georgi goes on to explain where Parringtons goes wrong about the ENCODE results. This critique is devastating, coming, as it does, from an author of the most relevant papers.1 My only complaint about the review is that George doesn't reveal his credentials. When he quotes from those papers—as he does many times—he should probably have mentioned that he is an author of those quotes.

Georgi goes on to explain four main arguments for junk DNA: genetic load, the C-value Paradox, transposons (selfish DNA), and modern evolutionary theory. I like this part since it's similar to the Five Things You Should Know if You Want to Participate in the Junk DNA Debate. The audience of this journal is teachers and this is important information that they need to know, and probably don't.

His critique of Nessa Carey's book is even more devastating. It begins with,
Still, despite a few unfortunate mistakes, The Deeper Genome is well written and gets many of its facts right, even if they are not interpreted properly. This is in stark contrast with Nessa Carey’s Junk DNA: A Journey Through the Dark Matter of the Genome. Nessa Carey has a PhD in virology and has in the past been a Senior Lecturer in Molecular Biology at Imperial College, London. However, Junk DNA is a book not written at an academic level but instead intended for very broad audience, with all the consequences that the danger of dumbing it down for such a purpose entails.
It gets worse. Nessa Carey claims that scientists used to think that all noncoding DNA was junk but recent discoveries have discredited that view. Georgi sets her straight with,
Of course, scientists have had a very good idea why so much of our DNA does not code for proteins, and they have had that understanding for decades, as outlined above. Only by completely ignoring all that knowledge could it have been possible to produce many of the chapters in the book. The following are referred to as junk DNA by Carey, with whole chapters dedicated to each of them (Table 3).


The inclusion of tRNAs and rRNAs in the list of “previously thought to be junk” DNA is particularly baffling given that they have featured prominently as critical components of the protein synthesis machinery in all sorts of basic high school biology textbooks for decades, not to mention the role that rRNAs and some of the other noncoding RNAs on that list play in many “RNA world” scenarios for the origin of life. How could something that has so often been postulated to predate the origin of DNA as the carrier of genetic information (Jeffares et al. 1998; Fox 2010) and that must have been of critical importance both before and after that be referred to as “junk”?
You would think that this is something that doesn't have to be explained to biology teachers but the evidence suggests otherwise. One of those teachers recently reviewed Nessa Carey's book very favorably in the journal The American Biology Teacher and another high school teacher reveals his confusion about the subject in the comments to my post [see Teaching about genomes using Nessa Carey's book: Junk DNA].

It's good that Georgi Marinov makes this point forcibly.

Now I'm going to leave you with an extended quote from Georgi Marinov's review. Coming from a young scientist, this is very potent and it needs to be widely disseminated. I agree 100%.
The reason why scientific results become so distorted on their way from scientists to the public can only be understood in the socioeconomic context in which science is done today. As almost everyone knows at this point, science has existed in a state of insufficient funding and ever increasing competition for limited resources (positions, funding, and the small number of publishing slots in top scientific journals) for a long time now. The best way to win that Darwinian race is to make a big, paradigm shifting finding. But such discoveries are hard to come by, and in many areas might actually never happen again—nothing guarantees that the fundamental discoveries in a given area have not already been made. ... This naturally leads to a publishing environment that pretty much mandates that findings are framed in the most favorable and exciting way, with important caveats and limitations hidden between the lines or missing completely. The author is too young to have directly experienced those times, but has read quite a few papers in top journals from the 1970s and earlier, and has been repeatedly struck by the difference between the open discussion one can find in many of those old articles and the currently dominant practices.

But that same problem is not limited to science itself, it seems to be now prevalent at all steps in the chain of transmission of findings, from the primary literature, through PR departments and press releases, and finally, in the hands of the science journalists and writers who report directly to the lay audience, and who operate under similar pressures to produce eye-catching headlines that can grab the fleeting attention of readers with ever decreasing ability to concentrate on complex and subtle issues. This leads to compound overhyping of results, of which The Deeper Genome is representative, and to truly surreal distortion of the science, such as what one finds in Nessa Carey’s Junk DNA.

The field of functional genomics is especially vulnerable to these trends, as it exists in the hard-to-navigate context of very rapid technological changes, a potential for the generation of truly revolutionary medical technologies, and an often difficult interaction with evolutionary biology, a controversial for a significant portion of society topic. It is not a simple subject to understand and communicate given all these complexities while in the same time the potential and incentives to mislead and misinterpret are great, and the consequences of doing so dire. Failure to properly communicate genomic science can lead to a failure to support and develop the medical breakthroughs it promises to deliver, or what might be even worse, to implement them in such a way that some of the dystopian futures imagined by sci-fi authors become reality. In addition, lending support to anti-evolutionary forces in society by distorting the science in a way that makes it appear to undermine evolutionary theory has profound consequences that given the fundamental importance of evolution for the proper understanding of humanity’s place in nature go far beyond making life even more difficult for teachers and educators of even the general destruction of science education. Writing on these issues should exercise the needed care and make sure that facts and their best interpretations are accurately reported. Instead, books such as The Deeper Genome and Junk DNA are prime examples of the negative trends outlined above, and are guaranteed to only generate even deeper confusion.
It's not easy to explain these things to a general audience, especially an audience that has been inundated with false information and false ideas. I'm going to give it a try but it's taking a lot more effort than I imagined.


1. Georgi Marinov is an author on the original ENCODE paper that claimed 80% of our genome is functional (ENCODE Project Consortium, 2012) and the paper where the ENCODE leaders retreated from that claim (Kellis et al., 2014).

ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 48957-74. [doi: 10.1038/nature11247]

Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G.E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]

Sunday, March 20, 2016

Another failure: "The Mysterious World of the Human Genome"

The Mysterious World of the Human Genome
by Frank Ryan
William Collins, an imprint of Harper Collins, London UK (2015)
ISBN 978-0-00-754906-1

This is just another "gosh, gee whiz" book on the amazing and revolutionary (not!) discoveries about the human genome. The title tells you what to expect: The Mysterious World of the Human Genome.

The author is Frank P. Ryan, a physician who was employed as an "Honorary Senior Lecturer" in the Department of Medical Education at the University of Sheffield (UK). He's a member of The Third Way group. You can read more about him at their website: Frank P. Ryan.

Wednesday, March 09, 2016

A 2004 kerfuffle over pervasive transcription in the mouse genome

The first drafts of the human genome sequence were published in 2001. There was still work to do on "finishing" the sequence but a lot of the International Human Genome Project (IHGP) team shifted to work on the mouse genome. The FANTOM Consortium and the RIKEN Genome Exploration Groups (I and II) published an analysis of mouse transcripts in December 2002.
Okazaki, Y., Furuno, M., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., Suzuki, H. et al. (2002) Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature, 420:563-573. [doi: 10.1038/nature01266]

Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 ‘transcriptional units’, contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense–antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.

Wednesday, March 02, 2016

When philosophers talk about genomes

Postgenomics is a compendium of twelve scholarly articles by philosophers and sociologists who write about the implication of the human genome sequence and subsequent work on interpreting the results. The volume is edited by Sarah Richardson, a professor in Social Sciences (History of Science) at Harvard University (Boston, Massachusetts, USA), and by Hallam Stevens, a professor of History at Nanyang Technology University in Singapore (Singapore).


The first essay is by Stevens and Richardson and it outlines the goal of the book.

Tuesday, February 16, 2016

Happy birthday human genome sequence!

The draft sequences of the human genome were published fifteen years ago. The International Human Genome Project (IGHP) published its draft sequence in Nature on Feb. 15, 2001 (Lander et al., 2001) and Celera Genomics published its draft sequence in Science on Feb. 16, 2001 (Venter et al., 2001).1

For me the timing was perfect since I was scheduled to give a Journal Club talk on March 16th and you could hardly ask for a better topic.