More Recent Comments

Showing posts with label Gene Expression. Show all posts
Showing posts with label Gene Expression. Show all posts

Friday, December 09, 2016

Using conservation to determine whether splice variants are functional

We've been having a discussion about function and how to recognize it. This is important when it comes to determining how much junk is in our genome [see Restarting the function wars (The Function Wars Part V)]. There doesn't seem to be any consensus on how to define "function" although there's general agreement on using sequence conservation as a first step. If some sequence under investigation is conserved in other species then that's a good sign that it's under negative selection and has a biological function. What if it's not conserved? Does that rule out function? The correct answer is "no" because one can always come up with explanations/excuses for such an observation. We discussed the example of de novo genes, which, by definition, are not conserved.

Let's look at another example: splice variants. Splice variants are different forms of RNA produced from the same gene. If they are biologically relevant then they will produce different forms of the protein (for protein-coding genes). This is an example of alternative splicing if, and only if, relevance has been proven.

Tuesday, December 06, 2016

How many proteins in the human proteome?

Humans have about 25,000 genes. About 20,000 of these genes are protein-coding genes.1 That means, of course, that humans make at least 20,000 proteins. Not all of them are different since the number of protein-coding genes includes many duplicated genes and gene families. We would like to know how many different proteins there are in the human proteome.

The latest issue of Science contains an insert with a chart of the human proteome produced by The Human Protein Atlas. Publication was timed to correspond with release of a new version of the Cell Atlas at the American Society of Cell Biology meeting in San Francisco. The Cell Atlas maps the location of about 12,000 proteins in various tissues and organs. Mapping is done primarily by looking at whether or not a gene is transcribed in a given tissue.

A total of 7367 genes (60%) are expressed in all tissues. These "housekeeping" genes correspond to the major metabolic pathways and the gene expression pathway (e.g. RNA polymerase subunits, ribosomal proteins, DNA replication proteins). Most of the remaining genes are tissue-specific or developmentally specific.

Friday, October 07, 2016

Scientists at the Lawrence Berkeley National Laboratory do not understand basic molecular biology

The Lawrence Berkeley National Laboratory employs a number of scientists who work on genes and gene expression. Here's part of a press release published two days ago [For Normal Heart Function, Look Beyond the Genes: Loss of noncoding elements of genome results in heart abnormalities, finds Berkeley Lab study]. It demonstrates that the workers at this National Laboratory don't understand anything about mammalian genomes.

The only other possibility is that the person who wrote the press release doesn't understand molecular biology1 and the scientists who work there just don't care what their institution publishes.
Researchers have shown that when parts of a genome known as enhancers are missing, the heart works abnormally, a finding that bolsters the importance of DNA segments once considered “junk” because they do not code for specific proteins.
Regular readers of this blog know that ...
  1. No knowledgeable scientist ever said that all noncoding DNA was junk.
  2. We've known about regulatory sequences for half a century. We've known about enhancers—just another kind of regulatory sequence—for thirty-five years. Nobody ever thought they were junk. Nobody ever thought they were unimportant.
When scientists sequenced the human genome, they discovered that less than 5 percent of our DNA were genes that actually coded for protein sequences. The biological functions of the noncoding portions of the genome were unclear.

Over the past fifteen years, however, there has been a growing appreciation for the importance of these noncoding regions, thanks in large part to the efforts of individual labs and, more recently, large international efforts such as the Encyclopedia of DNA Elements (ENCODE) project.

What became clear from this work is that there are many elements of the genome, including enhancers, that are involved in regulating gene expression, even though they do not encode for proteins directly.
At some point this flagrant misrepresentation of facts must be stopped. It's hurting science.

How can you believe anything in the press release once you read this? Do you think this represents the views of the scientists who published the paper? Is so, shame on them. If not, shame on the Lawrence Berkeley National Laboratory.


1. I sent her a link to this post.

Tuesday, August 23, 2016

Splice variants of the human triose phosphate isomerase gene: is alternative splicing real?

Triose phosphate isomerase (TIM) is one of the enzymes in the gluconeogenesis pathway leading to the synthesis of glucose from simple precursors. It also plays a role in the degradation of glucose (glycolysis). The enzyme catalyzes the following reaction ....


Triose phosphate isomerase is found in almost all species. The structure and sequence of the enzyme is well-conserved. It is a classic β-barrel enzyme that usually forms a dimer. The overall structure of a single subunit is classic example of an αβ-barrel known as a TIM-barrel in reference to this enzyme.

To the best of my knowledge, no significant variants of this enzyme due to alternative promoters, alternative splicing, or proteolytic cleavage are known.1 The enzyme has been actively studied in biochemistry laboratories for at least eighty years.

Saturday, July 30, 2016

Siddhartha Mukherjee tries to correct his book

There are lots of things wrong with Mukherjee's best-selling book The Gene. I've listed a few things that I know about [What is a "gene" and how do genes work according to Siddhartha Mukherjee?]. Others have come up with different problems.

The biggest problem is that Mukherjee misrepresents the current state of knowledge in genetics, biochemistry, and molecular biology. His misleads his readers by promoting silly viewpoints that conflict with the consensus view. He doesn't mention that there are other views that are well supported by tons of scientific evidence.

The best example is regulation of gene expression. He fails to explain the standard textbook understanding of transcriptional regulation by transcription factors—a view that's solidly backed by decades of work in biochemistry, developmental genetics, molecular biology, and genomics. Instead, he promotes a flaky epigenetic theory that, according to him, threatens to overthrow Darwinian evolution.

Thursday, July 28, 2016

You are junk

There's an article about junk DNA in the latest issue of New Scientist (July 27, 2016) [You are junk: Why it’s not your genes that make you human]. I've already discussed the false meme at the beginning of the article [False history and the number of genes: 2016]. Now it's time to look at the main argument.

The subtitle is ...
Genes make proteins make us – that was the received wisdom. But from big brains to opposable thumbs, some of our signature traits could come from elsewhere.
You can see where this is going. You start with a false paradigm, "Genes make proteins make us," then proceed to refute it. This is called "paradigm shafting."1

Monday, August 10, 2015

Insulators, junk DNA, and more hype and misconceptions

The folks at Evolution News & Views (sic) can serve a very useful purpose. They are constantly scanning the scientific literature for any hint of evidence to support their claim about junk DNA. Recall that Intelligent Design Creationists have declared that if most of our genome is junk then intelligent design is falsified since one of the main predictions of intelligent design is that most of our genome will be functional.

THEME

Genomes & Junk DNA
They must be getting worried because their most recent posts sounds quite desperate. The last one is: The Un-Junk Industry. It quotes a popular press report on a paper published recently in Procedings of the National Academy of Sciences (USA). The creationists concede that the paper itself doesn't even mention junk DNA but the article in EurekAlert does.

Sunday, July 19, 2015

The fuzzy thinking of John Parrington: pervasive transcription

Opponents of junk DNA usually emphasize the point that they were surprised when the draft human genome sequence was published in 2001. They expected about 100,000 genes but the initial results suggested less than 30,000 (the final number is about 25,0001. The reason they were surprised was because they had not kept up with the literature on the subject and they had not been paying attention when the sequence of chromosome 22 was published in 1999 [see Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome].

The experts were expecting about 30,000 genes and that's what the genome sequence showed. Normally this wouldn't be such a big deal. Those who were expecting a large number of genes would just admit that they were wrong and they hadn't kept up with the literature over the past 30 years. They should have realized that discoveries in other species and advances in developmental biology had reinforced the idea that mammals only needed about the same number of genes as other multicellular organisms. Most of the differences are due to regulation. There was no good reason to expect that humans would need a huge number of extra genes.

That's not what happened. Instead, opponents of junk DNA insist that the complexity of the human genome cannot be explained by such a low number of genes. There must be some other explanation to account for the the missing genes. This sets the stage for at least seven different hypotheses that might resolve The Deflated Ego Problem. One of them is the idea that the human genome contains thousands and thousands of nonconserved genes for various regulatory RNAs. These are the missing genes and they account for a lot of the "dark matter" of the genome—sequences that were thought to be junk.

Here's how John Parrington describes it on page 91 of his book.
The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance, many were found only in specific cellular compartments, indicating that they have fixed addresses where they operate. Surely there could hardly be a greater divergence from Crick's central dogma than this demonstration that RNAs were produced in far greater numbers across the genome than could be expected if they were simply intermediates between DNA and protein. Indeed, some ENCODE researchers argued that the basic unit of transcription should now be considered as the transcript. So Stamatoyannopoulos claimed that 'the project has played an important role in changing our concept of the gene.'
This passage illustrates my difficulty in coming to grips with Parrington's logic in The Deeper genome. Just about every page contains statements that are either wrong or misleading and when he strings them together they lead to a fundamentally flawed conclusion. In order to critique the main point, you have to correct each of the so-called "facts" that he gets wrong. This is very tedious.

I've already explained why Parrington is wrong about the Central Dogma of Molecular Biology [John Avise doesn't understand the Central Dogma of Molecular Biology]. His readers don't know that he's wrong so they think that the discovery of noncoding RNAs is a revolution in our understanding of biochemisty—a revolution led by the likes of John A. Stamatoyannopoulos in 2012.

The reference in the book to the statement by Stamatoyannopoulos is from the infamous Elizabeth Pennisi article on ENCODE Project Writes Eulogy for Junk DNA (Pennisi, 2012). Here's what she said in that article ...
As a result of ENCODE, Gingeras and others argue that the fundamental unit of the genome and the basic unit of heredity should be the transcript—the piece of RNA decoded from DNA—and not the gene. “The project has played an important role in changing our concept of the gene,” Stamatoyannopoulos says.
I'm not sure what concept of a gene these people had before 2012. It appears that John Parrington is under the impression that genes are units that encode proteins and maybe that's what Pennisi and Stamatoyannopoulos thought as well.

If so, then perhaps the publicity surrounding ENCODE really did change their concept of a gene but all that proves is that they were remarkably uniformed before 2012. Intelligent biochemists have known for decades that the best definition of a gene is "a DNA sequence that is transcribed to produce a functional product."2 In other words, we have been defining a gene in terms of transcripts for 45 years [What Is a Gene?].

This is just another example of wrong and misleading statements that will confuse readers. If I were writing a book I would say, "The human genome sequence confirmed the predictions of the experts that there would be no more than 30,000 genes. There's nothing in the genome sequence or the ENCODE results that has any bearing on the correct understanding of the Central Dogma and there's nothing that changes the correct definition of a gene."

You can see where John Parrington's thinking is headed. Apparently, Parrington is one of those scientists who were completely unaware of the fact that genes could specify functional RNAs and completely unaware of the fact that Crick knew this back in 1970 when he tried to correct people like Parrington. Thus, Parrington and his colleagues were shocked to learn that the human genome only had only 25,000 genes and many of them didn't encode proteins. Instead of realizing that his view was wrong, he thinks that the ENCODE results overthrew those old definitions and changed the way we think about genes. He tries to convince his readers that there was a revolution in 2012.

Parrington seems to be vaguely aware of the idea that most pervasive transcription is due to noise or junk RNA. However, he gives his readers no explanation of the reasoning behind such a claim. Spurious transcription is predicted because we understand the basic concept of transcription initiation. We know that promoter sequences and transcription binding sites are short sequences and we know that they HAVE to occur a high frequency in large genomes just by chance. This is not just speculation. [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA]

If our understanding of transcription initiation is correct then all you need is a activator transcription factor binding site near something that's compatible with a promoter sequence. Any given cell type will contain a number of such factors and they must bind to a large number of nonfunctional sites in a large genome. Many of these will cause occasional transcription giving rise to low abundance junk RNA. (Most of the ENCODE transcripts are present at less than one copy per cell.)

Different tissues will have different transcription factors. Thus, the low abundance junk RNAs must exhibit tissue specificity if our prediction is correct. Parrington and the ENCODE workers seem to think that the cell specificity of these low abundance transcripts is evidence of function. It isn't—it's exactly what you expect of spurious transcription. Parrington and the ENCODE leaders don't understand the scientific literature on transription initiation and transcription factors binding sites.

It takes me an entire blog post to explain the flaws in just one paragraph of Parrington's book. The whole book is like this. The only thing it has going for it is that it's better than Nessa Carey's book [Nessa Carey doesn't understand junk DNA].


1. There are about 20,000 protein-encoding genes and an unknown number of genes specifying functional RNAs. I'm estimating that there are about 5,000 but some people think there are many more.

2. No definition is perfect. My point is that defining a gene as a DNA sequence that encodes a protein is something that should have been purged from textbooks decades ago. Any biochemist who ever thought seriously enough about the definition to bring it up in a scientific paper should be embarrassed to admit that they ever believed such a ridiculous definition.

Pennisi, E. (2012) "ENCODE Project Writes Eulogy for Junk DNA." Science 337: 1159-1161. [doi:10.1126/science.337.6099.1159"]

Monday, March 23, 2015

Quantifying the "central dogma"

There was a short article in a recent issue of Science that caught my eye. The title was "Statistics requantitates the central dogma."

As most Sandwalk readers know, The Central Dogma of Molecular Biology says,
... once (sequential) information has passed into protein it cannot get out again (F.H.C. Crick, 1958)
The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information. It states that such information cannot be transferred from protein to either protein or nucleic acid. (F.H.C. Crick, 1970)
You might wonder how you can quantify the idea that once information gets into protein it can't flow back to nucleic acids. You can't, of course.

The authors are referring to the standard scheme of information flow from DNA to RNA to protein. This is often mistakenly referred to as the Central Dogma by those scientists who haven't read the original papers. In this case, the authors of the Science article are asking whether the levels of protein in different cells are mostly controlled at the level of transcription, translation, mRNA degradation, or protein degradation.

Mary Lyon (1925 - 2014)

Mary Lyon died on Christmas day last December. She was 89 years old.

She was a famous mouse geneticist who spend most of her working career at the MRC labs in Harwell, United Kingdom (near Oxford). The labs are known as an international center for mouse genetics.

Mary Lyon is famous for discovering the phenomenon of X chromosome inactivation. This is when one the the X chromosomes of female mammals is selectively inactivated so that the products of the X chromosome genes are quantitatively similar to the dosage in males where there's only one X chromosome. The phenomenon used to be referred to as Lyonization.

I never met Mary Lyon but from what people say about her, I'm sure I would have liked her. Here's an excerpt from the obituary in Nature: Mary F. Lyon (1925 - 2014).
Lyon was a central figure in twentieth-century mouse genetics. She laid the intellectual foundations and developed the genetic tools for the use of mice as model organisms in molecular medicine, cell and developmental biology and in deciphering the function of the human genome. Lyon was editor of Mouse News Letter from 1956 to 1970, a publication that had a key role in establishing a mouse-focused research community in the pre-Internet age. She also helped to develop a common language for the field by chairing the Committee on Standardised Genetic Nomenclature for Mice from 1975 to 1990. Her pivotal contribution was recognized by the naming of the Mary Lyon Centre, an international facility for mouse-genetic resources, opened at Harwell in 2004, and by the creation of the Mary Lyon Medal by the UK Genetics Society in 2014.

Because everything Mary said was so carefully thought through, she could be difficult to talk to: on the phone, it was easy to think you had been cut off. She did not suffer fools gladly, but was a great supporter of the bright young scientist, often eschewing authorship of publications to enhance the profile of junior collaborators. She was intellectually rigorous but not dictatorial. When I began my PhD with her in 1977, she gave me a handful of papers, showed me the genetic tools — mice carrying the various mutations and chromosomal rearrangements — and said, “do something on X-inactivation”. That degree of academic freedom was exhilarating, coupled as it was with the safety net of robust critique.

... Her first love was mice, although she always had a cat — a tortoiseshell, of course.
X chromosome inactivation is one of the classic examples of epigenetics, sensu stricto. It was the subject of one of my most popular posts of all time: Calico cats. Calico cats almost always have to be female but there are very rare examples of male calico cats. Can anyone figure out why?



Tuesday, March 10, 2015

A physicist tries to understand junk DNA

Rob Sheldon has a PhD. in physics and a M.A. in religion.1 With two strikes against him already, he attempts to understand biology by discussing evolution, junk DNA, and the Onion Test [Physicist suggests: “Onion test” for junk DNA is challenge to Darwinism, not ID]. As you might imagine, posting on Uncommon Descent in support of Intelligent Design Creationism leads directly to strike three.

The Onion Test was created by Ryan Gregory in 2007 [The Onion Test] and published in the scientific literature by Palazzo and Gregory in 2014. It goes like this. Take your favorite hypothesis suggesting that most of the DNA in the human genome is functional and use it to explain why the onion, Allium cepa, needs a genome that is five times larger than the human genome. Then explain why closely related species of onion need twenty times more DNA than humans.

Tuesday, September 23, 2014

A new mechanism of gene regulation!

I love it when new things are discovered, especially if they concern biochemistry. I'm always on the lookout for exciting discoveries that are going to make it into the next edition of my textbook.

That's why my eyes lit up (not!) when I saw this headline in Biology New Net: New mechanism in gene regulation revealed. Here's the teaser ...
The information encoded in our genes is translated into proteins, which ultimately mediate biological functions in an organism. Messenger RNA (mRNA) plays an important role, as it is the molecular template used for translation. Scientists from the Helmholtz Zentrum Muenchen and the Technische Universität Muenchen, in collaboration with international colleagues, have now unraveled a molecular mechanism of mRNA recognition, which is essential for understanding differential gene regulation in male and female organisms. The results are published in the renowned scientific journal Nature.
It took me a few minutes to track down the article because there weren't many hints in the press release. Turns out it still hasn't appeared in the print copy but it's available online.
Hennig, J., Militti, C., Popowicz, G.M., Wang, I., Sonntag, M., Geerlof, A., Gabel, F., Gebauer, F., and Sattler, M. (2014) Structural basis for the assembly of the Sxl–Unr translation regulatory complex. Nature published online Sept. 7, 2014 [doi:10.1038/nature13693]
The "new mechanism" is the binding of a protein to mRNA to block translation.

I suppose it depends on your definition of "new." We've been teaching undergraduates about this for over thirty years.

There's nothing in the paper about a new mechanism of gene regulation and there's no evidence in the press release that any of the authors make such a claim.


Thursday, August 07, 2014

The Function Wars: Part IV

The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.
Stephan Jay Gould (1982)
This is my fourth post on the function wars.

The first post in this series covered the various definitions of "function" [Quibbling about the meaning of the word "function"]. In the second post I tried to create a working definition of "function" and I discussed whether active transposons count as functional regions of the genome or junk [The Function Wars: Part II]. I claim that junk DNA is DNA that is nonfunctional and it can be deleted from the genome of an organism without affecting its survival, or the survival of its descendants.

In the third post I discussed a paper by Rands et al. (2014) presenting evidence that about 8% of the human genome is conserved [The Function Wars: Part III]. This is important since many workers equate sequence conservation with function. It suggests that only 8% of our genome is functional and the rest is junk. The paper is confusing and I'm still not sure what they did in spite of the fact that the lead author (Chris Rands) helped us out in the comments. I don't know what level of sequence similarity they counted as "constrained." (Was it something like 35% identity over 100 bp?)

My position if is that there's no simple definition of function but sequence conservation is a good proxy. It's theoretically possible to have selection for functional bulk DNA that doesn't depend on sequence but, so far, there are no believable hypothesis that make the case. It is wrong to arbitrarily DEFINE function in terms of selection (for sequence) because that rules out all bulk DNA hypotheses by fiat and that's not a good way to do science.

So, if the Rands et al. results hold up, it looks like more that 90% of our genome is junk.

Let's see how a typical science writer deals with these issues. The article I'm selecting is from Nature. It was published online yesterday (Aug. 6, 2014) (Woolston, 2014). The author is Chris Woolston, a freelance writer with a biology background. Keep in mind that it was Nature that started the modern functions wars by falling hook-line-and-sinker for the ENCODE publicity hype. As far as I know, the senior editors have not admitted that they, and their reviewers, were duped.

Tuesday, July 29, 2014

Walter Gehring (1939 - 2014)

I just learned today that Walter Gehring died in a car accident in Greece on May 29th. I learned of his death from the obituary by Michael Levine in Science [Walter Gehring (1939–2014)]. There's another obituary on the Biozentrum (Basel, Switzerland) website [Obituary for Walter Gehring (1939 – 2014)]. He was only seven years older than me.

I first met Walter Gerhing when I was a post-doc in Alfred Tissières lab in Geneva (Switzerland) in the mid-1070s. The two labs collaborated on cloning and characterizing the major heat shock gene (Hsp70) of Drosophila melanogaster. Paul Schedl and Spyros Artavanis-Tsakonis made the library in Gehring's lab in 1976-1977 and Marc-Edouard Mirault and I isolated the mRNA for screening and then identified the genes we cloned. The result was three papers in Cell (see below). (John Lis, then in David Hogness' lab, was cloning the same gene.)

I met Gerhing dozens of times but I only had a few conversations with him one-on-one. We always talked about evolution. I always found him to be very charming and very curious and not embarrassed to admit that he didn't know something. Other post-docs and students in his lab have different impressions.

As Michael Levine puts it ...
An amazing group of students and postdocs was attracted to the Gehring lab over the years: Eric Wieschaus (Nobelist), Christianne Nüsslein-Volhard (Nobelist), David Ish-Horowicz, Spyros Artavanis-Tsakonas, Paul Schedl, Alex Schier, Georg Halder, Hugo Bellen, and Markus Affolter, to mention just a few. I worked closely with two of my future lifelong friends and colleagues: Ernst Hafen and Bill McGinnis. The lab was an absolute blast, but a strange mix of anarchy and oppression. Walter permitted considerable independence, but was hardly laissez-faire. He could be confrontational, and did not hesitate to call us out (particularly me) when he felt we were misbehaving.

I found Walter to be a complicated character. He had the mannerisms of an authoritative Herr Doktor Professor, but was also folksy and unaffected and always ready to laugh and joke. He sometimes felt competitive with his students and postdocs, but was also highly supportive and proud of our independent careers. In short, I believe the key to Walter's success was his yin and yang embodiment of old-world scholar and modern competitive scientist. He was able to exude charm and empathy, but nothing we did seemed to be quite good enough. In other words, tough love, possibly the perfect prescription for eliciting the very best efforts from his students and postdocs.
Walter Gehring was one of a small group people who changed the way I think about science.


Artavanis-Tsakonas, S., Schedl, P., Mirault, M.-E., Moran, L. and J. Lis (1979) Genes for the 70,000 dalton heat shock protein in two cloned D. melanogaster DNA segments. Cell 17, 9-18. [doi: 10.1016/0092-8674(79)90290-3]

Moran, L., Mirault, M.-E., Tissières, A., Lis, J., Schedl, P., Artavanis-Tsakonas, S. and W.J. Gehring (1979) Physical map of two D. melanogaster DNA segments containing sequences coding for the 70,000 dalton heat shock protein. Cell 17, 1-8. [doi: 10.1016/0092-8674(79)90289-7]

Schedl, P., Artavanis-Tsakonas, S., Steward, R., Gehring, W. J., Mirault, M.-E., Goldschmidt-Clermont, M., Moran, L. and A. Tissières (1978) Two hybrid plasmids with D. melanogaster DNA sequences complementary to mRNA coding for the major heat shock protein. Cell 14, 921-929. [doi: 10.1016/0092-8674(78)90346-X]

Monday, July 28, 2014

Transcription Initiation Sites: Do You Think This Is Reasonable? (revisited)

I'm curious about how different people read the scientific literature. My way of thinking about science is to mentally construct a model of how I think things work. The more I know about a subject, the more sophisticated the model becomes.

When I read a new paper I immediately test it against my model of how things are supposed to work. If the conclusions of the paper don't fit with my views, I tend to be very skeptical of the paper. Of course I realize that my model could be wrong and I'm always on the lookout for new results that challenge the current dogma, but, in most cases, if the paper conflicts with current ideas then it's probably flawed.

This is what people mean when they talk about making sense of biology. The ENCODE papers don't make sense, according to my model of how genomes work so I was immediately skeptical of the reported claims. The arseniclife paper conflicted with my understanding of the structure of DNA and how it evolved so I knew it was wrong even before Rosie Redfield pointed out the flaws in the methodology.

Wednesday, May 14, 2014

What did the ENCODE Consortium say in 2012?

When the ENCODE Consortium published their results in September 2012, the popular press immediately seized upon the idea that most of our genome was functional and the concept of junk DNA was debunked. The "media" in this case includes writers at prestigious journals like Science and Nature and well-known science writers in other respected publications and blogs.

In most cases, those articles contained interviews with ENCODE leaders and direct quotes about the presence of large amounts of functional DNA in the human genome.

The second wave of the ENCODE publicity campaign is trying to claim that this was all a misunderstanding. According to this revisionist view of recent history, the actual ENCODE papers never said that most of our genome had to be functional and never implied that junk DNA was dead. It was the media that misinterpreted the papers. Don't blame the scientists.

You can see an example of this version of history in the comments to How does Nature deal with the ENCODE publicity hype that it created?, where some people are arguing that the ENCODE summary paper has been misrepresented.

Friday, May 09, 2014

How does Nature deal with the ENCODE publicity hype that it created?

Let's briefly review what happened in September 2012 when the ENCODE Consortium published their results (mostly in Nature).

Here's the abstract of the original paper published in Nature in September 2012 (Birney et al. 2012). Manolis Kellis (see below) is listed as a principle investigator and member of the steering committee.
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.
Most people reading this picked up on the idea that 80% of the genome had a function.

Saturday, May 03, 2014

Michael White's misleading history of the human gene

There are many ways of defining the gene but only some of them are reasonable in the 20th and 21st centuries [What Is a Gene?]. By the 1980s most knowledgeable biologists were thinking of a gene as a DNA sequence that's transcribed to produce a functional product.

They were familiar with genes that encoded proteins and with a wide variety of genes that produce functional RNAs like ribosomal RNA , transfer RNA, regulatory RNAs, and various catalytic RNAs. It would have been difficult to find many knowledgeable biologists who thought that all genes encoded proteins.

By the 1980s, most knowledgeable biologists were aware of RNA processing. They knew that the primary transcripts of genes could be modified in various ways to produce the final functional form. They knew about alternative splicing. All these things were taught in undergraduate courses and written in the textbooks.

Here's how Michael White views that history in: Your Genes Are Obsolete.

Saturday, April 19, 2014

Core concepts in genetics

I stumbled across an article in Science & Education that caught my eye. The authors discuss the way genetics is taught in introductory university courses (McElhinny et al. 2014). They quote several sources that define the core concepts of genetics that students must learn.

Here's the list ...
  1. DNA is the universal information molecule in living organisms, encoding genes and allowing for genetic variation within and genetic continuity between generations (DNA);
  2. Mendelian patterns of inheritance are directly related to the mechanisms of meiosis (MENDELIAN);
  3. Traits result from the expression of one or more genes working alone or together, with the environment, often in unpredictable ways (GENE EXPRESSION);
  4. The activities of genes and the environment regulate all developmental processes (GENES + ENVIRONMENT);
  5. Genetic variation underlies variation in traits, which is the basis for the differential survival and reproduction that allow populations to evolve (VARIATION); and
  6. The ability to analyze and manipulate genetic information raises a variety of complex issues for individuals and society (GENES + SOCIETY).
These six concepts for genetic literacy will hereafter be referred to as the core genetics concepts.
This is a strange list. Let me explain why.
  1. The structure of DNA and how it is expressed should be covered in other mandatory courses, including introductory biology and biochemistry. You should not have to spend any time at all on these topics in a genetics course. (P.S. DNA does not "encode genes.") You may want to spend some time on the biochemistry of recombination if it's not covered elsewhere. Students should understand Holliday junctions and how they are resolved.
  2. Mendelian genetics is important. Students should learn and understand the three laws he discovered. They should also learn about meiosis and sex. However, it's important for students to understand that simple transmission genetics is not limited to diploid eukaryotes. Bacteria also do genetics.
  3. Traits (phenotype) are due to information in DNA (not just genes) but most of those traits have very little to do with the external environment.
  4. Of course the activities of genes regulate development. They also regulate the citric acid cycle, photosynthesis, and protein synthesis. Surely you don't want undergraduates to think that development is the only thing that's important in genetics?
  5. It's important for students to understand that populations contain genetic variation. That means they have to learn about MUTATION and how it happens. They also have to learn why there's so much variation in populations—one of the most important discoveries in genetics in the last century. The answer is Neutral Theory and random genetic drift. No genetics course should leave out this important concept, especially because so few students will have never heard of it before enrolling in the course.
  6. Discussions about cloning, GM foods, and personal genomes are interesting but, unfortunately, there are very few scientists who can handle those issues in a genetics course. The important core concept is to get the science right and make sure students understand that getting the science right is absolutely essential whenever you discuss controversial issues.
  7. POPULATION GENETICS is an essential core concept in an introductory genetics course. You can't teach students about the genetics of EVOLUTION without it.
The authors discovered that only the first three "core concepts" were taught in every genetics course. Variation and mutation were taught in only 88% of the courses they surveyed. Only 63% covered GENES + ENVIRONMENT and only 9% covered GENES + SOCIETY.

McElhinn et al. (2014) discuss one possible change in the curriculum. It's a suggestion originally made by Dougherty (2009) and echoed by Redfield (2012). The idea is to "invert" genetics courses by beginning with coverage of poplations, variation, and complex traits. I strongly disagree with Rosie Redfield's proposal [see Questions for Genetics Students] but what surprises me in the McElhinny et al (2014) paper is that they can seriously list those core concepts without mentioning mutation and population genetics.


McElhinny, T.L., Dougherty, M.J., Bowling, B.V., and Libarkin, J.C. (2014) The Status of Genetics Curriculum in Higher Education in the United States: Goals and Assessment. Science and Education 23:445-464.
[doi: 10.1007/s11191-012-9566-1]

Redfield, R. J. (2012) "Why do we have to learn this stuff?’’—A new genetics for 21st century students. PLoS Biology, 10, e1001356 [doi: 10.1371/journal.pbio.1001356]

Monday, March 24, 2014

What is epigenetics?

Several students in my class decided to write essays on epigenetics. This was very brave of them since nobody seems to have a good definition of epigenetics and much of the hype about epigenetics is not very scientific. I'm also more than a little skeptical about some of the claims that have been made.

Here's a video. What do you think? Is this a useful contribution to our understanding of a complex issue? Is the inheritance of methylation sites at restriction/modification loci in bacteria an example of epigenetics? After E. coli divides, both cells inherit some lac repressor molecules and the lac operon is not expressed provided the parent wasn't exposed to lactose. Is this epigenetics?