More Recent Comments

Saturday, February 03, 2018

What's in Your Genome?: Chapter 5: Regulation and Control of Gene Expression

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?]. Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs [What's in Your Genome? Chapter 4: Pervasive Transcription].

Chapter 5 is Regulation and Control of Gene Expression.
Chapter 5: Regulation and Control of Gene Expression

What do we know about regulatory sequences?
The fundamental principles of regulation were worked out in the 1960s and 1970s by studying bacteria and bacteriophage. The initiation of transcription is controlled by activators and repressors that bind to DNA near the 5′ end of a gene. These transcription factors recognize relatively short sequences of DNA (6-10 bp) and their interactions have been well-characterized. Transcriptional regulation in eukaryotes is more complicated for two reasons. First, there are usually more transcription factors and more binding sites per gene. Second, access to binding sites depends of the state of chromatin. Nucleosomes forming high order structures create a "closed" domain where DNA binding sites are not accessible. In "open" domains the DNA is more accessible and transcription factors can bind. The transition between open and closed domains is an important addition to regulating gene expression in eukaryotes.
The limitations of genomics
By their very nature, genomics studies look at the big picture. Such studies can tell us a lot about how many transcription factors bind to DNA and how much of the genome is transcribed. They cannot tell you whether the data actually reflects function. For that, you have to take a more reductionist approach and dissect the roles of individual factors on individual genes. But working on single genes can be misleading ... you may miss the forest for the trees. Genomic studies have the opposite problem, they may see a forest where there are no trees.
Regulation and evolution
Much of what we see in evolution, especially when it comes to phenotypic differences between species, is due to differences in the regulation of shared genes. The idea dates back to the 1930s and the mechanisms were worked out mostly in the 1980s. It's the reason why all complex animals should have roughly the same number of genes—a prediction that was confirmed by sequencing the human genome. This is the field known as evo-devo or evolutionary developmental biology.
           Box 5-1: Can complex evolution evolve by accident?
Slightly harmful mutations can become fixed in a small population. This may cause a gene to be transcribed less frequently. Subsequent mutations that restore transcription may involve the binding of an additional factor to enhance transcription initiation. The result is more complex regulation that wasn't directly selected.
Open and closed chromatin domains
Gene expression in eukaryotes is regulated, in part, by changing the structure of chromatin. Genes in domains where nucleosomes are densely packed into compact structures are essentially invisible. Genes in more open domains are easily transcribed. In some species, the shift between open and closed domains is associated with methylation of DNA and modifications of histones but it's not clear whether these associations cause the shift or are merely a consequence of the shift.
           Box 5-2: X-chromosome inactivation
In females, one of the X-chromosomes is preferentially converted to a heterochromatic state where most of the genes are in closed domains. Consequently, many of the genes on the X chromosome are only expressed from one copy as is the case in males. The partial inactivation of an X-chromosome is mediated by a small regulatory RNA molecule and this inactivated state is passed on to all subsequent descendants of the original cell.
           Box 5-3: Regulating gene expression by
           rearranging the genome

In several cases, the regulation of gene expression is controlled by rearranging the genome to bring a gene under the control of a new promoter region. Such rearrangements also explain some developmental anomalies such as growth of legs on the head fruit flies instead of antennae. They also account for many cancers.
ENCODE does it again
Genomic studies carried out by the ENCODE Consortium reported that a large percentage of the human genome is devoted to regulation. What the studies actually showed is that there are a large number of binding sites for transcription factors. ENCODE did not present good evidence that these sites were functional.
Does regulation explain junk?
The presence of huge numbers of spurious DNA binding sites is perfectly consistent with the view that 90% of our genome is junk. The idea that a large percentage of our genome is devoted to transcriptional regulation is inconsistent with everything we know from the the studies of individual genes.
           Box 5-3: A thought experiment
Ford Doolittle asks us to imagine the following thought experiment. Take the fugu genome, which is very much smaller than the human genome, and the lungfish genome, which is very much larger, and subject them to the same ENCODE analysis that was performed on the human genome. All three genomes have approximately the same number of genes and most of those genes are homologous. Will the number of transcription factor biding sites be similar in all three species or will the number correlate with the size of the genomes and the amount of junk DNA?
Small RNAs—a revolutionary discovery?
Does the human genome contain hundreds of thousands of gene for small non-coding RNAs that are required for the complex regulation of the protein-coding genes?
A “theory” that just won’t die
"... we have refuted the specific claims that most of the observed transcription across the human genome is random and put forward the case over many years that the appearance of a vast layer of RNA-based epigenetic regulation was a necessary prerequisite to the emergence of developmentally and cognitively advanced organisms." (Mattick and Dinger, 2013)
What the heck is epigenetics?
Epigenetics is a confusing term. It refers loosely to the regulation of gene expression by factors other than differences in the DNA. It's generally assumed to cover things like methylation of DNA and modification of histones. Both of these effects can be passed on from one cell to the next following mitosis. That fact has been known for decades. It is not controversial. The controversy is about whether the heritability of epigenetic features plays a significant role in evolution.
           Box 5-5: The Weismann barrier
The Weisman barrier refers to the separation between somatic cells and the germ line in complex multicellular organisms. The "barrier" is the idea that changes (e.g. methylation, histone modification) that occur in somatic cells can be passed on to other somatic cells but in order to affect evolution those changes have to be transferred to the germ line. That's unlikely. It means that Lamarckian evolution is highly improbable in such species.
How should science journalists cover this story?
The question is whether a large part of the human genome is devoted to regulation thus accounting for an unexpectedly large genome. It's an explanation that attempts to refute the evidence for junk DNA. The issue is complex and very few science journalists are sufficiently informed enough to do it justice. They should, however, be making more of an effort to inform themselves about the controversial nature of the claims made by some scientists and they should be telling their readers that the issue has not yet been resolved.


25 comments :

John Harshman said...

Take the fugu genome, which is very much smaller than the human genome, and the lungfish genome, which is very much larger, and subject them to the same ENCODE analysis that was performed on the human genome.

Let me attempt to anticipate the counter-claim here. Humans are of course much more complex than fugu, so all that extra DNA is surely functional. Humans are of course much more complex than lungfish, so all that extra DNA is an index of how much less efficient lungfish developmental control is than human. Is that about it?

The Other Jim said...

Ugh... coffee in my nose from laughing....

Tom Mueller said...

Barb made the salient observation that Gene expression in eukaryotes may not be regulated, in part, by changing the structure of chromatin.

As she remarked - perhaps gene expression rather requires folding such inconvenient bulk-DNA out of the way to permit efficient gene expression..

Meanwhile -Doolittle also offered the following: "That "excess" DNA (that 40x as much that lungfish have compared to us, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by us "junk" supporters.”

In other words, this chromosomal super-structure constitutes non-informational nucleic acid "clean fill" (to use Doolittle's phraseology) between the crucial bits that get moved around, and represents an extravagantly redundant level of gene regulation.

... again, on the shaky presumption I really am understanding the nuances of all of this correctly.


Larry Moran said...

Bulk DNA hypotheses are perfectly reasonable. There's just no solid evidence to support them.

Restarting the function wars (The Function Wars Part V)

John Harshman said...

Hmmm..."reply" seems not to work. Tom, I missed the point where you explain why lungfish need 40x as much bulk DNA as humans do, and why fugu need almost none. And that's the big problem with any bulk DNA hypothesis.

Tom Mueller said...

Hi John - let's see if I can reply.

The issue, at least as I understand it is less than profound. Gene regulation is always leaky, for lack of a better word. Therefore several overlapping and redundant levels of gene regulation has its advantages - akin to simultaneously wearing belt and suspenders.

I contacted Ford Doolittle and asked him whether my rendition of ENCODE was sufficiently correct to present to my students. I asked:

For example, would I perhaps be stretching a point too far when I
claim interphase chromosome architecture is "functional"?

Is my suggestion perfectly acceptable form an ENCODE POV? I guess
I am asking whether my redefinition of "functionality" is not
similarly "disingenuous" along the lines you criticize in your PNAS
paper.


Here is Ford Doolittle in his own words:

Well no, not disingenuous, but perhaps not new. That "excess" DNA(that 40x as much that lungfish have compared to us, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by us "junk" supporters.

To my mind, it's like the "clean fill" you see signs for along the
highway. There may be a need for that much DNA but it doesn't matter what it is, as long as it doesn't contain deleterious sequences. You are also suggesting, I think, that larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways. I agree, as spelled out in the [PNAS] paper.


Tom Mueller said...

Hi Larry - quick question:

How does one go about getting a personally signed copy of your book? I for one, would be willing to what extra expense is involved.

Tom Mueller said...

I had occasion to contact Ford Doolittle again

I asked him:

There is significant disagreement regarding the precise intent of what you said in your PNAS paper “Is junk DNA bunk? A critique of ENCODE”

I was hoping against hope you would take a minute to set the record straight.

The thread can be found here:


http://sandwalk.blogspot.ca/2015/05/ford-doolittle-talks-about-transposons.html

I appreciated the reply which was direct and to the point:

Ford Doolittle: I guess I do not see the problem. The junk idea was that much DNA does not serve an informational role. It is not genes and not involved in any DIRECT way in the regulation of the expression of genes. This need not mean that selection does not care at all how much DNA an organism carries around. Moreover much DNA is transposable elements, which are answerable to selection at their own level, but do not necessarily enhance "host" fitness, so from the host's perspective they are junk...

To set context - this falls in line with my earlier query to Ford Doolitle whether all that "parasitic" DNA which serves no immediate direct benefit to the host can serve as "bric-a-brack" as opposed to "junk"?

... i.e. as material just laying around as possibly handy for future exaptation?

Like I said - "belt and suspenders"

Larry mentioned above: Bulk DNA hypotheses are perfectly reasonable. There's just no solid evidence to support them.

I am uncertain about that... are ecydysozoa perhaps more vulnerable to TE mutagenesis than deuterostomes? Has the question been addressed?

Is the question even relevant? After all, Drosophila have facultative heterochromatin, whose structure seems regulated according to the timing of those polytene puffs as different genes are being expressed.

I am the first to concede I may be overly naive and missing something

Larry Moran said...

Ford Doolittle and I disagree on the meaning of the word "junk." I think functional transposons are functional, not junk, even though they may not affect the host. To me, it makes no sense to refer to a transposon that makes a functional transposase enzyme as junk.

Similarly, spacer DNA and any other DNA that serves a non-sequence-specific function is functional, not junk. If the DNA has been selected to bulk up the genome for some reason then it isn't junk DNA.

Junk DNA is confined, in my opinion, to DNA that's dispensable in the life of the organism or the evolution of the species.

My view also conflicts with that preferred by Dan Graur. He limits the functional part of the genome to DNA that's selected for sequence and/or length. In that sense we agree that bulk/spacer DNA is functional. However, he subdivides nonfunctinal DNA (rubbish DNA) into two categories: junk DNA and garbage DNA. Garbage DNA consists of detrimental sequences that are being actively selected against. To me, this is a useless distinction (junk vs garbage). Garbage DNA clearly has a selected-effect function if it's being actively eliminated by natural selection.

Tom Mueller said...

Hi Larry

I am not getting something here:

You state

1 - I think functional transposons are functional, not junk, even though they may not affect the host.

I have difficulty reconciling this with your other statement:

You suggest that: 2 - ”Junk DNA is confined, in my opinion, to DNA that's dispensable in the life of the organism or the evolution of the species.

So what about those functional transposons?

- these are “DNA that’s dispensable in the life of the organism”? in other words “junk”

but simultaneously

- functional transposons … even though they may not affect the host. or, in other words "not junk".

If you rescue this apparent contradiction by invoking the eventual role of the innocuous Transposon DNA lying about as providing the eventual wherewithal important for ”the evolution of the species” – I would counter any such admittedly hypothetical rebuttal, with the question of how that hypothetical rebuttal differs from my original suggestion. (forgive me for jumping ahead on assumed rebuttals)

What am I not getting?


John Harshman said...

It would seem to me, then, that any bulk DNA in excess of that in fugu, at a maximum, must be really truly junk. There's no point in imagining that lungfish have some need for more of that bulk stuff than fugu.

John Harshman said...

What you're not getting is what Larry means by "functional" here. A functional transposon is one that's capable of inserting itself into a new place in the genome. That's functional from the perspective of the parasitic little bit of DNA, but it isn't functional from the perspective of the organism. (Of course there are a few transposon insertions that have become functional for the organism, just as any mutation can become functional, but that's not what Larry is talking about.)

Tom Mueller said...

Hi John – thanks for answering

Re: What you're not getting is what Larry means by "functional" here.

No – I got that the first time. I have no problem with the term “functional” being “fluid” depending on whether one is changing as you say “perspective” from parasite to host.

My problem is with the notion of “junk”.

Larry explains that he disagrees with both Doolittle’s and Graur’s definitions when considering the notion of “junk” from a host’s perspective.

I do not think, that in fact, is logically possible while simultaneously maintaining that " Junk DNA is confined… to DNA that's dispensable in the life of the organism or the evolution of the species"

John Harshman said...

Perhaps, then, you mistake a necessary condition for a sufficient one. Just because you limit "junk" to sequence that's useless to the host doesn't mean that all sequences useless to the host must be junk. Consider a Venn diagram with concentric circles. (On this I would disagree with Larry and call transposons junk, again excepting those few that have an organismal function.)

Larry Moran said...

Tom, biology is messy. You are spending too much time parsing details and not enough on seeing the big picture.

Functional transposons are essential to the life of the transposon "organism" and some of them produce functional proteins. I don't like to dismiss them as junk.

However, I realize that my working definition of junk DNA doesn't recognize different levels of selection so functional transposons must count as junk since they are dispensable in the life of the host organism.

Tough. No definitions in biology are 100% perfect. That's why Doolittle and Graur and a host of others have different criteria for junk DNA. It's why I think the "Function Wars" are a waste of time. Just state your criteria and move on.

The important thing is to be clear on what one means when one talks about junk DNA and why most our genome is junk. I think I'm doing that. Only a tiny percentage (<0.1%) of our genome contains active transposons. It's irrelevant.

Larry Moran said...

John, I'm comfortable with you and Doolittle saying that active transposons are junk DNA as long as you make clear that, in this case, junk DNA may be transcribed into mRNA which is translated to produce a functional protein and the sequences are under some form of selection for function.

Recently duplicated protein-coding genes are another ambiguous example. I'd rather not call one copy "junk" even though it's ultimate fate is to become a pseudogene. This is a causal-role function but I'm not going to lose sleep over it.

Tom Mueller said...

Hi Larry - we just cross-posted

I am not certain that the apparent (emphasis on the word "apparent") contradiction can be rescued by invoking higher levels of selection.

I suggest in humblest terms that your criterion/definition as currently stated does not communicate all that you, in fact, intend or imply according to your subtler than most's understanding of the admitted fuzzy notion of "junk"

Tom Mueller said...

oops - let's move this to the proper place in the thread:

Hi John - I love invoking Jesuitical notions of necessary vs sufficient

Let’s go back to your original suggestion:

JH: …what Larry means by "functional" here. A functional transposon is one that's capable of inserting itself into a new place in the genome. That's functional from the perspective of the parasitic little bit of DNA, but it isn't functional from the perspective of the organism.

Actually, I even disagree with that statement of yours: Any freeloading and self-propagating DNA – innocuous or not – IS INDEED “functional” from a host’s perspective; if I am understanding Larry’s contention correctly.

Let’s examine Larry’s first statement where he disagrees with Ford Doolittle:

LM: I think functional transposons are functional, not junk, even though they may not affect the host.

In other words, innocuously freeloading DNA (even when “useless to the host”) is “not junk”, when it is “active” in any transcriptional and even translational sense of the word; i.e. “not junk” even from a host’s perspective. That is how I understand Larry.

However, this is the important bit from where I admittedly may misunderstand things: “innocuously freeloading DNA” is later deemed “Junk” according to Larry’s definition where he explains:

LM: Junk DNA is confined, in my opinion, to DNA that's dispensable in the life of the organism or the evolution of the species.

This is where Larry loses me:

“Innocuously freeloading DNA” is by definition “dispensable in the life of the organism or the evolution of the species.”
Larry’s definition of “junk” juxtaposed to his restatement of Graur & Doolittle’s versions are internally inconsistent as I understand things. I think Larry may actually be agreeing with Graur by having accidently conflated two different meanings of the word functional from a Biochemist’s perspective.

I welcome correction.

Tom Mueller said...

Thank you Larry - your answer to John was a big help.

I still an eager to find out how I can obtain an autographed copy of your new book ;-)

I would be delighted to pay whatever extra cost is involved.

thanks

John Harshman said...

Larry,

Are transposons really subject to selection? Certainly individual instances are not. The population of transposons in a genome is, in the sense that a transposon that suffers an inactivating mutation no longer reproduces. But that's selection on a whole different track from the sort of selection that affects non-junk (or other non-junk by your definition) sequences.

Tom Mueller said...

Hi John

This question was already discussed on an earlier occasion when I suggested the metaphor of an extension cord.

A 10 foot extension cord is just as functional as a 5 foot extension cord, when only a 5 foot cord is required to the job

Metaphors of course are clumsy

John Harshman said...

If you ask me, whatever length of the extension cord are unnecessary to its function is just plain junk. Now of course bulk DNA has (allegedly) a little function, but that function might be considered as distributed across the sequence; if it's 10x longer than its minimal length, each little bit of the sequence could be considered 90% junk, 10% bulk-functional. To a first approximation, junk.

Tom Mueller said...

Hi John

I don't think we are really disagreeing over fundamentals. Like Larry said - nothing worth losing sleep over

Tom Mueller said...

Interesting tweet

https://www.quantamagazine.org/with-downsized-dna-flowering-plants-took-over-the-world-20180111

Anonymous said...

Interesting article.