Wednesday, September 05, 2007

The Role of Ultraconserved Non-Coding Elements in Mammalian Genomes

Ultraconserved elements are stretches of DNA that are 100% identical in mouse, rat, and human genomes. In order to qualify as an ultraconserved element, the length has to be greater than 200 bp. This eliminates most sequences that might be identical by chance.

The most interesting elements are those that fall outside of coding regions. These ultraconservative elements are most likely to be involved in regulating gene expression or some other essential feature of non-coding DNA. The fact that they are identical in species who last shared a common ancestor 100 million years ago is powerful evidence of adaptation.

Ahitiv et al. (2007) set out to test this hypothesis by selecting four examples of ultraconservative elements for further analysis. They discovered that the elements function as tissue specific enhancers in a test designed to look at how they control expression of a maker gene in mouse embryos. The results are shown in Figure 1 (left) of their paper, which was just published in the open access journal PLoS Biology.

The figure shows the genomic location of the four ultraconserved elements; uc248 (222 bp), uc329 (307 bp), uc467 (731 bp), and uc482 (295 bp).

Of these, uc467 is the most remarkable because it is 731 bp in length and resides in the last intron of the DNA polymerase alpha 1 gene (POLA1) on the human X chromosome. The enhancer trap experiment shows that this segment of conserved DNA directs expression of the marker gene in embryonic brain cells (shown as the dark blue area in the embryo above the 467 site). This is usually taken as evidence of specific regulatory sequences that bind transcription factors.

Ahituv et al. then deleted the four ultraconserved sequences from the mouse genome using standard knockout technology. Mice that were homozygous for the knockouts showed no evidence of any defect compared to wild-type mice. In other words, the ultraconserved elements seemed to be completely dispensable—a result that is not consistent with their extreme conservation.

Junk DNA

What are the possible explanations? It's possible that the authors missed a phenotype that can only be detected outside the laboratory. It's also possible that the sequences really aren't conserved because they perform an important function but for another reason. Here's how the authors explain their results,
Based on the compelling evidence that ultraconserved elements are conserved due to functional constraint, it has been proposed that their removal in vivo would lead to a significant phenotypic impact [7,8]. Accordingly, our results were unexpected. It is possible that our assays were not able to detect dramatic phenotypes that under a different setting, for instance, outside the controlled laboratory setting, would become evident. Moreover, possible phenotypes might become evident only on a longer timescale, such as longer generation time. It is also possible that subtler genetic manipulations of the ultraconserved elements might lead to an evident phenotype due to a gain-of-function-type mechanism. All four elements examined in this study demonstrated in vivo enhancer activity when tested in a transgenic mouse assay (Figure 1) [6], which would suggest regulatory element redundancy as another possible explanation for the lack of a significant impact following the removal of these specific elements. Just as gene redundancy has been shown to be responsible for the lack of phenotypes associated with many seemingly vital gene knockouts, regulatory sequence redundancy [22] can similarly provide a possible explanation for the lack of a marked phenotype in this study. While our studies have not defined a specific need for the extreme sequence constraints of noncoding ultraconserved elements, they have ruled out the hypothesis that these constraints reflect crucial functions required for viability.
[UPDATE: Ryan Gregory at Genomicron discusses the same paper with a more thorough coverage of the background information and the relevance to junk DNA (Ultraconserved non-coding regions must be functional... right?). R. Ford Denison at This Week in Evolution has some thoughts on the paper (If it's junk, can we get rid of it?")]

Ahituv, N,, Zhu, Y., Visel, A., Holt, A., Afzal, V., Pennacchio, L.A., and Rubin, E.M. (2007) Deletion of Ultraconserved Elements Yields Viable Mice. PLoS Biol 5(9): e234 doi:10.1371/journal.pbio.0050234.


  1. Would sequence lengths a lot smaller than 200 bp still bypass sequences identical by chance? For instance, PCR primers are around 20 bp long, which translates the possibility of another identical sequence in a genome to 1/4^20. I might also be wrong, but can't regulatory sequences that bind transcription factors be shorter than 200 bp?

  2. Ahh, so nice to see a demonstration that no amount of genome-crunching will ever eliminate the need to actually do experiments. ;)

    If not purifying selection, what other mechanism could maintain such extreme sequence conservation? Maybe I'm missing something obvious but it seems like a bit of a head-scratcher.

  3. The most interesting elements are those that fall outside of coding regions.

    Spoken like a true pluralist!

    Yet I agree. This paper is very similar to one that I commented on here:

    Three of the authors are the same on both papers as well.

  4. Very interesting i think that they obviously have an function otherwise they would drift betwen the difrent species of mammals we are missing something there can t be no magic here i read i post on Genomicron by Andras that i think is pertinent "Ryan,

    You are a great blogger because of you sharp eye for controversial issues and a sharp tongue for eloquently voicing your doubtful or occasionally outright negative opinions.

    These are important traits for a research professor – but I wonder if you also teach? People are intrigued by controversies, but need the catharsis of some meaningful resolution. Some sort of a “take home lesson” – if you will. With no resolution, titillation may result in confusion, frustration, followed by disinterest and finally boredom.

    I loved your “Onion Test” – for any who might think has a coherent concept about the function of “junk DNA” to please explain why different sub-species (e.g. of the onion) show huge differences in the amount of “junk DNA” they have.

    I took your challenge of the “Onion Test” in another blog where I provided my explanation based on FractoGene (that is, if junk DNA provides the auxiliary information for recursive fractal development of organelles, organisms and organs, governed by the fractal genes/nongenes) - it follows from fractal algorithms that small changes in their parameters may result in very rapid - or extremely slow - convergence, thus needing small or very large amounts of "Junk DNA" (that of course, is "anything but junk").

    Why don’t we formulate an “Elephant Fish Test” for the enigma of ultraconserved and ultraselected “non-coding DNA”? On one hand, such ultraselection was outright called a “Mystery” – while the last 3 authors (Venter, Strausberg, Brenner), out of 12 contributors of the “Elephant fish" paper arrived at the conclusion on conserved noncoding elements (CNEs) that “The ancient vertebrate-specific CNEs in the elephant shark and human genomes are likely to play key regulatory roles in vertebrate gene expression”. Anyone who thinks has a coherent concept about the function of whole genome should be able to provide an explanation of non-coding sequences conserved over half a billion years - yet, at the same time some of these sequences, when eliminated, apparently don't show up in regular functioning?

    Here we go. Let’s see if FractoGene not only can provide an at least as good answer to the “Onion Test” than any other I know of (please let me know if I am missing some) – but let’s see if FractoGene can pass the “Elephant Fish Test”?

    FractoGene holds that protein-coding DNA (formerly called “genes”, before ENCODE necessitated an “update of definition” of both "junk" and "gene") produce the “materials” of the organisms, while non-(directly)-coding DNA holds the information what is the “design” how they should be put together.

    As shown in JunkDNA, some "design” is worth “ultraconserving” for survival (e.g. the vertebral column for vertebrates) – while particular proteins might be substituted.

    So how about “not noticing the lack of some important design feature”?

    Since this is just a blog, let’s take an example that any truck-driver will understand. In the “evolution” of cars, some “design-elements” are “ultra-conserved” (wheels, steering, engine) - while "materials" have been substituted extremely widely. (Hardly any wood and even elimination of glass in late model cars, while none of today’s plastics in early cars, etc., etc.)

    Can it happen that one will not detect in the functioning of the automobile if an important "design element" that has been "ultraconserved" is missing?

    Of course, it can happen.

    For instance, having a spare tire is an “ultraconserved” element of the automobile design through the ages. (As for "survival of the fittest", who would pick from two comparable models a car that does *not* have a spare tire??)

    In normal use (on clean freeways), however, you will not note its absence sometimes for decades.

    Taking your car "to the wild" (to unpaved rocky roads, for instance) - is a different matter.

    Let’s see if in the scope of a blog, very intelligent “Onion” and “Elephant Fish” tests can be posted – and see if we are getting any "catharsis" (at the least, some attempts at resolutions). "

  5. Ah, the unexpected, the source of everything good! That the obvious answer doesn't seem to work is even more hopeful as an indicator of something new and wonderful.


    You are addressing the wrong blog (and with the wrong material).

  6. Or it's a clue that some stretches of sequences are really good at staying unchanged for its own sake, with no obvious effect at the organism level.

    Without redundancy, there cannot be evolution. If the pan-adaptionists were right, all life on Earth would have gone extinct ages ago.

  7. Or it's a clue that some stretches of sequences are really good at staying unchanged for its own sake, with no obvious effect at the organism level.

    But that's just restating the results- the question is, what's the mechanism by which it stays unchanged?

  8. Torbjörn Larsson:
    Nope im not adressing the wrong blog i should altought made a shorter transcription of the original reply because there are parts of it that aren t pertinent but the last part of it when it says " For instance, having a spare tire is an “ultraconserved” element of the automobile design through the ages. (As for "survival of the fittest", who would pick from two comparable models a car that does *not* have a spare tire??)

    In normal use (on clean freeways), however, you will not note its absence sometimes for decades.

    Taking your car "to the wild" (to unpaved rocky roads, for instance) - is a different matter"
    This is the point i wanted to make of this reply..

  9. Pretty freaking mysterious....what's keeping mutations out of that seuence for tens of millions of years if purifying selection seems to be excluded?

    Maybe this is highly biochemical and mechanistic... better spatial exposure to sequence correctig enzymes or something like that? gee . Well, I'm no biochemist.

  10. Jose:

    Nope im not adressing the wrong blog

    "... that i think is pertinent "Ryan, [sic]"

    Why Andras Pellionisz drivel doesn't pass the Onion smell test is explained by geneticists here.

  11. Oh my god!! i didn t know Andras was an ID advocate Shame on me :) well the only reason i have mentioned was because i think that this study could have missed the function that this highly conserved sequences might have when the animals are on theirs impredictble natural habitats, thanks for the tip in relation to Andras Torbjörn Larsson!

  12. But that's just restating the results- the question is, what's the mechanism by which it stays unchanged?

    I wouldn't dare speculate beyond the authors' explanation. Suffice to say chromatin has a lot of stuff in it, and a Pac-Man is probably involved.

  13. Jose:

    i didn t know Andras was an ID advocate

    He could be that, he could even be a commercial kook with earlier publications, if just his idea was interesting. (And as you can see from the thread, in some respects it is according to a geneticists.) But as for ID it is mostly nonpredictive.

    And Andras operates with a funny concept - IIRC he sees a correlation in one characteristic with genomic size and another correlation in one characteristic with genomic size, and then claims a causal relationship between the two characteristics. (Through his preferred "mechanism".)

    This is compounded with that it seems neither correlation bears scrutiny.

    In summary, some of this could have been interesting, but Andras himself is full of it.

  14. [Pellionisz]

    Just for the record, I consider all dogmatic categorizations contrary to the purposes of advancement of science - and particularly turned off by name calling. However, since someone (without my knowledge or approval) quoted my conceptual explanation of the "mystery" of "ultraconserved and ultraselected elements' absence without apparent phenotypic adversity" and the dichotomy of antagonistic belief systems came up in this blog, I feel compelled to re-state that my philosophy rejects belief-systems interfering with scientific rationale. I pursue an "Algorithmic Design" approach to Genomics (NOT invoking a designer), since the coding of information of heredity in my view will not be revealed as long as we don't find some algorithmic approach. I am one of the few who do occasionally post in both of the "antagonistic" camps - perhaps this triggers a degree of envy in some.

    It is easy to attack any particular theoretical approach, especially if it is novel, but we should be mindful that Genomics has never been "theory-heavy" and these days since ENCODE even the basic axioms are largely gone.

    The 62-Founders (and members representing 35 countries) of International PostGenetics Society are "theory friendly", probably for a reason.

  15. Andras Pellionisz:

    name calling

    There is no name calling here, but an attempt of analysis. We could refer you to a group, but that makes no difference.

    since ENCODE even the basic axioms are largely gone.

    Thank you! A better observation that you are living on another planet than the rest of us couldn't be made. The ENCODE project confirms old ideas:

    Some of the conclusions reinforce ideas that have already been in the literature for several years, for example that the majority of the human genome is transcribed (see, e.g., Wong et al. 2000; Wong et al. 2001). ... So, at this stage, we have increasingly convincing evidence of function for about 3% of the genome, with another 2% likely to fall into this category as it becomes more thoroughly characterized.

  16. One possible explanation for the
    lack of phenotype in the UC null
    mice mentioned in the paper is
    that there is redundancy, so that
    some other DNA is able to execute
    the function of the deleted

    While in the original 2004 UC
    paper a point is made that
    "....selection to maintain protein coding, protein-nucleic acid interactions, or RNARNA interactions does not result in near total conservation over long stretches of bases unless multiple functions are overlaid on the same DNA".

    (Which makes intuitive sense - the
    more constraints you impose on a
    space, the more you reduce its
    degrees of freedom - these UC
    elements evidently are subject to
    sufficient constraints to reduce
    the dimensionality of this piece
    of sequence space from 200, to

    Combining these gives a hypothesis
    in which this UC DNA carries out
    multiple functions, with each function backed up elsewhere. Sequence conservation is then potentially driven both by the molecular-machinery constraints of multiple function, and also by selection, with any mutation that abolishes one of the functions of an element, eliminated in any individual with a non-functional backup allele for that function.

    Its interesting to note from the 2004 paper that although these UC elements are 100% conserved between human, mouse and rat, there is some population
    variation , with 6 validated SNPs
    reported inside UC elements in human.

    One possible investigation of the
    redundancy + multiple function hypothesis would be to see whether the genotypes at these SNPs are correlated with genotypes at other SNPs in the genome - or at least, more correlated "than expected". The idea is that the minor allele of each of these UC element SNPS presumably compromises one of the functions of its containing UC element (in view of the highly constrained nature of the sequence) ; and that the combination of this minor allele , with a non-functional allele in the DNA of the corresponding backup - wherever that is -
    would result in failure of both
    functional elements, and hence should not occur together in a normal individual, so that the two genotypes should be correlated.