More Recent Comments

Showing posts with label Genome. Show all posts
Showing posts with label Genome. Show all posts

Tuesday, October 09, 2018

Alternative splicing and the gene concept

I just learned about a workshop scheduled for the end of this month. The topic is: Evolutionary Roles of Transposable Elements and Non-coding DNA: The Science and the Philosophy.

I'd love to attend but it's a just small workshop designed to encourage dialogue between scientists and philosophers who are interested in the topic. Here's a list of the speakers ...
  • Ryan Gregory: Junk DNA, genome size, and the onion test.
  • Stefan Linquist: Four decades debating junk DNA and the Phenotype Paradigm is (somehow) alive and well.
  • Chris Ponting: 92.9% of the human genome evolved neutrally.
  • Paul Griffiths: Both adaptation and adaptivity are relevant to diagnosing function.
  • Ford Doolittle: Selfish genes and selfish DNA: is there a difference?
  • Justin Garson: Biological functions, the liberality problem, and transposable elements.
  • Joyce Havstad: Evolutionary Thinking about Critique of Function Talk.
  • Guillame Bourque: Impact of transposable elements on human gene regulatory networks.
  • Ulrich Stegman: On parity, genetic causation and coding.
  • Steven Downes: Understanding non-coding variants as disease risk alleles.
  • Alexander Palazzo: How nuclear retention and cytoplasmic export of RNAs reduces the deleteriousness of junk DNA.
  • David Haig: Pax somatica
  • Cedric Feschotte: Transposable elements as catalysts of genome evolution.

Friday, July 13, 2018

How many protein-coding genes in the human genome?

The three main human databases (GENCODE/Ensembl, RefSeq, UniProtKB) contain a total of 22,210 protein-coding genes but only 19,446 of these genes are found in all three databases. That leaves 2764 potential genes that may or may not be real. A recent publication suggests that most of them are not real genes (Abascal et al., 2018). The issue is the same problem that I discussed in recent posts [Disappearing genes: a paper is refuted before it is even published] [Nature falls (again) for gene hype].

Sunday, July 08, 2018

Nature falls (again) for gene hype

Nature is arguably the most prestigious science journal. Articles published in Nature are widely perceived to be correct, unbiased, and factual. This perception is certainly true of articles that appear in the News section of the journal since these article are presumably written by expert science writers who have evaluated the new study and decided that it's worth reporting.

Sandwalk readers know that this perception is false (fake news). It turns out that science writers who publish in Nature are not very much better than science writers in general and that's not good.

I recently published a post about an extraordinary claim concerning the number of human genes [Disappearing genes: a paper is refuted before it is even published ]. It concerns a paper posted on an archive site claiming to have found 4,998 new genes of which 1,178 are new protein-coding genes (Pertea et. al., 2018). About five weeks later another paper was posted that effectively refuted the claim of new protein-coding genes (Jungreis et al., 2018). In between publication of those two papers, a freelance science writer, Cassandra Willyard, wrote an article for Nature News that covered the original claim of 4,998 new genes [New human gene tally reignites debate].

Let's see how she handled the controversy.

Disappearing genes: a paper is refuted before it is even published

Several readers alerted me to a paper that was posted on bioRxiv a few weeks ago (May 28, 2018). The paper claimed that the human genome contains 43,162 genes consisting of 21,306 protein-coding genes and 21,856 noncoding genes. The authors reported that they had discovered 3,819 new noncoding genes and 1,178 new protein-coding genes. In addition, they claim to have discovered 97,511 new splice variants raising the total number of splice variants to 12.5 per protein-coding gene although they seem to suggest that almost one-third of these splice variants are non-functional splicing errors. The most striking result, according to the authors, is that 95% of all transcripts are just transcriptional noise.

Here's the paper ...

Thursday, May 10, 2018

Philosophers talking about genes

It's important to define what you mean when you use the word "gene." I use the molecular definition since most of what I write refers to DNA sequences. There's no perfect definition but, for most purposes, a good working definition is: A gene is a DNA sequence that is transcribed to produce a functional product. [What Is a Gene?].

There are two types of genes: protein-coding genes and those that specify a functional noncoding RNA (i.e ribosomal RNA, lincRNA). The gene is the part of the DNA that's transcribed so it includes introns. Transcription is controlled by regulatory sequences such as promoters, operators, and enhancers but these are not part of the gene.

In addition to genes, there are many other functional parts of the genome. In the case of eukaryotic genomes, these include centromeres, telomeres, origins of replication, SARs, and some other bits. None of this is new ... these functions have been known for decades and the working definition I use has been common among knowledgeable experts for half-a-century. Scientists know what they are talking about when they say that the human genome contains about 20,000 protein-coding genes and at least 5,000 genes for non-coding RNAs. They are comfortable with the idea that our genome has lots of other functional regions that lie outside of the genes.

Non-experts may not be familiar with the topic and they may have many misconceptions about genes and DNA sequences but we don't base our science on the views of non-experts.

Because of my interest in this topic, I was intrigued by the title of a new book, The Gene: from Genetics to Postgenomics. I ordered it a soon as I heard about it and I've just finished reading it. The version I read has been translated from German by Adam Bostanci.

Saturday, April 07, 2018

Required reading for the junk DNA debate

This is a list of scientific papers on junk DNA that you need to read (and understand) in order to participate in the junk DNA debate. It's not a comprehensive list because it's mostly papers that defend junk DNA and refute arguments for massive amounts of function. The only exception is the paper by Mattick and Dinger (2013).1 It's the only anti-junk paper that attempts to deal with the main evidence for junk DNA. If you know of any other papers that make a good case against junk DNA then I'd be happy to include them in the list.

If you come across a publication that argues against junk DNA, then you should immediately check the reference list. If you do not see some of these references in the list, then don't bother reading the paper because you know the author is not knowledgeable about the subject.

Brenner, S. (1998) Refuge of spandrels. Current Biology, 8:R669-R669. [PDF]

Brunet, T.D., and Doolittle, W.F. (2014) Getting “function” right. Proceedings of the National Academy of Sciences, 111:E3365-E3365. [doi: 10.1073/pnas.1409762111]

Casane, D., Fumey, J., et Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. Med. Sci. (Paris) 31: 680-686. [doi: 10.1051/medsci/20153106023] [The apophenia of ENCODE or Pangloss looks at the human genome]

Cavalier-Smith, T. (1978) Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox. Journal of Cell Science, 34(1), 247-278. [doi: PDF]

Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) published online March 11, 2013. [PubMed] [doi: 10.1073/pnas.1221376110]

Doolittle, W.F., Brunet, T.D., Linquist, S., and Gregory, T.R. (2014) Distinguishing between “function” and “effect” in genome biology. Genome biology and evolution 6, 1234-1237. [doi: 10.1093/gbe/evu098]

Doolittle, W.F., and Brunet, T.D. (2017) On causal roles and selected effects: our genome is mostly junk. BMC biology, 15:116. [doi: 10.1186/s12915-017-0460-9]

Eddy, S.R. (2012) The C-value paradox, junk DNA and ENCODE. Current Biology, 22:R898. [doi: 10.1016/j.cub.2012.10.002]

Eddy, S.R. (2013) The ENCODE project: missteps overshadowing a success. Current Biology, 23:R259-R261. [10.1016/j.cub.2013.03.023]

Graur, D. (2017) Rubbish DNA: The functionless fraction of the human genome Evolution of the Human Genome I (pp. 19-60): Springer. [doi: 10.1007/978-4-431-56603-8_2 (book)] [PDF]

Graur, D. (2017) An upper limit on the functional fraction of the human genome. Genome Biology and Evolution, 9:1880-1885. [doi: 10.1093/gbe/evx121]

Graur, D., Zheng, Y., Price, N., Azevedo, R. B., Zufall, R. A., and Elhaik, E. (2013) On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution published online: February 20, 2013 [doi: 10.1093/gbe/evt028

Graur, D., Zheng, Y., and Azevedo, R.B. (2015) An evolutionary classification of genomic function. Genome Biology and Evolution, 7:642-645. [doi: 10.1093/gbe/evv021]

Gregory, T. R. (2005) Synergy between sequence and size in large-scale genomics. Nature Reviews Genetics, 6:699-708. [doi: 10.1038/nrg1674]

Haerty, W., and Ponting, C.P. (2014) No Gene in the Genome Makes Sense Except in the Light of Evolution. Annual review of genomics and human genetics, 15:71-92. [doi:10.1146/annurev-genom-090413-025621]

Hurst, L.D. (2013) Open questions: A logic (or lack thereof) of genome organization. BMC biology, 11:58. [doi:10.1186/1741-7007-11-58]

Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G. E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]

Mattick, J. S., and Dinger, M. E. (2013) The extent of functionality in the human genome. The HUGO Journal, 7:2. [doi: 10.1186/1877-6566-7-2]

Five Things You Should Know if You Want to Participate in the Junk DNA DebateMorange, M. (2014) Genome as a Multipurpose Structure Built by Evolution. Perspectives in biology and medicine, 57:162-171. [doi: 10.1353/pbm.2014.000]

Niu, D. K., and Jiang, L. (2012) Can ENCODE tell us how much junk DNA we carry in our genome?. Biochemical and biophysical research communications 430:1340-1343. [doi: 10.1016/j.bbrc.2012.12.074]

Ohno, S. (1972) An argument for the genetic simplicity of man and other mammals. Journal of Human Evolution, 1:651-662. [doi: 10.1016/0047-2484(72)90011-5]

Ohno, S. (1972) So much "junk" in our genome. In H. H. Smith (Ed.), Evolution of genetic systems (Vol. 23, pp. 366-370): Brookhaven symposia in biology.

Palazzo, A.F., and Gregory, T.R. (2014) The Case for Junk DNA. PLoS Genetics, 10:e1004351. [doi: 10.1371/journal.pgen.1004351]

Rands, C. M., Meader, S., Ponting, C. P., and Lunter, G. (2014) 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLOS Genetics, 10:e1004525. [doi: 10.1371/journal.pgen.1004525]

Thomas Jr, C.A. (1971) The genetic organization of chromosomes. Annual review of genetics, 5:237-256. [doi: annurev.ge.05.120171.001321]


1. The paper by Kellis et al. (2014) is ambiguous. It's clear that most of the ENCODE authors are still opposed to junk DNA even though the paper is mostly a retraction of their original claim that 80% of the genome is functional.

Thursday, April 05, 2018

Subhash Lakhotia: The concept of 'junk DNA' becomes junk

Continuing my survey of recent papers on junk DNA, I stumbled upon a review by Subash Lakhotia that has recently been accepted in The Proceedings of the Indian National Science Academy (Lakhotia, 2018). It illustrates the extent of the publicity campaign mounted by ENCODE and opponents of junk DNA. In the title of this post, I paraphrased a sentence from the abstract that summarizes the point of the paper; namely, that the 'recent' discovery of noncoding RNAs refutes the concept of junk DNA.

Lakhotia claims to have written a review of the history of junk DNA but, in fact, his review perpetuates a false history. He repeats a version of history made popular by John Mattick. It goes like this. Old-fashioned scientists were seduced by Crick's central dogma into thinking that the only important part of the genome was the part encoding proteins. They ignored genes for noncoding RNAs because they didn't fit into their 'dogma.' They assumed that most of the noncoding part of the genome was junk. However, recent new discoveries of huge numbers of noncoding RNAs reveal that those scientists were very stupid. We now know that the genome is chock full of noncoding RNA genes and the concept of junk DNA has been refuted.

Peter Larsen: "There is no such thing as 'junk DNA'"

The March 2018 issue of Chromosome Research is a Special Issue on Transposable Elements and Genome Function. I found it as I was doing my routine search for papers on junk DNA in order to see whether scientists are finally beginning to understand the issue. Peter Larsen (guest editor) wrote the introduction to the special issue. He says ...
There is no such thing as “junk DNA.” Indeed, a suite of discoveries made over the past few decades have put to rest this misnomer and have identified many important roles that so-called junk DNA provides to both genome structure and function (this special issue; Biémont and Vieira 2006; Jeck et al. 2013; Elbarbary et al. 2016; Akera et al. 2017; Chen and Yang 2017; Chuong et al. 2017). Nevertheless, given the historical focus on coding regions of the genome, our understanding of the biological function of non-coding regions (e.g., repetitive DNA, transposable elements) remains somewhat limited, and therefore, all those enigmatic and poorly studied regions of the genome that were once identified as junk are instead best viewed as genomic “dark matter.”

Tuesday, March 27, 2018

Sunday, March 18, 2018

What is "dark DNA"?

Some DNA sequencing technologies aren't very good at sequencing and assembling DNA that's rich in GC base pairs. What this means is that some sequenced genomes could be missing stretches of GC-rich DNA if they rely exclusively on those techniques. This difficult-to-sequence DNA was called "dark DNA" in a paper published last summer (July 2017).

The paper looked at some missing genes in the genome of the sand rat Psammomys obesus. The authors initially used a standard shotgun strategy in order to sequence the sand rat genome. They combined millions of short reads (&lt200 bp) to assemble a complete genome. A large block of genes seemed to be missing—genes that were conserved and present in the genomes of related species (Hargraves et al., 2017). They knew the genes were present because they could detect the mRNAs corresponding to those genes.

Tuesday, March 13, 2018

Making Sense of Genes by Kostas Kampourakis

Kostas Kampourakis is a specialist in science education at the University of Geneva, Geneva (Switzerland). Most of his book is an argument against genetic determinism in the style of Richard Lewontin. You should read this book if you are interested in that argument. The best way to describe the main thesis is to quote from the last chapter.

Here is the take-home message of this book: Genes were initially conceived as immaterial factors with heuristic values for research, but along the way they acquired a parallel identity as DNA segments. The two identities never converged completely, and therefore the best we can do so far is to think of genes as DNA segments that encode functional products. There are neither 'genes for' characters nor 'genes for' diseases. Genes do nothing on their own, but are important resources for our self-regulated organism. If we insist in asking what genes do, we can accept that they are implicated in the development of characters and disease, and that they account for variation in characters in particular populations. Beyond that, we should remember that genes are part of an interactive genome that we have just begun to understand, the study of which has various limitations. Genes are not our essences, they do not determine who we are, and they are not the explanation of who we are and what we do. Therefore we are not the prisoners of any genetic fate. This is what the present book has aimed to explain.

Wednesday, February 28, 2018

Junk DNA and selfish DNA

Selfish DNA is a term that became popular with the publication of a series of papers in Nature in 1980. The authors were referring to viruses and transposons that insert themselves into a genome where they exist solely for the purposes of propagating themselves. These selfish DNA sequences are often thought, incorrectly, to be the same as the Selfish Genes of Richard Dawkins1 [Selfish genes and transposons]. In fact, "selfish genes" refers to the idea that some DNAs enhance fitness and the frequency of these genes will increase in a population through their effect on the vehicle that carries them. It's an adaptationist view of evolution. The selfish DNA of transposons and viruses is quite different. These sequences only propagate themselves—the fitness of the organism is largely irrelevant. These elements do not contribute directly to the adaptive evolution of the species.

Transposons and integrated viruses are subjected to mutation just like the rest of the genome. Deleterious mutations cannot be purged by natural selection because inactivating a transposon has no effect on the fitness of the organism.2 As a result, large genomes are littered with defective transposons and bits and pieces of dead transposons. This is not selfish DNA by any definition. It is junk DNA [What's in Your Genome?].

Sunday, February 18, 2018

Human genome books

Theme
Genomes
& Junk DNA

I'm trying to read all the recent books on the human genome and anything related. There are a lot of them. Here's a list with some brief comments. You should buy some of these books. There are others you should not buy under any circumstances.

Friday, February 09, 2018

Are splice variants functional or noise?

This is a post about alternative splicing. I've avoided using that term in the title because it's very misleading. Alternative splicing produces a number of different products (RNA or protein) from a single intron-containing gene. The phenomenon has been known for 35 years and there are quite a few very well-studied examples, including several where all of the splice regulatory factors have been characterized.

Wednesday, February 07, 2018

The Salzburg sixty discuss a new paradigm in genetic variation

Sixty evolutionary biologists are going to meet next July in Salzburg (Austria)to discuss "a new paradigmatic understanding of genetic novelty" [Evolution – Genetic Novelty/Genomic Variations by RNA Networks and Viruses]. You probably didn't know that a new paradigm is necessary. That's because you didn't know that the old paradigm of random mutations can't explain genetic diversity. (Not!) Here's how the symposium organizers explain it on their website ...

Tuesday, February 06, 2018

How many lncRNAs are functional?

There's solid evidence that 90% of your genome is junk. Most of it is transcribed at some time but the transcripts are transient and usually confined to the nucleus. They are junk RNA [Functional RNAs?]. This is the view held by many experts but you wouldn't know that from reading the scientific literature and the popular press. The opposition to junk DNA gets much more attention in both venues.

There are prominent voices expressing the view that most of the genome is devoted to producing functional RNAs required for regulating gene expression [John Mattick still claims that most lncRNAs are functional]. Most of these RNAs are long noncoding RNAs known as lncRNAs. Although most of them fail all reasonable criteria for function there are still those who maintain that tens of thousands of them are functional [How many lncRNAs are functional: can sequence comparisons tell us the answer?].

Monday, February 05, 2018

ENCODE's false claims about the number of regulatory sites per gene

Some beating of dead horses may be ethical, where here and there they display unexpected twitches that look like life.

Zuckerkandl and Pauling (1965)

I realize that most of you are tired of seeing criticisms of ENCODE but it's important to realize that most scientists fell hook-line-and-sinker for the ENCODE publicity campaign and they still don't know that most of the claims were ridiculous.

I was reminded of this when I re-read Brendan Maher's summary of the ENCODE results that were published in Nature on Sept. 6, 2012 (Maher, 2012). Maher's article appeared in the front section of the ENCODE issue.1 With respect to regulatory sequences he said ...
The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression — and nearly 400,000 ‘enhancer’ regions that regulate expression of distant genes ... But the job is far from done, says [Ewan] Birney, a computational biologist at the European Molecular Biology Laboratory’s European Bioinformatics Institute in Hinxton, UK, who coordinated the data analysis for ENCODE. He says that some of the mapping efforts are about halfway to completion, and that deeper characterization of everything the genome is doing is probably only 10% finished.

Saturday, February 03, 2018

What's in Your Genome?: Chapter 5: Regulation and Control of Gene Expression

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?]. Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs [What's in Your Genome? Chapter 4: Pervasive Transcription].

Chapter 5 is Regulation and Control of Gene Expression.
Chapter 5: Regulation and Control of Gene Expression

What do we know about regulatory sequences?
The fundamental principles of regulation were worked out in the 1960s and 1970s by studying bacteria and bacteriophage. The initiation of transcription is controlled by activators and repressors that bind to DNA near the 5′ end of a gene. These transcription factors recognize relatively short sequences of DNA (6-10 bp) and their interactions have been well-characterized. Transcriptional regulation in eukaryotes is more complicated for two reasons. First, there are usually more transcription factors and more binding sites per gene. Second, access to binding sites depends of the state of chromatin. Nucleosomes forming high order structures create a "closed" domain where DNA binding sites are not accessible. In "open" domains the DNA is more accessible and transcription factors can bind. The transition between open and closed domains is an important addition to regulating gene expression in eukaryotes.
The limitations of genomics
By their very nature, genomics studies look at the big picture. Such studies can tell us a lot about how many transcription factors bind to DNA and how much of the genome is transcribed. They cannot tell you whether the data actually reflects function. For that, you have to take a more reductionist approach and dissect the roles of individual factors on individual genes. But working on single genes can be misleading ... you may miss the forest for the trees. Genomic studies have the opposite problem, they may see a forest where there are no trees.
Regulation and evolution
Much of what we see in evolution, especially when it comes to phenotypic differences between species, is due to differences in the regulation of shared genes. The idea dates back to the 1930s and the mechanisms were worked out mostly in the 1980s. It's the reason why all complex animals should have roughly the same number of genes—a prediction that was confirmed by sequencing the human genome. This is the field known as evo-devo or evolutionary developmental biology.
           Box 5-1: Can complex evolution evolve by accident?
Slightly harmful mutations can become fixed in a small population. This may cause a gene to be transcribed less frequently. Subsequent mutations that restore transcription may involve the binding of an additional factor to enhance transcription initiation. The result is more complex regulation that wasn't directly selected.
Open and closed chromatin domains
Gene expression in eukaryotes is regulated, in part, by changing the structure of chromatin. Genes in domains where nucleosomes are densely packed into compact structures are essentially invisible. Genes in more open domains are easily transcribed. In some species, the shift between open and closed domains is associated with methylation of DNA and modifications of histones but it's not clear whether these associations cause the shift or are merely a consequence of the shift.
           Box 5-2: X-chromosome inactivation
In females, one of the X-chromosomes is preferentially converted to a heterochromatic state where most of the genes are in closed domains. Consequently, many of the genes on the X chromosome are only expressed from one copy as is the case in males. The partial inactivation of an X-chromosome is mediated by a small regulatory RNA molecule and this inactivated state is passed on to all subsequent descendants of the original cell.
           Box 5-3: Regulating gene expression by
           rearranging the genome

In several cases, the regulation of gene expression is controlled by rearranging the genome to bring a gene under the control of a new promoter region. Such rearrangements also explain some developmental anomalies such as growth of legs on the head fruit flies instead of antennae. They also account for many cancers.
ENCODE does it again
Genomic studies carried out by the ENCODE Consortium reported that a large percentage of the human genome is devoted to regulation. What the studies actually showed is that there are a large number of binding sites for transcription factors. ENCODE did not present good evidence that these sites were functional.
Does regulation explain junk?
The presence of huge numbers of spurious DNA binding sites is perfectly consistent with the view that 90% of our genome is junk. The idea that a large percentage of our genome is devoted to transcriptional regulation is inconsistent with everything we know from the the studies of individual genes.
           Box 5-3: A thought experiment
Ford Doolittle asks us to imagine the following thought experiment. Take the fugu genome, which is very much smaller than the human genome, and the lungfish genome, which is very much larger, and subject them to the same ENCODE analysis that was performed on the human genome. All three genomes have approximately the same number of genes and most of those genes are homologous. Will the number of transcription factor biding sites be similar in all three species or will the number correlate with the size of the genomes and the amount of junk DNA?
Small RNAs—a revolutionary discovery?
Does the human genome contain hundreds of thousands of gene for small non-coding RNAs that are required for the complex regulation of the protein-coding genes?
A “theory” that just won’t die
"... we have refuted the specific claims that most of the observed transcription across the human genome is random and put forward the case over many years that the appearance of a vast layer of RNA-based epigenetic regulation was a necessary prerequisite to the emergence of developmentally and cognitively advanced organisms." (Mattick and Dinger, 2013)
What the heck is epigenetics?
Epigenetics is a confusing term. It refers loosely to the regulation of gene expression by factors other than differences in the DNA. It's generally assumed to cover things like methylation of DNA and modification of histones. Both of these effects can be passed on from one cell to the next following mitosis. That fact has been known for decades. It is not controversial. The controversy is about whether the heritability of epigenetic features plays a significant role in evolution.
           Box 5-5: The Weismann barrier
The Weisman barrier refers to the separation between somatic cells and the germ line in complex multicellular organisms. The "barrier" is the idea that changes (e.g. methylation, histone modification) that occur in somatic cells can be passed on to other somatic cells but in order to affect evolution those changes have to be transferred to the germ line. That's unlikely. It means that Lamarckian evolution is highly improbable in such species.
How should science journalists cover this story?
The question is whether a large part of the human genome is devoted to regulation thus accounting for an unexpectedly large genome. It's an explanation that attempts to refute the evidence for junk DNA. The issue is complex and very few science journalists are sufficiently informed enough to do it justice. They should, however, be making more of an effort to inform themselves about the controversial nature of the claims made by some scientists and they should be telling their readers that the issue has not yet been resolved.


Friday, November 17, 2017

Calculating time of divergence using genome sequences and mutation rates (humans vs other apes)

There are several ways to report a mutation rate. You can state it as the number of mutations per base pair per year in which case a typical mutation rate for humans is about 5 × 10-10. Or you can express it as the number of mutations per base pair per generation (~1.5 × 10-8).

You can use the number of mutations per generation or per year if you are only discussing one species. In humans, for example, you can describe the mutation rate as 100 mutations per generation and just assume that everyone knows the number of base pairs (6.4 × 109).

Wednesday, November 08, 2017

How much mitochondrial DNA in your genome?

Most mitochondrial genes have been transferred from the ancestral mitochondrial genome to the nuclear genome over the course of 1-2 billion years of evollution. They are no longer present in mitochondria but they are easily recognized because they resemble α-proteobacterial sequences more than the other nuclear genes [see Endosymbiotic Theory].

This process of incorporating mitochondrial DNA into the nuclear genome continues to this day. The latest human reference genome has about 600 examples of nuclear sequences of mitochondrial origin (= numts). Some of them are quite recent while others date back almost 70 million years—the limit of resolution for junk DNA [see Mitochondria are invading your genome!].