More Recent Comments

Wednesday, March 17, 2021

I think I'll skip this meeting

I just received an invitation to a meeting ...

On behalf of the international organizing committee, we would like to invite you to a conference to be held in Neve Ilan, near Jerusalem, from 4-8 October 2021, entitled ‘Potential and Limitations of Evolutionary Processes’. The main goal of this interdisciplinary, international conference is to bring together scientists and scholars who hold a range of viewpoints on the potential and possible limitations of various undirected chemical and biological processes.

The conference will include presentations from a broad spectrum of disciplines, including chemistry, biochemistry, biology, origin of life, evolution, mathematics, cosmology and philosophy. Open-floor discussion will be geared towards delineating mechanistic details, with a view to discussing in such a way that speakers and participants feel comfortable expressing different opinions and different interpretations of the data, in the spirit of genuine academic inquiry.

I'm pretty sure I got this invite because I attended the Royal Society Meeting on New trends in evolutionary biology: biological, philosophical and social science perspectives back in 2016. That meeting was a big disappointment because the proponents of extending the modern synthesis didn't have much of a case [Kevin Laland's new view of evolution].

I was curious to see what kind of followup the organizers of this new meeting were planning so I checked out the website at: Potential and Limitations of Evolutionary Processes. Warning bells went off immediately when I saw the list of topics.

  • Fine-Tuning of the Universe
  • The Origin of Life
  • Origin & Fine-Tuning of the Genetic Code
  • Origin of Novel Genes
  • Origin of Functional Islands in Protein Sequence Space
  • Origin of Multi-Component Molecular Machines
  • Fine-Tuning of Molecular Systems
  • Fine-Tuning in Complex Biological Systems
  • Evolutionary Waiting Times
  • History of Life & Comparative Genomics

This is a creationist meeting. A little checking shows that three of the four organizers, Russ Carlson, Anthony Futerman, and Siegfried Scherer, are creationists. (I don't know about the other organizer, Joel Sussman, but in this case guilt by association seems appropriate.)

I don't think I'll book a flight to Israel.


Happy St. Patrick's Day!

Happy St. Patrick's Day! These are my great-grandparents Thomas Keys Foster, born in County Tyrone on September 5, 1852 and Eliza Ann Job, born in Fintona, County Tyrone on August 18, 1852. Thomas came to Canada in 1876 to join his older brother, George, on his farm near London, Ontario, Canada. Eliza came the following year and worked on the same farm. Thomas and Eliza decided to move out west where they got married in 1882 in Winnipeg, Manitoba, Canada.

The couple obtained a land grant near Salcoats, Saskatchewan, a few miles south of Yorkton, where they build a sod house and later on a wood frame house that they named "Fairview" after a hill in Ireland overlooking the house where Eliza was born. That's where my grandmother, Ella, was born.

Other ancestors in this line came from the adjacent counties of Donegal (surname Foster) and Fermanagh (surnames Keys, Emerson, Moore) and possibly Londonderry (surname Job).

One of the cool things about studying your genealogy is that you can find connections to almost everyone. This means you can celebrate dozens of special days. In my case it was easy to find other ancestors from England, Scotland, Netherlands, Germany, France, Spain, Poland, Lithuania, Belgium, Ukraine, Russia, and the United States. Today, we will be celebrating St. Patrick's Day. It's rather hectic keeping up with all the national holidays but somebody has to keep the traditions alive!

It's nice to have an excuse to celebrate, especially when it means you can drink beer. However, I would be remiss if I didn't mention one little (tiny, actually) problem. Since my maternal grandmother is pure Irish, I should be 25% Irish but my DNA results indicate that I'm only 4% Irish. That's probalby because my Irish ancestors were Anglicans and were undoubtedly the descendants of settlers from England, Wales, and Scotland who moved to Ireland in the 1600s. This explains why they don't have very Irish-sounding names.

I don't mention this when I'm in an Irish pub.


Monday, March 15, 2021

Is science the only way of knowing?

Most of us learned that science provides good answers to all sort of questions ranging from whether a certain drug is useful in treating COVID-19 to whether humans evolved from primitive apes. A more interesting question is whether there are any limitations to science or whether there are any other effective ways of knowing. The question is related to the charge of "scientism," which is often used as a pejorative term to describe those of us who think that science is the only way of knowing.

I've discussed these issue many times of this blog so I won't rehash all the arguments. Suffice to say that there are two definitions of science; the broad definition and the narrow one. The narrow definition says that science is merely the activity carried out by geologists, chemists, physicists, and biologists. Using this definition it would be silly to say that science is the only way of knowing. The broad definition can be roughly described as: science is a way of knowing that relies on evidence, logic (rationality), and healthy skepticism.

The broad definition is the one preferred by many philosophers and it goes something like this ...

Unfortunately neither "science" nor any other established term in the English language covers all the disciplines that are parts of this community of knowledge disciplines. For lack of a better term, I will call them "science(s) in the broad sense." (The German word "Wissenschaft," the closest translation of "science" into that language, has this wider meaning; that is, it includes all the academic specialties, including the humanities. So does the Latin "scientia.") Science in a broad sense seeks knowledge about nature (natural science), about ourselves (psychology and medicine), about our societies (social science and history), about our physical constructions (technological science), and about our thought construction (linguistics, literary studies, mathematics, and philosophy). (Philosophy, of course, is a science in this broad sense of the word.)

Sven Ove Hanson "Defining Pseudoscience and Science" in Philosophy of Pseudescience: Reconsidering the Demarcation Problem.

Friday, March 12, 2021

The bad news from Ghent

A group of scientists, mostly from the University of Ghent1 (Belgium), have posted a paper on bioRxiv.

Lorenzi, L., Chiu, H.-S., Cobos, F.A., Gross, S., Volders, P.-J., Cannoodt, R., Nuytens, J., Vanderheyden, K., Anckaert, J. and Lefever, S. et al. (2019) The RNA Atlas, a single nucleotide resolution map of the human transcriptome. bioRxiv:807529. [doi: 10.1101/807529]

The human transcriptome consists of various RNA biotypes including multiple types of non-coding RNAs (ncRNAs). Current ncRNA compendia remain incomplete partially because they are almost exclusively derived from the interrogation of small- and polyadenylated RNAs. Here, we present a more comprehensive atlas of the human transcriptome that is derived from matching polyA-, total-, and small-RNA profiles of a heterogeneous collection of nearly 300 human tissues and cell lines. We report on thousands of novel RNA species across all major RNA biotypes, including a hitherto poorly-cataloged class of non-polyadenylated single-exon long non-coding RNAs. In addition, we exploit intron abundance estimates from total RNA-sequencing to test and verify functional regulation by novel non-coding RNAs. Our study represents a substantial expansion of the current catalogue of human ncRNAs and their regulatory interactions. All data, analyses, and results are available in the R2 web portal and serve as a basis to further explore RNA biology and function.

They spent a great deal of effort identifying RNAs from 300 human samples in order to construct an extensive catalogue of five kinds of transcripts: mRNAs, lncRNAs, antisenseRNAs, miRNAs, and circularRNAs. The paper goes off the rails in the first paragraph of the Results section where they immediately equate transcripts wiith genes. They report the following:

  • 19,107 mRNA genes (188 novel)
  • 18,387 lncRNA genes (13,175 novel)
  • 7,309 asRNA genes (2,519 novel)
  • 5,427 miRNAs
  • 5,427 circRNAs

Is science a social construct?

Richard Dawkins has written an essay for The Spectator in which he says,

"[Science is not] a social construct. It’s simply true. Or at least truth is real and science is the best way we have of finding it. ‘Alternative ways of knowing’ may be consoling, they may be sincere, they may be quaint, they may have a poetic or mythic beauty, but the one thing they are not is true. As well as being real, moreover, science has a crystalline, poetic beauty of its own.

The essay is not particularly provocative but it did provoke Jerry Coyne who pointed out that, "The profession of science" can be contrued as a social construct. In this sense Jerry is agreeing with his former supervisor, Richard Lewontin1 who wrote,

"Science is a social institution about which there is a great deal of misunderstanding, even among those who are part of it. We think that science is an institution, a set of methods, a set of people, a great body of knowledge that we call scientific, is somehow apart from the forces that rule our everyday lives and tha goven the structure of our society... The problems that science deals with, the ideas that it uses in investigating those problems, even the so-called scientific results that come out of scientific investigation, are all deeply influenced by predispositions that derive from the society in which we live. Scientists do not begin life as scientists after all, but as social beings immersed in a family, a state, a productive structure, and they view nature through a lens that has been molded by their social structure."

Coincidently, I just happened to be reading Science Fictions an excellent book by Stuart Ritchie who also believes that science is a social construct but he has a slighly different take on the matter.

"Science has cured diseases, mapped the brain, forcasted the climate, and split the atom; it's the best method we have of figuring out how the universe works and of bending it to our will. It is, in other words, our best way of moving towards the truth. Of course, we might never get there—a glance at history shows us hubristic it is to claim any facts as absolute or unchanging. For ratcheting our way towards better knowledge about the world, though, the methods of science is as good as it gets.

But we can't make progress withthose methods alone. It's not enough to make a solitary observation in your lab; you must also convince other scientists that you've discovered something real. This is where the social part comes. Philosophers have long discussed how important it is for scientists to show their fellow researchers how they came to their conclusions.

Dawkins, Coyne, Lewontin, and Ritchie are all right in different ways. Dawkins is talking about science as a way of knowing, although he restricts his definition of science to the natural sciences. The others are referring to the practice of science, or as Jerry Coyne puts it, the profession. It's true that the methods of science are the best way we have to get at the truth and it's true that the way of knowing is not a social construct in any meanigful sense.

Jerry Coyne is right to point out that the methods are employed by human scientists (he's also restricting the practice of science to scientists) and humans are fallible. In that sense, the enterprise of (natural) science is a social construct. Lewontin warns us that scientists have biases and prejudices and that may affect how they do science.

Ritchie makes a diffferent point by emphasizing that (natural) science is a collective endeavor and that "truth" often requires a consensus. That's the sense in which science is social. This is supposed to make science more robust, according to Ritchie, because real knowledge only emerges after carefull and skeptical scrutiny by other scientists. His book is mostly about how that process isn't working and why science is in big trouble. He's right about that.

I think it's important to distinguish between science as a way of knowing and the behavior and practice of scientists. The second one is affected by society and its flaws are well-known but the value of science as way of knowing can't be so easily dismissed.


1. The book is actually a series of lectures (The Massey Lectures) that Lewontin gave in Toronto (Ontario, Canada) in 1990. I attended those lectures.

Tuesday, February 16, 2021

The 20th anniversary of the human genome sequence:
6. Nature doubles down on ENCODE results

Nature has now published a series of articles celebrating the 20th anniversary of the publication of the draft sequences of the human genome [Genome revolution]. Two of the articles are about free access to information and, unlike a similar article in Science, the Nature editors aren't shy about mentioning an important event from 2001; namely, the fact that Science wasn't committed to open access.

By publishing the Human Genome Project’s first paper, we worked with a publicly funded initiative that was committed to data sharing. But the journal acknowledged there would be challenges to maintaining the free, open flow of information, and that the research community might need to make compromises to these principles, for example when the data came from private companies. Indeed, in 2001, colleagues at Science negotiated publishing the draft genome generated by Celera Corporation in Rockville, Maryland. The research paper was immediately free to access, but there were some restrictions on access to the full data.

Friday, February 12, 2021

The 20th anniversary of the human genome sequence:
5. 90% of our genome is junk

This is the fifth (and last) post in celebration of the 20th anniversary of publishing the draft sequence. The first four posts dealt with: (1) the way Science chose to commemorate the occasion [Access to the data]; (2) finishing the sequence; (3) the number of genes; and (4) the amount of functional DNA in the genome.

Back in 2001, knowledgeable scientists knew that most of the human genome is junk and the sequence confirmed that knowledge. Subsequent work on the human genome over the past 20 years has provided additional evidence of junk DNA so that we can now be confident that something like 90% of our genome is junk DNA. Here's a list of data and arguments that support that claim.

Wednesday, February 10, 2021

The 20th anniversary of the human genome sequence:
4. Functional DNA in our genome

We know a lot more about the human genome than we did when the draft sequences were published 20 years ago. One of the most important discoveries is the recognition and extent of true functional sequences in the genome. Genes are one example of such functional sequence but only a minor component (about 1.4%). Most of the functional regions of the genome are not genes.

Here's a list of functional DNA in our genome other than the functional part of genes.

  • Centromeres: There are 24 different centromeres and the average size is four million base pairs. Most of this is repetitive DNA and it adds up to about 3% of the genome. The total amount of centromeric DNA ranges from 2%-10% in different individuals. It's unlikely that all of the centromeric DNA is essential; about 1% seems to be a good estimate.
  • Telomeres: Telomeres are repetivie DNA sequences at the ends of chromosomes. They are required for the proper replication of DNA and they take up about 0.1% of the genome sequence.
  • Origins of replication: DNA replication begins at origins of replication. The size of each origin has not been established with certainlty but it's safe to assume that 100 bp is a good estimate. There are about 100,000 origin sequences but it's unlikely that all of them are functional or necessary. It's reasonable to assume that only 30,000 - 50,000 are real origins and that means 0.3% of the genome is devoted to origins of replication.
  • Regulatory sequences: The transcription of every gene is controlled by sequences that lie outside of the genes, usually at the 5′ end. The total amount of regulatory sequence is controversial but it seems reasonable to assume about 200 bp per gene for a total of five million bp or less than 0.2% of the genome (0.16%). The most extreme claim is about 2,400 bp per gene or 1.8% of the genome.
  • Scaffold attachment regions (SARs): Human chromatin is organized into about 100,000 large loops. The base of each loop consists of particular proteins bound to specific sequences called anchor loop sequences. The nomenclature is confusing; the original term (SAR) isn't as popular today as it was 35 years ago but that doesn't change the fact that about 0.3% of the genome is required to organize chromatin.
  • Transposons: Most of the transposon-related sequencs in our genome are just fragments of defective transposons but there are a few active ones. They account for only a tiny fraction of the genome.
  • Viruses: Functional virus DNA sequences account for less than 0.1% of the genome.

If you add up all the functional DNA from this list, you get to somewhere between 2% and 3% of the genome.


Image credit: Wikipedia.

Monday, February 08, 2021

The 20th anniversary of the human genome sequence: 3. How many genes?

This week marks the 20th anniversary of the publication of the first drafts of the human genome sequence. Science choose to celebrate the achievement with a series of articles that had little to say about the scientific discoveries arising out of the sequencing project; one of the articles praised the opennesss of sequence data without mentioning that the journal had violated its own policy on openness by publishing the Celera sequence [The 20th anniversary of the human genome sequence: 1. Access to the data and the complicity of Science].

I've decided to post a few articles about the human genome beginning with one on finishing the sequence. In this post I'll summarize the latest data on the number of genes in the human genome.

Saturday, February 06, 2021

The 20th anniversary of the human genome sequence:
2. Finishing the sequence

It's been 20 years since the first drafts of the human genome sequence were published. These first drafts from the International Human Genome Project (IHGP) and Celera were far from complete. The IHGP sequence covered about 82% of the genome and it contained about 250,000 gaps and millions of sequencing errors.

Celera never published an updated sequences but IHPG published a "finished" sequence in October 2004. It covered about 92% of the genome and had "only" 300 gaps. The error rate of the finished sequence was down to 10-5.

International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431:931-945. doi: 10.1038/nature03001

We've known for many decades that the correct size of the human genome is close to 3,200,000 kb or 3.2 Gb. There's isn't a more precise number because different individuals have different amounts of DNA. The best average estimate was 3,286 Gb based on the sequence of 22 autosomes, one X chromosome, and one Y chromosome (Morton 1991). The amount of actual nucleotide sequence in the latest version of the reference genome (GRCh38.p13) is 3,110,748,599 bp and the estimated total size is 3,272,116,950 bp based on estimating the size of the remaining gaps. This means that 95% of the genome has been sequenced. [see How much of the human genome has been sequenced? for a discussion of what's missing.]

Recent advances in sequencing technology have produced sequence data covering the repetitive regions in the gaps and the first complete sequence of a human chromosome (X) was published in 2019 [First complete sequence of a human chromosome]. It's now possible to complete the human genome reference sequence by sequencing at least one individual but I'm not sure that the effort and the expense are worth it.


Image credit the figure is from Miga et al. (2019)

Miga, K.H., Koren, S., Rhie, A., Vollger, M.R., Gershman, A., Bzikadze, A., Brooks, S., Howe, E., Porubsky, D., Logsdon, G.A. et al. (2019) Telomere-to-telomere assembly of a complete human X chromosome. Nature 585:79-84. [doi: 10.1038/s41586-020-2547-7]

Morton, N.E. (1991) Parameters of the human genome. Proceedings of the National Academy of Sciences 88:7474-7476. [doi: 10.1073/pnas.88.17.7474]

The 20th anniversary of the human genome sequence: 1. Access to the data and the complicity of Science

The first drafts of the human genome sequence were published 20 years ago. The paper from the International Human Genome Project (IHGP) was published in Nature on Febuary 15, 2001 and the paper from Celera was published in Science on February 16, 2001.

The original agreement was to publish both papers in Science but IHGP refused to publish their sequence in that journal when it choose to violate its own policy by allowing Celera to restrict access to its data. I highly recommend James Shreeve's book The Genome War for the history behind these publications. It paints an accurate, but not pretty, picture of science and politics.

Lander, E., Linton, L., Birren, B., Nusbaum, C., Zody, M., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., McEwan, P., McKernan, K., Meldrim, J., Mesirov, J., Miranda, C., Morris, W., Naylor, J., Raymond, C., Rosetti, M., Santos, R., Sheridan, A. and Sougnez, C. (2001) Initial sequencing and analysis of the human genome. Nature 409:860-921. doi: 10.1038/35057062

Venter, J., Adams, M., Myers, E., Li, P., Mural, R., Sutton, G., Smith, H., Yandell, M., Evans, C., Holt, R., Gocayne, J., Amanatides, P., Ballew, R., Huson, D., Wortman, J., Zhang, Q., Kodira, C., Zheng, X., Chen, L., Skupski, M., Subramanian, G., Thomas, P., Zhang, J., Gabor Miklos, G., Nelson, C., Broder, S., Clark, A., Nadeau, J., McKusick, V. and Zinder, N. (2001) The sequence of the human genome. Science 291:1304 - 1351. doi: 10.1126/science.1058040

Thursday, December 31, 2020

On the importance of controls

When doing an exeriment, it's important to keep the number of variables to a minimum and it's important to have scientific controls. There are two types of controls. A negative control covers the possibility that you will get a signal by chance; for example, if you are testing an enzyme to see whether it degrades sugar then the negative control will be a tube with no enzyme. Some of the sugar may degrade spontaneoulsy and you need to know this. A positive control is when you deliberately add something that you know will give a positive result; for example, if you are doing a test to see if your sample contains protein then you want to add an extra sample that contains a known amount of protein to make sure all your reagents are working.

Lots of controls are more complicated than the examples I gave but the principle is important. It's true that some experiments don't appear to need the appropriate controls but that may be an illusion. The controls might still be necessary in order to properly interpret the results but they're not done because they are very difficult. This is often true of genomics experiments.

Saturday, December 19, 2020

What do believers in epigenetics think about junk DNA?

I've been writing some stuff about epigenetics so I've been reading papers on how to define the term [What the heck is epigenetics? ]. Turns out there's no universal definition but I discovered that scientists who write about epigenetics are passionate believers in epigenetics no matter how you define it. Surprisingly (not!), there seems to be a correlation between belief in epigenetics and other misconceptions such as the classic misunderstanding of the Central Dogma of Molecular Biology and rejection of junk DNA [The Extraordinary Human Epigenome]

Here's an illustraton of this correlation from the introduction to a special issue on epigenetics in Philosophical Transactions B.

Ganesan, A. (2018) Epigenetics: the first 25 centuries, Philosophical Transactions B. 373: 20170067. [doi: 10.1098/rstb.2017.0067]

Epigenetics is a natural progression of genetics as it aims to understand how genes and other heritable elements are regulated in eukaryotic organisms. The history of epigenetics is briefly reviewed, together with the key issues in the field today. This themed issue brings together a diverse collection of interdisciplinary reviews and research articles that showcase the tremendous recent advances in epigenetic chemical biology and translational research into epigenetic drug discovery.

In addition to the misconceptions, the text (see below) emphasizes the heritable nature of epigenetic phenomena. This idea of heritablity seems to be a dominant theme among epigenetic believers.

A central dogma became popular in biology that equates life with the sequence DNA → RNA → protein. While the central dogma is fundamentally correct, it is a reductionist statement and clearly there are additional layers of subtlety in ‘how’ it is accomplished. Not surprisingly, the answers have turned out to be far more complex than originally imagined, and we are discovering that the phenotypic diversity of life on Earth is mirrored by an equal diversity of hereditary processes at the molecular level. This lies at the heart of modern day epigenetics, which is classically defined as the study of heritable changes in phenotype that occur without an underlying change in genome sequence. The central dogma's focus on genes obscures the fact that much of the genome does not code for genes and indeed such regions were derogatively lumped together as ‘junk DNA’. In fact, these non-coding regions increase in proportion as we climb up the evolutionary tree and clearly play a critical role in defining what makes us human compared with other species.

At the risk of bearting a dead horse, I'd like to point out that the author is wrong about the Central Dogma and wrong about junk DNA. He's right about the heritablitly of some epigenetic phenomena such as methylation of DNA but that fact has been known for almost five decades and so far it hasn't caused a noticable paradigm shift, unless I missed it [Restriction, Modification, and Epigenetics].


Saturday, December 05, 2020

Mouse traps Michael Denton

Michael Denton is a New Zealand biochemist, a Senior Fellow at the Discovery Institute, and the author of two Intelligent Design Creationist books: Evolution: A Theory in Crisis (1985) and Nature's Destiny (1998).

He has just read Michael Behe's latest book and he (Denton) is impressed [Praise for Behe’s Latest: “Facts Before Theory”]:

Behe brings out more forcibly than any other author I have recently read just how vacuous and biased are the criticisms of his work and of the ID position in general by so many mainstream academic defenders of Darwinism. And what is so telling about his many wonderfully crafted responses to his Darwinian critics is that it is Behe who is putting the facts before theory while his many detractors — Kenneth Miller, Jerry Coyne, Larry Moran, Richard Lenski, and others — are putting theory before the facts. In short, this volume shows that it is Behe rather than his detractors who is carefully following the evidence.

I don't know what planet Michael Denton is living on—probably the same one as Michael Behe—but let's make one thing clear about facts and evidence. Behe's entire argument is based on the "fact" that he can't see how Darwin's theory of natural selection can account for the evolution of complex features: therefore god(s) must have done it. This is NOT putting facts before theory and it is NOT carefully following the evidence.

It's just a somewhat sophisticated version of god of the gaps based on Behe's lack of understanding of the basic mechanisms of evolution.

(See, Of mice and Michael, where I explain why Michael Behe fails to answer my critique of The Edge of Evolution.)


Tuesday, December 01, 2020

Of mice and Michael

Michael Behe has published a book containing most of his previously published responses to critics. I was anxious to see how he dealt with my criticisms of The Edge of Evolution but I was disappointed to see that, for the most part, he has just copied excerpts from his 2014 blog posts (pp. 335-355).

I think it might be worthwhile to review the main issues so you can see for yourself whether Michael Behe really answered his critics as the title of his most recent book claims. You can check out the dueling blog posts at the end of this summary to see how the discussion evolved in real time more than four years ago.

Many Sandwalk readers participated in the debate back then and some of them are quoted in Behe's book although he usually just identifies them as commentators.

My Summary

Michael Behe has correctly indentified an extremely improbably evolution event; namely, the development of chloroquine resistance in the malaria parasite. This is an event that is close to the edge of evolution, meaning that more complex events of this type are beyond the edge of evolution and cannot occur naturally. However, several of us have pointed out that his explanation of how that event occurred is incorrect. This is important because he relies on his flawed interpretation of chloroquine resistance to postulate that many observed events in evolution could not possibly have occurred by natural means. Therefore, god(s) must have created them.

In his response to this criticism, he completely misses the point and fails to understand that what is being challenged is his misinterpretation of the mechanisms of evolution and his understanding of mutations.


The main point of The Edge of Evolution is that many of the beneficial features we see could only have evolved by selecting for a number of different mutations where none of the individual mutations confer a benefit by themselves. Behe claims that these mutations had to occur simultaneously or at least close together in time. He argues that this is possible in some cases but in most cases the (relatively) simultaneous occurrence of multiple mutations is beyond the edge of evolution. The only explanation for the creation of these beneficial features is god(s).

Tuesday, November 17, 2020

Using modified nucleotides to make mRNA vaccines

The key features of the mRNA vaccines are the use of modified nucleotides in their synthesis and the use of lipid nanoparticles to deliver them to cells. The main difference between the Pfizer/BioNTech vaccine and the Moderna vaccine is in the delivery system. The lipid vescicules used by Moderna are somewhat more stable and the vaccine doesn't need to be kept constantly at ultra-low temperatures.

Both vaccines use modified RNAs. They synthesize the RNA using modified nucleotides based on variants of uridine; namely, pseudouridine, N1-methylpseudouridine and 5-methylcytidine. (The structures of the nucleosides are from Andries et al., 2015).) The best versions are those that use both 5-methylcytidine and N1-methylpseudouridine.

I'm not an expert on these mRNAs and their delivery systems but the way I understand it is that regular RNA is antigenic—it induces antibodies against it, presumably when it is accidently released from the lipid vesicles outside of the cell. The modified versions are much less antigenic. As an added bonus, the modified RNA is more stable and more efficiently translated.

Two of the key papers are ...

Andries, O., Mc Cafferty, S., De Smedt, S.C., Weiss, R., Sanders, N.N. and Kitada, T. (2015) "N1-methylpseudouridine-incorporated mRNA outperforms pseudouridine-incorporated mRNA by providing enhanced protein expression and reduced immunogenicity in mammalian cell lines and mice." Journal of Controlled Release 217: 337-344. [doi: 10.1016/j.jconrel.2015.08.051]

Pardi, N., Tuyishime, S., Muramatsu, H., Kariko, K., Mui, B.L., Tam, Y.K., Madden, T.D., Hope, M.J. and Weissman, D. (2015) "Expression kinetics of nucleoside-modified mRNA delivered in lipid nanoparticles to mice by various routes." Journal of Controlled Release 217: 345-351. [doi: 10.1016/j.jconrel.2015.08.007]


Sunday, November 15, 2020

Why is the Central Dogma so hard to understand?

The Central Dogma of molecular biology states ...

... once (sequential) information has passed into protein it cannot get out again (F.H.C. Crick, 1958).

The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information. It states that such information cannot be transferred from protein to either protein or nucleic acid (F.H.C. Crick, 1970).

This is not difficult to understand since Francis Crick made it very clear in his original 1958 paper and again in his 1970 paper in Nature [see Basic Concepts: The Central Dogma of Molecular Biology]. There's nothing particularly complicated about the Central Dogma. It merely states the obvious fact that sequence information can flow from nucleic acid to protein but not the other way around.

So, why do so many scientists have trouble grasping this simple idea? Why do they continue to misinterpret the Central Dogma while quoting Crick? I seems obvious that they haven't read the paper(s) they are referencing.

I just came across another example of such ignorance and it is so outrageous that I just can't help sharing it with you. Here's a few sentences from a recent review in the 2020 issue of Annual Reviews of Genomics and Human Genetics (Zerbino et al., 2020).

Once the role of DNA was proven, genes became physical components. Protein-coding genes could be characterized by the genetic code, which was determined in 1965, and could thus be defined by the open reading frames (ORFs). However, exceptions to Francis Crick's central dogma of genes as blueprints for protein synthesis (Crick, 1958) were already being uncovered: first tRNA and rRNA and then a broad variety of noncoding RNAs.

I can't imagine what the authors were thinking when they wrote this. If the Central Dogma actually said that the only role for genes was to make proteins then surely the discovery of tRNA and rRNA would have refuted the Central Dogma and relegated it to the dustbin of history. So why bother even mentioning it in 2020?


Crick, F.H.C. (1958) On protein synthesis. Symp. Soc. Exp. Biol. XII:138-163. [PDF]

Crick, F. (1970) Central Dogma of Molecular Biology. Nature 227, 561-563. [PDF file]

Zerbino, D.R., Frankish, A. and Flicek, P. (2020) "Progress, Challenges, and Surprises in Annotating the Human Genome." Annual review of genomics and human genetics 21:55-79. [doi: 10.1146/annurev-genom-121119-083418]

Wednesday, November 11, 2020

On the misrepresentation of facts about lncRNAs

I've been complaining for years about how opponents of junk DNA misrepresent and distort the scientific literature. The same complaints apply to the misrepresentation of data on alternative splicing and on the prevalence of noncoding genes. Sometimes the misrepresentation is subtle so you hardly notice it.

I'm going to illustrate subtle misrepresentation by quoting a recent commentary on lncRNAs that's just been published in BioEssays. The main part of the essay deals with ways of determining the function of lncRNAs with an emphasis on the sructures of RNA and RNA-protein complexes. The authors don't make any specific claims about the number of functional RNAs in humans but it's clear from the context that they think this number is very large.

Wednesday, October 07, 2020

Undergraduate education in biology: no vision, no change

I was looking at the Vision and Change document the other day and it made me realize that very little has changed in undergraduate education. I really shouldn't be surprised since I reached the same conclusion in 2015—six years after the recommendations were published [Vision and Change] [Why can't we teach properly?].

The main recommendations of Vision and Change are that undergraduate education should adopt the proven methods of student-centered education and should focus on core concepts rather than memorization of facts. Although there has been some progress, it's safe to say that neither of these goals have been achieved in the vast majority of biology classes, including biochemistry and molecular biology classes.

Things are getting even worse in this time of COVID-19 because more and more classes are being taught online and there seems to be general agreement that this is okay. It is not okay. Online didactic lectures go against everything in the Vision and Change document. It may be possible to develop online courses that practice student-centered, concept teaching that emphasizes critical thinking but I've seen very few attempts.

Here are a couple of quotations from Vision and Change that should stimulate your thinking.

Traditionally, introductory biology [and biochemistry] courses have been offered as three lectures a week, with, perhaps, an accompanying two- or three-hour laboratory. This approach relies on lectures and a textbook to convey knowledge to the student and then tests the student's acquisition of that knowledge with midterm and final exams. Although many traditional biology courses include laboratories to provide students with hands-on experiences, too often these "experiences" are not much more than guided exercises in which finding the right answer is stressed while providing students with explicit instructions telling them what to do and when to do it.
"Appreciating the scientific process can be even more important than knowing scientific facts. People often encounter claims that something is scientifically known. If they understand how science generates and assesses evidence bearing on these claims, they possess analytical methods and critical thinking skills that are relevant to a wide variety of facts and concepts and can be used in a wide variety of contexts.”

National Science Foundation, Science and Technology Indicators, 2008

If you are a student and this sounds like your courses, then you should demand better. If you are an instructor and this sounds like one of your courses then you should be ashamed; get some vision and change [The Student-Centered Classroom].

Although the definition of student-centered learning may vary from professor to professor, faculty generally agree that student-centered classrooms tend to be interactive, inquiry-driven, cooperative, collaborative, and relevant. Three critical components are consistent throughout the literature, providing guidelines that faculty can apply when developing a course. Student-centered courses and curricula take into account student knowledge and experience at the start of a course and articulate clear learning outcomes in shaping instructional design. Then they provide opportunities for students to examine and discuss their understanding of the concepts presented, offering frequent and varied feedback as part of the learing process. As a result, student-centered science classrooms and assignments typically involve high levels of student-student and student-faculty interaction; connect the course subject matter to topics students find relevant; minimize didactic presentations; reflect diverse views of scientific inquiry, including data presentation, argumentation, and peer review; provide ongoing feedback to both the student and professor about the student's learning progress; and explicitly address learning how to learn.

This is a critical time for science education since science is under attack all over the world. We need to make sure that university students are prepared to deal with scientific claims and counter-claims for the rest of their lives after they leave university. This means that they have to be skilled at critical thinking and that's a skill that can only be taught in a student-centered classroom where students can practice argumentation and learn the importance of evidence. Memorizing the enzymes of the Krebs Cycle will not help them understand climate change or why they should wear a mask in the middle of a pandemic.


Saturday, October 03, 2020

On the importance of random genetic drift in modern evolutionary theory

The latest issue of New Scientist has a number of articles on evolution. All of them are focused on extending and improving the current theory of evolution, which is described as Darwin's version of natural selection [New Scientist doesn't understand modern evolutionary theory].

Most of the criticisms come from a group who want to extend the evolutionary synthesis (EES proponents). Their main goal is to advertise mechanisms that are presumed to enhance adaptation but that weren't explicitly included in the Modern Synthesis that was put together in the late 1940s.

One of the articles addresses random genetic drift [see Survival of the ... luckiest]. The emphasis in this short article is on the effects of drift in small populations and it gives examples of reduced genetic diversity in small populations.

Wednesday, September 30, 2020

New Scientist doesn't understand modern evolutionary theory

New Scientist has devoted much of their September 26th issue to evolution, but not in a good way. Their emphasis is on 13 ways that we must rethink evolution. Readers of this blog are familiar with this theme because New Scientist is talking about the Extended Evolutionary Synthesis (EES)—a series of critiques of the Modern Synthesis in an attempt to overthrow or extend it [The Extended Evolutionary Synthesis - papers from the Royal Society meeting].

My main criticsm of EES is that its proponents demonstrate a remarkable lack of understanding of modern evolutionary theory and they direct most of their attacks against the old adaptationist version of the Modern Synthesis that was popular in the 1950s. For the most part, EES proponents missed the revolution in evolutionary theory that occrred in the late 1960s with the development of Neutral Theory, Nearly-Neutral Theory, and the importance of random genetic drift. EES proponents have shown time and time again that they have not bothered to read a modern textbook on population genetics.

Tuesday, September 22, 2020

The Function Wars Part VIII: Selected effect function and de novo genes

Discussions about the meaning of the word "function" have been going on for many decades, especially among philosphers who love that sort of thing. The debate intensified following the ENCODE publicity hype disaster in 2012 where ENCODE researchers used the word function an entirely inappropriate manner in order to prove that there was no junk in our genome. Since then, a cottege indiustry based on discussing the meaning of function has grown up in the scientific literature and dozens of papers have been published. This may have enhanced a lot of CV's but none of these papers has proposed a rigorous definition of function that we can rely on to distinguish functional DNA from junk DNA.

The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)

That doesn't mean that all of the papers have been completely useless. The net result has been to focus attention on the one reliable definition of function that most biologists can accept; the selected effect function. The selected effect function is defined as ...

Friday, August 07, 2020

Alan McHughen defends his views on junk DNA

Alan McHughen is the author of a recently published book titled DNA Demystified. I took issue with his stance on junk DNA [More misconceptions about junk DNA - what are we doing wrong?] and he has kindly replied to my email message. Here's what he said ...

Thursday, August 06, 2020

More misconceptions about junk DNA - what are we doing wrong?

I'm actively following the views of most science writers on junk DNA to see if they are keeping up on the latest results. The latest book is DNA Demystified by Alan McHughen, a molecular geneticist at the University California, Riverside. It's published by Oxford University Press, the same publisher that published John Parrington's book the deeper genome. Parrington's book was full of misleading and incorrect statements about the human genome so I was anxious to see if Oxford had upped it's game.1, 2

You would think that any book with a title like DNA Demystified would contain the latest interpretations of DNA and genomes, especially with a subtitle like "Unraveling the double Helix." Unfortunately, the book falls far short of its objectives. I don't have time to discuss all of its shortcomings so let's just skip right to the few paragraphs that discuss junk DNA (p.46). I want to emphasize that this is not the main focus of the book. I'm selecting it because it's what I'm interested in and because I want to get a feel for how correct and accurate scientific information is, or is not, being accepted by practicing scientists. Are we falling for fake news?

Saturday, August 01, 2020

ENCODE 3: A lesson in obfuscation and opaqueness

The Encyclopedia of DNA Elements (ENCODE) is a large-scale, and very expensive, attempt to map all of the functional elements in the human genome.

The preliminary study (ENCODE 1) was published in 2007 and the main publicity campaign surrounding that study focused on the fact that much of the human genome was transcribed. The implication was that most of the genome is functional. [see: The ENCODE publicity campaign of 2007].

The ENCODE 2 results were published in 2012 and the publicity campaign emphasized that up to 80% of our genome is functional. Many stories in the popular press touted the death of junk DNA. [see: What did the ENCODE Consortium say in 2012]

Both of these publicity campaigns, and the published conclusions, were heavily criticized for not understanding the distinction between fortuitous transcription and real genes and for not understanding the difference between fortuitous binding sites and functional binding sites. Hundreds of knowledgeable scientists pointed out that it was ridiculous for ENCODE researchers to claim that most of the human genome is functional based on their data. They also pointed out that ENCODE researchers ignored most of the evidence supporting junk DNA.

ENCODE 3 has just been published and the hype has been toned down considerably. Take a look at the main publicity article just published by Nature (ENCODE 3). The Nature article mentions ENCODE 1 and ENCODE 2 but it conveniently ignores the fact that Nature heavily promoted the demise of junk DNA back in 2007 and 2012. The emphasis now is not on how much of the genome is functional—the main goal of ENCODE—but on how much data has been generated and how many papers have been published. You can read the entire article and not see any mention of previous ENCODE/Nature claims. In fact, they don't even tell you how many genes ENCODE found or how many functional regulatory sites were detected.

The News and Views article isn't any better (Expanded ENCODE delivers invaluable genomic encyclopedia). Here's the opening paragraph of that article ...
Less than 2% of the human genome encodes proteins. A grand challenge for genomic sciences has been mapping the functional elements — the regions that determine the extent to which genes are expressed — in the remaining 98% of our DNA. The Encyclopedia of DNA Elements (ENCODE) project, among other large collaborative efforts, was established in 2003 to create a catalogue of these functional elements and to outline their roles in regulating gene expression. In nine papers in Nature, the ENCODE consortium delivers the third phase of its valuable project.1
You'd think with such an introduction that you would be about to learn how much of the genome is functional according to ENCODE 3 but you will be disappointed. There's nothing in that article about the number of genes, the number of regulatory sites, or the number of other functional elements in the human genome. It almost as if Nature wants to tell you about all of the work involved in "mapping the functional elements" without ever describing the results and conclusions. This is in marked contrast to the Nature publicity campaigns of 2007 and 2012 where they were more than willing to promote the (incorrect) conclusions.

In 2020 Nature seems to be more interested in obfuscation and opaqueness. One other thing is certain, the Nature editors and writers aren't the least bit interested in discussing their previous claims about 80% of the genome being functional!

I guess we'll have to rely on the ENCODE Consortium itself to give us a summary of their most recent findings. The summary paper has an intriguing title (Perspectives on ENCODE) that almost makes you think they will revisit the exaggerated claims of 2007 and 2012. No such luck. However, we do learn a little bit about the human genome.
  • 20,225 protein-coding genes [almost 1000 more than the best published estimates - LAM]
  • 37,595 noncoding genes [I strongly doubt they have evidence for that many functional genes]
  • 2,157,387 open chromatin regions [what does this mean?]
  • 1,224,154 transcription factor binding sites [how many are functional?]
That's it. The ENCODE Consortium seems to have learned only two things in 2012. They learned that it's better to avoid mentioning how much of the genome is functional in order to avoid controversy and criticism and they learned that it's best to ignore any of their previous claims for the same reason. This is not how science is supposed to work but the ENCODE Consortium has never been good at showing us how science is supposed to work.

Note: I've looked at some of the papers to try and find out if ENCODE stands by it's previous claim that most the genome is functional but they all seem to be written in a way that avoids committing to such a percentage or addressing the criticisms from 2007 and 2012. The only exception is a paper stating that cis-regulatory elements occupy 7.9% of the human genome (Expanded encyclopaedias of DNA elements in the human and mouse genomes). Please let me know if you come across anything interesting in those papers.


1. Isn't it about time to stop dwelling on the fact that 2% (actually less than 1%) of our genome encodes protein? We've known for decades that there are all kinds of other functional regions of the genome. No knowledgeable scientist thinks that the remaining 98% (99%) has no function.

Saturday, July 11, 2020

The coronavirus life cycle

The coronavirus life cycle is depicted in a figure from Fung and Liu (2019). See below for a brief description.
The virus particle attaches to receptors on the cell surface (mostly ACE2 in the case of SARS-CoV-2). It is taken into the cell by endocytosis and then the viral membrane fuses with the host membrane releasing the viral RNA. The viral RNA is translated to produce the 1a and 1ab polyproteins, which are cleaved to produce 16 nonstructural proteins (nsps). Most of the nsps assemble to from the replication-transcription complex (RTC). [see Structure and expression of the SARS-CoV-2 (coronavirus) genome]

RTC transcribes the original (+) strand creating (-) strands that are subsequently copied to make more viral (+) strands. RTC also produces a cluster of nine (-) strand subgenomic RNAs (sgRNAs) that are transcribed to make (+) sgRNAs that serve as mRNAs for the production of the structural proteins. N protein (nucleocapsid) binds to the viral (+) strand RNAs to help form new viral particles. The other structural proteins are synthesized in the endoplasmic reticulum (ER) where they assemble to form the protein-membrane virus particle that engulfs the viral RNA.

New virus particles are released when the vesicles fuse with the plasma membrane.

The entire life cycle takes about 10-16 hours and about 100 new virus particles are released before the cell commits suicide by apoptosis.


Fung, T.S. and Liu, D.X. (2019) Human coronavirus: host-pathogen interaction. Annual review of microbiology 73:529-557. [doi: 10.1146/annurev-micro-020518-115759]


Thursday, July 09, 2020

Structure and expression of the SARS-CoV-2 (coronavirus) genome


Coronaviruses are RNA viruses, which means that their genome is RNA, not DNA. All of the coronaviruses have similar genomes but I'm sure you are mostly interested in SARS-CoV-2, the virus that causes COVID-19. The first genome sequence of this virus was determined by Chinese scientists in early January and it was immediately posted on a public server [GenBank MN908947]. The viral RNA came from a patient in intensive care at the Wuhan Yin-Tan Hospital (China). The paper was accepted on Jan. 20th and it appeared in the Feb. 3rd issue of Nature (Zhou et al. 2020).

By the time the paper came out, several universities and pharmaceutical companies had already constructed potential therapeutics and several others had already cloned the genes and were preparing to publish the structures of the proteins.1

By now there are dozens and dozens of sequences of SARS-CoV-2 genomes from isolates in every part of the world. They are all very similar because the mutation rate in these RNA viruses is not high (about 10-6 per nucleotide per replication). The original isolate has a total length of 29,891 nt not counting the poly(A) tail. Note that these RNA viruses are about four times larger than a typical retrovirus; they are the largest known RNA viruses.

Wednesday, July 08, 2020

Where did your chicken come from?

Scientists have sequenced the genomes of modern domesticated chickens and compared them to the genomes of various wild pheasants in southern Asia. It has been known for some time that chickens resemble a species of pheasant called red jungle fowl and this led Charles Darwin to speculate that chickens were domesticated in India. Others have suggested Southeast Asia or China as the site of domestication.

The latest results show that modern chickens probably descend from a subspecies of red jungle fowl that inhabits the region around Myanmar (Wang et al., 2020). The subspecies is Gallus gallus spadiceus and the domesticated chicken subspecies is Gallus gallus domesticus. As you might expect, the two subspecies can interbreed.

The authors looked at a total of 863 genomes of domestic chickens, four species of jungle fowl, and all five subspecies of red jungle fowl. They identified a total of 33.4 million SNPs, which were enough to genetically distinguish between the various species AND the subspecies of red jungle fowl. (Contrary to popular belief, it is quite possible to assign a given genome to a subspecies (race) based entirely on genetic differences.)

The sequence data suggest that chickens were domesticated from wild G. g. spadiceus about 10,000 years ago in the northern part of Southeast Asia. The data also suggest that modern domesticated chickens (G. g. domesticus) from India, Pakistan, and Bangladesh interbred with another subspecies of red jungle fowl (G. g. murghi) after the original domestication. These chickens from South Asia contain substantial contributions from G. g. murghi ranging from 8-22%.

Next time you serve chicken, if someone asks you where it came from you won't be lying if you say it came from Myanmar.


Image credits: BBQ chicken, Creative Common License [Chicken BBQ]
Red Jungle Fowl, Creative Commons License [Red_Junglefowl_-Thailand]
Map: Lawler, A. (2020) Dawn of the chicken revealed in Southeast Asia, Science: 368: 1411.

Wang, M., Thakur, M., Peng, M. et al. (2020) 863 genomes reveal the origin and domestication of chicken. Cell Res (2020) [doi: 10.1038/s41422-020-0349-y]

Monday, July 06, 2020

A storm of cytokines

Cytokines are a diverse groups of small signal proteins that act like hormones to turn on genes in blood cells and cells of the immune system. In COVID-19 the production of cytokines can be over-stimulated to produce a cytokine storm that activates immune cells producing all kinds of severe, sometimes lethal, effects. There are dozens of different cytokines but they all act in a similar manner. Each one binds to a receptor on the membrane of a target cell and this stimulates the cytoplasmic side of the receptor to activate a transcription factor that enters the nucleus and turns on a specific set of genes. The activation step requires phosphorylation just like dozens of other signalling pathways. (See Morris et al. (2018) for a recent review.)

I was curious about the structures of these cytokines so I looked up a few of them on PDB. Here are three fairly representative structures.



Morris, R., Kershaw, N.J., and Babon, J.J. (2018) The molecular details of cytokine signaling via the JAK/STAT pathway. Protein Science 27: 1984-2009. [doi: doi.org/10.1002/pro.3519]

Saturday, June 13, 2020

What's in Your Genome? Chapter 3: Repetitive DNA and Mobile Genetic Elements

By the end of chapter 3, readers will be familiar with two main lines of evidence for junk DNA: the C-Value Paradox, and the fact that most of our genome is full of bits and pieces of dead transposons and viruses. They will also understand that this is perfectly consistent with modern evolutionary theory.

Chapter 3: Repetitive DNA and Mobile Genetic Elements
  • Centromeres
  • Telomeres
  • Mobile genetic elements
  • Hidden viruses in your genome
  • What the heck is a transposon?
  • LINES and SINES
  • How much of our genome is composed of transposon-related sequences?
  • BOX 3-1: What does the humped bladderwort tell us about junk DNA?
  • Selfish genes and selfish DNA
  • Mitochondria are invading your genome!
  • Selection hypotheses
  • Exaptation and the post hoc fallacy
  • Box 3-2: Natural genetic engineering?
  • If it walks like a duck ...


What's in Your Genome? Chapter 2: The Evolution of Sloppy Genomes

I had to completely reorganize chapter 2 in order to move population genetics closer to the beginning of the book and reduce the number of words.

Chapter 2: The Evolution of Sloppy Genomes
  • Fugu sashimi
  • Variation in genome size
  • The Onion Test
  • Instantaneous genome doubling
  • Modern evolutionary theory
  • Random genetic drift
  • Neutral Theory
  • Nearly-Neutral Theory
  • Box 2-1: Are humans are still evolving?
  • Population size and the Drift-Barrier Hypothesis
  • Bacteria have small genomes
  • On the evolution of sloppy genomes



What's in Your Genome? Chapter 1: Introducing Genomes

My book is progressing slowly. The main task is to reduce it to about 120,000 words and that's proving to be a lot more difficult that I imagined.

Here's what's now in Chapter 1: Introducing Genomes
  • The genome war
  • Finishing the human genome sequence
  • What is DNA?
  • The double helix
  • The sequence of all the base pairs was the goal of the human genome project
  • How big is your genome?
  • Packaging DNA: chromatin
  • Transcription
  • Translation
  • The genetic code
  • Introns and exons
  • The history of junk DNA



Thursday, June 11, 2020

Dan Graur proposes a new definition of "gene"

I've thought a lot about how to define the word "gene." It's clear that no definition will capture all the possibilities but that doesn't mean we should abandon the term. Traditionally, the biochemical definition attempts to describe the part of the genome that produces a functional product. Most scientists seem to think that the only possible product is a protein so it's common to see the word "gene" defined as a DNA sequence that produces a protein.

But from the very beginning of molecular biology the textbooks also talked about genes for ribosomal RNAs and tRNAs so there was never a time when knowledgeable scientists restricted their definition of a gene to protein-coding regions. My best molecular definition is described in What Is a Gene?.

A gene is a DNA sequence that is transcribed to produce a functional product.

Dan Graur has also thought about the issue and he comes up with a different definition in a recent blog post: What Is a Gene? A Very Short Answer with a Very Long Footnote

A gene is a sequence of genomic material (DNA or RNA) that has a selected effect function.

This is obviously an attempt to equate "function" with "gene" so that all functional parts of the genome are genes, by definition. You might think this is rather silly because it excludes some obvious functional regions but Dan really does want to count them as genes.
Performance of the function may or may not require the gene to be translated or even transcribed.

Genes can, therefore, be classified into three categories:

(1) protein-coding genes, which are transcribed into RNA and subsequently translated into proteins.

(2) RNA-specifying genes, which are transcribed but not translated

(3) nontranscribed genes.
Really? Is it useful to think of centromeres and telomeres as genes? Is it useful to define an origin of replication as a gene? And what about regulatory sequences? Should each functional binding site for a transcription factor be called a gene?

The definition also leads to some other problems. Genes (my definition) occupy about 30% of the human genome but most of this is introns, which are mostly junk (i.e. no selected effect function). How does that make sense using Dan's definition?


Saturday, April 18, 2020

Three scientists discuss junk DNA

I just found this video that was posted to YouTube on May 2019. It's produced by the University of California and it features three researchers discussing the question, "Is Most of Your DNA Junk!" The three scientists are:
  • Rusty Gage, a neuroscientist at the Salk Institute
  • Alysson Muotri, who studies brain development at the University of California, San Diego
  • Miles Wilkinson, who studies neuronal and germ cell development at the University of San Diego
None of them appear to be experts on genomes or junk DNA although one of them (Wilkinson) appears to have some knowledge of the evidence for junk DNA, although many of his explanations are garbled. What's interesting is that they emphasize the fact that some transposon-related sequences are expressed in some cells and they rely on this fact to remain skeptical of junk DNA. They also propose that excess DNA might be present in order to ensure diversity and prepare for future evolution. All three seem to be comfortable with the idea that excess DNA may be protecting the rest of the functional genome.

This is a good example of what we are up against when we try to convince scientists that most of our genome is junk.





Wednesday, April 08, 2020

Alternative splicing: function vs noise

This post is about a recent review of alternative splicing published by my colleague Ben Blencowe in the Dept. of Medical Genetics at the University of Toronto (Toronto, Ontario, Canada). (The other author is Jermej Ule of The Francis Crick Institute in London (UK).) They are strong supporters of the idea that alternative splicing is a common feature of most human genes.

I am a strong supporter of the idea that most splice variants are due to splicing errors and only a few percent of human genes undergo true alternative spicing.

This is a disagreement about the definition of "function." Is the mere existence of multiple splice variants evidence that they are biologically relevant (functional) or should we demand evidence of function—such as conservation—before accepting such a claim?