More Recent Comments
Thursday, July 09, 2020
Structure and expression of the SARS-CoV-2 (coronavirus) genome
Coronaviruses are RNA viruses, which means that their genome is RNA, not DNA. All of the coronaviruses have similar genomes but I'm sure you are mostly interested in SARS-CoV-2, the virus that causes COVID-19. The first genome sequence of this virus was determined by Chinese scientists in early January and it was immediately posted on a public server [GenBank MN908947]. The viral RNA came from a patient in intensive care at the Wuhan Yin-Tan Hospital (China). The paper was accepted on Jan. 20th and it appeared in the Feb. 3rd issue of Nature (Zhou et al. 2020).
By the time the paper came out, several universities and pharmaceutical companies had already constructed potential therapeutics and several others had already cloned the genes and were preparing to publish the structures of the proteins.1
By now there are dozens and dozens of sequences of SARS-CoV-2 genomes from isolates in every part of the world. They are all very similar because the mutation rate in these RNA viruses is not high (about 10-6 per nucleotide per replication). The original isolate has a total length of 29,891 nt not counting the poly(A) tail. Note that these RNA viruses are about four times larger than a typical retrovirus; they are the largest known RNA viruses.
Labels:
Gene Expression
,
Genes
,
Genome
Wednesday, July 08, 2020
Where did your chicken come from?
Scientists have sequenced the genomes of modern domesticated chickens and compared them to the genomes of various wild pheasants in southern Asia. It has been known for some time that chickens resemble a species of pheasant called red jungle fowl and this led Charles Darwin to speculate that chickens were domesticated in India. Others have suggested Southeast Asia or China as the site of domestication.
The latest results show that modern chickens probably descend from a subspecies of red jungle fowl that inhabits the region around Myanmar (Wang et al., 2020). The subspecies is Gallus gallus spadiceus and the domesticated chicken subspecies is Gallus gallus domesticus. As you might expect, the two subspecies can interbreed.
The authors looked at a total of 863 genomes of domestic chickens, four species of jungle fowl, and all five subspecies of red jungle fowl. They identified a total of 33.4 million SNPs, which were enough to genetically distinguish between the various species AND the subspecies of red jungle fowl. (Contrary to popular belief, it is quite possible to assign a given genome to a subspecies (race) based entirely on genetic differences.)
The sequence data suggest that chickens were domesticated from wild G. g. spadiceus about 10,000 years ago in the northern part of Southeast Asia. The data also suggest that modern domesticated chickens (G. g. domesticus) from India, Pakistan, and Bangladesh interbred with another subspecies of red jungle fowl (G. g. murghi) after the original domestication. These chickens from South Asia contain substantial contributions from G. g. murghi ranging from 8-22%.
Next time you serve chicken, if someone asks you where it came from you won't be lying if you say it came from Myanmar.
The latest results show that modern chickens probably descend from a subspecies of red jungle fowl that inhabits the region around Myanmar (Wang et al., 2020). The subspecies is Gallus gallus spadiceus and the domesticated chicken subspecies is Gallus gallus domesticus. As you might expect, the two subspecies can interbreed.
The authors looked at a total of 863 genomes of domestic chickens, four species of jungle fowl, and all five subspecies of red jungle fowl. They identified a total of 33.4 million SNPs, which were enough to genetically distinguish between the various species AND the subspecies of red jungle fowl. (Contrary to popular belief, it is quite possible to assign a given genome to a subspecies (race) based entirely on genetic differences.)
The sequence data suggest that chickens were domesticated from wild G. g. spadiceus about 10,000 years ago in the northern part of Southeast Asia. The data also suggest that modern domesticated chickens (G. g. domesticus) from India, Pakistan, and Bangladesh interbred with another subspecies of red jungle fowl (G. g. murghi) after the original domestication. These chickens from South Asia contain substantial contributions from G. g. murghi ranging from 8-22%.
Next time you serve chicken, if someone asks you where it came from you won't be lying if you say it came from Myanmar.
Image credits: BBQ chicken, Creative Common License [Chicken BBQ]
Red Jungle Fowl, Creative Commons License [Red_Junglefowl_-Thailand]
Map: Lawler, A. (2020) Dawn of the chicken revealed in Southeast Asia, Science: 368: 1411.
Wang, M., Thakur, M., Peng, M. et al. (2020) 863 genomes reveal the origin and domestication of chicken. Cell Res (2020) [doi: 10.1038/s41422-020-0349-y]
Monday, July 06, 2020
A storm of cytokines
Cytokines are a diverse groups of small signal proteins that act like hormones to turn on genes in blood cells and cells of the immune system. In COVID-19 the production of cytokines can be over-stimulated to produce a cytokine storm that activates immune cells producing all kinds of severe, sometimes lethal, effects. There are dozens of different cytokines but they all act in a similar manner. Each one binds to a receptor on the membrane of a target cell and this stimulates the cytoplasmic side of the receptor to activate a transcription factor that enters the nucleus and turns on a specific set of genes. The activation step requires phosphorylation just like dozens of other signalling pathways. (See Morris et al. (2018) for a recent review.)
I was curious about the structures of these cytokines so I looked up a few of them on PDB. Here are three fairly representative structures.
I was curious about the structures of these cytokines so I looked up a few of them on PDB. Here are three fairly representative structures.
Morris, R., Kershaw, N.J., and Babon, J.J. (2018) The molecular details of cytokine signaling via the JAK/STAT pathway. Protein Science 27: 1984-2009. [doi: doi.org/10.1002/pro.3519]
Saturday, June 13, 2020
What's in Your Genome? Chapter 3: Repetitive DNA and Mobile Genetic Elements
By the end of chapter 3, readers will be familiar with two main lines of evidence for junk DNA: the C-Value Paradox, and the fact that most of our genome is full of bits and pieces of dead transposons and viruses. They will also understand that this is perfectly consistent with modern evolutionary theory.
Chapter 3: Repetitive DNA and Mobile Genetic Elements
Chapter 3: Repetitive DNA and Mobile Genetic Elements
- Centromeres
- Telomeres
- Mobile genetic elements
- Hidden viruses in your genome
- What the heck is a transposon?
- LINES and SINES
- How much of our genome is composed of transposon-related sequences?
- BOX 3-1: What does the humped bladderwort tell us about junk DNA?
- Selfish genes and selfish DNA
- Mitochondria are invading your genome!
- Selection hypotheses
- Exaptation and the post hoc fallacy
- Box 3-2: Natural genetic engineering?
- If it walks like a duck ...
Labels:
Junk DNA
What's in Your Genome? Chapter 2: The Evolution of Sloppy Genomes
I had to completely reorganize chapter 2 in order to move population genetics closer to the beginning of the book and reduce the number of words.
Chapter 2: The Evolution of Sloppy Genomes
Chapter 2: The Evolution of Sloppy Genomes
- Fugu sashimi
- Variation in genome size
- The Onion Test
- Instantaneous genome doubling
- Modern evolutionary theory
- Random genetic drift
- Neutral Theory
- Nearly-Neutral Theory
- Box 2-1: Are humans are still evolving?
- Population size and the Drift-Barrier Hypothesis
- Bacteria have small genomes
- On the evolution of sloppy genomes
Labels:
Junk RNA
What's in Your Genome? Chapter 1: Introducing Genomes
My book is progressing slowly. The main task is to reduce it to about 120,000 words and that's proving to be a lot more difficult that I imagined.
Here's what's now in Chapter 1: Introducing Genomes
Here's what's now in Chapter 1: Introducing Genomes
- The genome war
- Finishing the human genome sequence
- What is DNA?
- The double helix
- The sequence of all the base pairs was the goal of the human genome project
- How big is your genome?
- Packaging DNA: chromatin
- Transcription
- Translation
- The genetic code
- Introns and exons
- The history of junk DNA
Labels:
Junk RNA
Thursday, June 11, 2020
Dan Graur proposes a new definition of "gene"
I've thought a lot about how to define the word "gene." It's clear that no definition will capture all the possibilities but that doesn't mean we should abandon the term. Traditionally, the biochemical definition attempts to describe the part of the genome that produces a functional product. Most scientists seem to think that the only possible product is a protein so it's common to see the word "gene" defined as a DNA sequence that produces a protein.
But from the very beginning of molecular biology the textbooks also talked about genes for ribosomal RNAs and tRNAs so there was never a time when knowledgeable scientists restricted their definition of a gene to protein-coding regions. My best molecular definition is described in What Is a Gene?.
A gene is a DNA sequence that is transcribed to produce a functional product.
Dan Graur has also thought about the issue and he comes up with a different definition in a recent blog post: What Is a Gene? A Very Short Answer with a Very Long Footnote
A gene is a sequence of genomic material (DNA or RNA) that has a selected effect function.
This is obviously an attempt to equate "function" with "gene" so that all functional parts of the genome are genes, by definition. You might think this is rather silly because it excludes some obvious functional regions but Dan really does want to count them as genes.
The definition also leads to some other problems. Genes (my definition) occupy about 30% of the human genome but most of this is introns, which are mostly junk (i.e. no selected effect function). How does that make sense using Dan's definition?
But from the very beginning of molecular biology the textbooks also talked about genes for ribosomal RNAs and tRNAs so there was never a time when knowledgeable scientists restricted their definition of a gene to protein-coding regions. My best molecular definition is described in What Is a Gene?.
Dan Graur has also thought about the issue and he comes up with a different definition in a recent blog post: What Is a Gene? A Very Short Answer with a Very Long Footnote
This is obviously an attempt to equate "function" with "gene" so that all functional parts of the genome are genes, by definition. You might think this is rather silly because it excludes some obvious functional regions but Dan really does want to count them as genes.
Performance of the function may or may not require the gene to be translated or even transcribed.Really? Is it useful to think of centromeres and telomeres as genes? Is it useful to define an origin of replication as a gene? And what about regulatory sequences? Should each functional binding site for a transcription factor be called a gene?
Genes can, therefore, be classified into three categories:
(1) protein-coding genes, which are transcribed into RNA and subsequently translated into proteins.
(2) RNA-specifying genes, which are transcribed but not translated
(3) nontranscribed genes.
The definition also leads to some other problems. Genes (my definition) occupy about 30% of the human genome but most of this is introns, which are mostly junk (i.e. no selected effect function). How does that make sense using Dan's definition?
Labels:
Biochemistry
,
Genes
Saturday, April 18, 2020
Three scientists discuss junk DNA
I just found this video that was posted to YouTube on May 2019. It's produced by the University of California and it features three researchers discussing the question, "Is Most of Your DNA Junk!" The three scientists are:
This is a good example of what we are up against when we try to convince scientists that most of our genome is junk.
- Rusty Gage, a neuroscientist at the Salk Institute
- Alysson Muotri, who studies brain development at the University of California, San Diego
- Miles Wilkinson, who studies neuronal and germ cell development at the University of San Diego
This is a good example of what we are up against when we try to convince scientists that most of our genome is junk.
Wednesday, April 08, 2020
Alternative splicing: function vs noise
This post is about a recent review of alternative splicing published by my colleague Ben Blencowe in the Dept. of Medical Genetics at the University of Toronto (Toronto, Ontario, Canada). (The other author is Jermej Ule of The Francis Crick Institute in London (UK).) They are strong supporters of the idea that alternative splicing is a common feature of most human genes.
I am a strong supporter of the idea that most splice variants are due to splicing errors and only a few percent of human genes undergo true alternative spicing.
This is a disagreement about the definition of "function." Is the mere existence of multiple splice variants evidence that they are biologically relevant (functional) or should we demand evidence of function—such as conservation—before accepting such a claim?
I am a strong supporter of the idea that most splice variants are due to splicing errors and only a few percent of human genes undergo true alternative spicing.
This is a disagreement about the definition of "function." Is the mere existence of multiple splice variants evidence that they are biologically relevant (functional) or should we demand evidence of function—such as conservation—before accepting such a claim?
Monday, April 06, 2020
The Function Wars Part VII: Function monism vs function pluralism
This post is mostly about a recent paper published in Studies in History and Philosophy of Biol & Biomed Sci where two philosophers present their view of the function wars. They argue that the best definition of function is a weak etiological account (monism) and pluralistic accounts that include causal role (CR) definitions are mostly invalid. Weak etiological monism is the idea that sequence conservation is the best indication of function but that doesn't necessarily imply that the trait arose by natural selection (adaptation); it could have arisen by neutral processes such as constructive neutral evolution.
The paper makes several dubious claims about ENCODE that I want to discuss but first we need a little background.
Background
The ENCODE publicity campaign created a lot of controversy in 2012 because ENCODE researchers claimed that 80% of the human genome is functional. That claim conflicted with all the evidence that had accumulated up to that point in time. Based on their definition of function, the leading ENCODE researchers announced the death of junk DNA and this position was adopted by leading science writers and leading journals such as Nature and Science.
Let's be very clear about one thing. This was a SCIENTIFIC conflict over how to interpret data and evidence. The ENCODE researchers simply ignored a ton of evidence demonstrating that most of our genome is junk. Instead, they focused on the well-known facts that much of the genome is transcribed and that the genome is full of transcription factor binding sites. Neither of these facts were new and both of them had simple explanations: (1) most of the transcripts are spurious transcripts that have nothing to do with function, and (2) random non-functional transcription factor binding sites are expected from our knowledge of DNA binding proteins. The ENCODE researchers ignored these explanations and attributed function to all transcripts and all transcription factor binding sites. That's why they announced that 80% of the genome is functional.
The paper makes several dubious claims about ENCODE that I want to discuss but first we need a little background.
Background
The ENCODE publicity campaign created a lot of controversy in 2012 because ENCODE researchers claimed that 80% of the human genome is functional. That claim conflicted with all the evidence that had accumulated up to that point in time. Based on their definition of function, the leading ENCODE researchers announced the death of junk DNA and this position was adopted by leading science writers and leading journals such as Nature and Science.
Let's be very clear about one thing. This was a SCIENTIFIC conflict over how to interpret data and evidence. The ENCODE researchers simply ignored a ton of evidence demonstrating that most of our genome is junk. Instead, they focused on the well-known facts that much of the genome is transcribed and that the genome is full of transcription factor binding sites. Neither of these facts were new and both of them had simple explanations: (1) most of the transcripts are spurious transcripts that have nothing to do with function, and (2) random non-functional transcription factor binding sites are expected from our knowledge of DNA binding proteins. The ENCODE researchers ignored these explanations and attributed function to all transcripts and all transcription factor binding sites. That's why they announced that 80% of the genome is functional.
Wednesday, February 12, 2020
Happy Darwin Day! 2020
Charles Darwin, the greatest scientist who ever lived, was born on this day in 1809 [Darwin still spurs tributes, debates] [Happy Darwin Day!] [Darwin Day 2017]. Darwin is mostly famous for two things: (1) he described and documented the evidence for evolution and common descent and (2) he provided a plausible scientific explanation of evolution—the theory of natural selection. He put all this in a book, The Origin of Species by Means of Natural Selection published in 1859—a book that spurred a revolution in our understanding of the natural world. (You can still buy a first edition copy of the book but it will cost you several hundred thousand dollars.)
Friday, February 07, 2020
The Function Wars Part VI: The problem with selected effect function
The term "Function Wars" refers to the debate over the meaning of 'function,' especially in the context of junk DNA.1 That debate intensified in 2012 after the ENCODE publicity campaign that tried to redefine function to mean anything they want as long as it refutes junk DNA. This is the sixth in a series of posts exploring the debate and why it's important, or not. Links to the other five posts can be found at the bottom or this post.
The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.Stephen Jay Gould (1982)Much of the discussion seems like quibbling over semantics but I'm reminded of a similar debate over the mode of evolution: is it gradual or punctuated? As Gould pointed out in 1982, there's a serious issue underlying the debate—an issue that shouldn't get lost in bickering over the meaning of 'gradualistic.' The same warning applies here. It's important to determine how much of the human genome is junk and that requires an understanding of what we mean by junk DNA. However, it's easy to get distracted by focusing on the exact meaning of the word 'function' instead of looking at the big picture.
Friday, January 31, 2020
lncRNA nonsense from Los Alamos
A group of scientists at the Los Alamos National Laboratory (Los Alamos, NM, USA) and their collaborators in Vienna (Austria) and Lethbridge (Alberta, Canada) have worked out the structure of Braveheart lncRNA from mice.
Kim, D.N., Thiel, B.C., Mrozowich, T., Hennelly, S.P., Hofacker, I.L., Patel, T.R., Sanbonmatsu, K.Y. (2020) Zinc-finger protein CNBP alters the 3-D structure of lncRNA Braveheart in solution. Nat. Commun. 11:148 [doi: 10.1038/s41467-019-13942-4]The authors point out in their paper that lncRNAs are difficult to work with and the 3D structures of only a small number have been characterized. There's nothing in the paper about the problems associated with determining the functions of lncRNAs and nothing about the number of lncRNAs except for this brief opening statement: "Long non-coding RNAs (lncRNAs) constitute a significant fraction of the transcriptome ..."
Tuesday, January 14, 2020
The Three Domain Hypothesis: RIP
The Three Domain Hypothesis died about twenty years ago but most people didn't notice.
The original idea was promoted by Carl Woese and his colleagues in the early 1980s. It was based on the discovery of archaebacteria as a distinct clade that was different from other bacteria (eubacteria). It also became clear that some eukaryotic genes (e.g. ribosomal RNA) were more closely related to archaebacterial genes and the original data indicated that eukaryotes formed another distinct group separate from either the archaebacteria or eubacteria. This gave rise to the Three Domain Hypothesis where each of the groups, bacteria (Eubacteria), archaebacteria (Archaea), and eukaryotes (Eucarya, Eukaryota), formed a separate clade that contained multiple kingdoms. These clades were called Domains.
The original idea was promoted by Carl Woese and his colleagues in the early 1980s. It was based on the discovery of archaebacteria as a distinct clade that was different from other bacteria (eubacteria). It also became clear that some eukaryotic genes (e.g. ribosomal RNA) were more closely related to archaebacterial genes and the original data indicated that eukaryotes formed another distinct group separate from either the archaebacteria or eubacteria. This gave rise to the Three Domain Hypothesis where each of the groups, bacteria (Eubacteria), archaebacteria (Archaea), and eukaryotes (Eucarya, Eukaryota), formed a separate clade that contained multiple kingdoms. These clades were called Domains.
Wednesday, January 08, 2020
Are pseudogenes really pseudogenes?
There are many junk DNA skeptics who claim that most of our genome is functional. Some of them have even questioned whether pseudogenes are mostly junk. The latest challenge comes from a recent review in Nature Reviews: Genetics where the authors try to place the burden of proof on those who say that pseudogenes are broken, nonfunctional, genes (Cheetam et al., 2019). The authors of the review try to make the case that we should not label a DNA sequence as a pseudogene until we can prove that it is truly nonfunctional junk.
I'm about to refute this ridiculous stance but first we need a little background.
I'm about to refute this ridiculous stance but first we need a little background.
Subscribe to:
Posts
(
Atom
)