More Recent Comments

Showing posts with label Biochemistry. Show all posts
Showing posts with label Biochemistry. Show all posts

Sunday, December 15, 2019

The evolution of citrate synthase

Citrate synthase [EC 2.3.3.1] is one of the key enzymes of the citric acid cycle. It catalyzes the joining of acetyl-CoA and oxaloacetate to produce citrate.
acetyl-CoA + H2O + oxaloacetate → citrate + HS-CoA + H+
We usually think of this reaction in terms of energy production since acetyl-CoA is the end product of glycolysis and the citric acid cycle produces substrates that enter the electron transport system leading to production of ATP. However, it's important to keep in mind that the enzyme also catalyzes the reverse reaction.

Thursday, May 10, 2018

Fixing carbon by reversing the citric acid cycle

The citric acid cycle1 is usually taught as depicted in the diagram on the right.2 A four-carbon compound called oxaloaceate is joined to a two-carbon compound called acetyl-CoA to produce a six-carbon tricarboxylic acid called citrate. In subsequent reactions, two carbons are released in the form of carbon dioxide to regenerate the original oxaloacetate. The cycle then repeats. The reactions produce one ATP equivalent (ATP or GTP), three NADH molecules, and one QH2 molecule.

The GTP/ATP molecule and the reduced coenzymes (NADH and QH2) are used up in a variety of other reactions. In the case of NADH and QH2, one of the many pathways to oxidation is the membrane-associated electron transport system that creates a proton gradient across a membrane. The electron transport complexes are buried in membranes—plasma and internal membranes in bacteria and the inner mitochondrial membrane in eukaryotes. Students are often taught that this is the only fate of NADH and QH2 but that's not true.

One of the other common misconceptions is that the citric acid cycle runs exclusively in one direction; namely, the direction shown in the diagram. That's also not true. The reactions of the citric acid cycle are near-equilibrium reactions like most reactions in the cell. What this means is that the concentrations of the reactants and products are close to the equilibrium values so that a slight increase in one of them will lead to a rapid equilibration. The reactions can run in either direction.3

Tuesday, October 31, 2017

The history of DNA sequencing

This year marks the 40th anniversary of DNA sequencing technology (Gilbert and Maxam, 1977; Sanger et al., 1977)1 The Sanger technique soon took over and by the 1990s it was the only technique used to sequence DNA. The development of reliable sequencing machines meant the end of those large polyacrylamide gels that we all hated.

Pyrosequencing was developed in the mid 1990's and by the year 2000 massive parallel sequencing using this technique was becoming quite common. This "NextGen" sequencing technique was behind the massive explosion in sequences in the early part of the 21st century.2

Even newer techniques are available today and there's a debate about whether they should be called Third Generation Sequencing (Heather and Chain, 2015).

Nature has published a nice review of the history of DNA sequencing (Shendure et al., 2017). I recommend it to anyone who's interested in the subject. The figure above is taken from that article.


1. Many labs were using the technology in 1976 before the papers were published.

2. New software and enhanced computer power played an important, and underappreciated, role.

Heather, J.M., and Chain, B. (2015) The sequence of sequencers: The history of sequencing DNA. Genomics, 107:1-8. [doi: 10.1016/j.ygeno.2015.11.003]

Maxam, A.M., and Gilbert, W. (1980) Sequencing end-labeled DNA with base-specific chemical cleavages. Methods in enzymology, 65:499-560. [doi: 10.1016/S0076-6879(80)65059-9]

Sanger, F., Nicklen, S., and Coulson, A.R. (1977) DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74:5463-5467. [PDF]

Shendure, J., Balasubramanian, S., Church, G.M., Gilbert, W., Rogers, J., Schloss, J.A., and Waterston, R.H. (2017) DNA sequencing at 40: past, present and future. Nature, 550:345-353. [doi: 10.1038/nature24286]


Wednesday, October 11, 2017

Historical evolution is determined by chance events

Modern evolutionary theory is based on the idea that alleles become fixed in a population over time. They can be fixed by natural selection if they confer selective advantage or they can be fixed by random genetic drift if they are nearly neutral or slightly deleterious [Learning about modern evolutionary theory: the drift-barrier hypothesis]. Alleles arise by mutation and the path that a population follows over time depends on the timing of mutations [Mutation-Driven Evolution]. That's largely a chance event.

Friday, August 25, 2017

How much of the human genome is devoted to regulation?

All available evidence suggests that about 90% of our genome is junk DNA. Many scientists are reluctant to accept this evidence—some of them are even unaware of the evidence [Five Things You Should Know if You Want to Participate in the Junk DNA Debate]. Many opponents of junk DNA suffer from what I call The Deflated Ego Problem. They are reluctant to concede that humans have about the same number of genes as all other mammals and only a few more than insects.

One of the common rationalizations is to speculate that while humans may have "only" 25,000 genes they are regulated and controlled in a much more sophisticated manner than the genes in other species. It's this extra level of control that makes humans special. Such speculations have been around for almost fifty years but they have gained in popularity since publication of the human genome sequence.

In some cases, the extra level of regulation is thought to be due to abundant regulatory RNAs. This means there must be tens of thousand of extra genes expressing these regulatory RNAs. John Mattick is the most vocal proponent of this idea and he won an award from the Human Genome Organization for "proving" that his speculation is correct! [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research]. Knowledgeable scientists know that Mattick is probably wrong. They believe that most of those transcripts are junk RNAs produced by accidental transcription at very low levels from non-conserved sequences.

Tuesday, June 27, 2017

Debating alternative splicing (Part IV)

In Debating alternative splicing (Part III) I discussed a review published in the February 2017 issue of Trends in Biochemical Sciences. The review examined the data on detecting predicted protein isoforms and concluded that there was little evidence they existed.

My colleague at the University of Toronto, Ben Blencowe, is a forceful proponent of massive alternative splicing. He responded in a letter published in the June 2017 issue of Trends in Biochemical Sciences (Blencowe, 2017). It's worth looking at his letter in order to understand the position of alternative splicing proponents. He begins by saying,
It is estimated that approximately 95% of multiexonic human genes give rise to transcripts containing more than 100 000 distinct AS events [3,4]. The majority of these AS events display tissue-dependent variation and 10–30% are subject to pronounced cell, tissue, or condition-specific regulation [4].

Monday, June 26, 2017

Debating alternative splicing (Part III)

Proponents of massive alternative splicing argue that most human genes produce many different protein isoforms. According to these scientists, this means that humans can make about 100,000 different proteins from only ~20,000 protein-coding genes. They tend to believe humans are considerably more complex than other animals even though we have about the same number of genes. They think alternative splicing accounts for this complexity [see The Deflated Ego Problem].

Opponents (I am one) argue that most splice variants are due to splicing errors and most of those predicted protein isoforms don't exist. (We also argue that the differences between humans and other animals can be adequately explained by differential regulation of 20,000 protein-coding genes.) The controversy can only be resolved when proponents of massive alternative splicing provide evidence to support their claim that there are 100,000 functional proteins.

Saturday, June 24, 2017

Debating alternative splicing (part II)

Mammalian genomes are very large. It looks like 90% of it is junk DNA. These genomes are pervasively transcribed, meaning that almost 90% of the bases are complementary to a transcript produced at some time during development. I think most of those transcripts are due to inappropriate transcription initiation. They are mistakes in transcription. The genome is littered with transcription factor binding sites but only a small percentage are directly involved in regulating gene expression. The rest are due to spurious binding—a well-known property of DNA binding proteins. These conclusions are based, I believe, on a proper understanding of evolution and basic biochemistry.

If you add up all the known genes, they cover about 30% of the genome sequence. Most of this (>90%) is intron sequence and introns are mostly junk. The standard mammalian gene is transcribed to produce a precursor RNA that is subsequently processed by splicing out introns to produce a mature RNA. If it's a messenger RNA (mRNA) then it will be translated to produce a protein (technically, a polypeptide). So far, the vast majority of protein-coding genes produce a single protein but there are some classic cases of alternative splicing where a given gene produces several different protein isoforms, each of which has a specific function.

Friday, June 23, 2017

Debating alternative splicing (part I)

I recently had a chance to talk science with my old friend and colleague Jack Greenblatt. He has recently teamed up with some of my other colleagues at the University of Toronto to publish a paper on alternative splicing in mouse cells. Over the years I have had numerous discussions with these colleagues since they are proponents of massive alternative splicing in mammals. I think most splice variants are due to splicing errors.

There's always a problem with terminology whenever we get involved in this debate. My position is that it's easy to detect splice variants but they should be called "splice variants" until it has been firmly established that the variants have a biological function. This is not a distinction that's acceptable to proponents of massive alternative splicing. They use the term "alternative splicing" to refer to any set of processing variants regardless of whether they are splicing errors or real examples of regulation. This sometimes makes it difficult to have a discussion.

In fact, most of my colleagues seem reluctant to admit that some splice variants could be due to meaningless errors in splicing. Thus, they can't be pinned down when I ask them what percentage of variants are genuine examples of alternative splicing and what percentage are splicing mistakes. I usually ask them to pick out a specific gene, show me all the splice variants that have been detected, and explain which ones are functional and which ones aren't. I have a standing challenge to do this with any one of three sets of genes [A Challenge to Fans of Alternative Splicing].
  1. Human genes for the enzymes of glycolysis
  2. Human genes for the subunits of RNA polymerase with an emphasis on the large conserved subunits
  3. Human genes for ribosomal proteins
I realize that proponents of massive alternative splicing are not under any obligation to respond to my challenge but it sure would help if they did.

Thursday, June 22, 2017

Are most transcription factor binding sites functional?

The ongoing debate over junk DNA often revolves around data collected by ENCODE and others. The idea that most of our genome is transcribed (pervasive transcription) seems to indicate that genes occupy most of the genome. The opposing view is that most of these transcripts are accidental products of spurious transcription. We see the same opposing views when it comes to transcription factor binding sites. ENCODE and their supporters have mapped millions of binding sites throughout the genome and they believe this represent abundant and exquisite regulation. The opposing view is that most of these binding sites are spurious and non-functional.

The messy view is supported by many studies on the biophysical properties of transcription factor binding. These studies show that any DNA binding protein has a low affinity for random sequence DNA. They will also bind with much higher affinity to sequences that resemble, but do not precisely match, the specific binding site [How RNA Polymerase Binds to DNA; DNA Binding Proteins]. If you take a species with a large genome, like us, then a typical DNA protein binding site of 6 bp will be present, by chance alone, at 800,000 sites. Not all of those sites will be bound by the transcription factor in vivo because some of the DNA will be tightly wrapped up in dense chromatin domains. Nevertheless, an appreciable percentage of the genome will be available for binding so that typical ENCODE assays detect thousand of binding sites for each transcription factor.

This information appears in all the best textbooks and it used to be a standard part of undergraduate courses in molecular biology and biochemistry. As far as I can tell, the current generation of new biochemistry researchers wasn't taught this information.

Wednesday, June 21, 2017

John Mattick still claims that most lncRNAs are functional

Most of the human genome is transcribed at some time or another in some tissue or another. The phenomenon is now known as pervasive transcription. Scientists have known about it for almost half a century.

At first the phenomenon seemed really puzzling since it was known that coding regions accounted for less than 1% of the genome and genetic load arguments suggested that only a small percentage of the genome could be functional. It was also known that more than half the genome consists of repetitive sequences that we now know are bits and pieces of defective transposons. It seemed unlikely back then that transcripts of defective transposons could be functional.

Part of the problem was solved with the discovery of RNA processing, especially splicing. It soon became apparent (by the early 1980s) that a typical protein coding gene was stretched out over 37,000 bp of which only 1300 bp were coding region. The rest was introns and intron sequences appeared to be mostly junk.

Tuesday, May 30, 2017

We are scientists


You can tell we are scientists because we're all wearing lab coats.

Left to right: David Isenman, Larry Moran, Marc Perry, Kim Ellison, Trevor Moraes, Mike Ellison.

The photo was taken in the biochemistry department labs at the University of Toronto (Toronto, Canada).




Three generations of scientists


Bottom row, left to right.

Marc Perry: Bioinformatics researcher and former graduate student in my lab.
Mike Ellison: Professor, University of Alberta (Alberta, Canada) and former graduate student in the lab of my colleague David Pulleyblank.
Trevor Moraes: Professor in my department at the University of Toronto and former graduate student with Mike Ellison.
Kim (Bird) Ellison: Professor at the University of Alberta, former undergraduate student in my lab (where she met Mike Ellison), Ph.D. at MIT.



Friday, April 28, 2017

Professor, please can I have more marks?

I submitted my grades on Thursday morning and they were approved by the Department of Biochemistry in short order. Once the final grades have been approved and submitted to the Faculty they can't be changed unless the change is approved by the Departmental Chair. Students may appeal their grade by paying a fee to re-read their final exam but, even then, I do not have the authority on my own to change a grade. I have to justify any change in writing. This is a good thing.

A few hours after the grades were posted I received an email message from a student [It's that time of year, again]. Here's part of what the student said,
I just saw my final mark ... which was an 76, and was very surprised. I thought I'd done well on the final exam, and had studied hard. My performance on the Midterm was good, and I had expected this to be just as well. As such, I wanted to humbly inquire whether it'd be possible to move me a 77 (a 1% increase) or even an 80. This small difference could make a very big impact on my GPA as I apply for positions to pursue a master or other professional degrees. With the mark as it is now, I fall below the GPA requirement for a program I wish to enroll in next year and will have to do another few courses or a full year to make up for it.

Friday, April 21, 2017

I'm going to Chicago!

I leave tomorrow for Chicago where I'm attending Experimental Biology 2017. Is anyone else going to be there? Wanna get together? I'm there until Wednesday.



Thursday, April 20, 2017

Bill Martin is coming to town!!!

Contact me by email if you'd like to meet him on Sunday, April 30th.




The last molecular evolution exam: Question #6

How can alleles be fixed in a population by positive natural selection (i.e. adaptation) if the environment remains constant for thousands of years?

Question #1, Question #2, Question #3, Question #4, Question #5, Question #6


The last molecular evolution exam: Question #5

Many people believe that recombination evolved because it increases genetic variation in a population and this provided a selective advantage over species that didn’t have recombination. Do you agree with this explanation for the evolution of recombination? Why, or why not? What are the other possibilities?

Question #1, Question #2, Question #3, Question #4, Question #5, Question #6


The last molecular evolution exam: Question #4

More than 90% of our genome is transcribed when you add up all the transcripts from various cell types and various times of development (= pervasive transcription). Many biologists take this as evidence that most of the DNA in our genome is functional. What are the counter-arguments? Who do you believe and why?

Question #1, Question #2, Question #3, Question #4, Question #5, Question #6


The last molecular evolution exam: Question #3

The Three Domain Hypothesis has eukaryotes and archaea branching off from eubacteria. It shows eukaryotes more closely related to archaea than to eubacteria. However, many scientific studies indicate that a majority of our genes are more similar to eubacterial genes than to archaeal genes. How do you explain this apparent conflict?

Question #1, Question #2, Question #3, Question #4, Question #5, Question #6