More Recent Comments

Friday, July 22, 2011

Stop the Press!!! Scientists Discover the 7th and 8th Bases of DNA!

Science Daily published a press release from the University of North Carolina School of Medicine: Researchers Identify Seventh and Eighth Bases of DNA. The news was so extraordinary that the article was copied on Here's the opening paragraphs from the press release ...
ScienceDaily (July 21, 2011) — For decades, scientists have known that DNA consists of four basic units -- adenine, guanine, thymine and cytosine. Those four bases have been taught in science textbooks and have formed the basis of the growing knowledge regarding how genes code for life. Yet in recent history, scientists have expanded that list from four to six.

Now, with a finding published online in the July 21, 2011, issue of the journal Science, researchers from the UNC School of Medicine have discovered the seventh and eighth bases of DNA.

These last two bases -- called 5-formylcytosine and 5 carboxylcytosine -- are actually versions of cytosine that have been modified by Tet proteins, molecular entities thought to play a role in DNA demethylation and stem cell reprogramming.


Much is known about the "fifth base," 5-methylcytosine, which arises when a chemical tag or methyl group is tacked onto a cytosine. This methylation is associated with gene silencing, as it causes the DNA's double helix to fold even tighter upon itself.

Last year, Zhang's group reported that Tet proteins can convert 5 methylC (the fifth base) to 5 hydroxymethylC (the sixth base) in the first of a four step reaction leading back to bare-boned cytosine.
Speaking of textbooks, this amazing discovery couldn't have come at a better time since I'm just wrapping up the final chapters of my introductory biochemistry book. I'd better review what I wrote to see if I can include the 7th and 8th bases. Here's what I've got so far ...

DNA and RNA contain a number of modified nucleotides. The ones present in transfer RNA are well known (Section 21.8B) but the modified nucleotides in DNA are just as important. Some of the more common modified bases in DNA are shown in Figure 18.17. Most of them are only found in a few species or in bacteriophage while others are more widespread.

We will encounter N6-methyladenine in the next chapter when we discuss restriction endonucleases. 5-Methylcytosine is a common modified base in mammalian DNA because it plays a role in chromatin assembly and the regulation of transcription. About 3% of all deoxycytidylate residues in mammalian DNA are modified to 5-methylcytidine.
Oh dear. Looks like I've made a serious mistake. I've shown bases #5, #6, #7, #8, #9, and #10 but everyone knows that up until yesterday only six bases were known.

Where did I go wrong? Can anyone help me out before I have to send this chapter to the printer?1

(The original Science paper is Ito et al. (2011). The authors really do imply that there are only six known modified nucleotides but they add an important qualifier that seems to have been played down the press release.)

1. One of my sources is Gomers-Apt and Borst (1995). In addition to the modified bases I've shown above they describe three forms of glycosylated hydroxymethyl cytosine (#11, #12, #13), uracil (#14), α-putrescinylthymine (#15), two different sugar substituted forms of 5-dihydroxypentyluracil (#16, #17), a-glutamylthymine (#18), 7-methylguanine (#19), N6-carbamoylmethyladenine (#20), N6-methylcytosine (#21), three versions of glycosylated 5-hydroxycytosine (#22, #23, #24) and β-D-hydroxymethyluracil (#25).

Gommers-Ampt, J.H. and Borst, A.P. (1995) Hypermodified bases in DNA. FASEB 9: 1034-1042 [FASEB]

Ito, S., Shen, L., Dai, Q., Wu, S.C., Collins, L.B., Swenberg, J.A., He, C., and Zhang, Y. (2011) Tet Proteins Can Convert 5-Methylcytosine to 5-Formylcytosine and 5-Carboxylcytosine. Science Published Online 21 July 2011 [doi:10.1126/science.1210597]


  1. There is even a 5-hmC detection kit for sale.

    It's been shown to inhibit binding of methyl-CpG-binding domain proteins, reversing some of the effects of DNA methylation on expression.

    If this is the case, then it's another layer of epigenetic control, and all our bisulfite sequencing will have to be reconsidered.

  2. Oh, dear, if that's how number of bases get multiplied, think of how many different amino acid residues in proteins are out there!

  3. the cool thing about the new bases is that they are actually found in mammals, unlike most of the other ones. The fuzz about the discovery of formylcytosine is quite interesting, as it has already been published by a european group a month ago - but nobody seemed to care ...

  4. anonymous says,

    the cool thing about the new bases is that they are actually found in mammals, unlike most of the other ones.

    Hmmm ... I thought it was pretty cool when the other 25 modified bases were found in variety of different species. Why is it more cool that some are found in mammalian genomes?

    And does this myopic view justify implying that the new bases are only the 7th and 8th bases that have been found? This is one of the problems with science these days. Even supposedly knowledgeable researchers are completely ignorant of fundamental work done years ago in bacteria and phage.

  5. Dr. Moran,

    In the figure, O4 in 5-hydroxymethyluracil needs a double bond. Just wanted to point that out in case the error is present in the figure from your textbook as well.

  6. Any idea when your text will be published?

  7. Even supposedly knowledgeable researchers are completely ignorant of fundamental work done years ago in bacteria and phage.

    This complaint sounds a little like getting all upset that people are so busy being impressed with lasers that they forgot all that pioneering work does years ago with flashlights. *tease*

  8. anonymous asks,

    Any idea when your text will be published?

    It goes to the printer on Friday. We should have copies by September 1st.