A recent report began with ... [Parts of genome without a known function may play a key role in the birth of new proteins]
Researchers in Biomedical Informatics at IMIM (Hospital del Mar Medical Research Institute) and at the Universitat Politècnica de Catalunya (UPC) have recently published a study in eLife showing that RNA called non-coding (lncRNA) plays an important role in the evolution of new proteins, some of which could have important cell functions yet to be discovered.That sounds intriguing. Maybe I should read the paper even though it's in eLife.
It took a little more work than I expected, but eventually I found the paper (Ruiz-Orera et al., 2014). Here's the abstract.
Deep transcriptome sequencing has revealed the existence of many transcripts that lack long or conserved open reading frames (ORFs) and which have been termed long non-coding RNAs (lncRNAs). The vast majority of lncRNAs are lineage-specific and do not yet have a known function. In this study, we test the hypothesis that they may act as a repository for the synthesis of new peptides. We find that a large fraction of the lncRNAs expressed in cells from six different species is associated with ribosomes. The patterns of ribosome protection are consistent with the translation of short peptides. lncRNAs show similar coding potential and sequence constraints than evolutionary young protein coding sequences, indicating that they play an important role in de novo protein evolution.The study suggests that a lot of "noncoding" RNAs are being translated. The products appear to be short polypeptides of less than 100 residues.
New protein encoding genes do arise from time to time although the number of proven examples is very small. Let's assume, for the sake of argument, that a new gene arises about once every million years in a given lineage. That would mean about five new genes in humans since they split from chimpanzees and that seems about right for an upper limit.
Now, if you make a lot of junk RNAs by randomly transcribing junk DNA, then some of them will undoubtedly make short polypeptides. There's a chance that random mutations will create a peptide that takes on a functional role of some kind. There's an even smaller chance that this function will confer a selective advantage on the individual carrying the mutation. That's one way new genes are born.
Is this a reason for carrying a huge amount of junk DNA in your genome and making thousands of lncRNAs? Is the potential to make a new gene one million years in the future sufficient explanation for the preservation of junk DNA? The answer is "no."
You don't have junk DNA because it might proven useful in the future. You have it because you can't get rid of it. You don't transcribe your junk DNA because it might be useful, you transcribe it because the general properties of RNA polymerase and transcription factors don't allow for perfect discrimination between real genes and junk DNA. Junk transcripts aren't translated because they contain potential coding regions, they are sometimes translated because they must, by chance, contain some open reading frames.
Sloppiness might, by accident, lead to new genes but that's not why things are sloppy. If having junk DNA were a clear advantage for future evolution then the genomes of all extant lineages should have lots of junk DNA and should make lots of lncRNAs.
Ruiz-Orera, J., Messeguer, X., Subirana, J.A., and Alba, M.M. (2014) Long non-coding RNAs as a source of new peptides. eLife 2014;3:e03523 [doi: 10.7554/eLife.03523]