This is an updated version of what's in your genome based on the latest data. The simple version is ...
about 90% of your genome is junk
The more sophisticated version is...
There are several ways of estimating the amount of functional DNA and the amount of junk DNA. All of them are approximations but they only differ by a few percent. Note that several categories overlap. For example, introns and pseudogenes contain substantial amounts of DNA derived from transposons. The total amount of transposon-related sequence is about 55% when you include this fraction.
Here's a list of DNA sequences that are known or presumed to have a function (i.e. they are not junk).
- functional parts of protein-coding genes (mostly coding regions): 0.9%
- functional parts of genes for likely noncoding RNAs: 0.6%
- regulatory sequences: 0.2%
- scaffold attachment regions (SARs): 0.3%
- origins of replication: 0.3%
- centromeres: 1%
- telomeres: 0.1%
- (functional virus sequences: 0.1%)
- (functional transposons: 0.1%)
- conserved sequences of unknown function: ~4.6% (maximum)
This adds up to about 8% of the genome. Note that there's considerable debate over the definition of function and how it applies to virus sequences and transposon sequences that are still intact. It qualifies as junk DNA by my definition of DNA than can be deleted without affecting the survival of the organism but I want to make it clear that that all of the virus- and transposon-related sequences included in junk below are not intact and thus clearly junk by any definition.
Here's a list of DNA sequences that are known or presumed to be junk DNA.
- pseudogenes: 5%
- introns (including 25% of transposon sequences): 43%
- additional defective transposon sequences: 30%
- defective virus sequences: 9%
- mitochondrial DNA: 0.01%
- extra repetitive DNA: 2%
The total amount of known or presumed junk DNA adds up to 89%. That leaves another 3% unaccounted for. Some of it could be nonconserved spacer DNA that's functional or it could be additional conserved sequences since the total amount of conserved DNA could be closer to 10% according to some studies. Or it could be junk DNA.
Note that there are about 20,000 protein-coding genes and they take up about 39% of the genome (~1% exons, ~37% introns). We don't know exactly how many noncoding genes there are but a reasonable (and generous) estimate is 5,000. These gene take up an additional 7% of the genome (~1% exons, ~6% introns). (Much of the functional regions of noncoding RNA genes consists of 300 copies of ribosomal RNA genes (0.4%).) The important point is that roughly 45% of the genome is genes when we define a gene as a DNA sequence that's transcribed. A lot of this is junk within introns.
The figure below shows a region from the short arm of chromosome 12 (p13.31) in order to illustrate the gene density. A lot of people don't realize that almost half of our genome is genes.