Sandwalk: Segmental duplications in the human genome

Sunday, April 03, 2022

Segmental duplications in the human genome

The new completed human genome sequence contains some previously unknown large duplicatons (segmental duplications).

This is my third post on the complete telomere-to-telomere sequence of the human genome in cell line CHM13 (T2T-CHM13). There were six papers in the April 1st edition of Science. My posts on all six papers are listed at the bottom of this post.

Segmental duplications (SD) consist of large regions of the human genome (>100 kb) that have been duplicated, usually by recombination errors. Some of these duplications are ancient and may be shared with closely related species but many are quite recent giving rise to polymorphisms within the species. Some of us have certain duplicated regions and others don't. There are thousands of known SDs.

The standard reference genome (CRCh38) contains a number of segmental duplications but most of us are missing some of those SDs and most of us have about 1000 extra ones that aren't in the reference genome. The assembly of the standard reference genome was complicated by the presence of SDs so it isn't clear whether it represents a typical human genome. The new T2T-CHM13 sequence is assembled from very long reads and, furthermore, the DNA is esentially haploid so it was easier to recognize SDs and other genomic rearrangements. (Most genome sequences are from diploid cells and the sister chromosomes may differ in the locations of insertions and deletions making it difficult to assemble a single complete genome that represents both copies.)

Vollger, M.R., Guitart, X., Dishuck, P.C., Mercuri, L., Harvey, W.T., Gershman, A., Diekhans, M., Sulovari, A., Munson, K.M. and Lewis, A.M. et al. (2021) Segmental duplications and their variation in a complete human genome. Science 276:55. [doi: 10.1126/science.abj6965]

Despite their importance in disease and evolution, highly identical segmental duplications (SDs) are among the last regions of the human reference genome (GRCh38) to be fully sequenced. Using a complete telomere-to-telomere human genome (T2T-CHM13), we present a comprehensive view of human SD organization. SDs account for nearly one-third of the additional sequence, increasing the genome-wide estimate from 5.4 to 7.0% [218 million base pairs (Mbp)]. An analysis of 268 human genomes shows that 91% of the previously unresolved T2T-CHM13 SD sequence (68.3 Mbp) better represents human copy number variation. Comparing long-read assemblies from human (n = 12) and nonhuman primate (n = 5) genomes, we systematically reconstruct the evolution and structural haplotype diversity of biomedically relevant and duplicated genes. This analysis reveals patterns of structural heterozygosity and evolutionary differences in SD organization between humans and other primates.

The T2T-CHM13 genome contains 208Mb of unique non-repetitive SDs (including an estimate of the Y chromosome sequence). There are significant SDs in the ribosomal RNA clusters bringing the total amount of SDs in a typical human genome to about 7% of the total sequence. The fact that these SDs are polymorphic suggests a dynamic genome with frequent duplications and deletions that, to a first approximation, don't appear to have any effect on fitness.

Note that in addition to the SDs studied in this paper, there are unique (non SD) regions of the genome that are missing in some individuals suggesting that it is junk DNA. About 7% of the unique sequences in the entire genome can be deleted without noticeable effect although no two humans differ by more than 1% in the unique regions (see Bergström et al., 2020).

Vollger et al. identified 33 new inversion polmorphisms bringing the total number to 62 known inversion polymorphisms. These are regions of the genome that have been flipped, or inverted, relative to a standard reference genome. The fact that the inversions are present in some people but not others (i.e. polymorphic) suggests that they are innocuous.

Smaller duplications can also be polymorphic and sometimes they are associated with genes. This gives rise to copy number variation and the T2T-CHM13 genome added quite a few extra examples bring the total number of known copy number variants to 1292. In terms of copy number variants, the T2T-CHM13 genome is closer to the typical genome than the standard reference genome. This is just one more bit of evidence showing that the T2T-CHM13 genome is a more faithful representation of a typical genome than CRCh38. (Most of the CRCh38 sequence is from an anonymous donor in Buffalo, New York).

The authors are clearly interested in the functions of duplicated regions and they present the data with an adaptationist bias that tends to assume functionality. There is no mention of the possibility that much of the SD and copy number variation could be unrelated to function. This approach is similar to the other papers that seem to go out of their way to avoid any mention of junk DNA.

What do we do with two different human genome reference sequences?

Epigenetic markers in the last 8% of the human genome sequence

Segmental duplications in the human genome

Bergström, A., McCarthy, S.A., Hui, R., Almarri, M.A., Ayub, Q., Danecek, P., Chen, Y., Felkel, S., Hallast, P. and Kamm, J. (2020) Insights into human genetic variation and population history from 929 diverse genomes. Science 367:eaay5012. [doi: 10.1126/science.aay5012]

No comments :

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Sunday, April 03, 2022

Segmental duplications in the human genome

No comments :