The sequence of the human genome was announced on June 26, 2000 although the actual sequence wasn't published until a year later. There were two sequences. One was the product of the International Human Genome Project led by Francis Collins who said,
"It is humbling for me and awe-inspiring to realize that we have caught the first glimpse of our own instruction book, previously known only to God."
The sequence was a composite of a number of individuals.
The second sequence was from Celera Genomics, led by Craig Venter. It was mostly his genome, making him the second being to know his own instruction book ... right after God.
It took another seven years to finish and publish the complete sequence of all of Craig Venter's chromosomes. The paper was published in PLoS Biology (Levy et al., 2007) and highlighted in a Nature News article: All about Craig: the first 'full' genome sequence.
What's unique about this genome sequence—other than the fact that it's God's Craig Venter's—is that all 46 chromosomes were sequenced. In other words, enough data was generated to put together separate sequences of each pair of chromosomes. That produces some interesting data.
There were 4.1 million differences between homologous chromosomes (22 autosomes). 78% of these events were single nucleotide polymorphisms (SNPs). The rest were indels (insertions and deletions) and these accounted for 0.9 million nucleotides. Thus, indels made up 74% of the total number of variant nucleotide sequence.
In addition, there were 62 copy number variants (duplication) accounting for an additional 10Mb of variation between haploid sets of chromosomes. The total number of nucleotide differences is 13.9Mb when you add up all the indels, SNPs, and duplications. The two haploid genomes differ by about 0.5% by this calculation (total amount sequenced was 2,895Mb).
When the two copies of all annotated genes were compared, it turned out that 44% were heterozygous—the two copies were not identical.
Craig Venter's genome sequence differs from the composite human reference genome at 4,118,889 positions. Most of these were already known as variants in the human population but 31% were new variants (in 2007).
Venter has written about his genome sequence in A Life Decoded. He has variants in his APOE gene sequence that are associated with Alzheimer's and cardiovascular diseases. He has variants in his SORL1 that also make him at risk for Alzheimer's according to 2007 data. Just about everyone who gets their genomes sequenced will find variants that put them at greater risk for some genetic disease.
Massimo Pigliucci is an atheist who thinks that science and religion are compatible because they rule in different domains. He takes a very narrow view of "science"— one that excludes the work of historians and philosophers who are presumably using some other way of knowing. (He doesn't tell us what that is.)
I prefer the broad view of science as a way of knowing that relies on evidence, rational thinking, and healthy skepticism. This broad view of science is not universal—but it's not uncommon. In fact, Alan Sokel has defended this view of Massimo Pigiucci's own blog: [What is science and why should we care? — Part III]. According to this view, any attempt to gain knowledge should employ the scientific worldview. Historian and philosophers should follow this path if they hope to be successful. Pigliucci should know that there are different definitions and any discussion of the compatibility of science and religion must take these differences into account.
We know quite a lot about the origin of new genes (Carvunis et al., 2012; Kaessman, 2010; Long et al., 2003; Long et al., 2013; Näsvall et al., 2012); Neme and Tautz, 2013; Schlötterer, 2015; Tautz and Domazet-Lošo (2011); Wu et al., 2011). Most of them are derived from gene duplication events and subsequent divergence. A smaller number are formed de novo from sequences that were not part of a gene in the ancestral species.
In spite of what you might have read in the popular literature, there are not a large number of newly formed genes in most species. Genes that appear to be unique to a single species are called "orphan" genes. When a genome is first sequenced there will always be a large number of potential orphan genes because the gene prediction software tilts toward false positives in order to minimize false negatives. Further investigation and annotation reduces the number of potential genes.
Thanks for reading the commentary on my university’s communication page, hastily written for brevity and digestibility by me and our science communication officer, Lawrence Goodman. I was originally hoping the piece could focus on my latest research, but it turned into this sort of general Q&A chat. The commentary was written rather quickly and meant for a general audience perusing Brandeis research, so it is obviously not a peer-reviewed scientific publication.
I am well aware of both your reputations as fiery critics and experts of evolutionary biology, and you have somewhat of a following on the internet. Some of your earlier blog posts have been entertaining and even on point regarding how big projects like ENCODE have over-hyped the functional proportions of our genomes. So, it does NOT surprise me one bit that I would become your latest vitriolic target in your posts here, and here.
Could I learn more from you two about evolutionary biology theory? Indeed, I could. Can we revise our Q&A commentary to be more scientifically accurate while still being digestible to a general audience? Perhaps, if we have the time and I survive my tenure review, we may do so and take your input into consideration. Why respond and risk another snarky post from you guys? I could care less about your trivial blog critiques when I’ve received plenty of grants and paper rejections that cut much deeper into my existence as a young academic struggling to survive when the academic track has never been more challenging (<10% grant success rates at NIH, NSF, CIHR, etc).
I’m responding to ask that both of you reflect on the message your posts are sending to students and postdocs. As a young scientist, having a chat with my university PR rep, I have to now think twice about two senior tenured professors slamming my scientific credibility on your internet soapbox without a single direct email to me. How passive-aggressive!
Your message is saying that Academic science even less inviting to young scientists as it is, with faculty positions and grants falling way short of demand, and the tough sacrifices every young scientist is already making for the craft that we love. If we condone this type of sniping behavior, why would any young scientist want to learn and discuss with the older scientists of your generation?
A direct email from you to me expressing your scientific concerns of our commentary would have been a better way to go. I am willing to stand corrected. Your blog posts, however, are disappointing and appear petty to me. Let’s all set a better example here for our trainees.
If you wish to post this response verbatim on your blogs, go ahead, since I had thought of posting this response on your blog’s comments section. But to follow my own advice, I’ll try a direct email to you first. And if I don’t hear back from you, I may then ask my friend Bjorn to help me post this on his blog.
This time it's Assistant Professor of Biology Nelson Lau. He studies Piwi proteins and PiRNAs.
Lau was interviewed by Lawrence Goodman, a science communication officer at Brandeis University: DNA dumpster diving. The subject is junk DNA and you will be astonished at how ignorant Nelson Lau is about a subject that's supposed to be important in his work.
How does this happen? Aren't scientists supposed to be up-to-date on the scientific literature before they pass themselves off as experts? How can an Assistant Professor make such blatantly false and misleading statements about his own area of research expertise? Has he never encountered graduate students, post-docs, or mentors who would have corrected his misconceptions?
Here's the introduction to the interview,
Since the 1960s, it's largely been assumed that most of the DNA in the human genome was junk. It didn't encode proteins -- the main activity of our genes-- so it was assumed to serve no purpose. But Assistant Professor of Biology Nelson Lau is among a new generation of scientists questioning that hypothesis. His findings suggest we've been wrong about junk DNA and it may be time for a reappraisal. If we want to understand how our bodies work, we need to start picking through our genetic garbage.
BrandeisNow sat down with Lau to ask him about his research.
There's nothing wrong with being a "new generation" who questions the wisdom of their elders. That's what all scientists are supposed to do.
But there are certain standards that apply. The most important standard is that when you are challenging other experts you'd better be an expert yourself.
First off, what is junk DNA?
About two percent of our genome carries out functions we know about, things like building our bones or keeping the heart beating. What the rest of our DNA does is still a mystery. Twenty years ago, for want of a better term, some scientists decided to call it junk DNA.
Dan has already addressed this response but let me throw in my own two cents.
There was never, ever, a time when knowledgeable scientists said that all 98% of the DNA that wasn't part of a gene was junk. Not today, not twenty years ago (1996), and not 45 years ago.
There has never been at time since the 1960s when all non-gene DNA was a mystery. It certainly isn't a mystery today. If you don't know this then you better do some reading ... quickly. Google could be your friend, Prof. Lau, it will save you from further embarrassment. Search on "junk DNA" and read everything ... not just the entries that you agree with.
I added a bunch of links at the bottom of this post to help you out.
Is it really junk?
There’s two camps in the scientific community, one that believes it doesn’t do anything and another that believes it’s there for a purpose.
And you’re in the second camp?
Yes. It's true that sometimes organisms carry around excess DNA, but usually it is there for a purpose. Perhaps junk DNA has been coopted for a deeper purpose that we have yet to fully unravel.
It is possible that the extra DNA in our genome has an unknown deeper purpose but right now we have more than enough information to be confident that it's junk. You have to refute or discredit all the work that's been done in the past 40 years in order to be in the second camp.
Maybe when junk DNA moves to the right place in our DNA, this could cause better or faster evolution. Maybe when junk genes interacts with the non-junk ones, it causes a mutation to occur so humans can better adapt to changes in the environment.
Most of the undergraduates who took my course could easily refute that argument. I'm guessing that undergraduates in biology at Brandeis aren't as smart. Or maybe they're just too complacent to challenge a professor?
We've got a serious problem here folks. There are scientists being hired at respectable universities who aren't keeping up with the scientific literature in their own field. How does this happen? Are there newly hired biology professors who don't understand evolution?
Niu, D. K., and Jiang, L. (2012) Can ENCODE tell us how much junk DNA we carry in our genome?. Biochemical and biophysical research communications 430:1340-1343. [doi: 10.1016/j.bbrc.2012.12.074]
Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) published online March 11, 2013. [PubMed] [doi: 10.1073/pnas.1221376110]
Graur, D., Zheng, Y., Price, N., Azevedo, R. B., Zufall, R. A., and Elhaik, E. (2013) On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution published online: February 20, 2013 [doi: 10.1093/gbe/evt028
Eddy, S.R. (2013) The ENCODE project: missteps overshadowing a success. Current Biology, 23:R259-R261. [10.1016/j.cub.2013.03.023]
Hurst, L.D. (2013) Open questions: A logic (or lack thereof) of genome organization. BMC biology, 11:58. [doi:10.1186/1741-7007-11-58]
Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G. E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]
Morange, M. (2014) Genome as a Multipurpose Structure Built by Evolution. Perspectives in biology and medicine, 57:162-171. [doi: 10.1353/pbm.2014.000]
I'm taking Ms. Sandwalk and hope to show her Down House. She loves old English houses.
She'll also be really excited to see Darwin's tomb in Westminster Abbey and tour the Natural History Museum. We'll make it a fun-filled week of science and evolution! Why don't you join us?
He brings up two points that are worth discussing.
What is a model organism?
There are two common definitions. Birney leans toward defining a model organism as one that models human biochemistry and physiology. This is a common definition. It emphasizes the meaning of "model" as "model of something."
Watch two medical educators from my Faculty of Medicine at the University of Toronto. They are being interviewed by Steve Paiken of The Agenda. They rightly deplore the traditional lecture style of learning that's common in my university but their solution is more online learning.
The real problem with medical education is that much of the first two years is based on the "memorize and regurgitate" model that we know is ineffective. The best way to change the system is to use evidence-based methods that emphasize student-based learning. The idea is to teach medical students how to access information and how to interpret it rather than have them memorize facts. When teaching biochemistry, for example, it's pointless to ask medical students to take an exam based on structures and pathways that they will forget the day after the exam.
These two physicians are in charge of reforming medical education. They want to please the students by creating a new way of teaching that emphasizes the way "millennials" want to learn. (Short online courses, no lectures.) You'll watch the entire show without hearing any references to the pedagogical literature and what's known to work. Is there any evidence that undergraduate medical students are experts on medical education? (Hint: ... no.)
If this is the wave of the future, I fear that future doctors are not going to be any more informed that the current crop. They will still not be capable of critical thinking.
The way we teach needs to change, but not this way.
We've known for a long time that the most common mistake is assuming that there's only one solution to a problem. They see an end result, like a bacterial flagellum, or resistance to malaria, or the binding of two proteins, and assume that a few very specific mutations had to occur in a specific sequence in order to produce that result.
judmarc calls this the "lottery fallacy" and I think it's a good term [see lottery fallacy],
This is of course what I like to call the "lottery fallacy." It's used by virtually every ID proponent to produce erroneously inflated probabilities against evolution.
Lottery fallacy: The odds against any *particular individual* winning the PowerBall lottery are ~175 million to 1. But there were three winners just last night. That's because *someone* winning the PowerBall is not an especially rare occurrence. It happens every few weeks throughout the year.
In exactly the same way, Axe, Gauger, Behe, and the rest of the ID folks always base their math on the chances that a *particular* neutral or beneficial mutation will occur, and just as with the lottery, the chances of a *particular* outcome are utterly minuscule. The occurrence of *some* neutral or beneficial mutation, however, is, as with the lottery, so relatively common as to be completely unremarkable.
To summarize: ID proponents misuse probability math to make the common appear impossible.
As it turns out, someone on Evolution News & Views (sic) just posted an excellent example of this fallacy [Intelligent Design on Target]. Here's what he/she/it says,
In his second major treatise on design theory, No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence, William Dembski discusses searches and targets. One of his main points is that the ability to reach a target in a vast space of possibilities is an indicator of design. A sufficiently complex target that satisfies an independent specification, he argues, creates a pattern that, when observed, satisfies the Design Filter. There are rigorous mathematical and logical proofs of this concept in the book, but at one point, he uses an illustration even a child can understand.
Consider the case of an archer. Suppose an archer stands fifty meters from a large wall with a bow and arrow in hand. The wall, let us say, is sufficiently large that the archer cannot help but hit it. Now suppose each time the archer shoots an arrow at the wall, the archer paints a target around the arrow so that the arrow sits squarely in the bull's-eye. What can be concluded from this scenario? Absolutely nothing about the archer's ability as an archer. Yes, a pattern is being matched; but it is a pattern fixed only after the arrow has been shot. The pattern is thus purely ad hoc. [No Free Lunch, pp. 9-10, emphasis added.]
Most people have experience with target shooting of some kind, whether with bows and arrows, guns (including squirt guns), snowballs, darts, or most sports like baseball, soccer, basketball, hockey, and football. Children laugh when they picture an archer who "couldn't even hit the broadside of a barn" and rushes up to the arrow and paints a bull's-eye around it. Grown-ups might compare that to a biologist looking at an irreducibly complex biological system and simply stating, "It evolved." In each of these cases, Dembski would say that since the pattern was not independently specified, therefore it is ad hoc.
Do you see the fallacy? Just because we observe a complex adaptation or structure does NOT mean that it was specified or pre-ordained. There are certainly many different structures that could have evolved—most of them we never see because they didn't happen. And when a particular result is observed it doesn't mean that there was only one pathway (target) to producing that structure.
To continue the analogy—at the risk of abusing it—there may be hundreds of targets in the woods and most of them have very large bullseyes. Imagine you're out for a walk in the woods and you see that almost every tree has a big target with a large bullseye. You find an arrow stuck at the edge of one of the bullseyes and lots of arrows stuck in the trees, the ground, and parts of most of the targets outside of the central bullseyes. Would you write a book about how good the archer must have been?
I've been trying to figure out why Intelligent Design Creationists are so excited about epigenetics. They seem to think it's going to overthrow everything we know about evolution (= "Darwinism"). That means, in their minds, that "naturalism" and "materialism" aren't sufficient to explain biology.
I'll just quote the relevant part and let you try and figure out whether Denyse represents mainstream Intelligent Design Creationism. 'Cause if she does, the movement is in far worse shape than even I imagined.
I remember one adoptive mother, taunted by a rebellious teenager who wanted to find her “real” mother, taking the girl by the shoulders and saying, “Look, I raised you from when you were seven days old; I supported you, sat with you in emergency rooms and juvenile court, laughed and cried with you, … and got you into a good school in the end. I don’t know who or where your birth mother is. But I do know this: I am the only ‘real mother’ you have ever had or ever will have. Look at me. Get used to it. It doesn’t GET better than this.”
I hope the kid smartened up. Meanwhile what if she discovers, when she has children, that their genome reflects in part traits she acquired growing up in the adoptive home? Maybe that would allay some of the sense of alienation.
Might epigenetics could provide some basis for understanding? Time will tell.
See also: Epigenetic change: Lamarck, wake up, you’re wanted in the conference room!
Vincent Torley read a post by Jerry Coyne where Jerry wondered if Intelligent Design Creationism was in trouble because the Discovery Institute has lost Bill Dembski and Casey Luskin [Is the Discovery Institute falling apart?].
Torley disagrees, obviously, but he focuses on a couple of the scientific statements in Jerry Coyne's post and comes up with Two quick questions for Professor Coyne.
I hope Professor Coyne won't mind if I answer.
Before answering, let's take note of the fact that Vincent Torley has been convinced by the evidence that most of our genome is junk. I wonder how that will go over in the ID community?
If Behe & Snoke are correct then modern evolutionary theory cannot explain the formation of new functions that require multiple mutations.
Cassey Luskin is aware of the fact that this result has not been widely accepted. He mentions one specific criticism:
In 2008, Behe and Snoke's would-be critics tried to refute them in the journal Genetics, but found that to obtain only two specific mutations via Darwinian evolution "for humans with a much smaller effective population size, this type of change would take > 100 million years." The critics admitted this was "very unlikely to occur on a reasonable timescale."
He's referring to a paper by Durrett and Schmidt (2008). Those authors examined the situation where one transcription factor binding site was disrupted by mutation and another one nearby is created by mutation. The event requires two prespecified coordinated mutations.