More Recent Comments

Monday, June 06, 2016

Can scientists describe what they're doing to a fifth grader?

I'm working on a review of "The Gene" by Siddhartha Mukherjee. It raises a huge number of issues about science writing and the conflict between producing a bestseller and educating the public about science.

As part of the research for that blog post I've been reading all the reviews of his book and I came across an interview with Mukherjee on the Smithsonian website [Siddhartha Mukherjee Follows Up Biography of Cancer With “An Intimate History” of Genetics].

Here's an interesting answer to an important question ...
Those quotations also humanize the topics, which in The Gene, often have names that might intimidate a casual reader: transgenic, mitochondrial lineages. Family history and historical narratives bring the abstract science of genetics to life, as well. How do you balance the science with the narrative?

Readers are never casual. They come into books extremely informed. Just like you and I can sit in a musical performance, and while we may not be musicians ourselves, we can detect a false note immediately. I think readers detect false notes very quickly. I believe that we are hungry for this information. We need to be able to have a language that is not simplistic but is clear enough, simple enough.

I like this quote form one of my mentors: "If you can't describe what you're doing in science to a fifth grader using language that is easily understandable, it's probably not worth the effort of what you're doing.” Even if you're working in string theory, you can basically describe why you're doing what you're doing, what the basic method is, and why it's important. You may not be able to get to all the details, but I think striking the right balance is important.
The scary part about the question is that even on the Smithsonian.com website the questioner thinks that terms such as "transgenic" and "mitochondrial" are intimidating. That's what we're up against if we try to write for the general audience. Apparently, the only way to bring genetics to life is to avoid the science and concentrate on history and family stories!1

Mukherjee's answers are interesting. In the first paragraph he claims that his readers are extremely informed. He claims they can detect "false notes" very quickly but we know this isn't true. Mukherjee's book is full of false notes, especially his writing about epigenetics. The fact that the book is #1 on the bestseller list tells us that his reader are NOT extremely informed. They are gullible and easily swayed by rhetoric and style.

The second paragraph is also informative. I don't think you can dumb down science to the fifth grade level. If I'm interested in how much of our genome is junk, for example, it's going to be very difficult to explain this problem to a fifth grader in any meaningful way. Mukherjee is being extremely naive if he thinks he can accomplish this goal. He certainly didn't do it in his book—you can read the entire book (592 pages) and still not be able to correctly explain what a gene is. (The title of the book is "The Gene.")

We know that educating the general public about science is a problem. A fifth grade level knowledge of evolution, for example, is not enough to make informed decisions in the modern world. Almost all adults in our society are graduates of high school. In an ideal world we should be able to explain what we're doing to adults at the high school level. If we do a good job, we should be able to elevate them a bit above high school level as we explain our field.

We all know that's not going to work, probably because the average high school graduate is nowhere near a high school level of education in the sciences.

Can you explain what you're interested in to a fifth grader in any meaningful way? I bet you can't if you're working at the frontiers of knowledge.2


1. Mukherjee writes about schizophrenia and bipolar disease in his own family.

2. Creationist can do it. That's because they're already at the fifth grade level.

193 comments :

unknowing said...

I think "critically" or "nuanced" might be better word choices than "meaningfully," at least with regarding to the sort of "meaning" implied by the Mukherjee quote. I imagine the majority scientists are quite adept at painting the "meaning" of their work in such broad strokes (the infamous "elevator speech"). However, the honest scientist (and hopefully, science writer) would accurately recognize such a description as marketing. The meaning you imply, I feel, is that of a critical understanding, which necessarily requires strong foundations in the discipline.
I absolutely concur that the confounding of the two is common and problematic. I think it's dishonest of Mukherjee to suggest that simplistic, conversational explanations provide any depth of understanding. It's a great selling point though, I'll admit, convincing readers they can grok a complex, technical subject with minimal effort by reading an edutainment piece. However, to be fair to both author and readers, I think it's quite difficult to appreciate the depth of knowledge required for a critical understanding of a field if one has not themselves explored such frontiers or had formal scientific training.

Beaglelady said...

Very disappointing...I liked his book about cancer. And PBS made it into a 3-part series.

Jmac said...

Can anybody detect where this one of the most accomplished scientists in cancer research and experimental oncology has stepped on Larry's toe?

I have not read anything of the above yet, but I think I can pretty much guess where it has been at and how Larry is going to handle it...

Robert Byers said...

I agree you can't explain the frontiers of science to a grade school kid. Nor high school.
One must be already knowledgeable about the subject and that must cross a standard worthy of how complicated the subject is.
Settled knowledge possibly can be put in just the right words for quick understanding. Not genetics though.
Certainly not anything that has contention or is still figuring things out.

Why are science subjects considered complicated and get rwards for new discoveries if its that easy to understand after a few paragraphs for the public.
Science has a right to claim its doing complicated things and the public must have a foundation in basics to keep up with the most articulate on some subject.

DGA said...

Eric many other professionals in the field of genetics and biochemistry have already written critiques of this book and an excerpt that was published in the New Yorker. These have been highly unfavourable; Larry is just the latest. But you don't know that because you don't keep up, and you shoot your mouth off out of a Dunning-Kruger like ignorance

jb said...

terms such as "transgenic" and "mitochondrial" are intimidating. ... Apparently, the only way to bring genetics to life is to avoid the science and concentrate on history and family stories!

There is an important distinction between technical scientific words and sophisticated scientific concepts. Scientific jargon is shorthand that facilitates progress by experts but poses barriers for newcomers. Popular science writing can describe complex topics without jargon.

Probably the most extreme example is Randall Munroe's book, "Thing Explainer", where he describes science and technology using only the ten hundred most common words. There is a page on "Tiny Bags of Water You're Made Of" that talks about "little animals (not really animals) that got stuck in our bags of water a long time ago, like the green things in leaves" and "the control area" that "holds information about how to make the different parts of your body." Overall, its very creative, great fun to read, and I learned some things. I hope they're right.

judmarc said...

I have not read anything of the above yet

And that, as Keats would say, is all ye need to know.

judmarc said...

I do want to disagree slightly, not with the specifics of the comment, but with the general proposition that it is impossible to explain things simply to a non-expert. After all, don't many of the great scientific advances (including Darwin's theory of the origin of species) render a relatively simplified, unifying clarity from what was apparently complex and confused?

As a fifth grader, I was certainly neither confused nor intimidated by words like "mitochondria." And I can tell you that a mere one year after being a dumb grade schooler only equipped to learn "valence theory" in senior chemistry class, I somehow had no problem learning the little bit of quantum mechanics necessary to explain electron orbitals in freshman physical chemistry in college. It was a hell of a lot more interesting and fun than valence theory.

Yes, the average fifth grader these days is going to have a very tough time with many advanced scientific concepts. To my mind, that says far more about how science is presented to kids than it does about the kids' innate curiosity and ability to learn.

Jonathan Badger said...

"Can you explain what you're interested in to a fifth grader in any meaningful way? I bet you can't if you're working at the frontiers of knowledge"

It's funny how physicists always like to say how what they are working on isn't that complicated and that children could understand what they do but biologists are the other way around. Some of it is false modesty on the part of physicists, I would imagine, but an awful lot of the "complexity" of biology is terminology rather than concepts.

Ted said...

As a physicist, I would agree. I particularly hate how cladistics became an opportunity to introduce a slew of Greek polysyllables instead of simple and clear phrases like "shared derived." Once the textbooks forced the younger generation to jump through these terminological hoops, they became enshrined forever.

Mathematicians are the field leaders in the subtle concepts biz, and they have the self confidence to adopt a relaxed attitude towards terminology. Compare biologists' fussing over "non-avian dinosaurs" with the mathematical common sense approach of just saying "by an abuse of language, we will agree to say...."

John Harshman said...

Have to disagree. "Synapomorphy" has fewer syllables than "shared, derived character state". It's convenient to have a single word for a lot of things you could express as a phrase. And what's wrong with "non-avian dinosaurs"? These are simple terms for easily described concepts, and a certain amount of new vocabulary is essential if you're going to talk about any technical field without cumbersome circumlocutions.

I think I could explain what I do to a fifth-grader, but it would take a little time. Maybe I'm not on the frontiers of knowledge. And I think a Higgs boson would be harder to explain.

Jonathan Badger said...

Jargon is fine when discussing in private -- the problem is when simple concepts are clothed in complex terminology in order to impress outsiders. Personally, I'd say terms like "synapomorphy" and "homoplasy" aren't very useful even among scientists -- I find that people who tend to use them are typically people from the Willi Hennig school rather than more reasonable statistical types.

John Harshman said...

I find that pretty much all systematists use those terms, unless that's what you mean by "Willi Hennig school". I'm not sure what you would substitute for "homoplasy" that wouldn't be cumbersome..

Jmac said...

Are any of you doctors? How about an oncology specialists?
What makes you specialists over Sid?

Ted said...

Convergence?

Jonathan Badger said...

I don't think very many *molecular* systematists use the terms. If I'm making a tree of thousands of bacteria from an alignment of thousand nucleotides or more I'm hardly going to worry about whether a particular nucleotide evolved covergently or not. And if I did, "convergent evolution" explains the concept nicely.

John Harshman said...

I'm a molecular systematist, and the molecular systematists I know use those terms, though not often in reference to single nucleotides, since nobody generally thinks about them one at a time. People often use the terms in reference to indels and rarer events like retroelement insertions, though. "Convergence" doesn't have quite the same meaning.

The Lorax said...

Yes I can definitely describe my work to a fifth grader as well as a senior in high school and a first year graduate student. I can also discuss my research to my family members who are not college educated as well as the bar I like to frequent. If you can't do this, I suggest you are not a good science communicator nor are you well enough versed in your area of expertise to be able to discuss your work at an appropriate level.

Robert Byers said...

I think biology is the most complicated subject in the material world aside from human thought.
physics has a prestyige of being complicated but it surely doesn't come close.
Thats why so few people, so early, figured things out in physics. Newton, Einstein etc.
This because it was not that hard.
Benjamine Franklin, youtube bio, predicted in the late 1700's or so how there was coming a great revolution in intelligence and how diseases and other stuff would be figured out. he was wrong. biology was too complicated and still we don't heal ourselves much. Yet it was the other stuff that advanced greatly.
Biology in concepts , never mind origins, is more then memorizing details. Physics etc is easily memorized to explain raw structures in the universe.
Its not just terms. Biology investigation is still entry level. Physics is almost done or something.
Origin biology surely is contentious because its not been seen or can be seen in action and so is studied by other ways. unlike physics which quickly demonstrates who is right and so walls direct future investigation.

Deuterostome said...

Physics has never had a better "explainer" than Richard Feynman, but watch him tell an interviewer why he can't explain magnetic attraction "in terms of anything else that's familiar to you": https://www.youtube.com/watch?v=MO0r930Sn_8




Larry Moran said...

Please use your extraordinary skills to explain evolution to the average American.

Unknown said...

Let's see. An oncology specialist has just written a book about genetics. And geneticists have issues with it. How are they specialists in genetics compared to an oncologist? Nah, you got me there.

Unknown said...

People often use the terms in reference to indels and rarer events like retroelement insertions, though.

Because these actually reasonably fulfill the criteria for apomorphies. The meain reason nobody uses them for single nucleotides is that the term would be incorrect - a key requirement for an apomorphy is that it arises at a very low rate and single nucleotide substitutions occur at a couple of OOMs faster.

I particularly hate how cladistics became an opportunity to introduce a slew of Greek polysyllables instead of simple and clear phrases like "shared derived."

The key to phylogenetic systematics is that it clarified homology. And that it made systematics exclusively about decent. Both are important. The fact is that the original terminology has been ground down in common usage: Synautapomorphy became synapomorphy and often just apomorphy. And as John has pointed out, that's simply shorter than giving what amounts to a definition of the term. That's what Jargon is after all, a way to reference concepts without having to reiterate them. As a physicist biological Jargon feels more like an obstacle to you than Physicist jargon, which you have simply picked up. But I notice physicists will use a term like gamma-radiation, rather than "oscillating electromagnetic field with a frequency above 10^19Hz". And that's one most scientifically literate people will know. My brothers a physicist and I'll generally start to get lost halfway through the abstract when I read his papers. The same is true in reverse.

As for "non-avian Dinosaurs" - this is relevant because it's something different than Dinosauria and you can't simply ignore this difference. "non-avian dinosaurs" are not a monophylum. There is an issue here, because there are a lot of clades and we are now getting to a point where databases of taxa are generated by data mining software. If somebody uses a paraphyletic group as if it was monophyletic in a publication that will end up in databases and cause problems. In comparison something like the periodic table has a small number of elements and therefore can be managed by hand. But imagine if there were millions of elements and because some chemists didn't care the digitally created periodic table had earth, fire, water, air and the element of surprise in it.

Now if you are working with databases of taxa you have to manually check whether somebody used a non-clade like a clade. You have a couple of 1000 database entries and for each you have to get the original publication to check if they were being a bit imprecise with their terminology. That's as boring as it sounds. So whenever I hear people complaining that cladists don't want them to use some "perfectly reasonable" term, like "Fish" for "Chordata with the exception of Tetrapods" I get somewhat annoyed. I know that's not making me popular and I've heard "but it's totally clear what I *mean*" often enough to last a lifetime (but I'm absolutely sure I'm going to hear it a lot more). But it's not clear to that data mining software. And while yes, I completely understand what you mean after I had a look at your paper, it's going to be the 100th paper I'm going through on that day and one of the 20% or so garbage that is making my work harder.

John Harshman said...

The meain reason nobody uses them for single nucleotides is that the term would be incorrect - a key requirement for an apomorphy is that it arises at a very low rate and single nucleotide substitutions occur at a couple of OOMs faster.

I wouldn't agree with that at all. I don't think there's any such requirement for an apomorphy, which is nothing more than a derived character state. I suppose it would be good if that state were fixed within a species, but I wouldn't say even that was a requirement. Anyway, in some taxa at least, transversions happen at a rate not dissimilar to that of one-base indels. Your "OOM" idea would apply to truly rare events like big indels and retroelement insertions, but I still don't think that's relevant to the meaning of "apomorphy". Most often, of course, the term is applied to morphological characters. Is there a reason Jonathan wants to ignore morphology?

Ted said...

if you are working with databases of taxa you have to manually check whether somebody used a non-clade like a clade

Ouch! I deeply sympathize, but I fail to see how this relates to terminology. Why can't the data mining software search for "shared derived" just as easily as synautapomorphy/synapomorphy/apomorphy? And as for the cladistically correct phrase
"Chordata with the exception of Tetrapods" surely the data mining software can't reliably find the innumerable variants such as "Chordata except Tetrapods" or "non-tetrapod chordates?"

Joe Felsenstein said...

It could also be argued that Greek-rooted terms like "synapomorphy" are less biased toward English-speakers.

One addendum to what people are saying about using groups in the classification. My own position on the classification system is the IDMVM position -- the "it doesn't matter very much" position, to which I seem to be the only adherent. If we have estimates of the phylogeny, we can do our analysis of sequences using those, and do not really need to use the higher-order taxa in a classification system. It then doesn't matter whether or not they are monophyletic groups.

I've been pointing this out for 15-20 years now. But I think I am still the only fan of the IDMVM view.f

The Lorax said...

Do you really think the issue about evolution and the average american is that it is not explained well? If you ask most Americans what biologists think about evolution, they can tell you in general terms about common descent and change over time. The problem is if you ask them what they think, then you get to differences in belief systems. The problem is not one of dissemination of knowledge.

Jonathan Badger said...

Arguably Unifrac analyses in microbiome studies follow your IDMVM position.

Larry Moran said...

It's hard to explain what you're doing to a fifth grader if he/she has a belief system that rejects everything you say. You said that if I don't suceed then I'm not a good science communicator or I'm not well enough versed in my area of expertise.

I suspect that Mukherjee has very little experience trying to get the average American to understand what a gene is, how it works, and how it evolved.

judmarc said...

And I think a Higgs boson would be harder to explain.

It makes stuff heavy.

John Harshman said...

It makes stuff heavy.

And you consider that to be an explanation?

Unknown said...

It makes stuff heavy.
Not even true at that level of "explanation".

The Higgs field is around all the time, and makes some stuff massive. The boson is a rare occurrence that is evidence for the field.
No, that's not an explanation either.

The Lorax said...

"You said that if I don't suceed then I'm not a good science communicator or I'm not well enough versed in my area of expertise." Fair point, and poor communication on my part. I was assuming the person(s) in question legitimately wanted to know what you worked on/studied. And I also assume that what is explained to a 5th grader/high school student, etc is not an actual description of the specific research you do, but more of a 50,000 foot explanation. i.e. 'All of the animals you know shared a common ancestor millions of years ago. I'm studying how different animals, like cats and dogs, came to be.' might be a good starting point in talking to a 3rd grader.

judmarc said...

And you consider that to be an explanation?

Like, really heavy, man.

With two people feeling the need to weigh in with serious criticism of my witticism, obviously humor impairment is rampant. I didn't think I actually had to use a "winkie" to get the point across, but will know better in the future.

judmarc said...

The boson is a rare occurrence that is evidence for the field.

As long as we're being pedantic, the boson is not a rare occurrence, our ability to produce the conditions under which we can observe it is.

Bill Cole said...

The difficulty of a simple explanation may be due to the maturity level of the science and how abstract the explanation is. Newton's law of gravity
is mature with a simple mathematical model followed by direct experimental evidence. Einsteins general relativity is more conceptual (curved space time) has a very complex mathematical model but direct experimental evidence
Evolutionary theory that we all share a common ancestor is a simpler concept then general relativity but lacks direct experimental evidence and a mathematical model. The fifth grader may grasp the concept but then lose interest later in life when the how questions become fuzzy due to the lack of a directly testable mechanism that explains life's connection.

Unknown said...

I don't think there's any such requirement for an apomorphy, which is nothing more than a derived character state.

The problem here is that you can't really distinguish between apomorphies, plesiomorphies and homoplasies when it comes to single nucleotides. And I'd argue that you shouldn't call something an apomorphy unless you are pretty certain that it is one.
Let's say we have 3 taxa with this phylogeny ((A,B),C) and at a particular position A and B have T while C has G. There are several possibilities here:
- G plesiomorphic, T arises on the stemline to (A,B) in thus is a synapomorphy
- T plesiomrophic, G arises on the stemline to C
- G plesiomorphic, T arose independently on the stemlines to A and B.
And that's ignoring cases where multiple changes happen in a single stemline or where neither T or G were plesiomorphic states.

That's IMO the main reason to use ML or Bayesian approaches in molecular systematics, because MP makes the implicit assumption that we can mostly ignore these cases (and if you recover a lot of homoplasy in a morphological MP tree, it usually points to issues with your character coding).

Is there a reason Jonathan wants to ignore morphology?

IDK.

And as for the cladistically correct phrase
"Chordata with the exception of Tetrapods" surely the data mining software can't reliably find the innumerable variants such as "Chordata except Tetrapods" or "non-tetrapod chordates?"


No, but in these cases it would just find "Chordata" and possibly "Tetrapoda", both of which are clades.

. If we have estimates of the phylogeny, we can do our analysis of sequences using those, and do not really need to use the higher-order taxa in a classification system. It then doesn't matter whether or not they are monophyletic groups.

Well, the idea of phylogenetic systematics was that taxon names are shorthands for phylogeny. If you say Tetraconata, you say that you think Insects are sister to Crustacea, while Pancrustacea says Insects are part of the Crustacea. It makes thing easier to reference in text or conversation (you can put "We find that X is paraphyletic" in an abstact, but you can't put your phylogenetic reconstruction in it. That's rather helpful if somebody is doing a literature search as well). Here's what not naming clades can result in:
node 125 of figure 1 in (2) (From Tong et al.2015 Comment on “Phylogenomics resolves the timing and pattern of insect evolution”, Science, 349:487).
Or
clade containing Orthoptera and Blattodea (Gunkel et al. in prep)
That's all rather inelegant and it'd be great if that clade (which is rather well supported) had a name.

judmarc said...

lacks direct experimental evidence and a mathematical model

It has both, as you've been repeatedly informed. But this gets back to what Larry said above about the 5th grader who has a belief system that requires denying these facts.

John Harshman said...

Simon,

One major problem with your scenario is that parsimony requires at least 4 taxa, while your example has only 3. While it's true that the most parsimonious optimization is not necessary correct, that applies to all characters, not just nucleotides. Nor does parsimony assume that multiple hits are rare; it assumes instead that the probability of a character changing on any single branch is low but makes no such assumptions about the entire tree. That parsimony ignores homoplasy is not a reason to use ML at all, since it doesn't. The main reason is that parsimony is inconsistent under conditions we have reason to believe are commonly encountered, while ML, given a good enough model, is not.

Nor does lots of homoplasy say anything about your character coding. What it says is that there's lots of homoplasy.

Jonathan Badger said...

John --

"Most often, of course, the term is applied to morphological characters. Is there a reason Jonathan wants to ignore morphology?"

How do you create a meaningful statistical model of morphological characters? That's kind of the reason why the Hennig cladists stuck to parsimony and their morphological characters when the rest of us moved to molecular ML trees. Besides, molecular systematics has shown us that relying on morphology is often misleading (in the 1990s I remember people still insisting that the fact that molecular trees weren't congruent with morphology meant molecular phylogeny was "wrong", but I don't see that argument used much anymore).

John Harshman said...

The Lewis model is good enough for many purposes. And you do understand that while molecular data may be great for phylogenetic analysis of extant taxa, morphology is still very useful for a great many purposes, right? It's all we have for extinct taxa beyond a few hundred thousand years, and it's crucial for mapping onto trees (usually molecular trees) in an attempt to understand morphological evolution, which some people still find interesting. So I fail to see why you think any term used in morphological systematics is useless.

Joe Felsenstein said...

Bill Cole: You asserted that there is a "lack of a directly testable mechanism that explains life's connection".

Actually such a mechanism has been observed. It is called reproduction. Parents have been observed to have offspring, and this seems to happen in all known species.

Robert Byers said...

The Lorax.
Oh no this is not true.
You can't explain biology to children or teenagers unless they are personally interested and apply themselves.
If its just a few simple paragraphs that you think explains biology then thats the point.
Biology is not basic conclusions. Thats why evolutionism is not persuasive to any already hostile audience. Just saying we all come from the same fish is saying nothing about how and HOW even possible.
If biology is so easy then why not create it out of raw materials! Or heal everybody!
Kids just memorize very raw data on biology.
Knowing that the blood circulates and its motor and where is irrelevant to the biological structure of blood and unknown things about it.
Either there is a complexity to biology or there is not.
The same with physics. Yet physics is childsplay compared to biology.
There must be a curve on the graph of complexity relative to age and basic understanding about any science.

Creationists these days lead in saying the origin of biology is not simple.
Saying selection on mutations and friends is just raw data claims.
Indeed it must be proven by getting into the details. Its complicated.
So much so error crept in too much and more.

Rolf Aalberg said...

Unlearning is very difficult for a person who's been indoctrinated in early childhood. That's why they are before it is too late. Santa Claus is the only exception, for obvious reasons.

Proverbs 22:6: Train up a child in the way he should go: and when he is old, he will not depart from it.

Unknown said...

One major problem with your scenario is that parsimony requires at least 4 taxa, while your example has only 3

The tree I gave is rooted. The reason parsimony (or any other method) requires 4 taxa is that there is only 1 unrooted tree for 3 taxa, but there are 3 rooted trees for 3 taxa (the same number as for 4 taxa). By rooting the tree I've given the implicit inclusion of some outgroup, but only treated the character state for the ingroup. This did save me some typing, but I don't think it should be a conceptual problem (it's also worth noting that Joes seminal 1978 paper also discusses a rooted 3 taxon phylogeny first).

The main reason is that parsimony is inconsistent under conditions we have reason to believe are commonly encountered

That's not correct. The reason for LBA is homoplasy, if you do not allow convergence or reversal then LBA doesn't happen. It's also worth noting that ML models can also be prone to LBA, it's not a problem native to parsimony.

Nor does lots of homoplasy say anything about your character coding. What it says is that there's lots of homoplasy.

Given an optimal character coding there is no such thing as homoplasy in morphological data sets. If two superficially identical traits evolve in two different species, then there are differences between these traits when not treated superficially. And "clean" reversals don't happen either. The solution to LBA in morphological studies is to increase the resolution of your characters, which removes homoplasy and leads to MP producing consistent results. In molecular trees you always have homoplasy, which is why MP is generally not the way to go.

Which gets me to
How do you create a meaningful statistical model of morphological characters? That's kind of the reason why the Hennig cladists stuck to parsimony and their morphological characters when the rest of us moved to molecular ML trees.

Nope. As noted above MP is actually optimal for a morphological phylogeny given a detailed enough coding of characters. You need more sophisticated statistical models when dealing with molecular data, because you will always have a high degree of homoplasy, simply due to the fact that there is a finite character space and reversals have a positive probability.

in the 1990s I remember people still insisting that the fact that molecular trees weren't congruent with morphology meant molecular phylogeny was "wrong", but I don't see that argument used much anymore

There's a simple reason for this: Molecular phylogeny has gotten better. There were molecular phylogenies with bad taxon sampling, using single genes, sometimes not even single-copy ones, so they'd have paralogs in their alignments. You wouldn't get a lot of these published today, because as the tools available to molecular systematists have become more refined, the standards have risen. And in a lot of cases modern molecular phylogenies support clades that have been proposed by morphological systematists and where the two are at odds it's worth trying to figure out why that is the case (sometimes there are issues with the molecular data, sometimes it turns out that looking at a trait that was a supposed synapomorphy for a clade not supported by the molecular data in more detail shows differences in development or microstructure).

The relationship between morphologists and molecular biologists is not an antogonistic one (I'm stopping to write this post to meet with a morphologist a molecular biologist and two fellow paleontologists to discuss the placement of some fossils in a molecular phylogeny for the purpose of calibrating a relaxed molecular clock dating scheme).

judmarc said...

Actually such a mechanism has been observed. It is called reproduction.

Perhaps Bill feels that observing reproduction is immoral.

John Harshman said...

A rooted 3-taxon tree is really a 4-taxon tree; you're just ignoring the outgroup used to root it. Your 3-taxon tree assumes we know nothing about the states at the root, which means that, functionally, it's unrooted. That means that two of your three scenarios are identical, which is why parsimony can't tell them apart.

Yes, the reason for LBA is homplasy, but not just homoplasy: way too much homoplasy. And yes, ML can be inconsistent too. That's why I said (which you snipped) "given a good enough model".

Your idea of an "optimal" morphological character set is pure fantasy. If such things existed, parsimony would be unnecessary. You'd just assemble a tree by making a node for each character. In the real world, homoplasy is not generally detectable by inspection. Homoplastic characters differ in detail, but so do homologous ones. Evolution is always changing this or that. Errors in coding are unavoidable, and parsimony attempts to discern these errors by comparisons among characters. While it's true that homoplasy in sequence characters is undetectable by inspection, even in principle, morphological characters are not nearly as different in this regard as you suppose. There have been studies on the prevalence of homoplasy in morphological trees, and it's pretty high. You may attribute this to poor character coding, but I doubt you could do better. Morphology may be theoretically unlimited, but in practice there seem to be a limited number of states per character.

And there are indeed likelihood models for morphological characters, e.g. the one I have already mentioned: the Lewis version of Jukes-Cantor.

Bill Cole said...

Joe Felsenstein:
"Actually such a mechanism has been observed. It is called reproduction. Parents have been observed to have offspring, and this seems to happen in all known species."

Yes, I understand your point here. If we could demonstrate this process evolving kind b and c from kind a then we could easily explain this theory to a fifth grader. Unfortunately the Genome is a sequence as I know you understand very well. How are new protein complexes, splicing codes and DNA timing expression formed through this process. Do you believe that reproduction drift can account for this? From looking at the papers that talk about timing changes and alternative splicing changes from man vs chimps I honestly don't see how to reconcile that chimps and man share a common ancestor solely from genetic changes in isolated populations.

Unknown said...

A rooted 3-taxon tree is really a 4-taxon tree; you're just ignoring the outgroup used to root it.

That's what I said.

Your 3-taxon tree assumes we know nothing about the states at the root, which means that, functionally, it's unrooted.

Nope. It assumes we don't know the state of that one particular base at the root. If we include a 4th taxon, the same options remain, because we don't know if there is a substitution on the stemline to either taxon 4 or to the root.

Homoplastic characters differ in detail, but so do homologous ones.

Sure, but I'd expect the homoplasic charaters to be homologous for a clade. Increasing taxon sampling would resolve this. I agree that getting the homoplasy out is not practical, but that's due to constraints in time, funding and number of researchers available, not because there truly are cases where identical character states arise from different initial states.

And there are indeed likelihood models for morphological characters

I'm aware of that, but I don't think they are useful as replacements for MP in pure morphological studies, they are primarily aimed at giving morphology and molecular data a common data format for total evidence analysis. Now, whether TE is a good direction to go is another question (There are a few TE studies I don't think are as reliable as either pure molecular or pure morphological analyses of the same phylogenies.

John Harshman said...

Nope. It assumes we don't know the state of that one particular base at the root. If we include a 4th taxon, the same options remain, because we don't know if there is a substitution on the stemline to either taxon 4 or to the root.

Here you are incorrect. Try it. The state in the 4th taxon (assuming you intend that taxon to root the 3-taxon tree) makes one of your two scenarios more parsimonious than the other. If it's G, then your first scenario is preferred; if T, your second.

As for homoplasy, I think you're still setting up a fantasy scenario. It isn't that homoplastic characters are absolutely identical; it's that homologous characters aren't absolutely identical either. You can't in most cases diagnose homoplasy by inspection, even if you look really hard.

What exactly is wrong with ML models in morphology? Parsimony, after all, is isomorphic to the "no common mechanism" model.

Unknown said...

I agree that given the state in the 4th taxon one option is more parsimonious than the other. But my point here was that all of these scenarios are possible, although they are not equally parsimonious and for this reason we can not be certain whether a particular bases state is a synapomorphy or a plesiomorphy or convergent even if the phylogeny is known.

What exactly is wrong with ML models in morphology?

Nothing wrong with them. But in many cases they don't have an advantage over MP and they tend to take more computational resources.

John Harshman said...

Of course all scenarios are possible. Of course we can't be certain that anything is a synapomorphy. But that's a feature of science, and is as true about morphological characters as DNA bases.

Joe Felsenstein said...

Bill Cole: Oh, "kinds". Don't use that concept myself. Now, about species -- take the example of the White-Crowned Sparrow (Zonotrichia leucophrys) and the Golden-Crowned Sparrow (Zonotrichia atricapilla). They are different species. May I assume you would regard them as the same "kind" and thus would have no difficulty in saying that they have common ancestry?

christine janis said...

Yo --- while we're on this topic, can anybody give me a good simple summary paper of the pros and cons of molecular versus morphological phylogenies. (Yes, I know that's an extremely simplistic view, but I've been charged with writing 500 words for a textbook. 5,000 words would actually be easier.)

christine janis said...

"Just saying we all come from the same fish is saying nothing about how and HOW even possible."

Maybe not ---- but pointing out the similarities between us and fish could be explained to a first grader, especially if you could get a fish from the supermarket to demonstrate with (a derived teleost isn't ideal for this, but is can still show a lot).

Jmac said...

I've been trying to find some cancer research experts here on this blog.

No success so far.

As far as I could research, no one, including the host has ever done any research into cancer. If I'm wrong about it, please correct.

We are talking about the actual experiments performed by the scientists who by their experimental results challenge the other scientists to resolve the possible conflicts.

Which experiments did you perform Larry?
How about you Joe F? Harshman? Diogenis? Chris B? Mikkel Rumraket Rasmussen???

As I had predicted, none of them have done it. None.

I think it is easy for some to criticise someone who at least trying to save a life or more when he does not follow the atheistic requirement by registering first at devil.com

John Harshman said...

I can think of a few such papers, but they're so old as to be nearly useless, i.e. before we had the vast amount of sequence data we have now.

Robert Byers said...

Pointing out similarities could be done the children. However this has nothing to do with biology as a science.
Indeed similarities is irrelevant to figuring out why they are unless options are banned except one.
First graders are too dumb to question what they are told.
Although a few might question still that likeness is proof or common descent. Or even a hint.
Kids are not that dumb!

Chris B said...

Eric,

What does being a cancer research expert have to do with anything?

"We are talking about the actual experiments performed by the scientists who by their experimental results challenge the other scientists to resolve the possible conflicts. "

This is gibberish. Can you rephrase your point here, if there was one?

"I think it is easy for some to criticize someone who at least trying to save a life or more when he does not follow the atheistic requirement by registering first at devil.com "

This makes no sense whatsoever. Why would it be easier to criticize someone who is "trying to save a life or more"? Do only cancer research experts do research to try and save lives? Why would this be a necessary requirement to write a book about genetics?

If I have deciphered your incoherent yammering correctly, you are offering an argument from motive and an argument from authority. Is that really all you have to offer?

"As far as I could research, no one, including the host has ever done any research into cancer. If I'm wrong about it, please correct. "

Out of curiosity, what "research" did you do to determine this?

Bill Cole said...

Joe Felsenstein:
Yes, I use kinds, because I agree that species that are biochemically close could share a common ancestor.

Joe Felsenstein said...

Bill Cole:
OK. So you agree that there are processes that can explain change of species like these from a common ancestor.

And presumably one defines "kinds" as things that do not have a common ancestor for different "kinds".

Joe Felsenstein said...

At which point I should pop in for a bit of advertising: the use of Seawall Wright's "threshold model" of 1934. See my 2012 paper in American Naturalist.

Unknown said...

@ Joe: I keep encountering Brownian models for continuous characters and every time I wonder if it wouldn't be better to assume that actual character states are stepwise lognormal, i.e. that instead of taking a Brownian motion to model character values you set actual character values to exp(x), where x is the value from Brownian motion. A lot of continuous characters seem to be more adequately modeled in this way and while this shouldn't affect the discrete characters you model with a threshold, it would affect other continuous characters.

@John: The main difference is that with a large enough alignment it becomes almost certain that all cases are within the dataset and the chance of getting more than one variant in a molecular phylogeny of decent size is rather high. I wouldn't expect a lot of cases where we misidentify an apomorphic trait in a morphological phylogeny.

Bill Cole said...

Joe Felsenstein:
I agree with your point. Where I become skeptical is when two species or kinds or what ever we call them are separated biochemically with different functional sequences. When we look at the DNA codes, splicing sequences, and gene expression can we reconcile that it is a reasonable hypothesis that the change was caused by generational drift.
There is an interesting discussion at UD on common decent:http://www.uncommondescent.com/human-evolution/common-descent-ann-gaugers-response-to-vincent-torley/

Joe Felsenstein said...

"Generational drift?"

John Harshman said...

It occurs to me that "Can scientists describe what they're doing to a fifth grader?" could be read in an odd way by someone who stumbles into this site. I just want to reassure that hypothetical stumbler that nobody is doing anything to any fifth-graders.

John Harshman said...

Forget it. He's rolling.

christine janis said...

Akk --- no worries, I've figured out what to write. But would be grateful if you would read it ---- please contact me.

Bill Cole said...

Joe Felsenstein:
Sorry, better stated: Genetic drift over multiple generations.

Faizal Ali said...

The mechanism by which speciaton occurs has been explained many time to Bill, by several people here on Sandwalk. He refuses to accept this. I'm pretty sure I know why, and it's not because the explanations were inadequate.

Faizal Ali said...

Thanks for the link to that UD discussion, however, Bill. VJ Torley is to commended for trying to talk some sense into the IDiots. I doubt he will succeed, however, and would give even odds that he will instead be banned for daring to suggest that Intelligent Design proponents should actually accept a scientific fact, even if it means some of their creationist allies abandon the movement.

John Harshman said...

I've sent an email to your Brown address. Let me know if you don't see it.

Chris B said...

Bill Cole,

"Yes, I use kinds, because I agree that species that are biochemically close could share a common ancestor."

If you understood the implications of your statement, you would have little problem with modern evolutionary theory.

Joe Felsenstein said...

I find interesting and relevant papers by simply searching using the keywords: morphology molecular systematic biology in Google Scholar. Systematic Biology being a central journal for those arguments.

Joe Felsenstein said...

lutesuite: Yes, I get the point. If Bill Cole still thinks that genetic drift is the main mechanism invoked by evolutionary biologists for innovations in different lineages, then there is no point in trying to (re-)educate him here.

John Harshman: Love your description of Cole as "rolling". Though I suspect that you did not intentionally leave out the T, but succumbed to a case of unwanted spelling correction.

judmarc said...

Joe F: I thought John might be referring to the Animal House scene -

Bluto: Nothing is over until we decide it is! Was it over when the Germans bombed Pearl Harbor? Hell no!

Otter: [to Boon] Germans?

Boon: Forget it, he's rolling.

judmarc said...

If Bill Cole still thinks that genetic drift is the main mechanism invoked by evolutionary biologists for innovations in different lineages

The fact that drift can account for the majority of genetic changes but not most "innovations" is likely on the subtle side of things for many people coming from an ID point of view.

Faizal Ali said...

Bill Cole's position, as I understand it, is that functional sequences are too rare in "sequence space" to be arrived at by chance. God needs to be directing the mutations in order for them to occur within the time that has elapsed since the existence of the common ancestor. I've tried to correct him on that, but he is stubbornly ignorant. You guys can give it a shot if you want.

Mikkel Rumraket Rasmussen said...

"From looking at the papers that talk about timing changes and alternative splicing changes from man vs chimps I honestly don't see how to reconcile that chimps and man share a common ancestor solely from genetic changes in isolated populations."

Then you are having a psychological problem, not a logical or scientific one.

John Harshman said...

Joe,

When I try that I get a lot of stuff over 20 years old and a few over 10 years old. But I am vaguely remembering something by Pete Wagner that might prove useful. It was about the whether, as commonly alleged, morphological characters can access a huge number of character states. (Functionally no, if turned out.) Hmmm...Though he seems to have visited the subject a few times I may be thinking of this: Wagner, P.J. 2000. The exhaustion of morphological character states among fossil taxa. Evolution 54:365 - 386.

Faizal Ali said...

BTW, this is the article that Bill Cole most commonly cites in support of his claim that functional sequences cannot evolve thru natural means:

http://www.pandasthumb.org/archives/2007/01/92-second-st-fa.html

As you'll notice, that article actually demonstrates the precise opposite of what Bill claims. This is what you're up against. Good luck.

The Lorax said...

Hi Robert Byers. I realize you have limitations and I'm truly impressed that you do not let them stand in your way. Please don't assume we all have those same limitations.

christine janis said...

"Pointing out similarities could be done the children. However this has nothing to do with biology as a science. "

You may think that pointing out the similarities and differences between animals has nothing to do with biological science.

As somebody who has taught this as biological science for getting on for half a century, I disagree. Vehemently.

Bill Cole said...

Joe Felsenstein:
"lutesuite: Yes, I get the point. If Bill Cole still thinks that genetic drift is the main mechanism invoked by evolutionary biologists for innovations in different lineages, then there is no point in trying to (re-)educate him here."
Found your opinion on the skeptical zone:-)
Joe Felsenstein May 6, 2016 at 6:23 am
The whole reason that creationist debaters insist that natural selection “is random” is to fool the gullible. Why, if it’s “random” then that means that evolutionists want us to believe that impressive adaptations are forming “randomly”, like a tornado in a junkyard. Creationist debaters trust that they can fool their audiences in this way.

Biologists keep trying to make the point that while mutational processes can be considered to be random, natural selection sorts things out in a highy directional way, such as making a bird fly faster rather than slower.

Tom is correct: Lönnig in his second ENV piece is playing slick word games, saying in effect that, well, what natural selection is doing is entriely based on the available mutations, so therefore it “is random”

Is this still your opinion or is there a new one at point? Do you think Larry agrees with this? The real first question is not is this theory right but is it clear enough.

Joe Felsenstein said...

Bill Cole: Yes, of course I stand by that comment. I suspect Larry agrees with it, but you can ask him yourself if it is somehow important for you to know. I am not sure what "this theory" is: the statement that "natural selection sorts things out in a highly directional way", or the statement that creationist debaters are trying to mislead gullible audiences by insisting on using the word "random" to describe changes brought about by evolutionary forces that include natural selection. I'll happily stand by both of those.

Based on your apparent insistence that genetic drift is the main mechanism of evolutionary change, I'd say you were one of the gullible folks in the audience when the creationist debaters were characterizing evolutionary biology.

As for the "is it clear enough" trope, spare us the "concern trolling". If it isn't clear enough, that will become apparent soon enough.

Faizal Ali said...

The more pressing question is "When will Bill Cole finally learn to format his posts so one can tell what he's quoting and what he has written himself?"

What do you find unclear or debatable about what Joe wrote there?

John Harshman said...

Most evolution happens through drift. Most adaptive evolution happens through natural selection. How is this unclear? Or controversial?

Bill Cole said...

Joe Felsenstein John Harshman Lutesuite: Thank you for the clarification. John was actually the first person to educate me on what a specie is.
From the skeptical zone;
John Harshman March 26, 2016 at 6:16 pm
colewd:
John Harshman,

"I am unsure that you know what “speciation” means. Briefly, it’s the evolution of reproductive isolation between two populations. It has nothing directly to do with unique function or evolutionary innovation. I think that, instead of speciation or descent, you are concerned with the evolution of new features in populations, e.g. what makes humans different from chimps, such as our upright stance and fancy brains. Is that correct? If so, that’s a completely different question from what you have actually mentioned, whether you know that or not."

This is why I use kinds when talking about different animals with a real change in morphology and biochemistry.

John:"Most evolution happens through drift. Most adaptive evolution happens through natural selection. How is this unclear? Or controversial?"

If we are talking about the split of chimps and man from a common ancestor what differences are due to drift and what is due to natural selection?

Joe Felsenstein said...

There was a recent paper revisiting these issues. However, for the moment I can't find it. I'll post a reference if I happen to come across it.

Faizal Ali said...

This is why I use kinds when talking about different animals with a real change in morphology and biochemistry.

How do you distinguish changes that are "real" from those that are not "real"?

Faizal Ali said...

Perhaps a question that will more helpfully clarify your position: Do you consider the example of a strain of E. coli developing the novel ability to metabolize citrate an example of "real change"?

Hopefully you will start addressing my questions, rather than just ignoring them. You might find you learn something if you do. I'm no expert like Joe or Larry, but I flatter myself that I still have something to contribute.

judmarc said...

If we are talking about the split of chimps and man from a common ancestor what differences are due to drift and what is due to natural selection?

The mutations at loci that are changing at "molecular clock" or "default" rates are due to drift, and those at loci that are changing more rapidly are under selection. There are a fair number of papers in the academic literature on this topic discussing specific changes and comparing them. Have fun reading!

John Harshman said...

John was actually the first person to educate me on what a specie is.

A little more education: it's "species". The singular and plural are the same. And if anything I said has led you to refer to "kinds", then you don't understand what I said.

If we are talking about the split of chimps and man from a common ancestor what differences are due to drift and what is due to natural selection?

Quantitatively, around 40 million are due to drift, and at most a few thousand are due to selection. Judmarc has mentioned one of the ways we try to tell which ones the latter are.

Bill Cole said...

Lutesuite: "Perhaps a question that will more helpfully clarify your position: Do you consider the example of a strain of E. coli developing the novel ability to metabolize citrate an example of "real change"?

The change that would validate natural selections ability to drive diversity would start with the evolution of a new sequence. The lenski experiment showed new ability based on a few mutations of transcriptional proteins not new long functional sequences that we observe in the comparison between chimp and man biochemistry. The challenge is that the sequential space is so large this is very difficult to model mathematically and essentially impossible when sequences get about 50 AA long and now you are facing sequential space as large as the number of atoms in our galaxy. Richard Dawkins attempted to address this in the blind watchmaker with his Weasel program. This discussion occurs on about page 50 of the book. There are versions of this program you can play with and see where it breaks. My experience is with 80 letters and a 5% mutation rate the program will not finish even if it run for days. The reason is the high mutation rate requires the last mutations to finish simultaneously. The real issue with this program is it does not simulate natural selection because it requires a target to create a new sequence. If you play with this program you will see the difficulty of trying to work through a sequence. Joe Felsenstein understands this very well. The Weasel program has been debated extensively at the skeptical zone where John and Joe have frequently participated.

Bill Cole said...

John Harshman:
"Quantitatively, around 40 million are due to drift, and at most a few thousand are due to selection. Judmarc has mentioned one of the ways we try to tell which ones the latter are."

Do you believe the few thousand that were fixed are what represent the functional differences between man and chimps?

Faizal Ali said...

The real issue with this program is it does not simulate natural selection because it requires a target to create a new sequence.

Precisely. Think about that one for a while.

Also, your experiment confirms that evolution proceeds very slowly if you have an effective population size of one. Do you find that surprising?

judmarc said...

Do you believe the few thousand that were fixed are what represent the functional differences between man and chimps?

Consider what selection operates on.

Joe Felsenstein said...

Bill Cole quoth (re Weasel): My experience is with 80 letters and a 5% mutation rate the program will not finish even if it run for days.

If that is 5% mutation rate per site, and a target string of 80 letters, there should be an average of 4 mutations per offspring. The mutation is providing variability enabling selection to move toward the target, but is also causing the string to mutate away from the target at an average of 4 places once it gets close.

So it is hard for it to get all the way to the target, but it will rather quickly get to about 30 matches out of 80. Then it does not get farther, owing to the high mutation rate causing back-mutations.

Try a mutation rate of about 0.1 of 1% (i.e., 0.001). It should get all the way to the target in fewer than 20,000 generations.

John Harshman said...

Do you believe the few thousand that were fixed are what represent the functional differences between man and chimps?

As is so often the case, your use of language is confusing. I do think that the few thousand mutations that were fixed by natural selection cause almost all the functional differences between humans and chimps. The 40 million that were fixed by drift are overwhelmingly non-functional, if by "functional" you mean "maintained by selection". If you mean something else, what would it be?

Joe Felsenstein said...

Actually, a mutation rate of 0.01 will do even better.

Joe Felsenstein said...

@Simon: (Sorry for the delay in answering). Yes, as I say in my online text Theoretical Evolutionary Genetics, on page 395 of the current edition.

For a trait with a natural zero point, first take the logarithm of the phenotypes and base analysis on it. Do not return to the original scale unless you can come up with positive reasons why the genetic or environmental factors are likely to act additively on that scale.

This is at the end of a section in which I fret a lot about transformations of scale.

Once you are on the log scale the Brownian Motion model works as usual. I am not sure what you mean by "stepwise" lognormal. I don't think you mean to discretize the character.

Bill Cole said...

John
"As is so often the case, your use of language is confusing. I do think that the few thousand mutations that were fixed by natural selection cause almost all the functional differences between humans and chimps. The 40 million that were fixed by drift are overwhelmingly non-functional, if by "functional" you mean "maintained by selection". If you mean something else, what would it be?"
Yes maintained by selection but also can account for the phenotypic differences. The math is troubling here because according to Lynch's 2010 paper 6 mutation takes more evolutionary resources than the chimp to man common ancestor split. You are talking about 4 million mutations !0% based on junk DNA predictions turning into a few thousand fixed and functional mutations.

Bill Cole said...

Joe
"Actually, a mutation rate of 0.01 will do even better."

I agree. With a slower mutation rate the program finishes because it does not need a simultaneous sequence to finish.

John Harshman said...

Sorry again, Bill, but that was gibberish to such an extent that I don't know what you meant to say.

Bill Cole said...

Lutesuite
"The real issue with this program is it does not simulate natural selection because it requires a target to create a new sequence.

Precisely. Think about that one for a while.

Also, your experiment confirms that evolution proceeds very slowly if you have an effective population size of one. Do you find that surprising?"

There is no experiment here. A target is not a simulation of evolution it is an experiment of design. I am not a design advocate but natural selection does not have a sequence target it has fitness as a target. Dawkins program proves that if you know the target you can find it by natural selection. This in not the theory of evolution; If Joe can create software without a target(and yes he has been diligently trying) then we have a chance at a mathematical model but short of that we have an untested hypothesis with out a model at best. I have great respect for Joe but I do not believe that he can overcome the sequential space problem. Joe, do you disagree?

Bill Cole said...

John
Here is Lynch's 2005 paper on evolutionary resources to fix adaption. http://dx.doi.org/10.1110%2Fps.041171805, Help me reconcile how based on this that a few thousand fixed adaptions can be fixed over 6 million years of the split form a common ancestor of chimps and man.

Bill Cole said...

John
I don't think that you can reconcile Lynch's paper with the chimp to man transition because the genome is a sequence. As I said to Larry months ago this is a show stopper. Sequences are great to create diversity but suck at finding function through a random search.

Faizal Ali said...

Bill Cole, you have a real gift for reading scientific papers so that they seem to say the exact opposite of what they actually say. How did you interpret this sentence from the abstract:

Numerous simple pathways exist by which adaptive multi-residue functions can evolve on time scales of a million years (or much less) in populations of only moderate size. Thus, the classical evolutionary trajectory of descent with modification is adequate to explain the diversification of protein functions.

judmarc said...

If that is 5% mutation rate per site, and a target string of 80 letters, there should be an average of 4 mutations per offspring. The mutation is providing variability enabling selection to move toward the target, but is also causing the string to mutate away from the target at an average of 4 places once it gets close.

So it is hard for it to get all the way to the target, but it will rather quickly get to about 30 matches out of 80. Then it does not get farther, owing to the high mutation rate causing back-mutations.

Wouldn't a mutation rate anywhere near that high in a living organism get to error catastrophe pretty quickly?

judmarc said...

Sequences are great to create diversity but suck at finding function through a random search.

What sucks at finding function through random search is your attempt to counter the theory of evolution via bits of ideas turned up through Google.

judmarc said...

The math is troubling here because according to Lynch's 2010 paper 6 mutation takes more evolutionary resources than the chimp to man common ancestor split. You are talking about 4 million mutations !0% based on junk DNA predictions turning into a few thousand fixed and functional mutations.

Then the fact that we have a Powerball lottery winner every few weeks at odds of 175 million to 1 ought to trouble you greatly.

judmarc said...

Yes maintained by selection but also can account for the phenotypic differences.

So you didn't take the hint to consider what selection operates on.

If there is no phenotypic difference, is there anything for selection to operate on?

Faizal Ali said...

Wouldn't a mutation rate anywhere near that high in a living organism get to error catastrophe pretty quickly?

Welcome to creationist "science": Deliberately choose a parameter that has no resemblance whatsoever to what exists in the real world and which, in fact, would lead to rapid extinction if it did exist. Then, run a program using that parameter. When it doesn't work, conclude that you have now demonstrated that evolution cannot work, therefore God.

judmarc said...

@lutesuite: Exactly - thus what it shows is two very important things:

- The program is reasonably reflective of the real world; give it something that absolutely doesn't work in the real world and it probably won't work in the program.

- Evolution adheres to mathematical laws arising directly from the scientific principles on which it operates.

John Harshman said...

Your reference to Lynch's paper is another example of gibberish. It doesn't say anything like what you claim, and in fact seems to be closer to the exact opposite, or would be if your claim were coherent to begin with.

Faizal Ali said...

Bill Cole can confirm this, but I think his misunderstanding is that when he reads "adaptive multi-residue functions can evolve on time scales of a million years" he understands that to mean that a single adaptive sequence will be fixed over the course of a million years, with no other adaptations arising and moving towards fixation over that time. Then, another adaptation arises, and a million years later it is fixed. And so on, one sequence at a time, each requiring a million years. So, by his understanding, if a genome contains 2000 such adaptations, this would require 2,000,000,000 years.

Correct me if I'm wrong, Bill.

Bill Cole said...

John Harshman
Your reference to Lynch's paper is another example of gibberish. It doesn't say anything like what you claim, and in fact seems to be closer to the exact opposite, or would be if your claim were coherent to begin with."

Explain why Lynches paper that models adaptions and puts population and time constraints is not relevant to the chimp to man reconciliation.

Faizal Ali said...

If I may speak on John's behalf: It is relevant. What on earth made you think John was saying it isn't.

The issue, as was plainly stated, is that the paper does not show what you claim it does. It shows the exact opposite, i.e. that "the classical evolutionary trajectory of descent with modification is adequate to explain the diversification of protein functions." Again, this was stated plainly.

Bill Cole said...

lutesuite
"Bill Cole can confirm this, but I think his misunderstanding is that when he reads "adaptive multi-residue functions can evolve on time scales of a million years" he understands that to mean that a single adaptive sequence will be fixed over the course of a million years,"

This is what I am reading. Less then 10 adaptions can take millions of years. I am not reading any more into this. This time is consistent with the sequence. BTW how many people would win lotto if you went from 6 balls to 50?

Faizal Ali said...

Let me ask you this, Bill:

Lynch's paper states that the adaptations that are so vexing you can take place over a time scale of a million years, if not less, using only modest population sizes.

The MRCA of chimps and humans existed about 6 million years ago.

Why is it unlikely that something that can occur in under a million years would occur in 6 million years?

Do you not understand that 6 million is greater than 1 million?

Bill Cole said...

lutesuite

The issue, as was plainly stated, is that the paper does not show what you claim it does. It shows the exact opposite, i.e. that "the classical evolutionary trajectory of descent with modification is adequate to explain the diversification of protein functions." Again, this was stated plainly."
Please back up this bold claim integrating the numbers in the paper. And by the way show me how Hunts paper supports vertebrate evolution.

Bill Cole said...

lutesuite
"
Do you not understand that 6 million is greater than 1 million?"

How many adaptions did you get in 1 million years?

Faizal Ali said...

This is what I am reading. Less then 10 adaptions can take millions of years.

Where in the paper is that stated?

Please back up this bold claim integrating the numbers in the paper.

The "bold claim" is a direct from the paper you are citing in support of your position. The numbers to support it are contained in the body of the paper itself. I admit, I do not have the requisite knowledge to understand the mathematics involved. I am merely assuming that Lynch is able to understand his own calculations well enough to not misrepresent them. You are saying he made an error. So the onus is on you to specify where he got the math wrong.

Bill Cole said...

Lutesuite
I am stating that the large times required for adaptions based on his model are indicative of the sequence. In Hunts paper he makes a counter claim to Axes but the probability problem of 10^10 to 10^64 make it very unlikely that the molecular machines we see in life are formed through stochastic processes. The Wessel program needs a target and here we are 31 years later and genetic algorithms still need a target. Wake up and smell the coffee, we don't yet understand the mechanism of common decent if it exists at all. Your computer search analogy you made months ago will not work with long sequences.

Faizal Ali said...

I am stating that the large times required for adaptions based on his model are indicative of the sequence.

Please translate into English, please.

In Hunts paper he makes a counter claim to Axes but the probability problem of 10^10 to 10^64 make it very unlikely that the molecular machines we see in life are formed through stochastic processes.

And no one says they are. What I think you are saying is that those numbers mean functional sequences that may be subject to selection cannot arise thru mutations. You seem to have skipped that part where you actually show the calculations that demonstrate this. Hunt's paper provided the calculations that show your claim is wrong. What error do you think he made? Please be specific.

And it would be greatly appreciated if you actually answered my questions rather than engaging in transparent attempts to dodge them. But by now I know that's all I can expect from you.

Faizal Ali said...

The Wessel program needs a target and here we are 31 years later and genetic algorithms still need a target.

Wrong again:

https://en.wikipedia.org/wiki/Evolved_antenna

Joe G said...

Strange that you never said what it has to do with it.

Or do you agree that similarities show the degree of common design and the differences show the differing design requirements required?

Joe G said...

LoL! The target for the antenna were the specifications required to make it work in the scenario needed.

Bill Cole said...

lutesuite
Joe Felsenstein is an expert here. If you don't intuitively understand why a target is required you don't understand the magnitude of the sequential space problem, but you are not alone, few people do.

Faizal Ali said...
This comment has been removed by the author.
Faizal Ali said...

How are winning lottery numbers drawn without a "target", Bill? Do you think all lotteries are fixed?

Joe G said...

All genetic algorithms have a target or else they would never find the solution. And lottery numbers are drawn on a regular basis. The probability is 1 that lottery numbers will be drawn.

Faizal Ali said...

Is this some new creationist strategy that Bill Cole and Joe G are demonstrating? Trying to confuse your opponent by spewing out a series of incoherent non-sequiturs?

Bill Cole said...

lutesuite
You are talking about short vs long sequences. Lotteries are short sequences. Long sequences need a target. You are facing a mathematical combinatorial explosion problem. With amino acids every time you add one the sequential space multiplies by 20. Pretty soon the number of possibilities is greater than the age of the universe in pico seconds. Joe G blogs on the skeptical zone and both of us have seen this debate argued to the ground. You should try to debate there also. Lottos have only 6 balls so this allows a winner. Above 10 balls very few winners above 50 balls never a winner.

Joe G said...

LoL!@ lutesuite- Just because you are an ignorant troll doesn't mean we are spewing out anything.

Faizal Ali said...

Lottos have only 6 balls so this allows a winner. Above 10 balls very few winners above 50 balls never a winner.

What if you had millions of machines constantly spitting out combinations of 50 balls, and hundreds of thousands of people holding tickets? How long do you think it would take for a winning number to come up?

judmarc said...

The probability is 1 that lottery numbers will be drawn.

Oh, very good. How profound. And quite meaningless for any useful purpose.

The probability is less than 1 that you will win, just as the probability is less than 1 that you will pass your genes, in some combination with those of a mate, to offspring. In fact the probability that you will win is far, far less than 1, just as the probability is far, far less than 1 that some mutation of yours will, via positive selection or drift, be fixed in the genome millennia from now. But people do win the lottery, and it isn't at all an unusual event, just as mutations are fixed and it isn't at all an unusual event.

It's just math, it's not hard if you use your brain rather than cutting off your reasoning powers with a loud "By the glory of God, this cannot be right!"

Bill Cole said...

lutesuite
at 50^50 possible combinations you cannot solve no matter what you do. You need to understand how large this number is. Sequences are the largest mathematical spaces in the universe. 50^50 is larger then the combined age of the atoms in the universe in pico seconds. This is not a debating issue it is the reality of DNA and Proteins and a show stopper for stochastic processes driving common decent.

Faizal Ali said...

Creationists' minds are so easily boggled by big numbers. It's kind of sad, really.

It's hardly even worth pointing out that Bill has again avoided answering a direct question.

Joe G said...

Evo's minds are boggled by science and that is really sad.

judmarc said...

Science like "Sequences are the largest mathematical spaces in the universe"? Are they Euclidean or projective spaces, Bill?

Joe G said...

No, science like you don't have any way to test the claims your position makes.

Anonymous said...

Up above there somewhere, BC wrote "John was actually the first person to educate me on what a specie is."

I'm glad -- too many people are confused about this. Specie is "money in the form of coins rather than notes."

Perhaps he was thinking about a kind of organism, which is a species. Like deer and sheep, the word is the same in singular and plural forms.

Joe Felsenstein said...

Bill Cole said: The math is troubling here because according to Lynch's 2010 paper 6 mutation takes more evolutionary resources than the chimp to man common ancestor split. You are talking about 4 million mutations !0% based on junk DNA predictions turning into a few thousand fixed and functional mutations.

I believe that the 6 mutations figure that Cole is quoting is for a case where one has mutations that are going to increase fitness, but only when all 6 changes are present. Until then the fitness does not increase, and 6 mutations might be the most that could be expected to occur in such a case.

So to compare it to 4 million DNA changes, one would have to believe that the fitness would not increase until all 4 million had finished happening. A ludicrously irrelevant scenario.

Joe G said...

Well Joe Felsenstein, you still don't have any way to test the claims of your position. For example you don't know how to test the claim that natural selection and drift produced photosynthesis or ATP synthase.

IOW you entire position is a ludicrously irrelevant scenario.

Joe Felsenstein said...

On another issue (if that), Bill Cole quoth: Joe Felsenstein is an expert here. If you don't intuitively understand why a target is required you don't understand the magnitude of the sequential space problem, but you are not alone, few people do.

There are some interesting evolutionary simulations that don't have an explicit target built in, but instead have a simulated physics with some general form of selection. I have raised this issue on blogs, and opponents of evolutionary biology generally duck and dodge and fail to come to grips with this.

The reason the target/no-target issue is raised is that creationists and ID fans have a deeply built-in belief that information can get into genomes only if it is already present elsewhere. For example, in the target string.

But check out the simulations of Karl Sims, or the program breve that enables you to simulate a similar case, or the Boxcar2d system which does something similar (I have left out links because these are easy to find using a browser).

In these cases there are genotypes, but the only target is, say, that the organism move more rapidly to the right. Which genotypes accomplish that is left to the simulated physics.

Analogously, wolves preying on deer may cause them to change in the length of their limbs, the coordination of their movement, their acuity of vision, their alertness. But not because the wolves are carrying around a detailed blueprint of the deer genome!

Anonymous said...

JG, as I think you know, similarities can result from common ancestry. (Some similarities may result from natural selection in similar environments, e.g. the general shape of whales and true fish.) Differences may result from chance (genetic drift) but may result from responses to different selection in different environments.

Bill Cole said...

Joe
"So to compare it to 4 million DNA changes, one would have to believe that the fitness would not increase until all 4 million had finished happening. A ludicrously irrelevant scenario."

This is not the point I was trying to make but probably my faulty explanation. I am using Lynch's paper to show that an adaption requiring 6 adaptions takes lots of evolutionary resources because you are working through the sequence. Based this the chimp and man sharing a common ancestor with stochastic processes as a mechanism is very unlikely.

I will look at Sims work thanks for the reference.

Joe Felsenstein said...

Bill Cole said: I am using Lynch's paper to show that an adaption requiring 6 adaptions takes lots of evolutionary resources because you are working through the sequence. Based this the chimp and man sharing a common ancestor with stochastic processes as a mechanism is very unlikely.

I lack enough knowledge, sophistication, and subtlety to have a clue what argument Bill Cole is making here.

judmarc said...

No, science like you don't have any way to test the claims your position makes.

Your ignorance of more than a century of tests and confirmation is simply that - ignorance. It certainly doesn't affect the validity of evolution one iota, any more than your ignorance of computer science affects how computers work.

Your choice to keep being ignorant, but please don't expect anyone to take you seriously.

Since you know about science, would you please answer my question: Euclidean or projective spaces? Or the third option, by far the most likely: Bill throws out words without knowing what they mean.

John Harshman said...

Bill is claiming (though it's unclear whether he knows it) that no difference between humans and chimps can be advantageous until the final difference has been fixed. I think you had it right. He's confused about the actual number, whether 40 million, 4 million, or a few thousand. But that doesn't matter, as long as it's more than 6, apparently. He's confused about the difference between selection and drift. He's confused about anything Lynch's paper said. He can't distinguish Lynch from Behe & Snopes. It's a wonder he can find his way out of bed in the morning.

Faizal Ali said...

He also seems to think that, when calculating probabilities, if the denominator is a huge honking number with many zeros at the end, the odds will always be prohibitively small. The value of the numerator makes no difference. That's why he's so besotted with the large number of possible amino acid sequences that exist.

Mikkel Rumraket Rasmussen said...
This comment has been removed by the author.
Mikkel Rumraket Rasmussen said...

"at 50^50 possible combinations you cannot solve no matter what you do. You need to understand how large this number is. Sequences are the largest mathematical spaces in the universe. 50^50 is larger then the combined age of the atoms in the universe in pico seconds. This is not a debating issue it is the reality of DNA and Proteins and a show stopper for stochastic processes driving common decent."

This is irrelevant since it is only imagined in creationist minds that the evolutionary process must search the entirety of that space in order to find functional sequences or to navigate from one functional sequence to another through vast distances of nonfunctionality, one or a few mutation at a time.

Once you let go of these carefully crafted creationist assumptions, the problem goes away as the imagined issue it was.

John Harshman said...

In other words, since he's supposedly talking about the human/chimp divergence, new sequences arise not by instant assembly of millions of bases but by small changes to existing sequences. And if we're talking about adaptive changes, each step is probably advantageous. So much for Behe & Snopes.

Bill Cole said...

John
This is the explanation of the graph in the Lynch Abegg 2010 paper Lynch and Abegg · doi:10.1093/molbev/msq020FIG. 4. Mean number of generations until establishment for complex adaptive alleles involving d = 2, 3, 4, and 5 sites (denoted in the right margin) for the case in which the intermediate states are neutral. The lower solid line gives the theoretical results for d = 2. A constant mutation rate of u = 10−8 per site is assumed, and the adaptive allele is assumed to have a selective advantage of s2 = 0.02. The data are based on simulations of 25–100 replicates, and the curved lines are the approximate fits described in the text.

In the graph 5 fixed adaptions is 10^6 generations and 10^11 population size.

John Harshman said...

Why would one assume that all intermediate states are neutral? Why would one assume that only one adaptation can be in process at a time? What makes you think this has anything at all to do with human-chimp divergence?

The whole truth said...

joey g said: "No, science like you don't have any way to test the claims your position makes."

And: "Well Joe Felsenstein, you still don't have any way to test the claims of your position. For example you don't know how to test the claim that natural selection and drift produced photosynthesis or ATP synthase.

IOW you entire position is a ludicrously irrelevant scenario."

I'm curious, joey, what is the scientific test for 'allah-yahoo-yeshoo-holy-spook-did-it'?

Faizal Ali said...

A question for Bill Cole:

A human generation is about 20 years.

The current population of the earth is about 7 billion.

At 20 years/generation, it would take 140 billion years to reach that number.

How is it possible that the population was reached in much less time than that?

(Admittedly, this is not a perfect analogy. But with a bit of thought, it might lead Bill to appreciate one of the errors he is making, which is in assuming that only one adaptation is in the process of evolving at a time.)

Mikkel Rumraket Rasmussen said...

"I'm curious, joey, what is the scientific test for 'allah-yahoo-yeshoo-holy-spook-did-it'?"

I think I remember him answering a similar question before. In the IDiot mind, a scientific test of ID is "can evolution produce it? No! Therefore ID".

Mikkel Rumraket Rasmussen said...

Designers can choose to design objects such that they fit into a nesting hierarchy of shared similarities. They can also choose not to. Therefore ID predicts nothing in particular. Does Joe G know the mind of the designer? No.

In contrast, evolution absolutely requires nesting patterns of shared derived characteristics. Therefore evolution is a priori much more likely the explanation for a nesting pattern.

Faizal Ali said...

Gradual evolution predicts many transitional forms which would ruin any attempt to form a nested hierarchy.

I think this falls into the category of "not even wrong."

Faizal Ali said...

Just look at Linnaean classification which is a nested hierarchy-

I know. Isn't that an amazing coincidence? That he was able to classify organisms into the only schema possible under evolution, before the theory of evolution had even been enunciated.

it has nothing to do with evolution- all nice neat sets which transitional forms would ruin by their very nature.

LOL! There's no way to argue with stupidity like this.

Faizal Ali said...

Hee hee. I see Joe G is copying Bill Cole's schtick of citing publications that refute his own claims.

Faizal Ali said...

Gosh, Joe. I'm surprised you can still walk, the way you keep shooting yourself in the foot.

Anonymous said...

Joey's ignorance of real taxonomy is vast.

As taxonomists, we want to place organisms into nice, neat, either-or categories, to this species or that. We want each species to belong to one genus and not another, to one family or the other, not fall confusingly between them.

What do we find when we really look? While many species are distinct, some are not. Most widespread species differ from area to area -- should we call the different forms different species, or not? Some species clearly group together. Others are more different -- but are they different enough to call them a different genus or a different family? Some species originated by hybridization. People try to get rid of the problem by refining the definition of species, etc. However the problem is one of biology, not of words.

We best express the relationships among species (and individuals) in phylogenies (like geneologies) that show relationships among organisms. Phylogenies show "a natural arrangement," to quote Darwin. Phylogenies show nested relationships among the groups.

As taxonomists, we force this variation into the traditional Linnaean style classification system because that's how we file information. It works pretty well. Roughly, the levels of the traditional nested-hierarchy classification system are some of the nodes in the phylogenic trees (places where branches join together). There are many more nodes in the tree than traditional levels of classification, but that’s OK.

I find the pattern of variation and the way that pattern falls into nested groups is the very best evidence there is for evolution.

Anonymous said...

Lutesuite, I like your concise description of JG's "argument." It isn't even clear if he thinks that our neat classification system or messy reality makes an argument against evolution. :)

Bill Cole said...

John
Why would one assume that all intermediate states are neutral? Why would one assume that only one adaptation can be in process at a time? What makes you think this has anything at all to do with human-chimp divergence?"
The time and populations for 5 adaptions are outside the current chimp to man theory. I am struggling to see how this can not be relevant. Please help with this I certainly may not understand it correctly.

John Harshman said...

Well of course you don't understand it correctly. Try thinking about the questions I asked you. Your "5 adaptations" isn't 5 adaptations. It's a single adaptation requiring 5 mutations, all of which must drift to high frequency so that all are present in a single individual before the adaptation exists. There is no expectation or need that anything of the sort happen during the human-chimp divergence.

Instead, what happens is that a first mutation is slightly advantageous and subsequent mutations improve whatever function there is. (Of course there are many mutations that aren't advantageous, but they aren't selected and very few of them become fixed.)

Faizal Ali said...

Which, to avoid increasing Bill's confusion even further, does not mean that only a few of mutations that have been fixed in a genome are the result of genetic drift. Correct?

Anonymous said...

Phylogenies do show nested relationships among groups of organisms. Trees within trees within trees. (See "trees" at the Angiosperm phylogeny website, or visit the Tree of Life web project.)

Gradual evolution does mess up the attempt to form a simple nested hierarchical classification system (as both JG and I wrote), because it does produce transitional forms. That is the reason that some taxonomists want to shift only to the more nuanced phylogenetic trees and away from formal Linnaean classification style. Many of us stick with the more formal style because it is handy for storing information, while acknowledging that it is an oversimplification.

JG boldly waves evidence for evolution and cries that it disproves evolution. One of us seems to be unacquainted with logic.

Faizal Ali said...

They shouldn't if evolutionism was true.....

ID is not anti-evolution.



It's like he's not even living in the same universe as the rest of us.

Anonymous said...

I wrote, "Phylogenies do show nested relationships among groups of organisms." Joe G wrote, "They shouldn't if evolutionism was true."

It might be interesting to know what line of "reasoning" leads to this statement, but I doubt JG can articulate it.

Faizal Ali said...

It takes a really special kind of stupid to say that evolution predicts the existence of transitional species, cite examples such transitional species, and then say that this disproves evolution.

Anonymous said...

JG wrote, "as Darwin said if his concept was true then you wouldn't be able to produce any nested hierarchies."

No, that's not what he wrote. You've misunderstood.

Anonymous said...

JG, visit the Tree of Life Web Project (http://tolweb.org/tree/), click on a picture in the tree, like the frog, and use arrows to go back and forth to see what subtrees are contained in what trees.

Or visit the Angiosperm Phylogeny website and click on trees, then on the tree symbols to see more trees imbedded within the basic tree.

Anonymous said...

JG, what do you think a nested hierarchy is?

John Harshman said...

Sure, you can cut and paste. But do you have any idea what you're cutting and pasting? Apparently not, since you make up the requirement of increasing complexity out of, I hypothesize, the words "lower" and "levels". But this is nothing more than containment, i.e. groups within groups.

As Darwin pointed out and as you have unaccountably ignored even while quoting him, the nested hierarchy is possible because past populations are extinct, and even fossils are a small sample of the finely graded series of populations that have existed.

Joe Felsenstein said...

The one (small) nugget of validity in JoeG's argument is that if we include the actual ancestors of living species, we do have trouble making a hierarchical classification. For example, suppose that Archaeopteryx were actually ancestral to all modern birds. (It almost surely isn't, but allow me the supposition).

Which order, family, and genus of birds should we then place it in? You can't, unless you violate monophyly of these taxa.

Of course we really don't have the fossils of the actual ancestors, at least in groups where only a small percentage of ancient species have been found as fossils. For others, such as hominids, the problem is real. Is, for example Homo ergaster to be regarded as a sister species to Homo sapiens, or what?

Unknown said...

I've always thought a reasonable approach was to treat the MRCA of a clade as the clade itself. I.e. the MRCA of modern birds would have been a species we could refer to as Aves, which when it split turned into a clade Aves. We really don't have a way to distinguish stem line fossils from stem line representatives (and there are statistical arguments that some stem line representatives are possibly stem line fossils), so in practical terms that issue doesn't arise, because we simply treat all of them as stem line representatives.

John Harshman said...

Yes, groups within groups is what a nested hierarchy is. This added feature of increasing complexity is something you made up. I have no idea what "the lower the level the more traits are used" even means, nor why "more traits are used" means "the complexity increases". I don't think you have any idea either.

And of course unguided evolution predicts a nested hierarchy. It's unavoidable if species diverge and changes happen at particular spots on an evolving tree. No guidance is necessary, just changes and divergence. That's why junk DNA shows nested hierarchy.

John Harshman said...

Simon, I don't know what "stem line fossils" and "stem line representatives" are or what the difference is between them.

Unknown said...

And now it's my turn to say sorry, so much of this thread devolved into creationist weirdness that I missed your reply.

By stepwise I simply meant that if you have a time interval [t_0, t_1] then the change in the trait during that time interval follows a lognormal distribution, regardless of the particular values of t_0 and t_1. I.e. analog to the definition of Brownian motion:
X(t) continuous almost surely
Increments independent
X(t_1)-X(t_0)~ln N(0,t_1-t_0)
If you place this process on a phylogeny, you are likely going to discretize time at divergence dates and this ensures that you only need one modeling step per branch WLOG.

Unknown said...

@John: Let's say you have a rooted tree ((A,B),C). There is a lineage running from the common ancestor of all tree terminal taxa to the MRCA of A and B. A stem line fossil would be a fossil belonging to a species on that lineage, i.e. belonging to a species that was ancestral to A and B. On the other hand there could be a fossil species not represented in the tree with these terminal taxa, let's call it F. If a phylogeny including F looked like this: (((A,B),F),C) then F would be closer related to (A,B) than to C, but not ancestral to (A,B). This would be a strem line representative. If (A,B) had 3 synapomorphies i,ii, and iii and the fossil had i and iii but not ii, then both cases provide us with the same information on the sequence of character evolution (i and iii evolved before ii), hence the stem line representative can serve as a stand-in for an actual stem line member. We can sometimes provide clear evidence that a fossil is a stem line representative (usually by finding multiple fossil species that for a clade with their own synapomorphies that imply that they were not paraphyletic with regards to the crown group), but in a lot of cases we can't differentiate between a fossil belonging to an ancestral species or a fossil that belongs to an extinct sister group. Estimates of extinction rates lead to estimates of the percentage of fossils that should actually fall on the stem line and there's a high probability that we actually have quite a few of them, but for any particular fossil the probability that it is not part of an ancestral species is pretty high. So by default we treat every fossil as a SLR and thus propose that it belongs to a sister taxon.

John Harshman said...

OK, I understand. But those are really bad terms. A "stem line fossil" is more simply and correctly referred to as an ancestor. A "stem line representative" is a stem-AB. I agree that there's no way to distinguish an ancestor of AB from a stem-AB. Anyway, they're all just terminal taxa, possibly separated by short branches, or perhaps even zero-length branches, from some internal node.

Joe Felsenstein said...

In principle, one can distinguish actual ancestors from their relatives. That is what the stochastic birth-death-fossilization model attempts. Of course the inference is noisy. But it is now being attempted.

I would also add to John's complaint about Simon's terminology the note that those terms are not widely-used.

Bill Cole said...

John
Well of course you don't understand it correctly. Try thinking about the questions I asked you. Your "5 adaptations" isn't 5 adaptations. It's a single adaptation requiring 5 mutations, all of which must drift to high frequency so that all are present in a single individual before the adaptation exists. There is no expectation or need that anything of the sort happen during the human-chimp divergence."

Thanks for the clarification. Do you then believe that all adaptions between chimps and man took less then 5 mutations?

Unknown said...

@Joe: I'd like to postpone discussion of the FBD model a bit, because I've got a manuscript pretty much done and want to submit it at PNAS, naming you as potential editor because you had that position for the Heath et al. paper.
AFAIK the FBD model does not try to distinguish actual ancestors from relatives, rather it uses all possible combinations of actual ancestors and relatives and weighs them. If you used the model to try to resolve the issue, by - say - a ML approach using realistic BD parameters you'd pretty certainly end up with no fossils identified as ancestors.

On terminology: It's worth noting that there is a distinction between the stem line and the stem group. Both ancestors and SLRs are members of the stem group and thus simply referring to them as stem does not allow us to distinguish the two cases. The stem line consists of the succession of ancestral species leading up to a clade. The stem group consists of the stem line and extinct sister groups of stem line species. Stem groups are always paraphyletic, stem line representatives are generally not. (see e.g. Wägele JW 2005 "Foundations of Phylogenetic Systematics", Pfeil, Munich, pp.111-112). [I should also note that Wägele was nominally my boss until the end of May, so the terminology used in that book carries a bit more weight here].

John Harshman said...

Joe, I'd say that in principle all you can do is estimate the likelihood that a fossil belongs to an ancestral species, but that likelihood estimate would require a number of dubious assumptions and the likelihood would not be high. An observed zero-length branch might have a parametric length greater than zero, given the possibility of reversals.

John Harshman said...

Again, you are confused. There's a big difference between "took 5 mutations to reach their present state" and "required 5 mutations, each but the last being neutral". I'm not sure you are capable of understanding any of this. It seems that many creationists are less interested in understanding than in grabbing onto sound bites to support their beliefs.

Bill Cole said...

John
I am trying to understand your position and assume that this will make sense in the end. If I said do you then believe that all adaptions between chimps and man took less then 5 mutations where all 5 mutations were required for that specific adaption? Would that support you're understanding?

John Harshman said...

You persist in being unclear, probably because you aren't clear on what you mean. If you mean fewer than 5 mutations, all of which are necessary to be present in order to confer any advantage at all, then yes.

Bill Cole said...

John
"You persist in being unclear, probably because you aren't clear on what you mean. If you mean fewer than 5 mutations, all of which are necessary to be present in order to confer any advantage at all, then yes."

I think I got it. Thanks.