More Recent Comments

Saturday, May 03, 2014

Michael White's misleading history of the human gene

There are many ways of defining the gene but only some of them are reasonable in the 20th and 21st centuries [What Is a Gene?]. By the 1980s most knowledgeable biologists were thinking of a gene as a DNA sequence that's transcribed to produce a functional product.

They were familiar with genes that encoded proteins and with a wide variety of genes that produce functional RNAs like ribosomal RNA , transfer RNA, regulatory RNAs, and various catalytic RNAs. It would have been difficult to find many knowledgeable biologists who thought that all genes encoded proteins.

By the 1980s, most knowledgeable biologists were aware of RNA processing. They knew that the primary transcripts of genes could be modified in various ways to produce the final functional form. They knew about alternative splicing. All these things were taught in undergraduate courses and written in the textbooks.

Here's how Michael White views that history in: Your Genes Are Obsolete.
As the century progressed, biologists came to see genes as real physical objects. They discovered that genes have a definite size, that they are linearly arrayed on chromosomes, that individual genes are responsible for specific chemical events in the cell, and that they are made of DNA and written in the language of the Genetic Code. By the time the Human Genome Project was initiated in 1988, researchers knew that a gene was a segment of DNA with a clear beginning and end and that it acted by directing the production of a particular enzyme or other molecule that did a specific job in the cell. As real things, genes are countable, and in 1999 biologists estimated that humans had "80,000 or so" of them.
If he means that knowledgeable researchers knew about genes for functional RNAs (e.g. ribosomal RNA genes) then he is right. If he thinks that knowledgeable researchers thought that all genes encoded proteins, then he's wrong.

As for the number of genes, I've addressed this in False History and the Number of Genes, and Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome.

There may have been researchers who speculated about the number of genes in the human genome but surely the only estimates that count are those from scientists who were knowledgeable about the subject. Those experts expected about 30,000 genes based on genetic load arguments and data from the early 1970s on the amount of DNA that was unique. Most of those researchers were expecting humans to have about the same number of genes as fruit flies, or maybe a few thousand more.

Michael White continues ....
Yet, when the dust from the Human Genome Project cleared, we didn’t have nearly as many genes as we thought. By the latest count, we have 20,805 conventional genes that encode enzymes and other proteins. Our inflated gene count, though, wasn’t the only casualty of the Human Genome Project. The very idea of a gene as a well-defined segment of DNA with a clear functional role has also taken a hit, and as a result, our understanding of our relationship with our genes is changing.
There are about 21,000 protein encoding genes in our genomes and several thousand more genes that produce functional RNAs. The numbers may be a bit lower than most experts thought, but not by much. No great surprises there unless you count those people who made speculative guesses without knowing the data from the 60s and 70s.

And, there weren't many surprises about defining a gene either.
One major challenge to the concept of a gene is the growing evidence that many genes are shapeshifters. Instead of a well-defined segment of DNA that encodes a single protein with a clear function, we should view a gene as "a polyfunctional entity that assumes different forms under different cellular states," according to University of Washington biologist John Stamatoyannopoulos. While researchers have long known that genes are made up of discrete subunits called "exons," they hadn’t realized until recently the degree to which exons are assembled—like Legos—into sometimes thousands of different combinations. With new technologies, biologists are cataloging these various combinations, but in most cases they don’t know whether those combinations all serve the same function, different functions, or no function at all.
Maybe some people didn't know about RNA processing and alternative splicing but many of us did. No surprises there.

We don't know how many genes in the human genome are "shapeshifters" but there's a growing realization that many splice variants are just biological noise due to errors in splicing. Those variants have no biological function. The point is that the definition of "gene" wasn't affected by any discoveries by those who sequenced the human genome or by the ENCODE Consortium.
Our concept of a gene is also challenged by the fact that much of the function in our DNA is located outside of conventionally defined genes. These "non-coding" functional DNA segments regulate when and where conventional protein-coding genes operate. For our biology, non-coding regulatory DNA elements are as consequential as genes, but their properties are even more difficult to define because their function isn’t based on the well-understood Genetic Code and their boundaries are even fuzzier than gene boundaries.
No surprises there either. Knowledgeable researchers have known about regulatory sites since the 1960s. Most of them don't incorporate regulatory regions into the definition of "gene." Every gene is going to be associated with regulatory regions that regulate transcription.

I don't see why this well-known fact makes the definition of "gene" obsolete.
As a result, non-coding regulatory DNA elements are much more difficult to count. One consortium of researchers put the number of regulatory DNA segments in the human genome between 580,000 and 2.9 million, while just last month a different consortium claimed that there are only 43,000. Regardless of how you count them, it’s clear that these non-gene regulatory DNA elements far outnumber conventional genes. It is hard not to wonder, then, what good is the concept of a gene if it doesn’t include most of our functional DNA?
I think it's totally unreasonable to speculate that every gene would have 20 different regulatory sites scattered around the genome as some of those numbers suggest. If there were only a few near the promoter then this is exactly what we've known for decades and there's no reason to redefine a gene.

Finally, I don't know what Michael White was thinking but I've never heard any knowledgeable scientist say that all functional DNA has to be in "genes." So, what's the problem? If the definition of "gene" wasn't made obsolete with the decades-old discoveries of origins of replication, regulatory sequences, telomeres, and centromeres then what's changed in the past decade?

I don't get it. Why are so many prominent scientists saying that we need to redefine the word "gene"?


106 comments :

Jonathan Badger said...

Personally I blame the human genetics types. Because of practical and ethical reasons, in many ways the field is not as advanced as it is in model plants, animals and microbes. So the human data is noisy and confusing. Because people like to think they are special, they assume this noise must be all meaningful signal making us way more complicated than other organisms. To accept that we are not would be devastating to human exceptionalism.

SPARC said...

Unfortunately, splice variant databases contain many dubious entries that may be caused by reverse transcribed non fully processed pre-mRNAs. Thus, the relevance of alternative splicing appears to be overrated (at least to someone who remembers single identical Northern bands from every tissue he analayzed now being confronted with allegedly more than 10 alternative transcripts of the very same gene).
A recent paper in Genome Research points in the other direction:
"Background
RNA sequencing has opened new avenues for the study of transcriptome composition. Significant evidence has accumulated showing that the human transcriptome contains in excess of a hundred thousand different transcripts. However, it is still not clear to what extent this diversity prevails when considering the relative abundances of different transcripts from the same gene.

Results
Here we show that, in a given condition, most protein coding genes have one major transcript expressed at significantly higher level than others, that in human tissues the major transcripts contribute almost 85 percent to the total mRNA from protein coding loci, and that often the same major transcript is expressed in many tissues. We detect a high degree of overlap between the set of major transcripts and a recently published set of alternatively spliced transcripts that are predicted to be translated utilizing proteomic data. Thus, we hypothesize that although some minor transcripts may play a functional role, the major ones are likely to be the main contributors to the proteome. However, we still detect a non-negligible fraction of protein coding genes for which the major transcript does not code a protein.

Conclusions
Overall, our findings suggest that the transcriptome from protein coding loci is dominated by one transcript per gene and that not all the transcripts that contribute to transcriptome diversity are equally likely to contribute to protein diversity. This observation can help to prioritize candidate targets in proteomics research and to predict the functional impact of the detected changes in variation studies."

Unknown said...
This comment has been removed by a blog administrator.
Joe Felsenstein said...

@Malia justin's comment is pure spam.

Joe Felsenstein said...

I would count as one not-very-molecularly-informed biologist who accepted the figure of 100,000 genes in the human genome. I was told it in the '60s and did not keep up with counts much after that. I knew the Drosophila figure was lower but did not realize how close to it the human count would be.

One earlier problem with White's account is his statement that

"As the century progressed, biologists came to see genes as real physical objects. They discovered that genes have a definite size, that they are linearly arrayed on chromosomes ...."

The realization that genes are linearly arrayed on chromosomes came in 1911 with the gene mapping work of Sturtevant. By 1919 JBS Haldane had published on his mapping function, which required tha assumption of a linear array of genes. In the 1920s, under the leadership of HJ Muller, attempts were made to use the physics of radiation with X-rays to estimate the "target size" of genes. So by that date genes were being assumed have linear order on chromosomes and be of a definite size.

White makes this date sound much later by describing it along with the realization that genes were DNA and had a genetic code, when these happened 30 years later.

Larry Moran said...

I tend to agree with you on this. Many scientists seem to be motivated by a need to make humans look significantly different from all other species. In other words, they don't really understand evolution.

There are two other issues that cloud their judgement: medicine and profit. If your main goal is to cure cancer or found a profit-making technology company then it's in your best interest to make things look as complicated and as revolutionary as possible.

Finally, let's not forget the influence of grants and other sources of funding basic research. Nobody wants to admit that they have spent bundles of money studying noise and artifact. They certainly don't want to admit that they weren't smart enough to realize that they were on a wild goose chase. They wouldn't get funded in the next competion if they admitted that.

Fortunately for them, the members of the grant panels are in the same position. The committee members aren't going to punish someone who made the same mistakes they, themselves, made.

SRM said...

Finally, I don't know what Michael White was thinking but I've never heard any knowledgeable scientist say that all functional DNA has to be in "genes."

Maybe its as simple as that. For decades a somewhat sloppy conceptualization regarding the molecular basis of life called genes the "blueprint" for making the organism. When it dawns on writers that there is more to phenotype than genes (coding regions), they think the concept of the gene must then be expanded to include all functional regions of the genome - hence a fuzzying of the concept of gene. This would seem to have the eventual effect of rendering the word "gene" rather meaningless except perhaps to distinguish between functional and non-functional regions of DNA. But there are already words for that and for non-coding regions of DNA such as promoters, TF binding sites, origins of replication, etc.

Another possible factor is that it seems anytime a firm definition is proferred in biology, one will eventually encounter a circumstance that suggests a need for caveats or qualifications to the definition. These exceptions seem to excessively trouble people at times.

Unknown said...

"Another possible factor is that it seems anytime a firm definition is proferred in biology, one will eventually encounter a circumstance that suggests a need for caveats or qualifications to the definition. These exceptions seem to excessively trouble people at times."

I don't think that's right. The issues with definitions arise, because there's a tendency to use existing names for new concepts. I really like the Williams definition of gene. It's a very cool concept. But it's no good that it's called gene. The same holds for species definitions. It's not that definitions are problematic. It's that some people find existing concepts not useful for their work, then define some concept that is and call this new concept species.

Basically: If you define some useful concept, don't reuse words already associated with another concept. The whole point of having technical terms is that you have a shorthand for something complex and the people in the same field get it. If you have to start your paper by explaining what you mean by gene or species, then nothing is won.

Anonymous said...

Simon Gunkel: As someone who deals with species concepts a lot -- I'm a plant taxonomist -- I disagree with the idea of replacing "species" with different terms when we have to use a different definition of "species." Basically, a species is a "kind" of organism. The different species concepts, or definitions, result from the need to explain how we're parsing up variation into named units, given that organisms have very different reproductive biology, population genetics, and patterns of variation.

I think of “species” as a term we apply by different standards (species concepts) depending on the organism’s breeding system and patterns of variation. The Biological Species Concept works well for many of orgaisms – use it when you can. If the organisms are asexual or self-fertilizing, or a mix of selfing and outcrossing, or don’t ever meet, or seem to fall into distinct groups although they produce fully fertile offspring in the rare cases when they do meet – adjust to a different species concept. Although I’m sure we have different words we could use for the species named according to different species concepts (biologists are great at generating terminology) those words don’t make our lives better, so we just call them all species – because they are all “kinds”.

We humans like nice clear words with nice simple referents, but reality doesn’t cooperate. So we adjust, sometimes with new words, sometimes with flexibility in the words we already have.

Unknown said...

"The different species concepts, or definitions, result from the need to explain how we're parsing up variation into named units"

But that's the crux. The point of the BSC is not to parse up variation into named units. The point of the BSC is to define natural classes by the rate at which gene flow occurs. If there's little gene flow between two sets of organisms, they will tend to fix different alleles and hence accumulate differences. If there is a lot of gene flow, they will tend to fix the same alleles. That change in dynamics is not dependent on the particulars of the biology of these sets - either there is enough gene flow or there isn't.

Now, I agree that we need some way - and most likely more than one way - of dividing up variation. Concepts that do this are fundamentally different from the BSC and I've previously voiced the opinion that they should be called type concepts rather than species concepts. Biology demands concepts based on the evolutionary dynamics of organisms. It also demands concepts based on other criteria. No concept can serve both demands and for that reason alone, there should be two terms.

On a larger scale, consider categories like "Herbivores" or "Trees". These are useful categories for some questions. But it would be a very bad idea to refer to ecological or morphological descriptors as clades. We do want something that refers to phylogeny and phylogeny alone. And we use clade to indicate that. We might need non-clade descriptors for some questions and quite reasonably don't refer to them as clades. Species and gene didn't get that lucky...

Unknown said...

Larry, I am new to the Sandwalk but I must admit that I like it. I have commented on many blogs, but I have also been blocked from many (eg. Lifesitenews) simply because I presented evidence that disagreed with them. However, you continue to allow people like Quest to comment (to his intellectual embarrassment) and when you do delete a comment, you are transparent about it. The ID and religious sites are never that open about it.

John Harshman said...

The point of the BSC is to define natural classes by the rate at which gene flow occurs.

Not the case. There are many possible reasons for lack of gene flow between populations, and the BSC considers only one of them: genetically-based isolating mechanisms. Simple allopatry, for example, can result in zero gene flow without dividing the two populations into two species under the BSC. Then again, under the phylogenetic species concept, allopatric populations would easily be separate species in the absence of any other isolating mechanism.

Rosie Redfield said...

After reading James Gleick's The Information, I now teach that genes are primarily informational constructs, not structures. (Well, of course, all of genetics is understanding the intersections between physical molecules and information.)

Tom Mueller said...

Hi Joe

I must be getting old.

Forgive me for rehashing, but that is a prerogative of age.

I still remember when cis-acting regulatory “genes” were still “genes” and not “elements”.

There once were two molecular mechanisms for dominance vs. recessive.

Whacking through google scholar indicates the frequency of the the old-fashioned pairing “operator gene” diminishes just before the turn of the century.

There must have been a conference.

Your point about Morgan's fly lab is well taken... he and his decendants had a rigourous grasp of genetics that still makes sense to me. I wonder what Barbara McClintock's reaction would be?

I still say: "If you can mutate it and map it - it's a gene."

I realize that modern definitons no longer comply - making me wonder how I am supposed to teach the notion of "regulatory genes".

end of rant...

Tom Mueller said...

@ Simon & John

Please forgive my inadvertant rudeness... I have not had an opportunity to reply to your patient and indulgent responses to my more recent naive inquiries.

I am in debt to both of you... but right now I am up to my ear-lobes in alligators. I should give your replies regarding phylogenetics the considered response they deserve by tomorrow.

best and grateful regards

Robert Byers said...

The error here in classification is that only the option for genetic change is evolution say evolutionists.
The bible says there are kinds. Fixed and settled.
yet within the kinds, as shown by people, diversity can be great.
So species is just a moment in time. There are no species just as people are not species thought often more different then species in biology classifications.
Thats why the terms don't work.
there are KINDS but no species. Not species changing until its a very new thing.
No reason to see it that way.

Tom Mueller said...

Hi again Joe,

The genome may be smaller than expected - but the proteome is an order of magnitude larger than expected! ... over 1 000 000

Meanwhile, alternative splicing/promoters/editing can bring the transcriptome back up to that 100 000 mark.

OK - I was not provocative on purpose, I understand that much of the trascriptome is junk and not functional... just the same, those numbers are impressive.

http://www.piercenet.com/media/Proteome-Complexity-Figure-650px.jpg

Transposable elements, or “selfish genes”, are probably responsible for over 50% of our genome; but not all are ncecessarily selfish: Alu (for example) may be a symbiont.

If I understand this correctly (please help me out here...) Aligning human/mouse/other critters' genomes indicates that 5% of our genome is under selection. Do I understand this correctly? So how do we get a handle on how much is really functional - even if only serving the role of genomic clean fill as it were?

Tom Mueller said...

Jonathan - I think part of the problem is choice of model organisms were by happenchance exclusively Ecdysozoa (nematodes and fruitflies) which if I may rehash yet again are far removed (i.e. derived) from a deuterostome perspective which is far more basal. For example, gene regulation via DNA methylation seems far more prevalent in Deuterostomes than Ecdysozoa.

I need to return to John's and Simon's earlier rejoinders on that score before pushing that point and further.

Tom Mueller said...

Oh wow!!!! Hi Rosie, it's an honor.

What is your reaction to my cantankerous rant: "...if you can mutate it and map it, it is a gene"!

Call me cynical, but was perchance the definition of "gene" restricted by James Watson and his crowd to make the HGP more manageable. I notice that the frequency of the term "Operator Gene" coincidentally drops precipitously off the radar of google scholar by the 1990s.

John Harshman said...

How do you know which differences between deuterostomes and ecdysozoans are derived in which taxon?

The whole truth said...

Robert, are Homo sapiens members of the ape/primate kind?

Tom Mueller said...

Hi John

Point well taken: I am making some presumptions suggesting the importance of methylation in gene regulation in both Deuterostomes and Locotrophozoans represent neither an apomorphy nor a homoplasy but rather a a synapomorphy.

As I mentioned in our last exchange on the other thread which I need to return to… I am becoming more and more convinced that my notion of cladistics (we discussed earlier) is not so incorrect (at least according to Wikipedia) and that in fact Deuterostomes can be considered a basal outgroup to Ecydysozoans and Locotrophozoans no differently than Choanoflagellates can be considered a basal outgroup to Eumetazoa.

I need to get back to you on that.

Piotr Gąsiorowski said...

Tom, what's the advantage of using the term basal outgroup rather than simply sister clade? The adjective "basal" is misleading. It suggests some sort disparity between sister branches, whereas in fact any node in the family tree can be rotated. If Deuterostomia are "basal" with respect to the rest of Bilateria (whatever you call them -- OK, let's call them Bobs), then Bobs are "basal" with respect to Deuterostomia as well.

By the way, it's "lophotrocho...", not "locotropho...".

Larry Moran said...

Tom Mueller says,

The genome may be smaller than expected - but the proteome is an order of magnitude larger than expected! ... over 1 000 000

There are a large number of known post-translational modifications. In some cases these lead to variation in the final form of functional proteins. For example, there can be proteins that are active in both phosphorylated and non-phosphorylated forms.

However, most proteins have a single modified form that's biologically active whether this be glycosylated, after removing an N-terminal leader sequence, or after forming disulphide bonds. It's very unlikely that each and every protein-encoding gene produces, on average, five different post-translationally modified proteins that are biologically active. That's what would have to happen for your statement to make sense.

Meanwhile, alternative splicing/promoters/editing can bring the transcriptome back up to that 100 000 mark.

OK - I was not provocative on purpose, I understand that much of the trascriptome is junk and not functional... just the same, those numbers are impressive.


That number (100,000 transcripts) is only "impressive" if you assume that it has some biological meaning. If most of those transcripts are junk RNA and the total number of biologically active mRNAs is closer to 25,000, then it looks much less "impressive."

Many people are looking for ways to make humans special but the logic of their argument leaves a lot to be desired. If you are going to count all possible post-translational modifications as a way of making up for the low number of genes in the human genome, then you also have to consider that fruit flies and nematodes also have just as many post-translational modifications.

Thus, the proteome of most other eukarytoes also consists of one million proteins and the The Deflated Ego Problem [http://sandwalk.blogspot.ca/2007/05/deflated-ego-problem.html] still exists.

Same for alternative splicing. It doesn't save you because all other eukaryotes also have alternative splicing.

Tom Mueller said...

Hi Piotr

Re: what's the advantage of using the term basal outgroup rather than simply sister clade?

Please consider my target audience, high school students as explained here: http://tinyurl.com/n4e4nbj

I try to avoid technical terms like “plesiomorphic state” even though that is what you and I both understand.

Examine
a vertebrate cladogram

I maintain there is an intuitively apparent cogency to the notion that some modern species can still indirectly represent modern ambassadors of some more basal evolutionary stage, a stage that preceded more derived evolutionary stages (Darwin’s descent with accumulated modifications).

That is why cladograms are such powerful tools! Cladograms reveal probable relationships and degrees of relationships between groups of organisms, along with the relative times when different lines branched off (speciation occurred), showing common ancestry.

I like to explain it to my students this way – Let’s describe the Last Common Ancestor to Eumetazoa; it most likely had a dorsal nerve cord, indeterminate radial cleavage during embryogenesis and employed methylation as an important mechanism for gene regulation.

Are Deuterostomes in fact “basal’ when compared to Ecydysozoa? From an insect’s POV, aren’t humans less derived? Wouldn’t insect high school students regard us in the same condescending light that human high school students regard jawless fish? … without invoking in so many words your “plesiomorphy”.

I think my way of presenting this to students is not off-track and in fact concurs with http://en.wikipedia.org/wiki/Basal_(phylogenetics)

I understand your criticism that two or more equally basal clades branch off from the root of every cladogram.

My point is trivial – some branches occur earlier than others. That is why some modern species can still indirectly represent modern ambassadors of some more basal evolutionary stage.

Unless I am really missing something here – John Harshman’s and Simon Gunkel’s criticism of my explanations are possibly misplaced and that in fact my version does in fact make very “substantive claims” in terms easily understood by high school students.

My emply of "clade" concurs with http://en.wikipedia.org/wiki/Basal_(phylogenetics)

I also insist that terms like monophyletic can be very misleading. For example, is pesudocoelomy monophyletic or paraphyletic? Well if both Lophotrochozoans and Ecdysozoans have pseudocoelomate representatives, paraphyletic would be the correct answer.

Unless of course, both versions of pseudocoelomy represent atavisms to an earlier ancestral stage that preceded even the last common coelomate ancestor that gave rise to both Lophotrochozoan and Ecdysozoan lineages.

See where I am going with this?

Ps – thanx for the spelling correction… mind you my splelling mistake does have a certain appeal ;-)

John Harshman said...

Once again, what do you mean by "basal"? I believe I've tried to describe the common misunderstanding of the term, which I suspect you are suffering from.

To repeat, "basal" tends to mean "less speciose of two sister taxa", but with the attached and unsupported assumption that it's also primitive. So, how do you tell whether a particular character has a primitive state in some taxa? Generally, using parsimony. A character state shared with one or several closest outgroups is primitive. Unicellularity in choanoflagellates is estimated to be primitive, because the immediate outgroups to the choanoflagellate-metazoan clade are also unicellular. So, what outgroups to Eumetazoa share whatever you think is primitive in Deuterostomia?

Unknown said...

A few points:

1) Gene counts: It's great that some people had better estimates of the number of human genes before the Human Genome Project, but the larger estimates were widely believed, particularly in the molecular biology circles I was in when the draft genome sequence was published. It sounds like those with better ties to the classical genetics/evolution community believed more accurate estimates, while the molecular bio people, including the leaders of the HGP, were going with the inflated number. My 1998 Lodish et al text puts the number at 60,000-100,000.

2) The point of my piece is that the definition of a gene was broad at first- something that would have included cis-regulatory elements - but it developed into something much more restricted by the end of the 20th C, at least in the community associated with genome sequencing projects. That more restricted definition, the molecular concept of a gene as on ORF or RNA gene, was a big motivator behind genome sequencing project - we would solve the organism largely by assigning a function to each gene, first in yeast and other model organisms, and eventually in humans, as described here: http://www.ncbi.nlm.nih.gov/pubmed/15451511

3) Maybe someone here knows the answer: did anyone writing before 2000 expect that there was substantially more conserved regulatory DNA than coding DNA? These days, there is a big emphasis on regulatory genomics, which, as far as I can tell, was not expected 20 years ago.

4) You may not like John Stam’s estimate of 580,000 functional regulatory elements, but it’s not the result of idle speculation, it’s the result of associating DHS with specific genes. You may not believe the result (personally I favor the FANTOM estimate of 43,000), but it’s a serious attempt to put a number on regulatory DNA.

Tom Mueller said...

Hi John.

I answered your question here: http://tinyurl.com/og8yvpp

“I reckon characters or traits are considered “derived” if they are absent in a “basal group”, but present in presumed "later" groups because these traits emerged later on along the cladogram as it were.

I understand “basal group” to be some version of a modern ambassador representing the last common ancestor of the group. The clade jawless fish would be basal to mammals. That is not to say that modern jawless fish are identical to the last common ancestor of jawless fish and mammals.


You suggested back than that

Tom,

I think that's confused. Modern species can't represent the last common ancestor of a group, and jawless fish aren't a clade. I think you are conflating characters and taxa. Taxa are, or are supposed to be, clades. Characters may or may to diagnose clades.

Now, I think you're using "basal" to mean "primitive",…


My reading of Wikipedia indicates I am not off target here: http://en.wikipedia.org/wiki/Basal_(phylogenetics)


The term basal can only be correctly applied to clades of organisms, not to individual traits possessed by the organisms—although it may be misused in this manner in technical literature. While the term "basal" applies to clades, characters or traits are usually considered derived if they are absent in a basal group, but present in other groups. This assumption only holds true if the basal group is a good analogy for the last common ancestor of the group.

For example, Orangutans are a sister group to Homininae AND [emphasis mine] are the basal genus in the family as a whole”

I was most precise and explicitly avoided any conflation with “primitive”… at which point Simon suggested my claims sere no longer “substantive”.

I seem to getting caught between a rock and hard place here.

Going back to wikipedia I also read: http://en.wikipedia.org/wiki/Clade#Terminology

The relationship between clades can be described in several ways:
- A clade located within a clade is said to be nested within that clade. In the diagram, the hominoid clade, the apes and humans, is nested within the primate clade.
- Two clades are sisters if they have an immediate common ancestor. In the diagram, lemurs and lorises are sister clades.
- A clade A is basal to a clade B if A branches off the lineage leading to B before the first branch leading only to members of B. In the diagram to the right, the strepsirrhine clade, including the lemurs and lorises, is basal to the hominoids, the apes and humans. Some authors have used "basal" differently, using it to mean a clade that is "more primitive" or less species-rich than its sister clade; others consider this usage to be incorrect.


I do NOT employ “basal” as a synonym for “primitive”. As a matter of fact, I was most explicitly stating the contrary! In fact my reading of this last wikipedia explanation states exactly my intent and understanding.

I apologize if my own turn of phrase was less clear.
Best regards

Tom Mueller said...

Hi Larry

It would appear the following link is guilty of hubris and hyperbole:
http://www.piercenet.com/method/overview-post-translational-modification

…the myriad of different post-translational modifications exponentially increases the complexity of the proteome relative to both the transcriptome and genome.

It goes on to say…

… Indeed, it is estimated that 5% of the proteome comprises enzymes that perform more than 200 types of post-translational modifications (4). These enzymes include kinases, phosphatases, transferases and ligases, which add or remove functional groups, proteins, lipids or sugars to or from amino acid side chains, and proteases, which cleave peptide bonds to remove specific sequences or regulatory subunits. Many proteins can also modify themselves using autocatalytic domains, such as autokinase and autoprotolytic domains.

I refer to this link in class - and yes, I did naïvely infer that PTM gereates on average, five different post-translationally modified proteins that are biologically active. ... at a minimum.

This would not be the first time that my participation on this forum has corrected misconceptions that I have perpetrated in the classroom. Again, I remain in your debt!

I take it you are suggesting these claims to be over the top.

ITMT - I hope you are not suggesting I suffer along with some others from the so-called “Deflated Ego Problem”?!

If anything – I was suggesting that, in evolutionary terms, Ecdysozoa are far more “special” (whatever that is supposed to mean) than humans or other Deuterostomes.

I do suspect that many Ecdysozoa features (such as absence of DNA methylation in gene regulation) represent derived characteristics as compared to what I like to think was probably the “original default setting” as still found in more “basal” Deuterostomes.

Piotr Gąsiorowski said...

You would do well to consult the article referenced by Wikipedia at that point (Krell & Cranston 2004, "Which side of the tree is more basal?")

Unknown said...

@John: You're right. I should have written "maximum rate at which gene flow can occur" and trying to shorten the sentence made it imprecise to the point of being misleading.

Arguably the PSC matches the ISC when they are applied in deep time (because if you have long time extrinsic genetic isolation, speciation is likely to occur). But for short time spans there is that mismatch.

John Harshman said...

...for definitions of "short" that may extend into the millions of years.

Unknown said...

@Tom: Actually, you aren't - The first wiki entry notes under criticism:
"Despite the ubiquity of the usage of the term, some systematists believe it is unnecessary and misleading."
I can see one use for the term and that is referring to nodes. In most cases you can refer to these in terms of ancestors, but if you are comparing two different phylogenies you might want to reference nodes directly.

The problem with the use in the second wiki article is that the phylogeny simply says that there are two adephotaxa, one of which consists of Strepsirrhini and the other of Haplorhini. It makes little sense to single out either as basal.

John Harshman said...

Tom,

If that's what Wikipedia says, then it's just as confused as you are, I'm afraid. Now you could use "basal" in the way your second quote suggests, as long as you understand that the catarrhines are also basal to the strepsirrhines (assuming that strepsirrhines are themselves a clade). Similarly, one could say that hominines are the basal clade in the family as a whole. That is, either of two sister groups could equally be considered basal to the other. Now, to me, that says that "basal" means nothing useful. I swear that isn't how you're using it. But I'm no longer clear how you're using it.

Further, characters are not "derived" if they're absent in a basal group, even by the meaning Wikipedia apparently intends. You can't polarize any character with reference only to sister taxa. You need an outgroup at a minimum..

So, what exactly do you mean when you say that deuterostomes are basal? What do you mean when you say they could be used as representatives of the ancestor? You are certainly not being clear.

Unknown said...

I'm a paleontologist by trade. Recently to means always means "Sometime in the holocene" and I regularly referred to my 24.8Ma old material as "fairly young". So yea, I think I would feel uncomfortable with "short" when it's tens of Mys, but for anything that has the millions in single digits I'm happy to use short.

John Harshman said...

Can we agree that there is no particular period over which we can know that an isolating mechanism will involve in allopatry, only a rough expected half-life of same-species status? That is, there's clearly a stochastic element in speciation. We know that it would probably happen faster under selection than otherwise, but that just tells us that the expected half-life under selection is shorter than that in the absence of selection. For estimates of the relative time to speciation under selection and in its absence, see Coyne & Orr.

Larry Moran said...

Mike White says,

It's great that some people had better estimates of the number of human genes before the Human Genome Project, but the larger estimates were widely believed, particularly in the molecular biology circles I was in when the draft genome sequence was published.

I don't dispute that. It's not an excuse, it's a problem.

The point of my piece is that the definition of a gene was broad at first- something that would have included cis-regulatory elements ...

It's possible to find some examples of such a definition but most of the textbooks in the 60s, 70s, and 80s excluded regulatory sequences. It's much more common to find textbooks defining a gene as something that encodes a polypeptide and then read a discussion about ribosomal RNA genes and tRNA genes in the same book.You see this in all the early editions of Lewin's books, for example (GENES II, III, IV).

... but it developed into something much more restricted by the end of the 20th C, at least in the community associated with genome sequencing projects.

I think we've established that the group involved in sequencing was not necessarily the best group to define a gene. That's my point.

Maybe someone here knows the answer: did anyone writing before 2000 expect that there was substantially more conserved regulatory DNA than coding DNA?

Not me. Still don't.

These days, there is a big emphasis on regulatory genomics, which, as far as I can tell, was not expected 20 years ago.

The importance of regulation has been emphasized for at least forty years. Back in 1977, Stephen Jay Gould wrote an entire book about it (Ontogeny and Phylogeney). He argued that most visible changes in morphology are due to changes in how genes are regulated. In subsequent years there have been a number of developmental biologists who pushed the idea strongly.

Gould also wrote the following in 1977 ...

... those ignorant of history are not condemned to repeat it: they are merely destined to be confused..

Most of us are well aware of the hype generated by the ENCODE Consortium and others. They think they've discovered regulation on a massive, unprecedented scale. That doesn't make it correct. This is the same group that didn't understand what a gene was and doesn't understand evolution.

They don't have a very good track record.

I agree with you about one thing. I did not expect twenty years ago that so many prominent molecular biologists would actually believe that the average human gene required more that 2000 bp of regulatory DNA (i.e. at least 200 transcription factor binding sites.) (Average coding region is 2000 bp and a generous estimate of the size of a transcription factor binding site is 10 bp.)

You may not like John Stam’s estimate of 580,000 functional regulatory elements, but it’s not the result of idle speculation, ...

According to that estimate, there are, on average, 20 functional regulatory elements per gene. We know that the pol-I and pol-III genes don't need this many so that means the average for the other pol-II genes is even higher.

Why would a typical housekeeping gene need more than 20 functional regulatory elements?

If you can't point me to a paper where John Stam... seriously discussed this issue then I'm going with "idle speculation."

You may not believe the result (personally I favor the FANTOM estimate of 43,000), but it’s a serious attempt to put a number on regulatory DNA.

With all due respect, if this is a "serious attempt" then the field is in big trouble. Mike, do you (or anyone else) know of ten examples of genes that have been studied in detail where ten or more essential functional regulatory elements have been identified and characterized?

Tom Mueller said...

Hi Piotr

I love the link – thank you!

Fig. 2, the Polyneoptea are ‘the most basal clade’ of the Neoptera; in Fig. 3 it is the Eumetabola.

As both trees represent exactly the same phylogeny, calling one of the equivalent sides of the tree the most basal side makes no sense.


I agree because we are discussing TWO sister clades. I said as much in our previous exchange when focusing on jawless fish.

Compared to Lampreys - Mammals may have acquired more bells and whistles (and often subsequently lost) that we (as mammals) find particularly interesting. Such thinking betrays a mammal-centric bias on our part. Modern Lampreys have also acquired (and often subsequently lost) many bells and whistles compared to their ancestors that from a Lamprey’s POV are even more interesting.

We seem to agree. Now what about discussing three clades where only two later clades are nested in an earlier clade?

Back to that same Wikipedia article:

A clade located within a clade is said to be nested within that clade. In the diagram, the hominoid clade, the apes and humans, is nested within the primate clade.

A clade A is basal to a clade B if A branches off the lineage leading to B before the first branch leading only to members of B…

And on the other wikipedia article:

…orangutans are a sister group to Homininae and are the basal genus in the family as a whole

That is all I was attempting with Deuterostomes, Ecydysozoans and Lophotrochozoans. Which split occurred first? Which clade is basal and preceded the split of the next two clades?

I contend it STILL makes sense to inquire about the last Common Ancestor to Eumetazoa; either it had a dorsal nerve cord, or it didn’t; either it possessed indeterminate radial cleavage during embryogenesis, or it didn’t; did it employ methylation as an important mechanism for gene regulation or not?

I now recognize I mispoke with : Are Deuterostomes in fact “basal’ when compared to Ecydysozoa? From an insect’s POV, aren’t humans less derived? Wouldn’t insect high school students regard us in the same condescending light that human high school students regard jawless fish?

It should have read:

Are Deuterostomes (as a clade) in fact “basal” when compared to Ecydysozoa? From an insect’s POV, aren’t (some human features more ancestral and) less derived? Wouldn’t insect high school students regard us in the same condescending light that human high school students regard jawless fish?

Piotr – I thank you for this opportunity to focus my thoughts and hone my analysis.

I remain in your debt.
dziękuję

Tom Mueller said...

Hi John

I do not understand your contention that "basal" tends to mean "less speciose of two sister taxa"

… that suggestion strikes me most bizarre!

Unless I am misunderstanding you, that suggestion to my way of thinking is identified as Misconception #6

I think your point about “nodes” is right on target! When comparing two sister clades, neither is more basal than the other. I get it!

However, when comparing three different clades; one node must have preceded the other. Only then does it make sense to suggest one clade is “basal”

To my understanding: “basal” and “outgroup” are not synonymous.

How about Chimpanzees Humans and Orangutans? The “node” that separates the Orangutan lineage from Humans/Chimpanzees antecedes the node separating Humans and Chimpanzees. ITMT, neither lineage is more “speciose” than the other.

So what am I not getting?

Best and grateful regards for your patience and your indulgence.

Tom Mueller said...

@ Piotr & John

One last kick at the can before I go to pick up my kids...

How about this:

Are Deuterostomes (as a clade) in fact “basal” when compared to both Ecydysozoa AND Lophotrocophora? From an insect’s POV, aren’t some human features more ancestral and less derived? Wouldn’t insect high school students regard humans in the same condescending light that human high school students regard jawless fish?

This final attempt of mine above: is it not akin to discussing nodes separating Chimpanzee Human and Orangutan on a cladogram?

Best and grateful regards

Unknown said...

We can agree on that. But the probability of two populations being isolated for say - 5Myrs and not ending up with some intrinsic mechanism for isolation is pretty low. And for infinite time it's almost impossible (p=0).

My point there is simply that the ISC and the PSC are decent proxxies for each other in deep time. Given that we generally have to use morphotypes as proxxies for species anyway, I don't think ignoring the difference in a fossil dataset is going to introduce a large error.

Tom Mueller said...

@ John

oops - I almost forgot

I thought I remembered reading somewhere that Unicellularity in choanoflagellates may be an atavism and that choanoflagellates probably had a multicellular ancestor.

ditto Baker's Yeast.

Maybe my memory is playing tricks on me...

I need to pick this up later, family beckons.

yet again - best and grateful regards. I thank you for your indulgence and your patience in bringing me up to speed.

Tom Mueller said...

@ Simon

What's your take on the recent Elephant Shark data?

http://sandwalk.blogspot.ca/2014/01/can-some-genomes-evolve-more-slowly.html

Is it possible some genomes evolve an order of magnitude slower than others meaning in fact it may take much longer for some lineages to diverge?

SRM said...

I think the term operator gene was common early on particularly in the early studies of the lac operon of E. coli. All of this is before my time, but I can tell you that these days such terminology would be highly unusual and I would never expect to see it in modern manuscripts. The dropping of such terminology would have had nothing to do with the HGP.

John Harshman said...

I don't agree that the probability is low given 5 million years. Depends on the organisms, of course. But I know of populations with fairly high mitochondrial divergence that are not accounted separate species.

I don't know what that would have to do with fossils, though. I don't think you can reliably distinguish either biological or phylogenetic species in fossil samples.

John Harshman said...

Now what about discussing three clades where only two later clades are nested in an earlier clade?

I'm not sure what you mean by that or how it relates to "basal".

… that suggestion strikes me most bizarre!

I agree. But that's what people usually mean, whether they realize it or not. That's why I don't like the term. And it helps if you reify taxonomic ranks. Thus Pongo is said to be the basal genus in Hominidae. But why shouldn't Gorilla/Pan/Homo be considered a single genus and the basal genus in Hominidae? What, objectively, would be the difference? There are only two reasons for considering one of two sister taxa "basal": if you suppose there to be a main line of evolution, in which one taxon diverges from the main line and another exemplifies it, and reification of ranks, in which "genus" means something non-arbitrary. Neither of these being true, there's no point to "basal". Yes, this is misconception #6. And you would appear to have fallen victim to it; again, it makes no more sense to say that Deuterostomia is basal to Lophotrochozoa/Ecdysozoa than to say that Lophotrochozoa/Ecdysozoa is basal to Deuterostomia. They're just sister taxa. It's only the way you choose to give one branch two names and the other just one name that gives any impression that they differ.

Further, you have inconsistently used the "basal" status of a taxon to discuss ancestral character states, even though you deny any validity to such a notion. You do it above when you talk about basal clades and ancestral characters together.

You are correct about nodes; we can sensibly talk of nodes as basal to one another. But we can't apply the same idea to clades. The node separating Deuterostomia from Lophotrochozoa/Ecdysozoa is basal to the node separating Lophotrochozoa from Ecdysozoa. But that says nothing about Deuterostomia being a "basal" clade.

And yes, the sister group of Pongo is definitely more species than Pongo.

I would certainly be interested in any data showing that choanoflagellates are secondarily unicellular.

John Harshman said...

Curse auto-correct: And yes, the sister group of Pongo is definitely more speciose than Pongo.

Unknown said...

Well, there's certainly rate heterogeneity in most phylogenies (actually something I'm currently involved with. I'm looking at calibrations for relaxed molecular clocks in holometebolan insects). I'm very happy with the Graur & Martin paper referenced in that post - it's fun to read even when you're not directly involved with the subject and if you happen to be the paleo-guy in a project consisting mainly of molecular biologists provides a very good justification for your being there...

(one thing to note in Larrys post is that: "This is a lot more difficult than it seems because it requires fossils that lie close to the node and accurate dates." isn't quite right. The software used to produce RMC hypotheses supports soft calibration points, which means that you do not generally need fossils that are close to the node and dated very accurately - though it helps - but you can derive probability distributions for the node date from the fossil record, which then take into account uncertainties in these. Using multiple calibration points also helps.

As far as rates of speciation go, they don't exclusively depend on mutation rates. The basic concept of species selection in sensu Stanley is sound and the distribution of diversity in clades doesn't suggest a neutral model (in this case, that would mean that apomorphies don't affect the rate at which allopatric populations are formed and maintained).

Even if mutation vary by an order of magnitude, that simply means that we get a larger 95% CI for our estimate. Again, I doubt the match between morphotypes and either the ISC or the PSC is close enough to make the error made when using either as a proxxy for the other matter in the fossil record.

Unknown said...

@John: How big would you estimate that probability to be? The example I recall for large divergence of mtDNA without speciation is Limulus p. and I'm not sure about current estimates for the age of the species.

An estimate of distrubitions of per lineage speciation rates where 0.2/My isn't on the low end seems to be very low (I think a rate of 1/My is default for simulations using evolver, and given that a duration of 5My or more comes with a probability of 0.67% That does seem low to me). Alroy (1996) estimated a per lineage extinction rate of .91/Myr in mammals. Mammals are still around, so speciation rates should be higher than that.

Piotr Gąsiorowski said...

Note, however, that the living representatives of Ponginae (two extant species of Pongo) are survivors of a larger group that includes at least Khoratpithecus, and (more speculatively) a few other fossil genera, such as Sivapithecus and Gigantopithecus.

John Harshman said...

The sister group of Ponginae also has fossil members, so I don't thing the orangutans gain much prestige from that.

Robert Byers said...

The truth, the whole truth, and nothing but the truth is that people are a different kind by reason of our soul being created in Gods image. We are unique on eath thereby.
We are also unique in being the only KIND to have the same kind of body ass a different kind.
All other creatures are within their own kind and specialy different. Whatever it wa at the beginning.
Yet we can't have our own KIND of body as all biology is off the same rack and yet defines its identity by its kind looks.
So we are renting a body. The primate one.
In order to prove we are made different in our identity.
We can't look like what we really are and so we were given the best bode on earth for fun and profit.
Its an equation.

SRM said...

seems like with god all things are possible, except he couldn't figure out a way to make us look the way we really are.

The whole truth said...

Robert, will you tell me where in the bible it says that 'God' created only our "soul" in his image?

Can you provide any scientific evidence to show that humans have souls, and that any or all other life forms don't?

Do conjoined twins have conjoined souls or separate souls? And how do you know?

How many souls does a person with two heads have? And how do you know?

What 'kind' are lichens a member of? How about viruses, and slime molds? And how do you know?

How many 'kinds' are there on the Earth?

Piotr Gąsiorowski said...

Hush! Can't you see Robert is busy inventing a new religion -- a cross between Christianity and Scientology?

Larry Moran said...

SRM says,

I think the term operator gene was common early on particularly in the early studies of the lac operon of E. coli.

I don't think there was ever a time when the majority of molecular biologists used the term "operator gene."

Here's what Jim Watson wrote in the first edition of Molecular Biology of the Gene in 1965.

The functioning of an operon is under the direct control of a specific chromosomal region, the operator. The operator is always located adjacent to the genes whose transcription (functioning?) it controls, so there is a specific operator for each operon.

This was pretty standard in the 1960s. The operator was a sequence of DNA adjacent to the gene. It was not a gene and it was not part of the gene it controlled.

And here's what Benjamin Lewin wrote in the first edition of his book (then called Gene Expression-1) back in 1974.

These characteristics demonstrate that the operator does not code for some product which can diffuse through the cytoplasm to govern the activity of structural genes. It must instead control some integral property of the adjacent zya segment by serving as a recognition element; Jacob and Monod proposed that the interaction of the operator sequence with repressor might prevent transcription of adjacent genes. Whereas both structural and regulatory genes require transcription and translation to fulfil their functions, regulator sites such as the operator function by constituting a nucleotide sequence whose recognition by a protein controls the expression of adjacent genes. All such sites may be defined by their cis dominant, trans recessive control of structural genes. (Lewin's emphasis of "sites" and "genes.")

Tom Mueller said...

Hi John, Hi Piotr

Let’s return to Piotr’s original question:

Tom, what's the advantage of using the term basal outgroup rather than simply sister clade?

To recap: what do I mean by “basal ? I am referring to nodes… and please remember I commenced this line of inquiry with the caveat “advocatus diaboli”

I notice that august publications no less than PNAS can publish articles entitled
Evaluating hypotheses of basal animal phylogeny using complete sequences of large and small subunit rRNA

... which seem to use the term “basal” in conjunction with “clades” along the same lines I do.

I hope I am not confused… that said, perhaps I should in future be more circumspect when bandying technical vocabulary.

Let’s reboot:

Is it possible to construct a phylogenetic tree from molecular with nested nodes? Yes!

Do some nodes represent earlier branch points than others? Yes!

As a matter of fact, so-called “outgroups” to “root” phylogenetic trees often cannot be presumed but rather need to be deduced. False presumptions about outgroups can really mess up data analysis, if I understand theory correctly. In other words, rooting of trees using outgroups is problematic if our assumptions about the status of outgroups is incorrect.

I hope we agree so far.

I employed the term “basal outgroup” exactly along these lines. Perhaps, I need to be a little more careful how I express myself in class.

I refer you to an activity provided by Professor David Hillis for his textbook Principles of Life

I also have my students finish Hillis’ “Working with Data” worksheet I have my students do in class: http://tinyurl.com/o6bfp2f

Let’s try this one more time using Vertebrate cladograms.

Humans would be an appropriate “out group” to root Echinoderm Phylogenetic Trees.

Sea Cucumbers would be an appropriate “out group” to root Vertebrate Phylogenetic Trees.

I guess my question would be:

Can jawless fish represent an appropriate “out group” to root a
Vertebrate phylogenetic tree
that included the “remaining” vertebrate clades, Gnathostomata, Osteichthyes, Choanata & Tetrapoda?

BUT on the other hand

Gnathostomata would NOT be an appropriate “out group” to root a phylogenetic tree that included all the “remaining” vertebrate clades as well as the clade that comprised Jawless Fish?!

That was my intent when stating basal outgroup rather than simply sister clade as Piotr suggested.

Furthermore, I humply suggest that Yes – it is reasonable to speculate which differences between deuterostomes and lopotochozoans and ecdysozoans are derived and in which taxon…

Please tell me I am not confused, or I will need to seek the nearest brick wall to bounce my head against.

I thank you all for your patience and for your indulgence.

Tom Mueller said...
This comment has been removed by the author.
John Harshman said...

An outgroup is an outgroup. If it isn't an outgroup, it isn't an outgroup. "Basal" doesn't enter into it. An outgroup is simply a taxon that isn't a member of the ingroup. And this is all about nodes. There are basal nodes, not basal taxa. One node can be basal to another, but one taxon is not basal to another, unless you want to say something like "mammals are basal to elephants", which sounds weird to me. "Basal animal phylogeny" refers to investigation of relationships around the basal nodes, i.e. the earliest divergences, of metazoans. Jawless fish aren't an outgroup to other vertebrates; the living jawless fish are two successive outgroups. Gnathostomes are not an outgroup to other vertebrates, since there is not one clade of other vertebrates but two living ones that are not sister taxa. An outgroup doesn't have to be the sister clade of the ingroup, but it's usually a good idea.

Imagine an unrooted tree with no distinction between root and branch. You can pick any portion of any branch and declare it to be the root. This will divide the tree into two clades, which we might call the ingroup and the outgroup. And this is exactly what we do in phylogenetic analyses, in most of which transformations are reversible and thus agnostic regarding time: we construct an unrooted tree and then designate part of that tree as an outgroup to root it. If the designated outgroup really is an outgroup, that's good. If it isn't, the tree will contain some incorrect clades, even if the unrooted tree was accurate.

Yes, it is reasonable to speculate which differences among taxa are derived in which taxon. It would however be good to have some kind of reasonable argument to support that speculation. And "deuterostomes are basal" is not such an argument.

Did that help?

W. Benson said...

A reasonable estimate of mammalian gene number by a prominent geneticist in a first-line biological journal was published in the mid-1960s. In 1966 Hermann J. Muller (American Naturalist, 100: 493-517) affirmed
“As for mammals, gene number, as earlier estimated (Muller, 1950), as estimated from more recent data on the frequency of spontaneous mutations at specific loci in relation to the maximum mutational load that could be carried, and from data on the frequencies of X-ray-induced mutations at specific loci as compared with that of total induced lethals, agree on a maximum of not much more than 30,000 [genes].”
If I understand it correctly, Muller’s method detected both regulatory and protein coding genes. If the 10,000 gene “overestimate” is taken as measure of the relative amount of DNA having regulatory function, for each two functional nucleotides in coding genes there should be about one functional unit of regulatory nucleotide. It will be interesting to see how well this 1960s derived estimate holds up.

Tom Mueller said...

Just to set this discussion in historical context. Jacob and Monod in their original paper employed “operator locus” and “operator gene” interchangeably.

https://www.pasteur.fr/ip/resource/filecenter/document/01s-000046-03t/genetic-regulatory.pdf

The 1968 Edition of Levine’s Genetics specifically refers to “operator gene”

http://tinyurl.com/o55pumj

I found 23 hits for “operator gene” in PNAS, the last one being 1973
http://tinyurl.com/mbg3hl8

Searching google scholar, I can find the specific term “operator gene” (albeit with decreasing frequency) in the 1970’s, 1980’s, 1990’s and even beyond.

I can even search the term with under the restriction “since 2010” http://tinyurl.com/l2u8mmp

Finally, the current Wikipedia article on Operon cites “operator gene no less than 3 times.

http://en.wikipedia.org/wiki/Operon

I recognize that Larry is correct when he explains that current consensus favors a more modern vernacular by invoking cis-Regulatory Elements (CREs) as opposed to “genes”

http://en.wikipedia.org/wiki/Cis-regulatory_element

My contention is unambitious – I merely state that it wasn’t always so and consensus on this change in terminology does not yet appear unanimous.

Tom Mueller said...

Hi again John

I think understand you – it’s just that not everybody seems to agree:
Check out pbs: http://www.pbs.org/wgbh/evolution/change/family/page04.html

For example, how would a scientist figure out evolutionary relationships between sharks, dolphins and wolves? Two fishy-looking animals and a four-legged, furry one.

For this, scientists use something called an "outgroup", a living descendant that still shares many of the ancestor's primitive traits. So we need a living jawless fish, like a lamprey.

How about Wikipedia http://en.wikipedia.org/wiki/Basal_(phylogenetics)

In phylogenetics, basal is the direction of the base (or root) of a rooted phylogenetic tree or cladogram. A basal clade is the earliest clade (of a given taxonomic rank[a]) to branch within a larger clade. More generally, clade A is basal to clade B if B is a subset of the sister group of A.

[a] Without this qualification there must, of course, be two or more equally basal clades branching off from the root of every cladogram.


The wikipedia article goes on to say that : … orangutans are a sister group to Homininae and are the basal genus in the family as a whole.

To me this makes intuitive sense.

Therefore, one of the Jawless Fish taxons could still serve as an out group to the “remaining” vertebrate clades (representative of subsequent nodes that occurred later on) Gnathostomata, Osteichthyes, Choanata & Tetrapoda? Why not?

Am I correct, that not everybody is in agreement with you
Or
… am I missing something.

best regards

John Harshman said...

I believe you are probably missing something, but it isn't clear just what. Apparently PBS doesn't agree with me. I am not concerned. PBS isn't a systematist. I am. An outgroup is in no way defined by the particular traits it possesses, whether primitive or derived. It's defined by not being part of the ingroup. Period. Now, we could discuss what would make the best outgroup for a particular analysis, but that's another question, best left for later.

Wikipedia seems to agree with the common definition of "basal" clade, the one I told you about, which as you know I don't like. Intuitive sense is all very nice, but please don't confuse it with reality. Ranks are arbitrary and should not be considered real. And would you really say that actinopterygians are basal to giraffes?

Yes, any clade that isn't part of the ingroup can serve as an outgroup. That's what I said. There could also be multiple outgroups.

Anonymous said...

It seems that you've advocated using "species" for "kinds" that are defined some ways, but using a different word for units of variation (species, I'd call them) that are defined a different way. I just can't see it. Within one modern plant genus, you can find one species that's an obligate outcrosser with a wide, mostly connected range so that the BSC applies. And you can find another species that's entirely clonal (though apoxmixis or bulbil production) and another that's entirely selfing. Many species are somewhere between selfing and outcrossing, or clonal and outcrossing, sometimes depending on the weather. Each of these breeding systems produces a different pattern of variation and sometimes we have to apply different definitions of species to decide which units to name at the species level. And, of course, sometimes we don't know anything at all about the population genetics of the group.

I don't think it's helpful for most users of taxonomic information to say that Festuca idahoensis (outcrossing) is a species but F. brachyphylla (mainly selfing) and F. prolifera (pseudoviviparus) aren't species but rather . . . fill in the blank with a new word. Which would seem to be an implication of your point.

Larry Moran said...

My contention is unambitious – I merely state that it wasn’t always so and consensus on this change in terminology does not yet appear unanimous

My contention is also unambiguous. The consensus since the 1960s is that operators are DNA binding sites and not genes.

You can always find non-experts who misuse a term and/or don't understand all the ramifications of their definitions. When you're trying to teach students about a subject it's always a good idea to focus on what the consensus of the experts is saying and not what the outliers are saying.

SRM said...

I don't think there was ever a time when the majority of molecular biologists used the term "operator gene."

Thanks. I saw its use several times in older papers, but certainly have no first hand experience of how commonly some terms were used in the 60s and 70s. I knew it would be decidedly uncommon to non-existent these days, in any case.

Unknown said...

Yup, I'm advocating restricting species to the BSC and the ISC. I'm advocating using "type" as a designator for other kinds of sets of organisms. The reason to choose type here is that it has a history of usage as long as species and in fact is used in the taxonomic literature: A holotype is named and diagnostic criteria are given.

By this line of reasoning, we name types. We then erect the hypothesis that the type we have defined and named is a species. The latter part is testable.

This description defines a set of organisms, regardless of gene flow. Species in the BSC and the ISC is not a taxonomic rank, it's a systematic unit. It's a similar difference to that between classical higher ranks like genera, and clades in phylogenetic systematics. Whether a clade is in fact a clade is a testable hypothesis, whether a family is a family or should be regarded as a genus instead is not.

Using maximum possible rates of gene flow means that in some cases individual organisms might be species. If you have exclusively clonal organisms and no mechanism by which a substantial amount of HGT occurs, that's what you end up with. I have absolutely no problem with that - a single-organismed species is no weirder than a single-celled organism. In that case the defined type is not a species, but a clade.

Unknown said...

Robert, even if we accept your proposition that humans have a "soul", how do you know that my cat, my clown fish, my cleaner shrimp and my hermit crab don't have souls? What evidence, other than an often transcribed, translated, mis-transcribed and mis-translated book do you have?

And, really, what is a soul?

John Harshman said...

Simon,

I'm not sure what speciation (or extinction) rates have to do with time of divergence prior to speciation (=BSC speciation), so I'm not sure why you brought those up. I know of some birds that are considered single species with mt divergence between populations of >6%; if I recall, there are some up to 9%.

Unknown said...

Maybe we are talking past each other. If we are talking about the durations for which species persist, I agree with you completely. One of the systematic effects of molecular clocks is that they date divergence (so the dates denote the time the common ancestor split into phylospecies), rather than the split into internodal species, defined through speciation in the BSC sense. There is a time lag between the two (and usually there's also a time lag between divergence and the evolution of apomorphic traits, which allow morphotypes to be used as proxxies. And there's usually a further lag between that and the first appearance in the fossil record).
My argument here is basically that the difference between divergence dates and BSC speciation dates tends to be smaller than that between divergence dates and first appearance dates.

I claimed it was rare that divergence dates preceed BSC speciations by more than 5Myr. That's a claim you can test using speciation rates, because these imply some bounds. If divergence occured directly after speciation, then the time to speciation would follow an exponential distribution with the speciation rate as a parameter. This optimistic model leads to low probabilities for divergence times preceeding speciation by more then 5Myr.

SRM said...

And, really, what is a soul?

I dont know but I would be happy to sell mine for the right price. Its a little like iron pyrite. Not worth a dime to me, but if someone insists it is gold, I will be happy to charge the price of gold.

Tom Mueller said...

Larry – you misunderstood me entirely

Read my original post – I said "My contention is unambitious"

... I never said "unambiguous"!

What I tried to say was that there seems to persist a great number of “outliers” even today... but no big deal.

Re: When you're trying to teach students about a subject it's always a good idea to focus on what the consensus of the experts is saying and not what the outliers are saying.

I agree! That was my intent all along! As a mater of fact, that is exactly how I corrected ,myself in class just the other day.

And for the record – participation on this site has made a far stronger teacher and I remain in your debt.

Thank you!

John Harshman said...

I don't think speciation rates actually imply bounds. And haven't you just said that divergence begins some unknown time prior to speciation? (I agree, given an allopatric model.) I truly do not understand why in any case speciation rate would affect time to speciation. Perhaps you can explain. Of course, you can't study speciation rate using a tree, only divergence-that-eventually-becomes-permanent rate, which we have agreed is not the same thing.

I also think there are a great many cryptic species in the fossil record, which can cause huge errors when interpreting morphotypes as species. Geographic and temporal sampling are further issues.

Tom Mueller said...

Hi John

I love your metaphor: ...an unrooted tree with no distinction between root and branch

I strongly think that we may be actually saying the same thing but in slightly different words.

I use three different textbooks in my classroom. One of them is Principles of Life by David Hillis et al

The second edition has just been published and the relevant Chapter 16 is open source.

http://tinyurl.com/qye7ekg


Check out page 318 where David Hillis describes a vertebrate cladogram

The earliest branch in the tree represents the common ancestor of the outgroup (lamprey) and the ingroup (the remaining species of vertebrates)… Each clade in the tree is supported by at least one shared derived trait, or synapomorphy… Derived traits are indicated along lineages in which they evolved.

I am particularly keen about the interactive tutorial reproducing Hillis’s landmark experiment in experimental phylogenetics

http://www.pol2e.com/at16.01.html

I also have my students do a modified version of Hillis’s assignment:

http://tinyurl.com/o6bfp2f

My students are able to reconstruct a phylogenetic tree revealing the probable relationships and degrees of relationships between groups of lineages, along with the relative times when different lineagess branched off from a common ancestor. Better yet, when I provide some different versions of these data sets, my students are able to deduce the identity of the outgroup, while determining the commonalities of the lineages.

It gets better – I tell my students that scientists are able to reconstruct many extinct gene sequences from naturally occurring organisms (such as the visual pigment protein genes from extinct archosaurs) by using similar techniques. Mind-boggling!

I understand you and Piotr regarding "sister" clade vs. "basal" clade.

Note that I repeated above what I said on an earlier thread : Compared to Lampreys - Mammals may have acquired more bells and whistles (and often subsequently lost) that we (as mammals) find particularly interesting. Such thinking betrays a mammal-centric bias on our part. Modern Lampreys have also acquired (and often subsequently lost) many bells and whistles compared to their ancestors that from a Lamprey’s POV are even more interesting.

By that I meant that lampreys are no more “primitive” than humans.

That said – I take your point and need to modify what I way in class.

What is your reaction to the following:

Not only can humans be employed as an outgroup to root Echinodermata, but humans could also be used an outgroup to root Jawless fish which are no less derived and no less speciose than the other side of the swivel point representing the common ancestor to jawless fishes and those other vertebrate clades we just examined.

The molecular clock’s effect on the jawless fish lineages would no longer represent the autapomorphies of an outgroup but become instead the synapomorphies of the speciose jawless fishfishes clade(s) we are now studying.


I hope the above meets with your approval.

Tom Mueller said...
This comment has been removed by the author.
Tom Mueller said...
This comment has been removed by the author.
John Harshman said...

Jawless fish are not a clade. Extant jawless fish are two successive outgroups to gnathostomes. (Extinct ones just make it worse.) So you can't use a gnathostome to root the tree of jawless fish. You have to use a non-vertebrate.

Jawless fish are much, much less speciose than gnathostomes. It isn't clear to me that you know what "speciose" means. It just means "having a certain (generally large) number of species". "More speciose" means "having more species".

I am not sure what your thought experiment is intended to do. Echinoderms would "stand out" as an outgroup to chordates if there were an outgroup to deuterostomes to enable us to root the deuterostome portion of the tree. Outgroups don't declare themselves; you have to decide a prior what they are. Unless, of course, you assume a molecular clock, in which case the root is the midpoint of the line connecting the two most distant taxa.

I do not understand your question beginning with "What is wrong...".

Robert Byers said...

The soul is just who we are. Its a image reflection of gOd. the bible says so and that old motown song "I'mmmm a soullll mannn...".

I don't know if creatures have souls but certainly not in gods image.
Only we do and so only we can't be represented in the common equation of biology. So we must rent another critters body. The best one!
Thats the reason we look like apes. Because we can't look like our true selves in such a simple blueprint of nature. The others are identified by their looks or their original looks. Their kind's looks.

Tom Mueller said...

OK - it is almost 2 AM and I am exhausted... I just posted and deleted an incoherent post.

That said - I finally see where John is coming from. I had to struggle a little more with his version of the tree metaphor.

John & Piotr - thank you!

One last question - is it possible to address the following question:

The last common ancestor to Deuterostomes, Lophotrochozoans and Ecydysozoans did it have a dorsal nerve cord or a a ventral nerve cord.

I just remember PZ Myers addressed this question

http://scienceblogs.com/pharyngula/2006/01/12/hemichordate-evodevo/

Answer: If Hemichordates provide any indication, either that ancestor had an ambiguous nerve net and dorsal ventral axes evolved independently in two lineages or it had some version of the protostome ventral version of events.

Tom Mueller said...

Hi John - I had hoped you had missed my second post aka the thought experiment. I just deleted it. Please ignore it. It represented a "brain fart".

Regarding my first post... I thought the node that gave rise to jawless fishes preceded the node that gave rise to gnathostomes,

So to my understanding, yes jawless fishes can root subsequent nodes as described by Hillis above

http://www.geol.umd.edu/~jmerck/tassite/eltsysex/sysq6.gif

Tom Mueller said...

Hi John

to recap– above you responded to my earlier suggestion that jawless fish can be used to root some other vertebrates:

Modern species can't represent the last common ancestor of a group, and jawless fish aren't a clade.

I returned with

Can jawless fish represent an appropriate “out group” to root a Vertebrate phylogenetic tree that included the “remaining” vertebrate clades, Gnathostomata, Osteichthyes, Choanata & Tetrapoda?

BUT on the other hand

Gnathostomata would NOT be an appropriate “out group” to root a phylogenetic tree that included all the “remaining” vertebrate clades as well as the clade that comprised Jawless Fish?!

That was my intent when stating basal outgroup rather than simply sister clade as Piotr suggested.


You then repeated

Jawless fish aren't an outgroup to other vertebrates

OK - jawless fish are not ONE Clade. I imagine Jawless Fishes to be like Protostomes which are a combination of two complex clades.

but Hillis does just what you deny!

He uses lampreys as an outgroup – not to all vertebrates – but to Gnathostomata, Osteichthyes, Choanata & Tetrapoda as just I detailed.

Unknown said...

Per lineage speciation rates allow you to give a probability distribution for the time between two speciations. Since divergence happens sometime between the two, the time between divergence and speciation is equal to or less than the time between two speciations.

Again, I agree that divergence precedes speciation by some time most of the time.

I don't think you can generalize that claim about cryptic species - depending on the organismal group you've got a lot of singleton morphotypes for instance.

Temporal sampling might actually be better than thought, with a sizable effect of morphological degradation. I.e. we might have quite a few representatives of a clade preceding the first representative where the apomorphies are preserved. I've got some preliminary results from hexapods that suggest this and I'm planing to repeat that for a couple of other clades to see if that's something general (or possibly if the hexapod data might be a statistical fluke).

John Harshman said...

"Jawless fishes" and "lampreys" are not synonyms. You can use one clade of jawless fishes to root gnathostomes. But you can't use jawless fishes as a group, because they aren't a group. Lampreys and hagfish are successive outgroups. Protostomes, on the other hand, may indeed be a clade. "Basal outgroup" is redundant. If it isn't "basal" (whatever that means), it isn't an outgroup.

John Harshman said...

I think you have that backwards. Divergence is what you see in the branches of a cladogram. You don't see speciation. And it isn't necessary for speciation to occur between divergences (again, given an allopatric model). You could have a ton of divergence without any speciation at all, including multiple lineage splits. You just don't have the data for that.

In what groups are there no cryptic species?

I'm not sure what your point is about temporal sampling. Surely the sampling for hexapods is piss-poor, being highly dependent upon a very few lagerstätten.

Unknown said...

I think you misunderstood me there. My claim is that you have divergence between speciations, which I think should be uncontroversial. So the time interval between speciations constrains the time interval between divergence and speciation.

Cryptic species are cases where a single morphotype contains 2 or more species. Hence if a morphotype is only represented by a single individual in the fossil record, there's no chance of there being several cryptic species in that morphotype.

On the temporal resolution: Methods that assess temporal sampling of particular taxa to derive confidence intervals for first appearance dates (like Strauss and Sadler) use the temporal spacing between documented fossils. If you have a fossil at times A, B and C, the relevant time intervals are A-B and B-C. If B is not well enough preserved to be placed within that taxon, the interval you are left with is A-C, which is larger than either A-B or B-C. This leads to a larger CI for the FAD.

So there are two alternative reasons for this:
1) There is no B
2) There is a B, but it can`t be reliably placed within the same taxon as A and C.
These cases can be distinguished and for hexapods 2 seems like a better fit.

John Harshman said...

But how do you determine the time interval between speciations? Speciation doesn't show up on a cladogram. The nodes on cladograms aren't speciation events; they're divergence events. The times between nodes aren't times between speciations; they're times between divergences. You really can't say when, on a cladogram, the evolution of reproductive isolation happens. It could in fact happen after multiple nodes (divergences) have happened.

If a morphotype is represented by a single individual, you have very bad taxonomic sampling and can't really say much about speciation.

I still don't understand how your point about temporal resolution is relevant to what we're talking about.

Tom Mueller said...

Hi John,

Let me begin by thanking you for your patience and indulgence. I remain in your debt. If ever you are in New Brunswick, let me know. It would be an honor to meet you in person and treat you to a seafood feast on my dime. I have benefited greatly and so have my students.

I did some google-whacking. It turns out that jawless fish may be monophyletic for all the wrong reasons.

Heimberg et al. (11) conclude that the
latest common ancestor of all vertebrates
may have been phenotypically more complex
than living cyclostomes. Interestingly,
this echoes the opinion expressed long
ago by some paleontologists (13), who
supported the theory that lampreys and
hagfishes were derived from heavily armored
and ossified Paleozoic jawless fishes
referred to as ostracoderms…


http://www.pnas.org/content/107/45/19137.full.pdf+html

It appears that Lampreys and hagfish are not successive outgroups and apparently not even a single outgroup for the phylogenetic tree we are discussing. But at least the modern version of cyclosomes (that survive today) do appear to be a single "group".

Unknown said...

I'm not sure where the disagreement is. I agree that the nodes on cladograms are not speciation events. I agree that the dates recovered for nodes track divergence times. How do I determine time between speciations? I track extinction rates, which for extant groups provide a lower bound for speciation rates. I then use speciation rates to bound time to speciation. I wouldn't derrive them from a cladogram with crown group leaves.

I do not disagree that morphotypes represented by single specimens are badly resolved. I do note that they lead to a reduced risk of incorrectly assigning several cryptic species to one species. I did not claim that these cases are relevant for our understanding of speciation, but they are relevant for assessing how big the impact on cryptic species on the fossil record may be.

You introduced temporal resolution to this discussion, by stating that "Geographic and temporal sampling are further issues." Since that's something I'm interested in, I commented on this. It's not relevant to the definition of species.

John Harshman said...

I'll take you up on the seafood dinner if I'm ever in NB, which however seems unlikely. But I think you're still confused. The shapes of trees and the characteristics of ancestors are quite different issues. Whether the ancestor of all vertebrates physically resembled a hagfish, lamprey, or bony fish more than the others is irrelevant to the question of relationships or to what's an out group. What's important about the paper you cite isn't the characters but the tree topology inferred. If we accept the tree in that paper, then hagfish and lamprey are a single outgroup. I don't know if it's true, but perhaps it is.

If the tree is (hagfish,(lamprey, gnathostomes)), then the hagfish and lamprey are successive outgroups to gnathostomes. If the tree is ((hagfish, lamprey), gnathostomes), the the hagfish and lamprey are a single outgroup to gnathostomes. Whether there are other, extinct outgroups is irrelevant.

John Harshman said...

I still don't understand how your point about temporal resolution is relevant to what I was talking about when I brought up temporal sampling; I do in fact think that temporal sampling is relevant to the determination and definition of species. I don't understand why speciation rates bound time to speciation, unless you mean that time to speciation can't be greater than the time from common ancestor to the present for extant species. But your statement about "crown group leaves" seems to rule that out. I will agree that the effect of cryptic species on the fossil record of very poorly sampled taxa, specifically, is minimal, but I don't see why we should use very poorly sampled taxa to assess the impact of cryptic species on the fossil record.

Unknown said...

"I do in fact think that temporal sampling is relevant to the determination and definition of species."

How? Neither the ISC, nor the PSC make reference to temporal sampling.

My statement about crown group leaves references some work I've seen that tries to deduce speciation rates directly from a dated cladogram. Such estimates are of course affected by taxon sampling and the fact that they do not include extinct lineages.

Tom Mueller said...

Hi John – I think I am finally wrapping my head around what you are saying.

I had an aha moment when I read:

An outgroup is an outgroup. If it isn't an outgroup, it isn't an outgroup. "Basal" doesn't enter into it. An outgroup is simply a taxon that isn't a member of the ingroup. And this is all about nodes. There are basal nodes, not basal taxa. One node can be basal to another, but one taxon is not basal to another, unless you want to say something like "mammals are basal to elephants", which sounds weird to me.

I had a second aha moment when I read:

Imagine an unrooted tree with no distinction between root and branch. You can pick any portion of any branch and declare it to be the root. This will divide the tree into two clades, which we might call the ingroup and the outgroup. And this is exactly what we do in phylogenetic analyses, in most of which transformations are reversible and thus agnostic regarding time: we construct an unrooted tree and then designate part of that tree as an outgroup to root it. If the designated outgroup really is an outgroup, that's good. If it isn't, the tree will contain some incorrect clades, even if the unrooted tree was accurate.

That helped a lot!

I draw your attention to this diagram:

http://ebooks.bfwpub.com/hillis1e/figures/16_3.gif

I guess part of my problem was your contention that lampreys cannot root vertebrate trees when clearly this is being done all the time. I am intrigued that the “jawless” status of lampreys my represent an atavism invalidating their status as an outgroup to jawed fish.

Tom Mueller said...

On the subject of the status of the last common ancestor to Deuterostomes, Lophotrochozoa and Ecydysozoa.

At the risk of incurring Larry’s wrath, I suspect we need to invoke evodevo. Some common ancestor possessed common molecular tool-kits for Lophotrocozoan, Ecdysozoan and Deuterstome later exploited by all three lineages.

If Volker Schmidt is to be believed, that common ancestor was probably coelomate and all three lineages experienced occasional atavistic reversions (perhaps) to an even more ancestral acoelomate or pseudo-coelomate archetype.

http://scienceblogs.com/pharyngula/2006/07/18/diploblasts-and-triploblasts/

Cnidarians have mostly "radial cleavage" (whatever that is now supposed to mean). Possibly Cnidarians are not an outgroup to Deuterostomes and Protostomes? We know that all cnidarians have a bilateral directive axis during embryo development and since some cnidarians appear to have had a triploblastic bilateral ancestor. Check out

http://scienceblogs.com/pharyngula/2006/05/08/hox-genesis/

What about paraphyletic Porifera. Some Porifera have bilateral larva.

That begs the question of Sponges’ status as an “outgroup” to Eumetazoa? What about this controversial paper? http://www.aaas.org/news/science-jelly-not-sponge-base-animal-family-tree

Again, I am eager to hear what you think of these lines of speculation.

You mentioned a while back that you would be interested in any data suggesting that Choanoflagellates were secondarily unicellular. I refer you to

Garci-Fernàndez raises the possibility that choanoflagellates are also degenerate sponges that have abandoned multicellularity and secondarily lost their ANTP genes.

http://scienceblogs.com/pharyngula/2006/05/08/hox-genesis/


I am very intrigued by choanoflagellate’ signaling and adhesion proteins and again wonder out loud how much of this Choanoflagellate story is secondarily unicellular.

http://scienceblogs.com/pharyngula/2008/03/03/the-choanoflagellate-genome-an/

Back to the subject of evodevo’s molecular toolkit, I am intrigued by some peculiar observations such as Annelids' mouth formation which all exhibit homology (at the molecular level) yet some annelids are apparent protostomes, some apparent deuterostomes and some apparent amphistomes if embryonic morphology is considered as the sole criterion.

I humbly submit there is a story here that needs to be knit together. I believe it possible to address the question of the status of the last common ancestor to Deuterostoma, Ecydysozoa and Lophotrochozoa. I suggest part of the story would invoke a possible transitional status of Brachiopods between Lophotrochozoa and Deuterostomes.

That’s my dream in any case.

Again thank you for your patience and your indulgence.

John Harshman said...

Once again, you must understand that a character state, whether an atavism or not, does not invalidate a lamprey as an outgroup to "jawed fish", if by that you mean gnathostomes. Gnathostomes are a clade and lampreys are outside that clade, no matter what character states may have existed in the common ancestor.

Now, the reason lampreys can't be used to root the tree of vertebrates is that they aren't outside vertebrates; nor can they be used to root the tree of all other vertebrates because they aren't outside other vertebrates: hagfish are either their sister group or the sister group of lampreys plus gnathostomes. What lampreys could be used to do is to root the tree of gnathostomes. And I'm pretty sure that nobody has ever done anything else. Hillis is for some reason ignoring the existence of hagfish; don't know why.

Now, if there were extinct fish with jaws that weren't members of gnathostomes (almost certainly true) and if some of those fish were equally or more closely related to lampreys than to gnathostomes, then lampreys would be unable to root the tree of fish with jaws, though they would still be able to root the tree of gnathostomes. Note: Gnathostomata is a clade with a particular definition, which would be something like the common ancestor of sharks, placoderms, and osteichthyans and all its descendants. That's a crown group, a clade defined with relation to extant taxa, and just about any crown group will exclude some taxa possessing what you think of as the defining characters of the group. These days, taxa don't have defining characters, only phylogenetic definitions.

John Harshman said...

I don't see why eve-devo would incur Larry's wrath. He only complains about the silly idea that evo-devo is an alternative to the usual theories of evolution. It's certainly true that all metazoans share many developmental genes, but I don't see that the implications are particularly weird. It's true that the three main eumetazoan groups share even more genes, but again I can't see any huge inferences there except that those genes were present in the common ancestor, which may imply something about that ancestor's anatomy.

It does seem likely that the common ancestor of deuterostomes, lophotrochozoans, and ecdysozoans had the major features shared by all three groups, including a coelom. Again, nothing weird here.

http://scienceblogs.com/pharyngula/2006/07/18/diploblasts-and-triploblasts/

It's certainly possible that mesoderm and triploblasty arose once, at an earlier part of the tree than generally believed. But that has nothing, once more, to do with the shape of the tree. You're going to have to learn this: you are fairly consistently confusing relationships with character evolution. If cnidarians have mesoderm, that doesn't mean they're eumetazoans, only that mesoderm is not a character limited to eumetazoans.


What about paraphyletic Porifera. Some Porifera have bilateral larva.

What about them?

What about this controversial paper? http://www.aaas.org/news/science-jelly-not-sponge-base-animal-family-tree

It's interesting, but I'm going to wait for independent confirmation.

You mentioned a while back that you would be interested in any data suggesting that Choanoflagellates were secondarily unicellular. I refer you to

http://scienceblogs.com/pharyngula/2006/05/08/hox-genesis/


Thanks, but that is just unsupported speculation. No evidence is presented. And again remember that character states are not phylogeny. Choanoflagellates could be the sister group of metazoans even if they had once been multicellular. To determine that, we need an outgroup to choanoflagellates plus metazoans, i.e. some other opisthokont. As far as I know, whenever this has been done, choanoflagellates have always been in the same position, the traditional one.

http://scienceblogs.com/pharyngula/2008/03/03/the-choanoflagellate-genome-an/

Note the tree that appears in that page, which does exactly what I suggest above and finds the traditional position for choanoflagellates.

I believe it possible to address the question of the status of the last common ancestor to Deuterostoma, Ecydysozoa and Lophotrochozoa.

Yes, of course it is. You do that using comparative biology: discovering the states of various characters in extant species and mapping them onto phylogenetic trees. You do not make a priori assumptions about what is primitive and derived or about what has been lost and gained.

John Harshman said...

Without a decent sample it's impossible even to delimit morphotypes, which are the only guide, however poor, that fossils offer to any species limits. Perhaps it isn't temporal sampling that's needed, just a fair number of individuals. Note that with sampling the many species of Triceratops have been merged into T. horridus as has the supposed separate genus Torosaurus. Better temporal sampling is one way to increase the sample; geographic sampling is another; and increased sampling within a locality is yet another. They all have their limits and biases. It isn't the species concept that this is all relevant to; it's species determination.

Mind you, just about any species concept runs into trouble if you try to extend it very far in time or space.

Tom Mueller said...

Hi John

Yes - I noticed the tree as well. I merely suggested it was open to reinterpretation and revision.

Your point on a priori assumptions is well taken.

Tom Mueller said...

Hi again John

Just when I thought I understood you – I flounder!

Here is the part that frustrated me.

Heimberg et al. (11) conclude that the
latest common ancestor of all vertebrates
may have been phenotypically more complex
than living cyclostomes. Interestingly,
this echoes the opinion expressed long
ago by some paleontologists (13), who
supported the theory that lampreys and
hagfishes were derived [emphasis mine]
from heavily armored and ossified Paleozoic
jawless fishes referred to as ostracoderms,

“…These authors even alluded to
the possibility that all jawless vertebrates,
fossil and recent, could have been derived,
through many character losses, from a
common ancestor that was morphologically
more similar to a jawed than a jawless
vertebrate (13)…”


Here is the diagram http://www.pnas.org/content/107/45/19137/F1.large.jpg

According to my misreading of the first paragraph above , the node that leads to ostracoderms should be relatively proximal to the node representing the last common ancestor as compared to the node that leads to so-called cyclosomes (i.e hagfish & lampreys) which should be more distal. The node for cyclosome ancestry should be part of an ingroup to ostracoderms and be relatively more distal from the node representing the last common ancestor. At least that is how I read the paragraph.

Rereading the first paragraph, I pause on the word “echoes”… I think I see the problem. I interpreted that paragraph to suggest the emergence of ancient ostracoderms preceded ancestral cyclosomes. Clearly what was meant was some ostracoderm-like ancestor eventually gave rise to ostracoderms and also gave rise to cyclosomes.

Anonymous said...

Apparently you want to use "type" to refer to both the type specimen on which a scientific name is based and the whole group of organisms (species) that that type specimen supposedly refers to, in cases where the BSC doesn't apply. Therefore, you've got a bigger problem with confused words than you would if you used species the way most people do, which involves different species concepts depending on the biology of the particular species in question. Not an improvement.

John Harshman said...

I don't especially like the terms "proximal" and "distal". How about "younger" and "older" as a substitute? Of course that tree makes no comment about the appearance of the ancestors at any nodes.

Piotr Gąsiorowski said...

Tom,

Sorry for bein a pedant again, but just in case lampreys and hagfish should be grouped together, they are cyclostomes, not -somes (they have round mouths, not bodies).

Unknown said...

@John: "It isn't the species concept that this is all relevant to; it's species determination."

We are on the same page here. When we assign fossils to species, we do make errors and all the effects you listed are error sources - I'd add that the morphological incompleteness of specimens is a further source of error and depending on the particular question, temporal constraints on the site are another one.

I would argue that the ISC doesn't run into trouble if you extend it in time. Species determination becomes an issue, but neither the BSC nor the ISC are about making operational units. They are theoretical constructs are are more or less matched by operational units.

@Barbara: Traditional taxonomic ranks are also types and they are (in the best case) clades. A type is any definition of a set of biological phenomena by their properties. Hence we can look at ecological functions and define ecotypes. We can group organisms based on their similarity of morphology into morphotypes. Some work has been done on leaf damage types (most recently http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0094950).

And yes, I'd like species to be used consistently. Something not everybody wants to do.

Tom Mueller said...

@ John

I don't especially like the terms "proximal" and "distal". How about "younger" and "older" as a substitute? Of course that tree makes no comment about the appearance of the ancestors at any nodes.

I hear you John. Back to:

… this echoes the opinion expressed long
ago by some paleontologists (13), who
supported the theory that lampreys and
hagfishes were derived from
[emphasis mine]
heavily armored and ossified Paleozoic
jawless fishes referred to as ostracoderms
[sic]

Do you understand my earlier confusion? I initially read the above to suggest that the older node, i.e “the speciation event” (for lack of a better term considering the thread above) that gave rise to ostracoderms should have preceded the younger node or “speciation event” that gave rise to lampreys and hagfishes, both now being an ingroup to the ostracoderm clade… If that was not the author’s intent, I think he did a very poor job of communication.

Do you agree that the author could have been clearer in writing this opinion piece? I note it was poorly proof-read.

Best regards

Tom Mueller said...

Hi Piotr

“Pedant” is good. Thank you!

Likewise I gladly offer you the same invitation above. If ever you are in New Brunswick, Canada: let me know and I will treat you to a seafood feast.

My Biology degree is more than a quarter of a century stale! I think the committees who decide curriculum and dictate the content of textbooks are even worse off.

Check out this phylogenetic tree that comes from the Biology textbook currently used in our province.

Some teachers actually teach this as correct!!!! (aside to Larry – this should explain much: students in your class have much to unlearn, a more difficult task than entering your class as a tabula rasa)

Advanced courses in many schools still teach that pseudocoelomates are monophyletic, so too acoelomates; and both are ancestral to coelomates.

I actually put this diagram on my students’ final exam and ask them to identify all the glaring errors.

I don’t want to appear immodest, I am attempting to explain my debt of gratitude to everyone for helping bring me up to speed and do a better job by my students.

Best & grateful regards

John Harshman said...

The problem with the quote, I believe, is that the paper is using "ostracoderm" in two incompatible ways: on the tree as a clade, and in the text as a paraphyletic group characterized by possession of a suite of characters. Or perhaps the problem lies only with "echoes", which you don't seem to have noted. That is, Janvier isn't saying the tree fits the theory, only that there are certain similarities, notably the idea that modern cyclostomes have lost armor and other stuff. This, by the way, is unsurprising. An alternate notion, also unsurprising, would be that the fossil record consists mostly of those groups of jawless fish that adopted heavy armor and that a much greater, unknown fauna of unarmored fish was also around at the same time. Based on the tree, we can come up with hypotheses about character states at the ancestral node, but they are only as good as the assumptions that go into them. Question assumptions.

Tom Mueller said...

@ W. Benson

Thank you for redirecting my attention to Hermann Muller's work. I remember studying this long long ago and even doing some undergrad experiments along these lines.

I remember some classical work estimating the minimum number of mutational hits required to generate the cancer phenotype which were uncannily prescient as I remember it. Around 5 if I remember correctly...

You just conjured memories - I now recall being taught how Benzer could generate mutatants and was the first to identify mutants at the resolution of a single nucleotide.

I guess one issue that still remains unresolved in my head is how a single stretch of DNA represents more than one gene. in other words; one locus is not equivalent to one gene.

One open reading frame can be differentially spiced to generate different proteins. I guess I am asking how ubiquitous this phenomenon is? I guess another way of asking the same question, How much a factor more functional mRNA is there compared to the genome?

And exactly how much of the transcriptome is indeed functional on the understanding ENCODE is over the top.