I recently stumbled on a letter he published in Science back in 1969 (Margoliash, 1969). It's about how you define "homology." This is one of my pet peeves. I've been trying to teach people for years that homology refers to the fact that two genes share a common ancestor. It a conclusion based on evidence such as sequence similarity. For example, if two genes/proteins are more than 30% identical over their entire length then you can conclude that they are homologous—they descend from a common ancestor. The conclusion is based on evidence, such as 30% sequence identity. Don't confuse "similarity" and "homology" because they are two different things.1
Homology is like being pregnant. Either you are or you aren't. You can't be 30% pregnant and you can't be 30% homologous.
I knew that the definition of homology had changed over the years but I didn't know that the dispute over its usage in molecular phylogeny started in the 1960s. Here's the Margoliash letter.
I regret the error in citation (the journal name was given as Nature, rather than Science), which crept in among the 462 references of the review (1) to which Winter, Walsh, and Neurath take exception (Letters, 27 Dec.). In that review, the term homologous was taken to imply, in parallel to universal biological usage, "that the genes coding for the polypeptide chains considered, in all the species carrying these proteins, had at one time a common ancestral gene," and we stated that when this concept is not intended "it would be best to use any of the numerous synonyms of 'similar' and 'similarity' and not appear to be prejudging the issue of evolutionary relations." The "pointed and specific criticism" followed, and was entirely contained in the sentence: "Other definitions may cause confusion and are unlikely to supplant well established biological usages." The "other definitions" referred to the article by Neurath, Walsh, and Winter (2), in which they state, "The term homology as applied to proteins refers to similarity in amino acid sequence," and later, that comparisons of protein structures "must be interpreted on a statistical basis lest we misinterpret random similarities."
On this last score there is no argument. Winter, Walsh, and Neurath will surely agree that in this field erroneous conclusions are likely to arise from the lack of an appropriate statistical distinction between random similarities and similarities of structure greater than can result from random phenomena. An excellent method of performing just such a distinction was published by Fitch (3), and although Neurath, Walsh, and Winter acknowledge it in their article (2), they do not use any acceptable statistical techniques in their comparisons of proteases. Thus, even by their own definition they fail to show "homology."
Homology, in any biological evolutionary context has a generally understood and well-defined meaning, namely the one we have adopted for use in protein primary structure comparisons. One cannot argue that such comparisons represent an area of knowledge separate from evolutionary biology, and that therefore one may use the same words for other meanings, since such protein studies obtain their interest largely in terms of evolutionary concepts and have their major impact in the taxonomic-evolutionary field. Winter, Walsh, and Neurath justify their novel definition of "homology" by maintaining that, without fossil remains, it is not possible to decide whether the structural genes corresponding to a set of present-day proteins are or are not ancestrally related. Apart from the inherent danger of assuming that a problem is insoluble, it may be pointed out that six pages after the definition of "homology," the paper (1) reviewed a statistical method for demonstrating just such ancestral homology. One requires enough primary structures to derive a "statistical phylogenetic tree," as has been possible in the case of cytochrome c (4). From such a tree a simple statistical calculation permits one to approximate the number of residues in a set of proteins that will remain invariant, because of biological necessity, no matter how many species are examined (5). If, in the comparison of any two proteins of this set, the number of identical residues is substantially in excess of the number that remain invariant in the entire set of proteins, then clearly this excess cannot result from functional convergence from different phylogenetic origins, a process yielding analogous structures, and, therefore, it can only be attributed to ancestral homology. In such a procedure, the assumption of the constancy of the genetic code has replaced the fossils of the morphological evolutionist.
Even if one does not accept the validity of such a demonstration, it is difficult to understand why there is an insistence on using the word "homology" for "similarities of protein primary structure greater than random." Any of the over 30 synonyms of "similarity" (6) or a variety of elegant neologisms would do, and prevent an insidious misunderstanding likely to arise in biological literature. Rather than take Alice in her confused trip in Wonderland as a model for logical scientific nomenclature, I prefer to follow the 17th-century poet reacting against a form of debasement of the language then prevalent, and "call a cat a cat" (7).
Department of Molecular Biology,
North Chicago, Illinois 60064
1. C. Nolan and E. Margoliash, Ann. Rev. Biochem. 37, 727 (1968).
2. H. Neurath, K. A. Walsh, W. P. Winter, Science 158, 1638 (1967).
3. W. M. Fitch, J. Mol. Biol. 16, 9 (1966).
4. W. M. Fitch and E. Margoliash, Science 155, 279 (1967).
5. W. M. Fitch and E. Margoliash, Biochem. Genet. 1, 65 (1967).
6. Roget's Thesaurus (St. Martin's Press, New York, 1965).
7. N. Boileau, Satires 1, line 52 (1660). "J'appelle un chat un chat, et Rolet un fripon."
1. Very few people pay attention to me. I appear to be fighting for a lost cause.
Margoliash, E. (1969) Homology: A Definition. Science 163:127