domingo, 25 de noviembre de 2007


In cladistic analysis, the inference of homology has been previously suggested to be at least a two-step procedure: the first step is the hypothesis of correspondence of constituent features between two or more organisms. The second step subject these character hypotheses to the test of congruence (Rieppel & Kearney, 2002). In this sense it is not just the method used for phylogenetic inference which determines the quality of relationship hypotheses. Definition and selection of characters constitute a fundamental step (in the relationship hypotheses), because with them the crucial test in phylogenetic analysis is done.

A character is a logical relation established between intrinsic attributes of two or more organisms that is rooted in observation (Rieppel, 1988). So, “a meaningful character is thus based upon a character description that can in itself be evaluated... and potentially rejected” (Riepperl & Kearney, 2002). In morphological data, there are some classical characteristics that a good character must have: topology, connectivity, and the establishment of a one-to-one relationship of the parts being compared (Rieppel & Kearney, 2002). Although, there are other attributes as function and “special similarity” that are relevant in character definition (Rieppel & Kearney, 2002; Agnarsson & Coddington, 2007). Sometimes, topology correspondence, function and special similarity could conflict and quantitative methods for choosing among different criteria are necessary (Agnarsson & Coddington, 2007).

Other problem that morphological data face is coding. When we compare features in different organisms in some way, we are assuming some correspondence (not necessary topological). So, a presence-absence coding (See Pleijel, 1999) is telling nothing about that feature correspondence, and nothing about the taxa relationship. Becasuse, what forms the evidence in a cladistic analysis is the change among the character states, not the existence of different states (Brower, 2000).

In molecular data, the problem of character definition and character coding is different. Because it is widely accepted that similarity is equivalent to homology, but as those data also have homoplasy, similarity could not be seen as homology. Therefore, a criterion for defining molecular character hypotheses is necessary (de Pinna, 1991). The characters definition with DNA (or other molecular data) could be seen as a previous step, aligning with an algorithm. Or could be seen as a direct search of the optimal trees via direct optimization of the data with which the character hypotheses change during the search (Wheeler, 1996). The second approach is preferable, while it is testing the topology directly and is exploring different possibilities of the data, not just an alignment.

Finally, congruence is the test that corroborates the character as synapomorphies. The tool that allow to asses hypotheses of relationship, and concomitantly of character evolution, is parsimony. Parsimony maximizes congruence between the characters, so it maximizes the propositions of homology (Farris, 1983; Kluge, 1997; Sober, 1998).

Agnarsson, I. & Coddington, J. A. 2007. Quantitative tests of primary homology. Cladistics.
Brower, A. 2000. Homology and the inference of systematic relationships: some historical and philosophical perspectives. In Homology and systematics, coding characters for phylogenetic analysis (eds. Scotland, R & Pennington, R. T.). Taylor & Francis.
de Pinna, M. C. C. 1991. Concepts and tests of homology in the cladistic paradigm. Cladistics.
Kluge, A. 1997. Testability and the refutation and corroboration of cladistic hypotheses. Cladistics.
Pleijel, F. 1995. On character coding for phylogenetic reconstruction. Cladistics.
Rieppel, O. & Kearney, M. 2002. Similarity. Biological Journal of the Linnean Society.
Wheeler, W. C. 1996. Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics.
Farris SJ. 1983. The logical basis of phylogenetic analysis. In: Platnick, NI, Funk, VA, eds. Advances in Cladistics, Vol. 2. New York: Columbia University Press.


Homology is correspondence due to their shared ancestry. Homology assessment is a crucial step in phylogenetics analyses regardless of the type of data employed, since hypotheses of homology relate observations among taxa.

Every proposition of homology involves two stages which are associated with is generation and testing: the primary homology statement is conjectural based on similarity. The secondary level of homology is the outcome of a patter detecting analysis, the congruence test, and represents a test of the expectation that the observable match of similarities is potentially part of a retrievable regularity indicative of a general pattern (de Pinna, 1991).

When dealing with molecular data similarity guides primary homology -as in morphological characters- but there have been a tendency to believe that higher the similarity, the more likely that the sequences are homologous (Salemi and Vandamme, 2003; Patterson, 1988) where the fundamental homology statements are made at the level of the individual nucleotide bases. Conversely, the fundamental homology statement can be consider at the level of the sequence itself because the contiguous sequences are the homologous units that transform at prescribed costs among various states (Wheeler, 1999). Therefore Sequences themselves are treated as the fundamental units of homology and homology assessment is not consider a matter of aligning sequences and counting matches between them. Following “Direct Optimization’’ we can vary dynamically the primary homology hypotheses during tree search consequently the resulting hypotheses of homology are tested in conjunction with character congruence through parsimony (Wheeler, 2006). The testing of all characters against one another simultaneously constitutes the most severe test of congruence; In a phylogenetic context, the best alignment is the one that generates the most parsimonious tree when analyzed in conjunction with all relevant data (Philips, 2006).

The homology assessment have been shown to be influenced by the way we coding and the different interpretations of the criteria used to asses those statements demonstrating that the organismal variation is often conceptualized as characters and characters states in different ways (Scotland and Pennington, 2000). Even so, it is possible to have greater explicitness in the delimitation of morphological characters by detailed observation and the topologic criterion used, used in conjunction with special quality, and intermediate conditions of form (Rieppel and Kearney, 2002). In general, better guidelines concerning character conceptualization are required to help solving this.

Similarity and conjunction have proposed as tests for homology but only congruence serves to this respect. Similarity, as mentioned above, guides the assessment of the homology conjecture. Conjunction is an indicator of non-homology, but it is not specific about the pair wise comparison where non-homology is present, and depends on a specific scheme of relationship in order to refute a hypothesis of homology (de Pinna, 1991).

On Homology

The importance of homology has been discused by several authors (Patterson, 1988; Wagner, 1989; de Pinna, 1991; Rieppel & Kearney, 2002; Agnarsson & Coddington, 2007). So, homology is a crucial basis in the Systematics. The most simple meaning of homology is equivalence of parts (de Pinna, 1991). In 1982, Patterson states that homology is equal to synapomorphy. So, the synapomorphic characters must be homologous. Nevertheless, the symplesiomorphic characters could be homologous (homologies at a higher level).
There are two kinds of homologies; the primary homology - conjectures or hypothesis about common origin of characters -, and the secondary homology - the tested hypothesis – (de Pinna, 1991).

Pattern or process
A question about homology is the significance of evolutionary process in the identification of homologous characters. Implicitly, the evolutionary events are bounded to the analysis of characters (Lee, 2002). However, Brower (2000) stated that “the similarities - homologies - between taxa represent the only necessary ontological foundation for the construction of cladograms and hypotheses of taxonomic grouping”. So, the evolutionary asumptions are not necessary in the identification of homologies; but the homologous characters can be explained by evolutionary process (Rieppel & Kearney, 2002).
Equally, some authors claims that the phenomenon of circularity in homology (need of a priori topology) is undesirable because the recognition of homologous characters is conditionated to mapping of them in a initial topology. Nevertheless, the circularity is not a “big” problem, because the tested homologies (mapped hypothesis of homologies) are “tested hypothesis” attached to new set of analysis (new tests).

An inherent point in the identification of homologous characters is the character's coding. A inadequate definition of characters produces bias in the identification of homology. Some author claims that the real problem in homology is the character's coding. So, for example, there is the belief that the character's coding is linked to knowledge of researcher about taxon. So, the “eyes” of an experienced researcher would discriminate and describe “best” characters and character's states that an non-experienced researcher.
A approach used in the the character's coding was the morphometric analysis (biometry). However, the biometry is not useful in homology because the complexity of the biological estrucures and its incompatibility with the statistical multivariate analysis (Bookstein, 1994).

The three tests of homology (similarity, conjuction, and congruence) are secuencial in the identification of homology. Nevertheless, the similarity is not a test (in Popperian sense), similarity is a conjecture of homologous characters - primary homology – (de Pinna, 1991) . The conjuction and congruence (agreement in supporting the same phylogenetics relationships) are the “hard” tests of homology – the secondary homology from de Pinna - (Rieppel and Kearney, 2002). Although there is interdependece among them (tests), it not means that the three tests will be one only. So, the result is a hypothesis of homology corroborated, but they is not definitive (“true homology”).

Molecular homology
The identification of homology in molecular characters (nucleotide sequences) presents problems not found in other kinds of character data. For example, although each base position presents one of four identical states (A, C, G or T), the number of these positions is likely to vary, that is homologous nucleotide sequences may differ in length (Wheeler, 1996). Further, Patterson (1988) claims that the tests of homology in molecular characters are equal to morphological characters. However, the significance of tests (similarity, conjuction, and congruence) is different because the similarity is the most crucial test, while in morphological characters is test of congruence.
A point of discussion in the identification of homologies in molecular sequences is the need of an alignment to determine sites or homologous fragment. However, this way is considered inadequate because alignment is generated using a priori costs and asumptions. Wheeler (2003) proposes a synapomorphic-based alignment methods - Implied alignment – (IA) that identifies homologies in the topology using Direct Optimization (DO). So, Implied alignment generating all posibles alignments and all posibles homologies are analized. This method may be efficient to identify homologies. Nevertheless, the a priori asumptions are inherent to alignments.


  • Agnarsson, I., & Coddington, J. A. (2007). Quantitative tests of primary homology. Cladistics, 23, 1-11.
  • Brower, A. V. Z. (2000). Evolution is not an assumption of cladistics. Cladistics, 16, 143–154.
  • de Pinna, M. C. C. (1991) Concepts and tests of homology in the cladistic paradigm. Cladistics, 7, 367-394.
  • Lee, M. S. Y. (2002). Divergent evolution, hierarchy, and cladistics. Zoologica Scripta, 31, 217–219.
  • Patterson, C. (1988). Homology in classical and molecular biology. Molecular Biology and Evolution, 5, 603-625.
  • Rieppel, O., & Kearney, M. (2002) Similarity. Biological Journal of the Linnean Society, 75, 59-82.
  • Wagner, G. P. (1989) The biological homology concept. Annual Reviews of Ecology and Systematics. 20, 51-69.
  • Wheeler, W. C. (1996). Optimization alignment: the end of multiple sequence alignment in phylogenetics?. Cladistics, 12, 1-9.
  • Wheeler, W. C. (2003) Implied alignment: a synapomorphic-based multiple alignment method and its use in cladogram search. Cladistics, 19, 261-268.