Bottom Line:
However, different phylogenetic trees often contain conflicting results and contradict significant background data.We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies.Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded.

Background: Molecular phylogenies are being published increasingly and many biologists rely on the most recent topologies. However, different phylogenetic trees often contain conflicting results and contradict significant background data. Not knowing how reliable traditional knowledge is, a crucial question concerns the quality of newly produced molecular data. The information content of DNA alignments is rarely discussed, as quality statements are mostly restricted to the statistical support of clades. Here we present a case study of a recently published mollusk phylogeny that contains surprising groupings, based on five genes and 108 species, and we apply new or rarely used tools for the analysis of the information content of alignments and for the filtering of noise (masking of random-like alignment regions, split decomposition, phylogenetic networks, quartet mapping).

Results: The data are very fragmentary and contain contaminations. We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies. Split-decomposition, quartet mapping and neighbornet analyses reveal conflicting nucleotide patterns and lack of distinct phylogenetic signal for the deeper phylogeny of mollusks.

Conclusion: Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded. Contradictions between phylogenies based on different analyses are already a strong indication of unnoticed pitfalls. The use of tree-independent tools for exploratory analyses of data quality is highly recommended. Concerning the new mollusk phylogeny more convincing evidence is needed.

Figure 7: Visualizing phylogenetic structure of alignments via quartet mapping (Nieselt-Struwe and von Haeseler, 2001). Dots in a corner of a triangle represent high support for only one of the three topologies that can be constructed for a quartet of taxa. Dots in the centre represent a star-like topology, and the rest of the triangle stands for intermediate situations. Red circles indicate placement of the mean fraction of points. In all cases the majority of quartets are near the star-tree region, indicating little or no phylogenetic signal. The studied combinations are: A1–6: Original alignment of Giribet et al. (2006) with all characters. B1–6: Same alignment after exclusion of columns with gaps or missing data. C1–6: Same alignment after masking with the ALISCORE approach. For each alignment, the association of Laevipilina with all six possible variants of pairs of higher mollusc taxa were examined (see text). B = Bivalvia, G = Gastropoda, L = Laevipilina, P = Polyplacophora, S = Scaphopoda.

Mentions:
The analyses of the 28S region of the original alignment were executed three times: (A) with all characters, (B) without gap-containing columns, and (C) with the data after application of ALISCORE masking. Accumulation of dots in triangle corners and absence of dots in the central region of triangles are indications for phylogenetic structure of the data set. In all three analyses (Fig. 7) it is apparent that Laevipilina fits best the Scaphopoda and Polyplacophora sequences (triangle corners with groups {(L)(S)} and {(L)(P)}), albeit without strong support. Red circles indicate the mean fraction of simplex points and the radius represents the standard deviation. In all three groups the mean center of simplex points (red dot) is within the star like tree area, indicating only weak if any signal for a single preferred topology. Excluding of gap-containing columns and masking the alignment with the ALISCORE approach enhanced signal, but not beyond the star tree area.

Figure 7: Visualizing phylogenetic structure of alignments via quartet mapping (Nieselt-Struwe and von Haeseler, 2001). Dots in a corner of a triangle represent high support for only one of the three topologies that can be constructed for a quartet of taxa. Dots in the centre represent a star-like topology, and the rest of the triangle stands for intermediate situations. Red circles indicate placement of the mean fraction of points. In all cases the majority of quartets are near the star-tree region, indicating little or no phylogenetic signal. The studied combinations are: A1–6: Original alignment of Giribet et al. (2006) with all characters. B1–6: Same alignment after exclusion of columns with gaps or missing data. C1–6: Same alignment after masking with the ALISCORE approach. For each alignment, the association of Laevipilina with all six possible variants of pairs of higher mollusc taxa were examined (see text). B = Bivalvia, G = Gastropoda, L = Laevipilina, P = Polyplacophora, S = Scaphopoda.

Mentions:
The analyses of the 28S region of the original alignment were executed three times: (A) with all characters, (B) without gap-containing columns, and (C) with the data after application of ALISCORE masking. Accumulation of dots in triangle corners and absence of dots in the central region of triangles are indications for phylogenetic structure of the data set. In all three analyses (Fig. 7) it is apparent that Laevipilina fits best the Scaphopoda and Polyplacophora sequences (triangle corners with groups {(L)(S)} and {(L)(P)}), albeit without strong support. Red circles indicate the mean fraction of simplex points and the radius represents the standard deviation. In all three groups the mean center of simplex points (red dot) is within the star like tree area, indicating only weak if any signal for a single preferred topology. Excluding of gap-containing columns and masking the alignment with the ALISCORE approach enhanced signal, but not beyond the star tree area.

Bottom Line:
However, different phylogenetic trees often contain conflicting results and contradict significant background data.We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies.Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded.

Background: Molecular phylogenies are being published increasingly and many biologists rely on the most recent topologies. However, different phylogenetic trees often contain conflicting results and contradict significant background data. Not knowing how reliable traditional knowledge is, a crucial question concerns the quality of newly produced molecular data. The information content of DNA alignments is rarely discussed, as quality statements are mostly restricted to the statistical support of clades. Here we present a case study of a recently published mollusk phylogeny that contains surprising groupings, based on five genes and 108 species, and we apply new or rarely used tools for the analysis of the information content of alignments and for the filtering of noise (masking of random-like alignment regions, split decomposition, phylogenetic networks, quartet mapping).

Results: The data are very fragmentary and contain contaminations. We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies. Split-decomposition, quartet mapping and neighbornet analyses reveal conflicting nucleotide patterns and lack of distinct phylogenetic signal for the deeper phylogeny of mollusks.

Conclusion: Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded. Contradictions between phylogenies based on different analyses are already a strong indication of unnoticed pitfalls. The use of tree-independent tools for exploratory analyses of data quality is highly recommended. Concerning the new mollusk phylogeny more convincing evidence is needed.