



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The use of bayesian inference in character evolution studies, focusing on its ability to account for both phylogenetic and mapping uncertainty. How bayesian methods enable the reconstruction of ancestral states and histories of character change with more credibility than traditional methods. It also touches upon sensitivity analysis and the importance of using realistic models in bayesian phylogenetic analysis.
Typology: Study notes
1 / 7
This page cannot be seen from the preview
Don't miss anything!




Much recent progress in evolutionary biology is based
on the inference of ancestral states and past transform-
ations in important traits on phylogenetic trees. These
exercises often assume that the tree is known without
error and that ancestral states and character change
can be mapped onto it exactly. In reality, there is often
considerable uncertainty about both the tree and the
character mapping. Recently introduced Bayesian stat-
istical methods enable the study of character evolution
while simultaneously accounting for both phylogenetic
and mapping uncertainty, adding much needed credi-
bility to the reconstruction of evolutionary history.
Evolution is a difficult phenomenon to study. It is rarely
fast enough to be observed directly and only in exceptional
cases is it possible to find physical evidence, such as fossils
or ancient DNA, of past states and events. Fortunately,
evolution leaves its footprint in the distribution of traits
among living things. By studying this footprint, we can
infer how organisms originated through the successive
splitting of ancestral lineages, a process depicted in
phylogenetic trees. Given a phylogenetic tree, we can
also reconstruct the evolutionary history of individual
traits of interest.
The wide range of questions that can be addressed by
the INFERENCE (see Glossary) of ancestral states or paths
of change in key traits on phylogenetic trees is fascinating.
A few examples include the design of vaccines [1], the
reconstruction of ancestral hormone receptors [2] and
ancestral metabolic pathways [3], the inference of ancient
behaviours [4], the identification of past dispersal patterns
[5–7], the study of positive selection in proteins [8], the
discovery of viral infection pathways [9], and the recog-
nition of character correlation in coevolving lineages [10].
Many of these applications still rely on explicit or
implicit PARSIMONY mapping of characters onto a single
phylogenetic tree. The parsimony method finds the
reconstruction that implies the smallest number of
changes on the given tree; the solution is often intuitively
obvious (Figure Ia in Box 1). The inferred ancestral states
and character changes using parsimony typically reveal
the process of evolution in exhilarating detail.
It has long been recognized that this approach ignores
two important sources of error. First, the parsimony
principle singles out the solution(s) requiring the
minimum amount of change on the given tree, although there is usually a range of alternative reconstructions on the same tree that are almost as likely [11] ( MAPPING UNCERTAINTY; Box 1). Second, the tree is almost never known without error [12] ( PHYLOGENETIC UNCERTAINTY; Box 2). If there is a range of plausible trees, it is possible that the evolutionary history of a trait could differ depending on the tree. Clearly, ignoring either of these sources of error is potentially misleading.
Glossary
Bayesian inference: theory of statistical inference based on the idea of rational accumulation of scientific knowledge. Statistical models and model parameters are regarded as random variables, and statistical analysis uses data (obser- vations) to update a prior probability distribution on these parameters to a posterior probability distribution. Bootstrapping (nonparametric): procedure for examining the uncertainty in a statistical estimate by drawing new samples (pseudosamples) from the original sample, and repeating the statistical procedure for each of these new samples. There is also a parametric variant that generates new samples by using a parametric model estimated from the original sample. Conditional probability: the probability conditioned on (given) some infor- mation; we can think of it as a relative probability. In Box 1, the conditional (relative) probability of ancestor B being purple (state 0) is Pr(BZ0)Z 0.00024/0.00037Z0.65. Hence, the conditional probability it being green is Pr(BZ1)Z 1 KPr(BZ0)Z 1 K0.65Z0.35. Box 1 Inference: to draw conclusions about a statistical model using empirical data. Likelihood: probability that a particular model (with specific parameter values) produced some observed data. For instance, the likelihood (probability) of the data in Box 1 is LZ0.00037 given the binary Markov model with p 1 Z0.5 and summing over ancestral states. If ancestor B has state 0 (purple), the likelihood is L(BZ0)Z0.00024; if it has state 1 (green), the likelihood is L(BZ1)ZLK L(BZ0)Z0.00037K0.00024Z0.00013. Box 1 Mapping uncertainty: the error associated with reconstructing the evolution of a character on a given phylogenetic tree. Markov chain Monte Carlo (MCMC): stochastic simulation technique for generating a sample from a complex distribution that is known up to a normalizing constant. It is widely used to sample Bayesian posterior distributions, where it is based on specially designed Markov models (similar but more complex than the ones used to model evolution; Box 3) and their tendency to move towards a stationary condition. Box 3 Maximum likelihood (ML): widely used method of statistical inference that finds the parameter values that maximize likelihood. For instance, when p 1 Z0.5, the ML state of ancestor B (Figure Ib in Box 1) is 0 (purple) because L(BZ0)OL(BZ1). More typically, ML is used to estimate the free parameters of a probability model. For instance, if we vary p 1 , we discover that the likelihood of the observed data is maximized when p 1 z0.20. This is the ML estimate of p 1. Figure I, Box 1 Parsimony: inference principle based on minimizing cost; in evolutionary inference, usually the same as minimizing the number of character changes. Phylogenetic uncertainty: the uncertainty in reconstructing character evolution owing to error in the phylogenetic estimate. Posterior (probability distribution): probability distribution describing the knowledge about a model and its parameters after a Bayesian analysis. Can be used as the prior in a subsequent Bayesian analysis. Prior (probability distribution): probability distribution specifying the knowl- Corresponding author: Fredrik Ronquist ([email protected]). edge about a model and its parameters before a Bayesian analysis. Available online 21 July 2004
www.sciencedirect.com 0169-5347/$ - see front matter Q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tree.2004.07.
Sensitivity analysis
A simple way of examining the robustness of evolutionary
inference is to look at how sensitive the results are to
slight changes in the analytical conditions, a procedure
known as sensitivity analysis [13–17]. Belshaw and
Quicke [18] recently used this approach extensively in
studying the evolution of a group of parasitic wasps in
which species lay their eggs in either concealed or exposed
hosts. They examined mapping uncertainty by modifying
the parsimony cost of the evolutionary switch from
exposed to concealed hosts relative to the cost of the
opposite switch until their evolutionary reconstruction
changed. Then they assessed phylogenetic uncertainty by
finding the difference in parsimony score between the
preferred tree and the best tree implying a different
evolutionary history. The authors concluded that there
was strong support for two switches in the parsimony
reconstruction: one to exposed hosts and another in the
opposite direction. Other sensitivity-type approaches to
phylogenetic uncertainty include enumerating all pos-
sible phylogenetic trees consistent with some classifi-
cation of the studied organisms [19] or simulating
alternative trees according to some plausible tree-gener-
ating process, such as random speciation and extinction
[20], and then mapping the studied character(s) onto each
of these trees.
Although a particular sensitivity analysis can be
illuminating, the procedure is a little bit like measuring
the stability of sand castles by pouring water, blowing and
stepping on them. Without standardization, the fact that
one castle stands and the other falls might be solely
because of differences in the treatment. Parametric
statistical methods, such as likelihood analysis and
Bayesian inference, can potentially add the rigor that
sensitivity analysis lacks.
Mapping uncertainty Likelihood analysis The most common parametric approach to mapping uncertainty is based on LIKELIHOODS calculated from an evolutionary probability model. The model of choice for discrete characters is the Markov model (Box 3), exempli- fied by the Jukes-Cantor model and similar models long used by molecular evolutionists to study nucleotide, amino acid and codon evolution [21]. There are also Markov models for discrete characters with an arbitrary number of states [11,22–25]. For quantitative charac- ters, Brownian motion is a popular evolutionary model [12,26]. Both Markov and Brownian motion models are mathematically convenient, but they are also able to capture many of the known complexities of the evolutionary process. Given a fixed tree with fixed branch lengths, fixed states for the tips, and a Markov model with fixed parameters (the fixed values of which are often derived from a MAXIMUM LIKELIHOOD analysis), we can calculate the RELATIVE ( CONDITIONAL ) PROBABILITY of each ancestral state given the observed states at the tips [27–29]. The conditional probabilities indicate some of the uncer- tainty in the ancestral state assignments (Figure Ib in Box 1), but the potential error in the fixed parameters is not accounted for. For instance, assume that we were mapping a binary character (0 or 1), with the unknown parameter p 1 specifying the rate of 0 / 1 changes measured as a fraction of the total evolution- ary rate (p 0 Cp 1 ; where p 0 is this rate of 0 / 1 changes; Box 3). For the conditional probabilities to be valid, we have to assume the value of p 1 was known with certainty although it is typically an estimate, perhaps a maximum likelihood estimate, associated with some error.
TRENDS in Ecology & Evolution
B
A
C
D
E F G
i
ii
A
B
C
D E^
F G
0.0 0.2 0.4 0.6 0.8 1.
Prior
Posterior
Probability
A
B
C
D E^
F G
(a) (b) (c) (d)
π 1 = 0.
π 1
Figure I.
We are interested in inferring the ancestral states of a character with two states, purple (0) or green (1). The states could, for example, represent particular behaviours, life-history traits or morphological features. The tree and the ages of the nodes are known; the scale is in amount of expected change. Parsimony (Figure Ia) finds the reconstruction requiring the minimum amount of change between green and purple. In this case, there are two changes [marked (i) and (ii)] assuming gains and losses count equally. Ancestors are inferred as being either green or purple, but we do not know how certain these conclusions are. That is, we have not taken mapping uncertainty into account. Likelihood analysis requires that we know the relative rates of 0/ 1 changes (p 1 ) and 1/0 changes (p 0 ). Assuming that these rates are
equal (p 0 Zp 1 Z0.5), for instance, we can calculate the probability of each ancestor being either green or purple (Figure Ib). The ancestral state is uncertain for ancestors A, B and F because they are on long branches or close to regions of the tree where a state change is likely. In Bayesian inference, p 1 does not have to be fixed. Instead, we specify a prior probability distribution on p 1. In the absence of background information, we can assume that all possible values are equally likely (Figure Ic: prior). This enables us to infer ancestral states while weighting each p 1 value according to its probability given the data (Figure Ic: posterior). In our example, Bayesian inference simply adds a dash more uncertainty to the conditional probability values (Figure Id; the effect is most notable for ancestor C).
estimate. One of the advantages of bootstrapping is that it
can be applied to a wide array of methods for reconstruct-
ing phylogeny, including distance methods, parsimony,
maximum likelihood and even Bayesian inference.
Using bootstrapping to account for phylogenetic uncer-
tainty in studies of character evolution is straightforward:
simply map characters onto each of the bootstrap
estimates of phylogeny and then use the distribution of
these mappings to describe the effect of the uncertainty.
The approach seems to have been first discussed by
Felsenstein [12] and first applied by Richman and Price
[34]. Ronquist and Liljeblad [35] recently used the
technique extensively in reconstructing the origin of gall
wasps. Simple parsimony reconstruction suggested that
the first gall wasps lived in the Mediterranean and
induced single-chambered galls that were distinct swel-
lings of the seed-capsules of herbaceous Papaveraceae.
When phylogenetic uncertainty was taken into account, it
turned out that the robustness of these conclusions varied
considerably.
Unfortunately, bootstrapping cannot be used to address
mapping uncertainty. Bootstrapping DNA sequences, for
example, will be of little help in understanding how precise the mapping of a single behavioural character might be on each of the possible trees. Bayesian inference, however, can account for both mapping and phylogenetic uncertainty across a heterogeneous dataset. In principle, we only need to expand the probability model to include topology, branch lengths, and other parameters necessary to infer phylogeny from the available data. The posterior probability distribution can no longer be calculated analytically (with pen and paper) because it is so complex, but we can sample from it using stochastic simulation in, for example, so-called MARKOV CHAIN MONTE CARLO (MCMC) techniques [36–40]. If the simulation is run long enough, we obtain a valid sample of the posterior probability distribution. Box 2 gives an example of how phylogenetic and mapping uncertainty are handled in a Bayesian analysis of a discrete character. Huelsenbeck and col- leagues [31] first developed this approach, illustrating it with an analysis of the origin of soldier aphids. Parsimony suggested one origin and three losses of the soldier caste, but the Bayesian analysis revealed that this conclusion was uncertain. About a year later, Lutzoni and Pagel
Markov models are used for random processes, in which the probability of change depends only on the current state (the Markov property). They are most easily understood in terms of their instantaneous rate matrix, which describes the transition rates in an infinitesimal amount of time. For a discrete character with two states, 0 or 1, the rate matrix Q is (Eqn I)
Q ¼ fqijg ¼
Kp 1 p 1 p 0 Kp 0
; [Eqn I]
where q (^) ij refers to the rate in row i and column j of Q. There are two different rates in the off-diagonals: q 01 Zp 1 is the rate of 0 / 1 transitions, and q 10 Zp 0 is the rate of 1/0 transitions. The diagonal contains the loss rates. For instance, q 00 Z-p 1 is the rate at which the frequency of state 0 changes. The rate is negative because the frequency decreases as the character evolves from 0 to 1. The rate at which the frequency of a state decreases must balance the rate at which it evolves into other states; thus, each row in Q sums to 0. Markov models usually tend towards an equilibrium condition (stationarity). The probability of being in a particular state i at stationarity (the stationary frequency of the state) is usually denoted pi and can be determined from the rate matrix Q. In the binary model, the stationary frequencies correspond to the transition rates (scaling disregarded). To illustrate this, I ran three simulations under a two-state Markov model with p 1 Z0.75 (and p 0 Z0.25). Each simulation had 200 inde- pendently evolving characters; one was started with all characters in state 0, one with all characters in state 1, and the last one with half the characters in each state. In all cases, w75% of the characters ended up in state 1 and w25% in state 0, as predicted by the stationary frequencies (Figure I). To use a Markov model for simulations or probability calculations, we want to know the transition probabilities over a certain time period, t. These are represented in a matrix denoted P(t), which is obtained by integrating Q over time. For the binary Markov model, we get (Eqn II)
PðtÞ ¼ fpijðtÞgK ¼
p 0 þ p 1 eKmt^ p 1 K p 1 eKmt
p 0 K p 0 eKmt^ p 1 þ p 0 eKmt
; [Eqn II]
where m is a scaling factor. Each element of the P matrix summarizes the probability of a particular state change over an infinite number of change histories. For instance, p 01 (t) is the sum of the probability of one change 0/1, three changes 0/ 1 / 0 /1, five changes 0/ 1 / 0 / 1 / 0 /1, and so on, over time t.
The P matrix can be used to simulate the states at the terminals of an evolutionary tree. We draw a starting state at the root of the tree from the stationary frequencies. Then we use the P matrix for each branch in turn to generate the end state of that branch. To obtain a sample of change histories, we need to go back to the Q matrix and utilize the fact that the waiting time (x) to the next change is exponentially distributed (Eqn III):
PrðxÞ ¼
eKx lKqii Kqii
; [Eqn III]
where Pr(x) is the probability of x and q (^) ii is the loss rate of the current state. Thus, for a binary character in state 1, the waiting time to the next change is distributed as (Eqn IV)
PrðxÞ ¼
eKx=p^0 p 0
: [Eqn IV]
When there are more than two states, the probability of the change being from i to a particular state j is determined by (Eqn V)
PrðjÞ ¼ Pqij j:jsi qij
: [Eqn V]
TRENDS in Ecology & Evolution
Time
Frequency of state 1
Stationary frequency (^) (π 1 )
Figure I.
published similar work [41,42] on the origin of lichenized
fungi, but failed to apply a strict Bayesian approach to
mapping uncertainty. Recently, Huelsenbeck and Rannala
extended Bayesian MCMC techniques to the comparative
analysis of quantitative characters using Brownian
motion models [43].
An old argument is whether or not the character to be
mapped should be included in the phylogenetic analysis
[44–48]. The Bayesian posterior probabilities are always
based on all the available data. Assume that we are
mapping a behavioural trait onto a DNA phylogeny using
a composite model that describes both the evolution of the
DNA sequences and the behavioural character. Think of
the model as a table with many dimensions, each
dimension corresponding to a different parameter and
each cell to a combination of parameter values (Table I in
Box 2). The Bayesian MCMC analysis uses the data, the
model and the prior to estimate the probability of each cell
in the table (the joint probability distribution). The joint
probabilities are the same regardless of whether the DNA
phylogeny is derived first and the behavioural character
mapped on afterwards or if both data sources are
combined in a single analysis. After the analysis, the
investigator is free to focus on any parameter (axis of the
table) of interest by calculating its marginal distribution
(the marginal sums of that dimension in the table).
Character change histories
Bayesian inference can also be used to obtain samples of
character change histories from the posterior distribution
while accounting for both phylogenetic and mapping
uncertainty. Normally, dealing with change histories is a
nuisance because there are infinitely many of them.
Standard probability calculations avoid the problem by
using the transition probability (P) matrix, which sums
(integrates) over all possible realizations of character
change; only the starting and ending states matter
(Box 3). In a seminal paper, Nielsen [49] described how
we can nevertheless sample change histories. The idea is
to simulate character change on a set of MCMC samples,
working backwards. First, we use P matrices to draw a
sample of ancestral states given the observed tip states
and the parameter values of the MCMC sample. We then
simulate the substitution process, one branch at a time,
until we get a realization that is consistent with the fixed
starting and ending states of each branch (Box 4).
Nielsen’s method can be used to generate a range of
plausible scenarios for how evolution might have
occurred. We can also study the nature of character
evolution by comparing the sample of change histories
leading to the observed tip states with the change histories
expected from the model used for mapping [32,49]. The
expected histories are obtained by simulating character
evolution on the MCMC samples without fixing ancestral
and tip states first; this is referred to as a posterior
predictive distribution because it predicts future obser-
vations from the posterior. If the observed and expected
change histories differ, we can reject the mapping model
and learn something about character evolution. For
instance, we might find that there is more rate variation
than expected under an equal-rates model [49], we can
detect positively selected sites using a model with no across-site variation in selection pressure [49], and we can reveal character correlation using a model assuming no correlation [32]. In each of these cases, it would have been easy to use a more sophisticated mapping model, enabling us to obtain a valid sample of change histories and to estimate par- ameters such as the extent of across-site rate variation. However, Nielsen’s posterior predictive approach enables simple models to be used in addressing evolutionary phenomena that would otherwise have been difficult to model. For instance, evolutionary models that can vary across organism lineages are complicated (but not imposs- ible) to analyze with Bayesian MCMC techniques. With Nielsen’s approach, we can study the basic properties of complicated processes using simple standard models and use these results in designing more realistic models.
Bayesian controversies Bayesian posterior probabilities have an intuitive interpretation. A tree with a posterior probability of 0. has a 90% chance of being true given that the model and the priors are correct. This follows from the definition of posterior probabilities and needs no mathematical proof. Nevertheless, there have been simulation studies report- ing a slight Bayesian bias (underestimate) under these conditions [50–53]. This could be because of programming error, but recent analyses suggest that the bias is caused
To illustrate the uncertainty in reconstructing character evolution, it is useful to have a sample of likely character change histories. In the Bayesian approach, we first obtain a sample from the posterior distribution of a phylogenetic analysis (as in Figure Ia in Box 2). For each sampled tree, we draw a sample of ancestral states for the character(s) we want to map (Figure Ia). This is done by pulling conditional probabilities down the tree to obtain downpass prob- abilities, and then drawing ancestral states one node at a time up the tree from the downpass probabilities adjusted according to the already drawn states [49]. Once we have a sample of ancestral states, we simulate a character change history by drawing waiting times between changes one branch at a time until the drawn history matches the starting and ending states of that branch. This produces a valid sample of character change history from the posterior probability distribution (Figure Ib). A branch can have more than one change, as illustrated by the left descendant of A. By repeating the procedure for each tree in the sample, we can obtain thousands of samples of character change similar to the one in Figure Ib. These samples help reveal how the mapped characters evolve.
TRENDS in Ecology & Evolution
Draw ancestral states
Simulate character change
(a) (b)
Figure I.
14 Maddison, W.P. and Maddison, D.R. (1992) MacClade Version 3: Analysis of Phylogeny and Character Evolution, Sinauer 15 Wheeler, W.C. (1995) Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Syst. Biol. 44, 321– 16 Donoghue, M.J. and Ackerly, D.D. (1996) Phylogenetic uncertainties and sensitivity analyses in comparative biology. Philos. Trans. R. Soc. London B 351, 1241– 17 Cunningham, C.W. et al. (1998) Reconstructing ancestral character states: a critical appraisal. Trends Ecol. Evol. 13, 361– 18 Belshaw, R. and Quicke, D.L.J. (2002) Robustness of ancestral state estimates: evolution of life history strategy in ichneumonoid para- sitoids. Evolution 51, 450– 19 Losos, J.B. (1994) An approach to the analysis of comparative data when a phylogeny is unavailable or incomplete. Syst. Biol. 43, 117– 20 Martins, E.P. (1996) Conducting phylogenetic comparative studies when the phylogeny is not known. Evolution 50, 12– 21 Felsenstein, J. (2004) Inferring Phylogenies, Sinauer 22 Pagel, M. (1994) Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proc. R. Soc. London B. Biol. Sci. 255, 37– 23 Schultz, T.R. et al. (1996) The reconstruction of ancestral character states. Evolution 50, 504– 24 Schluter, D. et al. (1995) Uncertainty in ancient phylogenies. Nature 377, 108– 25 Lewis, P.O. (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50, 913– 26 Felsenstein, J. (1985) Phylogenies and the comparative method. Am. Nat. 125, 1– 27 Schluter, D. et al. (1997) Likelihood of ancestor states in adaptive radiation. Evolution 51, 1699– 28 Pagel, M. (1999) The maximum likelihood approach to reconstructing ancestral character states of discrete characters on phylogenies. Syst. Biol. 48, 612– 29 Mooers, A.Ø. and Schluter, D. (1999) Reconstructing ancestor states with maximum likelihood: support for one- and two-rate models. Syst. Biol. 48, 623– 30 Schultz, T.R. and Churchill, G.A. (1999) The role of subjectivity in reconstructing ancestral character states: a Bayesian approach to unknown rates, states, and transformation asymmetries. Syst. Biol. 48, 651– 31 Huelsenbeck, J.P. et al. (2000) Accommodating phylogenetic uncer- tainty in evolutionary studies. Science 288, 2349– 32 Huelsenbeck, J.P. et al. (2003) Stochastic mapping of morphological characters. Syst. Biol. 52, 131– 33 Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783– 34 Richman, A.D. and Price, T. (1992) Evolution of ecological differences in the old world leaf warblers. Nature 355, 817– 35 Ronquist, F. and Liljeblad, J. (2001) Evolution of the gall wasp-host plant association. Evolution 55, 2503– 36 Gamerman, D. (1997) Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, Chapman & Hall 37 Lewis, P.O. (2001) Phylogenetic systematics turns over a new leaf. Trends Ecol. Evol. 16, 30– 38 Huelsenbeck, J.P. et al. (2001) Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 2310– 39 Huelsenbeck, J.P. et al. (2002) Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. 51, 673– 40 Holder, M. and Lewis, P.O. (2003) Phylogenetic estimation: traditional and Bayesian approaches. Nat. Rev. Genet. 4, 275–
41 Lutzoni, F. et al. (2001) Major fungal lineages are derived from lichen symbiotic ancestors. Nature 411, 937– 42 Pagel, M. and Lutzoni, F. (2002) Accounting for phylogenetic uncertainty in comparative studies of evolution and adaptation. In Biological Evolution and Statistical Physics (La¨ ssig, M. and Valler- iani, A. eds), pp. 151–164, Springer 43 Huelsenbeck, J.P. and Rannala, B. (2003) Detecting correlation between characters in a comparative analysis with uncertain phylogeny. Evolution 57, 1237– 44 Coddington, J. (1988) Cladistic tests of adaptational hypotheses. Cladistics 4, 3– 45 Brooks, D.R. and McClennan, D.A. (1990) Phylogeny, Ecology, and Behavior, University of Chicago Press 46 Armbruster, S.R. (1992) Phylogeny and the evolution of plant-animal interactions. BioScience 42, 12– 47 de Queiroz, K. (1996) Including the characters of interest during tree reconstruction and the problem of circularity and bias in studies of character evolution. Am. Nat. 148, 700– 48 Luckow, M. and Bruneau, A. (1997) Circularity and independence in phylogenetic tests of ecological hypotheses. Cladistics 13, 145– 49 Nielsen, R. (2002) Mapping mutations on phylogenies. Syst. Biol. 51, 729– 50 Wilcox, T.P. et al. (2002) Phylogenetic relationships of the dwarf boas and a comparison of Bayesian and bootstrap measures of phylogenetic support. Mol. Phylog. Evol. 25, 361– 51 Douady, C.J. et al. (2002) Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol. Biol. Evol. 20, 248– 52 Alfaro, M.E. et al. (2003) Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence. Mol. Biol. Evol. 20, 255– 53 Cummings, M.P. et al. (2003) Comparing bootstrap and posterior probability values in the four-taxon case. Syst. Biol. 52, 477– 54 Erixon, P. et al. (2003) Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Syst. Biol. 52, 665– 55 Suzuki, Y. et al. (2002) Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc. Natl. Acad. Sci. U. S. A. 99, 16138– 56 Buckley, T.R. (2002) Model misspecification and probabilistic tests of topology: evidence from empirical data sets. Syst. Biol. 51, 509– 57 Simmons, M.P. et al. (2004) How meaningful are Bayesian support values? Mol. Biol. Evol. 21, 188– 58 Lemmon, A.R. and Moriarty, E.C. (2004) The importance of proper model assumption in Bayesian phylogenetics. Syst. Biol. 53, 216– 59 Nylander, J. et al. (2004) Bayesian phylogenetic analysis of combined data. Syst. Biol. 53, 47– 60 Holmes, S. (2003) Bootstrapping phylogenetic trees: theory and methods. Stat. Sci. 18, 241– 61 Ronquist, F. and Huelsenbeck, J.P. (2003) MrBayes3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572– 62 Galtier, N. (2004) Sampling properties of the bootstrap support in molecular phylogeny: influence of nonindependence among sites. Syst. Biol. 53, 38– 63 Efron, B. et al. (1996) Bootstrap confidence levels for phylogenetic trees. Proc. Natl. Acad. Sci. U. S. A. 93, 13429– 64 Sanderson, M.J. and Wojciechowski, M.F. (2000) Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae). Syst. Biol. 49, 671–