Bayesian Character Evolution: Accounting for Uncertainty, Study notes of Biology

The use of bayesian inference in character evolution studies, focusing on its ability to account for both phylogenetic and mapping uncertainty. How bayesian methods enable the reconstruction of ancestral states and histories of character change with more credibility than traditional methods. It also touches upon sensitivity analysis and the importance of using realistic models in bayesian phylogenetic analysis.

Typology: Study notes

Pre 2010

Uploaded on 08/27/2009

koofers-user-i0m
koofers-user-i0m 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Phylogenetics Series
Bayesian inference of character
evolution
Fredrik Ronquist
Computational Science and Information Technology, Florida State University, Tallahassee, FL 32306–4120, USA
Much recent progress in evolutionary biology is based
on the inference of ancestral states and past transform-
ations in important traits on phylogenetic trees. These
exercises often assume that the tree is known without
error and that ancestral states and character change
can be mapped onto it exactly. In reality, there is often
considerable uncertainty about both the tree and the
character mapping. Recently introduced Bayesian stat-
istical methods enable the study of character evolution
while simultaneously accounting for both phylogenetic
and mapping uncertainty, adding much needed credi-
bility to the reconstruction of evolutionary history.
Evolution is a difficult phenomenon to study. It is rarely
fast enough to be observed directly and only in exceptional
cases is it possible to find physical evidence, such as fossils
or ancient DNA, of past states and events. Fortunately,
evolution leaves its footprint in the distribution of traits
among living things. By studying this footprint, we can
infer how organisms originated through the successive
splitting of ancestral lineages, a process depicted in
phylogenetic trees. Given a phylogenetic tree, we can
also reconstruct the evolutionary history of individual
traits of interest.
The wide range of questions that can be addressed by
the INFERENCE (see Glossary) of ancestral states or paths
of change in key traits on phylogenetic trees is fascinating.
A few examples include the design of vaccines [1], the
reconstruction of ancestral hormone receptors [2] and
ancestral metabolic pathways [3], the inference of ancient
behaviours [4], the identification of past dispersal patterns
[5–7], the study of positive selection in proteins [8], the
discovery of viral infection pathways [9], and the recog-
nition of character correlation in coevolving lineages [10].
Many of these applications still rely on explicit or
implicit PARSIMONY mapping of characters onto a single
phylogenetic tree. The parsimony method finds the
reconstruction that implies the smallest number of
changes on the given tree; the solution is often intuitively
obvious (Figure IainBox 1). The inferred ancestral states
and character changes using parsimony typically reveal
the process of evolution in exhilarating detail.
It has long been recognized that this approach ignores
two important sources of error. First, the parsimony
principle singles out the solution(s) requiring the
minimum amount of change on the given tree, although
there is usually a range of alternative reconstructions on
the same tree that are almost as likely [11] (MAPPING
UNCERTAINTY;Box 1). Second, the tree is almost never
known without error [12] (PHYLOGENETIC UNCERTAINTY;
Box 2). If there is a range of plausible trees, it is possible
that the evolutionary history of a trait could differ
depending on the tree. Clearly, ignoring either of these
sources of error is potentially misleading.
Glossary
Bayesian inference: theory of statistical inference based on the idea of rational
accumulation of scientific knowledge. Statistical models and model parameters
are regarded as random variables, and statistical analysis uses data (obser-
vations) to update a prior probability distribution on these parameters to a
posterior probability distribution.
Bootstrapping (nonparametric): procedure for examining the uncertainty in a
statistical estimate by drawing new samples (pseudosamples) from the original
sample, and repeating the statistical procedure for each of these new samples.
There is also a parametric variant that generates new samples by using a
parametric model estimated from the original sample.
Conditional probability: the probability conditioned on (given) some infor-
mation; we can think of it as a relative probability. In Box 1, the conditional
(relative) probabi lity of ancestor Bbeing purple (state 0) is Pr( BZ0)Z
0.00024/0.00037Z0.65. Hence, the conditional probability it being green is
Pr(BZ1)Z1KPr(BZ0)Z1K0.65Z0.35. Box 1
Inference: to draw conclusions about a statistical model using empirical data.
Likelihood: probability that a particular model (with specific parameter values)
produced some observed data. For instance, the likelihood (probability) of the
data in Box 1 is LZ0.00037 given the binary Markov model with p
1
Z0.5 and
summing over ancestral states. If ancestor Bhas state 0 (purple), the likelihood
is L(BZ0)Z0.00024; if it has state 1 (green), the likelihood is L(BZ1)ZLK
L(BZ0)Z0.00037K0.00024Z0.00013. Box 1
Mapping uncertainty: the error associated with reconstructing the evolution of
a character on a given phylogenetic tree.
Markov chain Monte Carlo (MCMC): stochastic simulation technique for
generating a sample from a complex distribution that is known up to a
normalizing constant. It is widely used to sample Bayesian posterior
distributions, where it is based on specially designed Markov models (similar
but more complex than the ones used to model evolution; Box 3) and their
tendency to move towards a stationary condition. Box 3
Maximum likelihood (ML): widely used method of statistical inference that
finds the parameter values that maximize likelihood. For instance, when
p
1
Z0.5, the ML state of ancestor B(Figure Ib in Box 1) is 0 (purple) because
L(BZ0)OL(BZ1). More typically, ML is used to estimate the free parameters of
a probability model. For instance, if we vary p
1
, we discover that the likelihood
of the observed data is maximized when p
1
z0.20. This is the ML estimate of p
1
.
Figure I, Box 1
Parsimony: inference principle based on minimizing cost; in evolutionary
inference, usually the same as minimizing the number of character changes.
Phylogenetic uncertainty: the uncertainty in reconstructing character evolution
owing to error in the phylogenetic estimate.
Posterior (probability distribution): probability distribution describing the
knowledge about a model and its parameters after a Bayesian analysis. Can
be used as the prior in a subsequent Bayesian analysis.
Prior (probability distribution): probability distribution specifying the knowl-
edge about a model and its parameters before a Bayesian analysis.
Corresponding author: Fredrik Ronquist ([email protected]).
Available online 21 July 2004
www.sciencedirect.com 0169-5347/$ - see front matter Q2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tree.2004.07.002
Review TRENDS in Ecology and Evolution Vol.19 No.9 September 2004
pf3
pf4
pf5

Partial preview of the text

Download Bayesian Character Evolution: Accounting for Uncertainty and more Study notes Biology in PDF only on Docsity!

Phylogenetics Series

Bayesian inference of character

evolution

Fredrik Ronquist

Computational Science and Information Technology, Florida State University, Tallahassee, FL 32306–4120, USA

Much recent progress in evolutionary biology is based

on the inference of ancestral states and past transform-

ations in important traits on phylogenetic trees. These

exercises often assume that the tree is known without

error and that ancestral states and character change

can be mapped onto it exactly. In reality, there is often

considerable uncertainty about both the tree and the

character mapping. Recently introduced Bayesian stat-

istical methods enable the study of character evolution

while simultaneously accounting for both phylogenetic

and mapping uncertainty, adding much needed credi-

bility to the reconstruction of evolutionary history.

Evolution is a difficult phenomenon to study. It is rarely

fast enough to be observed directly and only in exceptional

cases is it possible to find physical evidence, such as fossils

or ancient DNA, of past states and events. Fortunately,

evolution leaves its footprint in the distribution of traits

among living things. By studying this footprint, we can

infer how organisms originated through the successive

splitting of ancestral lineages, a process depicted in

phylogenetic trees. Given a phylogenetic tree, we can

also reconstruct the evolutionary history of individual

traits of interest.

The wide range of questions that can be addressed by

the INFERENCE (see Glossary) of ancestral states or paths

of change in key traits on phylogenetic trees is fascinating.

A few examples include the design of vaccines [1], the

reconstruction of ancestral hormone receptors [2] and

ancestral metabolic pathways [3], the inference of ancient

behaviours [4], the identification of past dispersal patterns

[5–7], the study of positive selection in proteins [8], the

discovery of viral infection pathways [9], and the recog-

nition of character correlation in coevolving lineages [10].

Many of these applications still rely on explicit or

implicit PARSIMONY mapping of characters onto a single

phylogenetic tree. The parsimony method finds the

reconstruction that implies the smallest number of

changes on the given tree; the solution is often intuitively

obvious (Figure Ia in Box 1). The inferred ancestral states

and character changes using parsimony typically reveal

the process of evolution in exhilarating detail.

It has long been recognized that this approach ignores

two important sources of error. First, the parsimony

principle singles out the solution(s) requiring the

minimum amount of change on the given tree, although there is usually a range of alternative reconstructions on the same tree that are almost as likely [11] ( MAPPING UNCERTAINTY; Box 1). Second, the tree is almost never known without error [12] ( PHYLOGENETIC UNCERTAINTY; Box 2). If there is a range of plausible trees, it is possible that the evolutionary history of a trait could differ depending on the tree. Clearly, ignoring either of these sources of error is potentially misleading.

Glossary

Bayesian inference: theory of statistical inference based on the idea of rational accumulation of scientific knowledge. Statistical models and model parameters are regarded as random variables, and statistical analysis uses data (obser- vations) to update a prior probability distribution on these parameters to a posterior probability distribution. Bootstrapping (nonparametric): procedure for examining the uncertainty in a statistical estimate by drawing new samples (pseudosamples) from the original sample, and repeating the statistical procedure for each of these new samples. There is also a parametric variant that generates new samples by using a parametric model estimated from the original sample. Conditional probability: the probability conditioned on (given) some infor- mation; we can think of it as a relative probability. In Box 1, the conditional (relative) probability of ancestor B being purple (state 0) is Pr(BZ0)Z 0.00024/0.00037Z0.65. Hence, the conditional probability it being green is Pr(BZ1)Z 1 KPr(BZ0)Z 1 K0.65Z0.35. Box 1 Inference: to draw conclusions about a statistical model using empirical data. Likelihood: probability that a particular model (with specific parameter values) produced some observed data. For instance, the likelihood (probability) of the data in Box 1 is LZ0.00037 given the binary Markov model with p 1 Z0.5 and summing over ancestral states. If ancestor B has state 0 (purple), the likelihood is L(BZ0)Z0.00024; if it has state 1 (green), the likelihood is L(BZ1)ZLK L(BZ0)Z0.00037K0.00024Z0.00013. Box 1 Mapping uncertainty: the error associated with reconstructing the evolution of a character on a given phylogenetic tree. Markov chain Monte Carlo (MCMC): stochastic simulation technique for generating a sample from a complex distribution that is known up to a normalizing constant. It is widely used to sample Bayesian posterior distributions, where it is based on specially designed Markov models (similar but more complex than the ones used to model evolution; Box 3) and their tendency to move towards a stationary condition. Box 3 Maximum likelihood (ML): widely used method of statistical inference that finds the parameter values that maximize likelihood. For instance, when p 1 Z0.5, the ML state of ancestor B (Figure Ib in Box 1) is 0 (purple) because L(BZ0)OL(BZ1). More typically, ML is used to estimate the free parameters of a probability model. For instance, if we vary p 1 , we discover that the likelihood of the observed data is maximized when p 1 z0.20. This is the ML estimate of p 1. Figure I, Box 1 Parsimony: inference principle based on minimizing cost; in evolutionary inference, usually the same as minimizing the number of character changes. Phylogenetic uncertainty: the uncertainty in reconstructing character evolution owing to error in the phylogenetic estimate. Posterior (probability distribution): probability distribution describing the knowledge about a model and its parameters after a Bayesian analysis. Can be used as the prior in a subsequent Bayesian analysis. Prior (probability distribution): probability distribution specifying the knowl- Corresponding author: Fredrik Ronquist ([email protected]). edge about a model and its parameters before a Bayesian analysis. Available online 21 July 2004

www.sciencedirect.com 0169-5347/$ - see front matter Q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tree.2004.07.

Sensitivity analysis

A simple way of examining the robustness of evolutionary

inference is to look at how sensitive the results are to

slight changes in the analytical conditions, a procedure

known as sensitivity analysis [13–17]. Belshaw and

Quicke [18] recently used this approach extensively in

studying the evolution of a group of parasitic wasps in

which species lay their eggs in either concealed or exposed

hosts. They examined mapping uncertainty by modifying

the parsimony cost of the evolutionary switch from

exposed to concealed hosts relative to the cost of the

opposite switch until their evolutionary reconstruction

changed. Then they assessed phylogenetic uncertainty by

finding the difference in parsimony score between the

preferred tree and the best tree implying a different

evolutionary history. The authors concluded that there

was strong support for two switches in the parsimony

reconstruction: one to exposed hosts and another in the

opposite direction. Other sensitivity-type approaches to

phylogenetic uncertainty include enumerating all pos-

sible phylogenetic trees consistent with some classifi-

cation of the studied organisms [19] or simulating

alternative trees according to some plausible tree-gener-

ating process, such as random speciation and extinction

[20], and then mapping the studied character(s) onto each

of these trees.

Although a particular sensitivity analysis can be

illuminating, the procedure is a little bit like measuring

the stability of sand castles by pouring water, blowing and

stepping on them. Without standardization, the fact that

one castle stands and the other falls might be solely

because of differences in the treatment. Parametric

statistical methods, such as likelihood analysis and

Bayesian inference, can potentially add the rigor that

sensitivity analysis lacks.

Mapping uncertainty Likelihood analysis The most common parametric approach to mapping uncertainty is based on LIKELIHOODS calculated from an evolutionary probability model. The model of choice for discrete characters is the Markov model (Box 3), exempli- fied by the Jukes-Cantor model and similar models long used by molecular evolutionists to study nucleotide, amino acid and codon evolution [21]. There are also Markov models for discrete characters with an arbitrary number of states [11,22–25]. For quantitative charac- ters, Brownian motion is a popular evolutionary model [12,26]. Both Markov and Brownian motion models are mathematically convenient, but they are also able to capture many of the known complexities of the evolutionary process. Given a fixed tree with fixed branch lengths, fixed states for the tips, and a Markov model with fixed parameters (the fixed values of which are often derived from a MAXIMUM LIKELIHOOD analysis), we can calculate the RELATIVE ( CONDITIONAL ) PROBABILITY of each ancestral state given the observed states at the tips [27–29]. The conditional probabilities indicate some of the uncer- tainty in the ancestral state assignments (Figure Ib in Box 1), but the potential error in the fixed parameters is not accounted for. For instance, assume that we were mapping a binary character (0 or 1), with the unknown parameter p 1 specifying the rate of 0 / 1 changes measured as a fraction of the total evolution- ary rate (p 0 Cp 1 ; where p 0 is this rate of 0 / 1 changes; Box 3). For the conditional probabilities to be valid, we have to assume the value of p 1 was known with certainty although it is typically an estimate, perhaps a maximum likelihood estimate, associated with some error.

TRENDS in Ecology & Evolution

B

A

C

D

E F G

i

ii

A

B

C

D E^

F G

0.0 0.2 0.4 0.6 0.8 1.

Prior

Posterior

Probability

A

B

C

D E^

F G

(a) (b) (c) (d)

π 1 = 0.

π 1

Figure I.

Box 1. Mapping uncertainty

We are interested in inferring the ancestral states of a character with two states, purple (0) or green (1). The states could, for example, represent particular behaviours, life-history traits or morphological features. The tree and the ages of the nodes are known; the scale is in amount of expected change. Parsimony (Figure Ia) finds the reconstruction requiring the minimum amount of change between green and purple. In this case, there are two changes [marked (i) and (ii)] assuming gains and losses count equally. Ancestors are inferred as being either green or purple, but we do not know how certain these conclusions are. That is, we have not taken mapping uncertainty into account. Likelihood analysis requires that we know the relative rates of 0/ 1 changes (p 1 ) and 1/0 changes (p 0 ). Assuming that these rates are

equal (p 0 Zp 1 Z0.5), for instance, we can calculate the probability of each ancestor being either green or purple (Figure Ib). The ancestral state is uncertain for ancestors A, B and F because they are on long branches or close to regions of the tree where a state change is likely. In Bayesian inference, p 1 does not have to be fixed. Instead, we specify a prior probability distribution on p 1. In the absence of background information, we can assume that all possible values are equally likely (Figure Ic: prior). This enables us to infer ancestral states while weighting each p 1 value according to its probability given the data (Figure Ic: posterior). In our example, Bayesian inference simply adds a dash more uncertainty to the conditional probability values (Figure Id; the effect is most notable for ancestor C).

estimate. One of the advantages of bootstrapping is that it

can be applied to a wide array of methods for reconstruct-

ing phylogeny, including distance methods, parsimony,

maximum likelihood and even Bayesian inference.

Using bootstrapping to account for phylogenetic uncer-

tainty in studies of character evolution is straightforward:

simply map characters onto each of the bootstrap

estimates of phylogeny and then use the distribution of

these mappings to describe the effect of the uncertainty.

The approach seems to have been first discussed by

Felsenstein [12] and first applied by Richman and Price

[34]. Ronquist and Liljeblad [35] recently used the

technique extensively in reconstructing the origin of gall

wasps. Simple parsimony reconstruction suggested that

the first gall wasps lived in the Mediterranean and

induced single-chambered galls that were distinct swel-

lings of the seed-capsules of herbaceous Papaveraceae.

When phylogenetic uncertainty was taken into account, it

turned out that the robustness of these conclusions varied

considerably.

Unfortunately, bootstrapping cannot be used to address

mapping uncertainty. Bootstrapping DNA sequences, for

example, will be of little help in understanding how precise the mapping of a single behavioural character might be on each of the possible trees. Bayesian inference, however, can account for both mapping and phylogenetic uncertainty across a heterogeneous dataset. In principle, we only need to expand the probability model to include topology, branch lengths, and other parameters necessary to infer phylogeny from the available data. The posterior probability distribution can no longer be calculated analytically (with pen and paper) because it is so complex, but we can sample from it using stochastic simulation in, for example, so-called MARKOV CHAIN MONTE CARLO (MCMC) techniques [36–40]. If the simulation is run long enough, we obtain a valid sample of the posterior probability distribution. Box 2 gives an example of how phylogenetic and mapping uncertainty are handled in a Bayesian analysis of a discrete character. Huelsenbeck and col- leagues [31] first developed this approach, illustrating it with an analysis of the origin of soldier aphids. Parsimony suggested one origin and three losses of the soldier caste, but the Bayesian analysis revealed that this conclusion was uncertain. About a year later, Lutzoni and Pagel

Box 3. Markov models

Markov models are used for random processes, in which the probability of change depends only on the current state (the Markov property). They are most easily understood in terms of their instantaneous rate matrix, which describes the transition rates in an infinitesimal amount of time. For a discrete character with two states, 0 or 1, the rate matrix Q is (Eqn I)

Q ¼ fqijg ¼

Kp 1 p 1 p 0 Kp 0

; [Eqn I]

where q (^) ij refers to the rate in row i and column j of Q. There are two different rates in the off-diagonals: q 01 Zp 1 is the rate of 0 / 1 transitions, and q 10 Zp 0 is the rate of 1/0 transitions. The diagonal contains the loss rates. For instance, q 00 Z-p 1 is the rate at which the frequency of state 0 changes. The rate is negative because the frequency decreases as the character evolves from 0 to 1. The rate at which the frequency of a state decreases must balance the rate at which it evolves into other states; thus, each row in Q sums to 0. Markov models usually tend towards an equilibrium condition (stationarity). The probability of being in a particular state i at stationarity (the stationary frequency of the state) is usually denoted pi and can be determined from the rate matrix Q. In the binary model, the stationary frequencies correspond to the transition rates (scaling disregarded). To illustrate this, I ran three simulations under a two-state Markov model with p 1 Z0.75 (and p 0 Z0.25). Each simulation had 200 inde- pendently evolving characters; one was started with all characters in state 0, one with all characters in state 1, and the last one with half the characters in each state. In all cases, w75% of the characters ended up in state 1 and w25% in state 0, as predicted by the stationary frequencies (Figure I). To use a Markov model for simulations or probability calculations, we want to know the transition probabilities over a certain time period, t. These are represented in a matrix denoted P(t), which is obtained by integrating Q over time. For the binary Markov model, we get (Eqn II)

PðtÞ ¼ fpijðtÞgK ¼

p 0 þ p 1 eKmt^ p 1 K p 1 eKmt

p 0 K p 0 eKmt^ p 1 þ p 0 eKmt

; [Eqn II]

where m is a scaling factor. Each element of the P matrix summarizes the probability of a particular state change over an infinite number of change histories. For instance, p 01 (t) is the sum of the probability of one change 0/1, three changes 0/ 1 / 0 /1, five changes 0/ 1 / 0 / 1 / 0 /1, and so on, over time t.

The P matrix can be used to simulate the states at the terminals of an evolutionary tree. We draw a starting state at the root of the tree from the stationary frequencies. Then we use the P matrix for each branch in turn to generate the end state of that branch. To obtain a sample of change histories, we need to go back to the Q matrix and utilize the fact that the waiting time (x) to the next change is exponentially distributed (Eqn III):

PrðxÞ ¼

eKx lKqii Kqii

; [Eqn III]

where Pr(x) is the probability of x and q (^) ii is the loss rate of the current state. Thus, for a binary character in state 1, the waiting time to the next change is distributed as (Eqn IV)

PrðxÞ ¼

eKx=p^0 p 0

: [Eqn IV]

When there are more than two states, the probability of the change being from i to a particular state j is determined by (Eqn V)

PrðjÞ ¼ Pqij j:jsi qij

: [Eqn V]

TRENDS in Ecology & Evolution

Time

Frequency of state 1

Stationary frequency (^) (π 1 )

Figure I.

published similar work [41,42] on the origin of lichenized

fungi, but failed to apply a strict Bayesian approach to

mapping uncertainty. Recently, Huelsenbeck and Rannala

extended Bayesian MCMC techniques to the comparative

analysis of quantitative characters using Brownian

motion models [43].

An old argument is whether or not the character to be

mapped should be included in the phylogenetic analysis

[44–48]. The Bayesian posterior probabilities are always

based on all the available data. Assume that we are

mapping a behavioural trait onto a DNA phylogeny using

a composite model that describes both the evolution of the

DNA sequences and the behavioural character. Think of

the model as a table with many dimensions, each

dimension corresponding to a different parameter and

each cell to a combination of parameter values (Table I in

Box 2). The Bayesian MCMC analysis uses the data, the

model and the prior to estimate the probability of each cell

in the table (the joint probability distribution). The joint

probabilities are the same regardless of whether the DNA

phylogeny is derived first and the behavioural character

mapped on afterwards or if both data sources are

combined in a single analysis. After the analysis, the

investigator is free to focus on any parameter (axis of the

table) of interest by calculating its marginal distribution

(the marginal sums of that dimension in the table).

Character change histories

Bayesian inference can also be used to obtain samples of

character change histories from the posterior distribution

while accounting for both phylogenetic and mapping

uncertainty. Normally, dealing with change histories is a

nuisance because there are infinitely many of them.

Standard probability calculations avoid the problem by

using the transition probability (P) matrix, which sums

(integrates) over all possible realizations of character

change; only the starting and ending states matter

(Box 3). In a seminal paper, Nielsen [49] described how

we can nevertheless sample change histories. The idea is

to simulate character change on a set of MCMC samples,

working backwards. First, we use P matrices to draw a

sample of ancestral states given the observed tip states

and the parameter values of the MCMC sample. We then

simulate the substitution process, one branch at a time,

until we get a realization that is consistent with the fixed

starting and ending states of each branch (Box 4).

Nielsen’s method can be used to generate a range of

plausible scenarios for how evolution might have

occurred. We can also study the nature of character

evolution by comparing the sample of change histories

leading to the observed tip states with the change histories

expected from the model used for mapping [32,49]. The

expected histories are obtained by simulating character

evolution on the MCMC samples without fixing ancestral

and tip states first; this is referred to as a posterior

predictive distribution because it predicts future obser-

vations from the posterior. If the observed and expected

change histories differ, we can reject the mapping model

and learn something about character evolution. For

instance, we might find that there is more rate variation

than expected under an equal-rates model [49], we can

detect positively selected sites using a model with no across-site variation in selection pressure [49], and we can reveal character correlation using a model assuming no correlation [32]. In each of these cases, it would have been easy to use a more sophisticated mapping model, enabling us to obtain a valid sample of change histories and to estimate par- ameters such as the extent of across-site rate variation. However, Nielsen’s posterior predictive approach enables simple models to be used in addressing evolutionary phenomena that would otherwise have been difficult to model. For instance, evolutionary models that can vary across organism lineages are complicated (but not imposs- ible) to analyze with Bayesian MCMC techniques. With Nielsen’s approach, we can study the basic properties of complicated processes using simple standard models and use these results in designing more realistic models.

Bayesian controversies Bayesian posterior probabilities have an intuitive interpretation. A tree with a posterior probability of 0. has a 90% chance of being true given that the model and the priors are correct. This follows from the definition of posterior probabilities and needs no mathematical proof. Nevertheless, there have been simulation studies report- ing a slight Bayesian bias (underestimate) under these conditions [50–53]. This could be because of programming error, but recent analyses suggest that the bias is caused

Box 4. Generating a Bayesian sample of character change

histories

To illustrate the uncertainty in reconstructing character evolution, it is useful to have a sample of likely character change histories. In the Bayesian approach, we first obtain a sample from the posterior distribution of a phylogenetic analysis (as in Figure Ia in Box 2). For each sampled tree, we draw a sample of ancestral states for the character(s) we want to map (Figure Ia). This is done by pulling conditional probabilities down the tree to obtain downpass prob- abilities, and then drawing ancestral states one node at a time up the tree from the downpass probabilities adjusted according to the already drawn states [49]. Once we have a sample of ancestral states, we simulate a character change history by drawing waiting times between changes one branch at a time until the drawn history matches the starting and ending states of that branch. This produces a valid sample of character change history from the posterior probability distribution (Figure Ib). A branch can have more than one change, as illustrated by the left descendant of A. By repeating the procedure for each tree in the sample, we can obtain thousands of samples of character change similar to the one in Figure Ib. These samples help reveal how the mapped characters evolve.

TRENDS in Ecology & Evolution

C

D

F E

B

A

G

Draw ancestral states

Simulate character change

(a) (b)

Figure I.

14 Maddison, W.P. and Maddison, D.R. (1992) MacClade Version 3: Analysis of Phylogeny and Character Evolution, Sinauer 15 Wheeler, W.C. (1995) Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Syst. Biol. 44, 321– 16 Donoghue, M.J. and Ackerly, D.D. (1996) Phylogenetic uncertainties and sensitivity analyses in comparative biology. Philos. Trans. R. Soc. London B 351, 1241– 17 Cunningham, C.W. et al. (1998) Reconstructing ancestral character states: a critical appraisal. Trends Ecol. Evol. 13, 361– 18 Belshaw, R. and Quicke, D.L.J. (2002) Robustness of ancestral state estimates: evolution of life history strategy in ichneumonoid para- sitoids. Evolution 51, 450– 19 Losos, J.B. (1994) An approach to the analysis of comparative data when a phylogeny is unavailable or incomplete. Syst. Biol. 43, 117– 20 Martins, E.P. (1996) Conducting phylogenetic comparative studies when the phylogeny is not known. Evolution 50, 12– 21 Felsenstein, J. (2004) Inferring Phylogenies, Sinauer 22 Pagel, M. (1994) Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proc. R. Soc. London B. Biol. Sci. 255, 37– 23 Schultz, T.R. et al. (1996) The reconstruction of ancestral character states. Evolution 50, 504– 24 Schluter, D. et al. (1995) Uncertainty in ancient phylogenies. Nature 377, 108– 25 Lewis, P.O. (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50, 913– 26 Felsenstein, J. (1985) Phylogenies and the comparative method. Am. Nat. 125, 1– 27 Schluter, D. et al. (1997) Likelihood of ancestor states in adaptive radiation. Evolution 51, 1699– 28 Pagel, M. (1999) The maximum likelihood approach to reconstructing ancestral character states of discrete characters on phylogenies. Syst. Biol. 48, 612– 29 Mooers, A.Ø. and Schluter, D. (1999) Reconstructing ancestor states with maximum likelihood: support for one- and two-rate models. Syst. Biol. 48, 623– 30 Schultz, T.R. and Churchill, G.A. (1999) The role of subjectivity in reconstructing ancestral character states: a Bayesian approach to unknown rates, states, and transformation asymmetries. Syst. Biol. 48, 651– 31 Huelsenbeck, J.P. et al. (2000) Accommodating phylogenetic uncer- tainty in evolutionary studies. Science 288, 2349– 32 Huelsenbeck, J.P. et al. (2003) Stochastic mapping of morphological characters. Syst. Biol. 52, 131– 33 Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783– 34 Richman, A.D. and Price, T. (1992) Evolution of ecological differences in the old world leaf warblers. Nature 355, 817– 35 Ronquist, F. and Liljeblad, J. (2001) Evolution of the gall wasp-host plant association. Evolution 55, 2503– 36 Gamerman, D. (1997) Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, Chapman & Hall 37 Lewis, P.O. (2001) Phylogenetic systematics turns over a new leaf. Trends Ecol. Evol. 16, 30– 38 Huelsenbeck, J.P. et al. (2001) Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 2310– 39 Huelsenbeck, J.P. et al. (2002) Potential applications and pitfalls of Bayesian inference of phylogeny. Syst. Biol. 51, 673– 40 Holder, M. and Lewis, P.O. (2003) Phylogenetic estimation: traditional and Bayesian approaches. Nat. Rev. Genet. 4, 275–

41 Lutzoni, F. et al. (2001) Major fungal lineages are derived from lichen symbiotic ancestors. Nature 411, 937– 42 Pagel, M. and Lutzoni, F. (2002) Accounting for phylogenetic uncertainty in comparative studies of evolution and adaptation. In Biological Evolution and Statistical Physics (La¨ ssig, M. and Valler- iani, A. eds), pp. 151–164, Springer 43 Huelsenbeck, J.P. and Rannala, B. (2003) Detecting correlation between characters in a comparative analysis with uncertain phylogeny. Evolution 57, 1237– 44 Coddington, J. (1988) Cladistic tests of adaptational hypotheses. Cladistics 4, 3– 45 Brooks, D.R. and McClennan, D.A. (1990) Phylogeny, Ecology, and Behavior, University of Chicago Press 46 Armbruster, S.R. (1992) Phylogeny and the evolution of plant-animal interactions. BioScience 42, 12– 47 de Queiroz, K. (1996) Including the characters of interest during tree reconstruction and the problem of circularity and bias in studies of character evolution. Am. Nat. 148, 700– 48 Luckow, M. and Bruneau, A. (1997) Circularity and independence in phylogenetic tests of ecological hypotheses. Cladistics 13, 145– 49 Nielsen, R. (2002) Mapping mutations on phylogenies. Syst. Biol. 51, 729– 50 Wilcox, T.P. et al. (2002) Phylogenetic relationships of the dwarf boas and a comparison of Bayesian and bootstrap measures of phylogenetic support. Mol. Phylog. Evol. 25, 361– 51 Douady, C.J. et al. (2002) Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol. Biol. Evol. 20, 248– 52 Alfaro, M.E. et al. (2003) Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence. Mol. Biol. Evol. 20, 255– 53 Cummings, M.P. et al. (2003) Comparing bootstrap and posterior probability values in the four-taxon case. Syst. Biol. 52, 477– 54 Erixon, P. et al. (2003) Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Syst. Biol. 52, 665– 55 Suzuki, Y. et al. (2002) Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc. Natl. Acad. Sci. U. S. A. 99, 16138– 56 Buckley, T.R. (2002) Model misspecification and probabilistic tests of topology: evidence from empirical data sets. Syst. Biol. 51, 509– 57 Simmons, M.P. et al. (2004) How meaningful are Bayesian support values? Mol. Biol. Evol. 21, 188– 58 Lemmon, A.R. and Moriarty, E.C. (2004) The importance of proper model assumption in Bayesian phylogenetics. Syst. Biol. 53, 216– 59 Nylander, J. et al. (2004) Bayesian phylogenetic analysis of combined data. Syst. Biol. 53, 47– 60 Holmes, S. (2003) Bootstrapping phylogenetic trees: theory and methods. Stat. Sci. 18, 241– 61 Ronquist, F. and Huelsenbeck, J.P. (2003) MrBayes3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572– 62 Galtier, N. (2004) Sampling properties of the bootstrap support in molecular phylogeny: influence of nonindependence among sites. Syst. Biol. 53, 38– 63 Efron, B. et al. (1996) Bootstrap confidence levels for phylogenetic trees. Proc. Natl. Acad. Sci. U. S. A. 93, 13429– 64 Sanderson, M.J. and Wojciechowski, M.F. (2000) Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae). Syst. Biol. 49, 671–