




























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The role of information about events, macro conditions, and asset pricing theories in selecting optimal portfolios. It discusses the use of Bayesian methods to recover the predictive distribution of future asset returns, accounting for prior information, law of motion, estimation risk, and model uncertainty. The document also covers studies on the time-series and cross-sectional properties of expected returns, and the application of Bayesian model averaging to manage model uncertainty in asset allocation decisions.
Typology: Exercises
1 / 36
This page cannot be seen from the preview
Don't miss anything!





























∗We are grateful to Lubos P´astor for useful comments and suggestions as well as Dashan Huang and Minwen Li for outstanding research assistance. to
Corresponding Author: Doron Avramov. Email: [email protected]
Bayesian Portfolio Analysis
This paper reviews the literature on Bayesian portfolio analysis. Information about events, macro conditions, asset pricing theories, and security-driving forces can serve as useful priors in selecting optimal portfolios. Moreover, parameter uncertainty and model uncertainty are prac- tical problems encountered by all investors. The Bayesian framework neatly accounts for these uncertainties, whereas standard statistical models often ignore them. We review Bayesian portfolio studies when asset returns are assumed both independently and identically distributed as well as predictable through time. We cover a range of applications, from investing in single assets and equity portfolios to mutual and hedge funds. We also outline existing challenges for future work.
1 Introduction
Portfolio selection is one of the most important problems in practical investment management. First papers in the field go back at least to the mean variance paradigm of Markowitz (1952) which analytically formalizes the risk return tradeoff in selecting optimal portfolios. Even when the mean variance is a static one-period model it has widely been accepted by both academics and practitioners. The latter developed intertemporal Capital Asset Pricing Model of Merton (1973) already accounts for the dynamic multi-period nature of investment-consumption decisions. In an intertemporal economy, the overall demand for risky assets consists of both the mean variance component as well as a component hedging against unanticipated shocks to time varying investment opportunities. Empirically, for a wide variety of preferences, hedging demands for risky assets are typically small, even nonexistent [see also Ait-Sahalia and Brandt (2001) and Brandt (2009)].
We review Bayesian studies of portfolio analysis. The Bayesian approach is potentially attrac- tive. First, it can employ useful prior information about quantities of interest. Second, it accounts for estimation risk and model uncertainty. Third, it facilitates the use of fast, intuitive, and easily implementable numerical algorithms in which to simulate otherwise complex economic quantities. There are three building blocks underlying Bayesian portfolio analysis. The first is the formation of prior beliefs, which are typically represented by a probability density function on the stochastic parameters underlying the stock return evolution. The prior density can reflect information about events, macroeconomy news, asset pricing theories, as well as any other insights relevant to the dynamics of asset returns. The second is the formulation of the law of motion governing the evo- lution of asset returns. Then, one could recover the predictive distribution of future asset returns, analytically or numerically, incorporating prior information, law of motion, as well as estimation risk and model uncertainty. The predictive distribution, which integrates out the parameter space, characterizes the entire uncertainty about future asset returns. The Bayesian optimal portfolio rule is obtained by maximizing the expected utility with respect to the predictive distribution.
Zellner and Chetty (1965) pioneer the use of predictive distribution in decision-making in gen- eral. First applications in finance appear during the 1970s. Such applications are entirely based on uninformative or data based priors. Bawa, Brown, and Klein (1979) provide an excellent survey on such applications. Jorion (1986) introduces the hyperparameter prior approach in the spirit of
the Bayes-Stein shrinkage prior, while Black and Litterman (1992) advocate an informal Bayesian analysis with economic views and equilibrium relations. Recent studies of P´astor (2000) and P´astor and Stambaugh (2000) center prior beliefs around values implied by asset pricing theories. Tu and Zhou (2009) argue that the investment objective itself provides a useful prior for portfolio selection.
Whereas all above-noted studies assume that asset returns are identically and independently distributed through time, Kandel and Stamabugh (1996), Barberis (2000), and Avramov (2002), among others, account for the possibility that returns are predictable by macro variables such as the aggregate dividend yield, the default spread, and the term spread. Incorporating predictability provides fresh insights into asset pricing in general and Bayesian portfolio selection in particular.
Indeed, we review Bayesian portfolio studies when asset returns are assumed to (i) be indepen- dently and identically distributed, (ii) be predictable through time by macro conditions, as well as (iii) exhibit regime shifts and stochastic volatility. We cover a range of applications, from investing in the market portfolio, equity portfolios, and single stocks to investing in mutual funds and hedge funds. We also outline existing challenges for future work.
The paper is organized as follows. Section 2 reviews Bayesian portfolio analysis when asset returns are independent and identically distributed through time. Section 3 surveys studies that account for potential predictability in asset returns. Section 4 discusses alternative return gener- ating processes. Section 5 outlines ideas for future research, and Section 6 concludes.
2 Asset Allocation when Returns are IID
Consider N + 1 investable assets, one of which is riskless and the others are risky. Risky assets may include stocks, bonds, currencies, mutual funds, and hedge funds. Denote by rf t and rt the returns on the riskless and risky assets at time t, respectively. Then, Rt ≡ rt − rf t (^1) N is an N dimensional vector of time t excess returns on risky assets, where 1N is an N -vector of ones. The joint distribution of Rt is assumed IID through time with mean μ and covariance matrix V.
For analytical insights, it would be beneficial to review the mean-variance framework pioneered by Markowitz (1952). In particular, consider an optimizing investor who chooses at time T portfolio
The Bayesian approach treats θ as a random quantity. One can only infer its probability distri- bution function. Following Zellner and Chetty (1965), the Bayesian optimal portfolio is obtained by maximizing the expected utility under the predictive distribution. In particular, the utility maximization is formulated as
wˆBayes^ = argmaxw
RT + U^ ˜ (w) p(RT +1|ΦT ) dRT +
= argmaxw
RT +
μ
V U^ ˜ (w) p(RT +1, μ, V |ΦT ) dμdV dRT +1, (8)
where U˜ (w) is the utility of holding a portfolio w at time T + 1 and ΦT is the data available at time T. Moreover, p(RT +1|ΦT ) is the predictive density of the time T + 1 return, which integrates out μ and V from p(RT +1, μ, V |ΦT ) = p(RT +1|μ, V, ΦT ) p(μ, V |ΦT ), (9)
where p(μ, V |ΦT ) is the posterior density of μ and V. To compare the classical and Bayesian formulations in (7) and (8), notice that the expected utility is maximized under the conditional and predictive distributions, respectively. Unlike the conditional distribution, the Bayesian predictive distribution accounts for estimation errors by integrating out the unknown parameter space. The degree of uncertainty about the unknown parameters will thus play a role in the optimal solution.
To get better understanding of the Bayesian approach we consider various specifications for prior beliefs about the unknown parameters. We start with the standard diffuse prior on μ and V. The typical formulation is given by
p 0 (μ, V ) ∝ |V |−^ N^2 +1^. (10)
Then assuming that returns on risky assets are jointly normally distributed, the posterior distribu- tion is given by (see, e.g., Zellner (1971)),
p(μ, V |ΦT ) = p(μ | V, ΦT ) × p(V | ΦT ) (11)
with p(μ | V, ΦT ) ∝ |V |−^1 /^2 exp{− 12 tr[T (μ − μˆ)(μ − μˆ)′V −^1 ]}, (12) P (V ) ∝ |V |−^ ν^2 exp{− 12 tr V −^1 (T Vˆ )}, (13)
where ‘tr’ denotes the trace of a matrix and ν = T + N. Moreover, the predictive distribution obeys the expression
p(RT +1|ΦT ) ∝ ∣∣V + (RT +1 − μˆ)(RT +1 − μˆ)′/(T + 1)∣∣−T /^2 , (14)
which amounts to a multivariate t-distribution with T − N degrees of freedom.
The problem of estimation error is already recognized by Markowitz (1952). Nevertheless, this problem receives serious attention only during the 1970s. Winkler (1973) and Winkle and Barry (1975) are earlier examples of Bayesian studies on portfolio choice. Brown (1976, 1978) and Klein and Bawa (1976) lay out independently and clearly the Bayesian predictive density approach, especially Brown (1976) who explains thoroughly the estimation error problem and the associated Bayesian approach. Later, Bawa, Brown, and Klein (1979) provide an excellent review of the literature.
Under the diffuse prior, (10), it is known that the Bayesian optimal portfolio weights are
wˆBayes^ =^1 γ
V^ ˆ −^1 μ.ˆ (15)
Similar to the classical solution wˆML, an optimizing Bayesian agent holds the portfolio that is also proportional to (^) γ^1 Vˆ −^1 ˆμ, with the coefficient of proportion being (T − N − 2)/(T + 1). This coefficient can be substantially smaller than one when N is large relative to T. Intuitively, the assets are riskier in a Bayesian framework since parameter uncertainty is an additional source of risk and this risk is accounted for in the portfolio decision. As a result, in the presence of a risk-free security the overall positions in risky assets are generally smaller in the Bayesian versus classical frameworks.
However, the Bayesian approach based on diffuse prior does not yield significantly different portfolio decisions compared with the classical framework. In particular, ˆwML^ is a biased estimator of w∗^ , whereas the classical unbiased estimator is given by
w¯ =^1 γT^ −^ N T −^2 Vˆ −^1 μ,ˆ (16)
which is a scalar adjustment of ˆwML, and differs from the Bayesian counterpart only by a scalar T /(T +1). The difference is independent of N , and is negligible for all practical sample sizes. Hence, incorporating parameter uncertainty makes little difference if the diffuse prior is used. Indeed, to
where the first term on the right hand side is the true expected utility based on the true optimal portfolio. Hence, ρ(w∗, w˜|μ, V ) is the utility loss if one plays infinite times the investment game with ˜w, whether estimated via a Bayesian or a non-Bayesian approach. In particular, the difference in expected utilities between any two estimated rules, ˜wa and ˜wb, should be
Gain = E[U ( ˜wa)|μ, V ] − E[U ( ˜wb)|μ, V ]. (19)
This is an objective utility gain (loss) of using portfolio strategy ˜wa versus ˜wb. It is considered to be an out-of-sample measure since it is independent of any single set of observations. If it is, say 5%, it means that using ˜wa instead of ˜wb would yield a 5% gain in the expected utility over repeated use of the estimation strategy. In this case, if ˜wa is obtained under prior a and ˜wb is obtained under prior b, one could consider prior a to be superior to prior b. The loss or gain criterion is widely used in the classical statistics to evaluate two estimators. Brown (1976, 1978), Jorion (1986), Frost and Savarino (1986), and Stambaugh (1997), for example, use ρ(w∗, wˆ) to evaluate portfolio rules.
Still, one cannot compute the loss function since it depends on unknown true parameters. Even though, it is widely used in two major ways. First, alternative estimators can be assessed in simulations with various assumed true parameters. Second, a comparison of alternative estimators can often be made analytically without any knowledge of the true parameters. For example, Kan and Zhou (2007) show that the Bayesian solution ˆwBayes^ dominates ¯w given in Equation (16), by having positive utility gains regardless of the true parameter values. However, the Bayesian solution is yet dominated by another classical rule,
wˆc = cγ Σˆ−^1 μ,ˆ c = (T^ −^ N^ T− (^ 1)(T −T 2)− N^ −^ 4). (20)
This calls again for the use of informative priors in Bayesian portfolio analysis.
The conjugate prior, which retains the same class of distributions, is a natural and common in- formative prior on any problem in decision making. In our context, the conjugate specification considers a normal prior for μ (conditional on V ) and inverted Wishart prior for V. The conjugate
prior is given by
μ | V ∼ N (μ 0 , (^1) τ V ), (21) V ∼ IW (V 0 , ν 0 ), (22)
where μ 0 is the prior mean, τ is a parameter reflecting the prior precision of μ 0 , and ν 0 is a similar prior precision parameter on V. Under this prior, the posterior distribution of μ and V obey the same form as that based on the diffuse prior, except that now the posterior mean of μ is given by the mixture μ˜ = (^) T τ+ τ μ 0 + (^) T T+ τ μ.ˆ (23)
That is, the posterior mean is simply a weighted average of the prior and sample means. Similarly, V 0 can be updated by
V˜ = T^ + 1 T (ν 0 + N − 1)
V 0 + T Vˆ + (^) TT τ + τ (μ 0 − μˆ)(μ 0 − ˆμ)′
which is a weighted average of the prior variance, sample variance, and deviations of ˆμ from μ 0.
Frost and Savarino (1986) provide an interesting application of the conjugate prior, assuming all assets exhibit identical means, variances, and patterned covariances, a priori. They find that such a prior improves ex post performance. This prior is related the well known 1/N rule that invests equally across the N assets.
Jorion (1986) introduces hyperparameters η and λ that underlie the prior distribution of μ. In particular, the hyperparameter prior is formulated as
p 0 (μ | η, λ) ∝ |V |−^1 exp{− 12 (μ − η (^1) N )′(λV )−^1 (μ − η (^1) N )}. (25)
Then employing diffuse priors on both η and λ and integrating these parameters out from a suitable distribution, the predictive distribution of the future portfolio return can be obtained following Zellner and Chetty (1965). In particular, the Jorion’s optimal portfolio rule is given by
wPJ^ = γ^1 ( Vˆ PJ)−^1 μˆPJ, (26)
where we denotes the value-weighted weights in the stock index and γ is the market risk-aversion coefficient. Assume that the true expected excess return μ is normally distributed with mean μe,
μ = μe^ + e, e^ ∼ N (0, τ V ), (35)
where e, the deviation of μ from μe, is normally distributed with zero mean and covariance matrix τ V with τ being a scalar indicating the degree of belief in how close μ is to the equilibrium value μe. In the absence of any views on future stock returns, and in the special case of τ = 0, the investor’s portfolio weights must be equal to we, the weights of the value-weighted index.
Black and Litterman (1992) consider views on the relative performance of stocks that can be represented mathematically by a single vector equation,
P μ = μv^ + v, v^ ∼ N (0, Ω), (36)
where P is a K × N matrix summarizing K views, μv^ is a K-vector summarizing the prior means of the view portfolios, and v^ is the residual vector. The views may be formed based on news, events, or analysis on the economy and investable assets. The covariance matrix of the residuals, Ω, measures the degree of confidence the investor has in his own views. Applying the Bayesian rule to the beliefs in market equilibrium relationship and investor own views, as formulated in (35) and (36), Black and Litterman (1992) obtain the Bayesian updated expected returns and risks as
μ¯BL^ = [(τ V )−^1 + P ′Ω−^1 P ]−^1 [(τ V )−^1 μe^ + P ′Ω−^1 μv], (37) V¯ BL^ = V + [(τ V )−^1 + P ′Ω−^1 P ]−^1. (38)
Replacing V by Vˆ and plugging these two updated estimates into (6), one obtains the Black and Litterman solution to the portfolio choice problem.
Note that the Black-Litterman expected return, ¯μBL, is a weighted average of the equilibrium expected return and the investor’s views about expected return. Intuitively, the less confident the investor is in his views, the closer ¯μBL^ is to the equilibrium value, and so the closer the Black- Litterman portfolio is to we. This is indeed the case as shown mathematically by He and Litterman (1999). Hence, the Black Litterman model tilts the investor’s optimal portfolio away from the market portfolio according to the strength of the investor’s views. Since the market portfolio is a reasonable starting point which takes no extreme positions, any suitably controlled tilt should also
yield a portfolio without any extreme positions. This is one of the major reasons making the Black Litterman model popular in practice.
Whereas the Black Litterman model is considered to be a Bayesian approach, it is not entirely Bayesian. For one, the data-generating process is not spelled out explicitly. Moreover, the Bayesian predictive density is not used anywhere. Zhou (2009) treats the investors’ view as yet another layer of priors, and combines this and the equilibrium prior with the data-generating process, resulting a formal Bayesian treatment and an extension of the famous Black and Litterman model.
P´astor (2000) and P´astor and Stambaugh (2000) introduce interesting priors that reflect an in- vestor’s degree of belief in the ability of an asset pricing model to explain the cross section disper- sion in expected returns. In particular, let Rt = (yt, xt), where yt contains the excess returns of m non-benchmark positions and xt contains the excess returns of K (= N − m) benchmark positions. Consider a factor model multivariate regression
yt = α + Bxt + ut, (39)
where ut is an m × 1 vector of residuals with zero means and a non-singular covariance matrix Σ = V 11 − BV 22 B′. Notice that α and B are related to μ and V through
α = μ 1 − Bμ 2 , B = V 12 V 22 − 1 , (40)
where μi and Vij (i, j = 1, 2) are the corresponding partitions of μ and V ,
μ =
(μ 1 μ 2
A factor-based asset pricing model, such as the three-factor model of Fama and French (1993), implies the restrictions α = 0 for all non-benchmark assets.
To allow for mispricing uncertainty, P´astor (2000), and P´astor and Stambaugh (2000) specify the prior distribution of α as a normal distribution conditional on Σ,
α|Σ ∼ N
0 , σ α^2
s^2 Σ^ Σ
where s^2 Σ is a suitable prior estimate for the average diagonal elements of Σ. The above alpha-Sigma link is also explored by MacKinlay and P´astor (2000) in a classical framework. The magnitude of
Formally, the objective-based prior starts from a prior on w, w ∼ N (w 0 , V 0 V −^1 /γ). (44)
where w 0 and V 0 are suitable prior constants with known values, and then back out a prior on μ,
μ ∼ N
γV w 0 , σ^2 ρ
s^2 V
where s^2 is the average of the diagonal elements of V. The prior on V can be taken as the usual inverted Wishart distribution.
Using monthly returns on the Fama-French 25 size and book-to-market portfolios and three factors from January 1965 to December 2004, Tu and Zhou (2009) find that the investment per- formance under the objective-based priors can be significantly different from that under diffuse and asset pricing priors, with differences in terms of annual certainty-equivalent returns greater than 10% in many cases. In terms of the loss function measure, portfolio strategies based on the objective-based priors can substantially outperform both strategies under the alternative priors.
3 Predictable Returns
So far asset returns are assumed to be IID and thus unpredictable through time. However, Keim and Stambaugh (1986), Campbell and Shiller (1988), and Fama and French (1989), among others, identify business cycle variables, such as the aggregate dividend yield and the default spread, that predict future stock and bond returns. Such predictive variables, when incorporated in studies that deal with the time-series and cross-sectional properties of expected returns, provide fresh in- sights into asset pricing and portfolio selection. In asset pricing, Lettau and Ludvigson (2001) and Avramov and Chordia (2006a) show that factor models with time varying risk premia and/or risk are reasonably successful relative to their unconditional counterparts. Focusing on portfolio selec- tion, Kandel and Stambaugh (1996) analyze investments when returns are potentially predictable.
In particular, consider a one-period optimizing investor who must allocate at time T funds be- tween the value-weighted NYSE index and one-month Treasury bills. The investor makes portfolio
decisions based on estimating the predictive system
rt = a + b′zt− 1 + ut, (46) zt = θ + ρzt− 1 + vt, (47)
where rt is the continuously compounded NYSE return in month t in excess of the continuously compounded T-bill rate for that month, zt− 1 is a vector of M predictive variables observed at the end of month t − 1, b is a vector of slope coefficients, and ut is the regression disturbance in month t. The evolution of the predictive variables is essentially stochastic. Typically a first order vector autoregression is employed to model that evolution. The residuals in equations (46) and (47) are assumed to obey the normal distribution. In particular, let ηt = [ut, v t′]′^ then ηt ∼ N (0, Σ) where
Σ =
[ (^) σ 2 u σuv σvu Σv
The distribution of rT +1, the time T + 1 NYSE excess return, conditional on data and model parameters is N (a + b′zT , σ^2 u). Assuming the inverted Wishart prior distribution for Σ and multi- variate normal prior for the intercept and slope coefficients in the predictive system, the Bayesian predictive distribution P (rT +1|ΦT ) obeys the Student t density. Then, considering a power utility investor with parameter of relative risk aversion denoted by γ the optimization formulation is
ω∗^ = arg max ω
rT +
[(1 − ω) exp(rf ) + ω exp(rf + rT +1)]^1 −γ 1 − γ P^ (rT^ +1|ΦT^ )^ drT^ +1,^ (49)
subject to ω being nonnegative. It is infeasible to have analytic solution for the optimal portfi- olio. However, it can easily be solved numerically. In particular, given G independent draws for RT +1 from the suitable predictive distribution, the optimal portfolio is found by implementing a constrained optimization code to maximize the quantity
1 G
g=
(1 − ω) exp(rf ) + ω exp(rf ) + R( Tg +1) )
} 1 −γ
1 − γ (50)
subject to ω being nonnegative. Kandel and Stambaugh (1996) show that even when the statistical evidence on predictability, as reflected through the R^2 is the regression (46), is weak, the current values of the predictive variables, zT , can exert a substantial influence on the optimal portfolio.
that the allocation to equity diminishes with the investment horizon, as stocks appear to be riskier in longer horizons. Accounting for both return predictability and estimation risk, Barberis (2000) shows that investors allocate considerably more heavily to equity the longer their horizon.
One essential question is what are the benefits of using the Bayesian approach in studying asset allocation with predictability?
We describe four major advantages of the Bayesian versus classical approaches. First, unlike in the single period case wherein estimation risk plays virtually no role, estimation risk does play an important role in long horizon investment decisions. Barberis shows that a long horizon in- vestor who ignores it may overallocate to stocks by a sizeable amount. Second, even when the predictors evolve stochastically, both Kandel and Stambaugh (1996) and Barberis (2000) assume that the initial value of the predictive variables z 0 is non-stochastic. With stochastic initial value the distribution of future returns conditioned on model parameters does not longer obey a well known distributional form. Nevertheless, Stambaugh (1999) easily gets around this problem by implementing the Metropolis Hastings (MH) algorithm, a Markov Chain Monte Carlo procedure introduced by Metropolis et al (1953) and generalized by Hastings (1970). There are other several powerful numerical Bayesian algorithms such as the Gibbs Sampler and data augmentation [see a review by Chib and Greenberg (1996)] which make the Bayesian approach broadly applicable. The third and fourth advantages pertain to the ability of a Bayesian investor to incorporate model uncertainty as well as consider prior views about the degree of predictability explained by asset pricing models. Both of these important features of the Bayesian approach are explained below.
Indeed, as noted earlier, financial economists have identified economic variables that predict future asset returns. However, the “correct” predictive regression specification has remained an open is- sue for several reasons. For one, existing equilibrium pricing theories are not explicit about which variables should enter the predictive regression. This aspect is undesirable, as it renders the em- pirical evidence subject to data overfitting concerns. Indeed, Bossaerts and Hillion (1999) confirm in-sample return predictability, but fail to demonstrate out-of-sample predictability. Moreover, the multiplicity of potential predictors also makes the empirical evidence difficult to interpret. For ex- ample, one may find an economic variable statistically significant based on a particular collection of
explanatory variables, but often not based on a competing specification. Given that the true set of predictive variables is virtually unknown, the Bayesian methodology of model averaging, described below, is attractive, as it explicitly incorporates model uncertainty in asset allocation decisions.
Bayesian model averaging has been implemented to study hearth attacks in medicine, traffic congestion in transportation economy, hot hands in basketball, and economic growth in macro economy. In finance, Bayesian model averaging facilitates a flexible modeling of investors uncer- tainty about potentially relevant predictive variables in forecasting models. In particular, it assigns posterior probabilities to a wide set of competing return-generating models (Overall, 2M^ models); then it uses the probabilities as weights on the individual models to obtain a composite weighted model. This optimally weighted model is ultimately employed to investigate asset allocation de- cisions. Bayesian model averaging contrasts markedly with the traditional classical approach of model selection. In the heart of the model selection approach, one uses a specific criterion (e.g., adjusted R^2 ) to select a single model and then operates as if the model is correct. Implementing model selection criteria, the econometrician views the selected model as the true one with a unit probability and discards the other competing models as worthless, thereby ignoring model uncer- tainty. Accounting for model uncertainty, Avramov (2002) shows that Bayesian model averaging outperforms, ex post out-of-sample, the classical approach of model selection criteria, generating smaller forecast errors and being more efficient. Ex ante, an investor who ignores model uncertainty suffers considerable utility loses.
The Bayesian weighted predictive distribution of cumulative excess continuously compounded returns averages over the model space, and integrates over the posterior distribution that summa- rizes the within-model uncertainty about Θj where j is the model identifier. It is given by
j=
P (Mj |ΦT )
Θj P (Θj |Mj , ΦT ) P (RT +K |Mj , Θj , ΦT ) dΘj , (55)
where P (Mj |ΦT ) is the posterior probability that model Mj is the correct one. Drawing from the weighted predictive distribution is done in three steps. First draw the correct model from the distribution of models. Then conditional upon the model implement the two steps, noted above, of drawing future returns from the model specific Bayesian predictive distribution.