









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Bayesian posterior distribution prior
Typology: Study notes
1 / 17
This page cannot be seen from the preview
Don't miss anything!










Attendance Probability Observed frequency Week 2 and Week 4 θ 11 n 11 = 25 Week 2 but not Week 4 θ 10 n 10 = 7 Week 4 but not Week 2 θ 01 n 01 = 6 Neither week θ 00 n 00 = 13
Suppose that the prior distribution for (θ 11 , θ 10 , θ 01 , θ 00 ) is a Dirichlet distribution with den- sity proportional to θ^311 θ 10 θ 01 θ 002.
(a) Find the prior means and prior variances of θ 11 , θ 10 , θ 01 , θ 00. (b) Find the posterior distribution. (c) Find the posterior means and posterior variances of θ 11 , θ 10 , θ 01 , θ 00. (d) Using the R function hpdbeta which may be obtained from the Web page (or other- wise), find a 95% posterior hpd interval, based on the exact posterior distribution, for θ 00. (e) Find an approximate 95% hpd interval for θ 00 using a normal approximation based on the posterior mode and the partial second derivatives of the log posterior density. Compare this with the exact hpd interval. Hint: To find the posterior mode you will need to introduce a Lagrange multiplier. (f) The population mean number of attendances out of two is μ = 2θ 11 + θ 10 + θ 01. Find the posterior mean of μ and an approximation to the posterior standard deviation of μ.
44.3 50.2 51.7 49.4 50.6 55.0 53.5 48.6 48.8 53. 59.4 51.4 52.0 51.9 51.6 48.3 49.3 54.1 52.4 53.
We regard these as independent samples from a normal distribution with mean μ and variance σ^2 = τ −^1. Find a 95% posterior hpd interval for μ under each of the following two conditions.
(a) The value of τ is known to be 0.1 and our prior distribution for μ is normal with mean 60.0 and standard deviation 20.0. (b) The value of τ is unknown. Our prior distribution for τ is a gamma distribution with mean 0.1 and standard deviation 0.05. Our conditional prior distribution for μ given τ is normal with mean 60.0 and precision 0. 025 τ (that is, standard deviation
40 τ −^1 /^2 ).
i=
yi = 672 and
i=
y i^2 = 16193.
(a) Suppose that the value of τ is known to be 0.04 and that our prior distribution for μ is normal with mean 20 and variance 100. Find the posterior distribution of μ and evaluate a posterior 95% hpd interval for μ. (b) Suppose that we have a gamma(1, 10) prior distribution for τ and our conditional prior distribution for μ given τ is normal with mean 20 and variance (0. 1 τ )−^1. Find the marginal posterior distribution for τ, the marginal posterior distribution for μ and the marginal posterior 95% hpd interval for μ.
Patient 1 2 3 4 5 6 7 8 Before 210 169 187 160 167 176 185 206 After 201 165 166 157 147 145 168 180 Patient 9 10 11 12 13 14 15 Before 173 146 174 201 198 148 154 After 147 136 151 168 179 129 131
We are interested in the effect of the drug on blood pressure. We assume that, given pa- rameters μ, τ, the changes in blood pressure, from before to after, in the n patients are independent and normally distributed with unknown mean μ and unknown precision τ. The fifteen differences are as follows.
Our prior distribution for τ is a gamma(0. 35 , 1 .01) distribution. Our conditional prior dis- tribution for μ given τ is a normal N (0, [0. 003 τ ]−^1 ) distribution.
(a) Find the marginal posterior distribution of τ. (b) Find the marginal posterior distribution of μ. (c) Find the marginal posterior 95% hpd interval for μ. (d) Comment on what you can conclude about the effect of the drug.
f (t) = αρ^2 t exp[−(ρt)^2 ]
for 0 < t < ∞. We will observe a sample of n such lifetimes where n is large.
(a) Find the maximum likelihood estimate of p and hence find the maximum likelihood estimate of θ. (b) Disregarding the prior distribution, find a large-sample approximation to the posterior distribution of θ. (c) Using your approximate posterior distribution for θ, find an approximate 95% hpd interval for θ. (d) Use the exact posterior distribution for p to find the actual posterior probability that θ is inside your approximate hpd interval.
Notes: • The standard normal distribution function Φ(x) =
∫ (^) x −∞ φ(u)^ du^ where^ φ(u) = (2π)−^1 /^2 exp{−u^2 / 2 }.
dl dθ
dl dp
dp dθ and
d^2 l dθ^2
d dθ
dl dp
dp dθ
d dθ
dl dp
dp dθ
dl dp
d^2 p dθ^2
=
d^2 l dp^2
dp dθ
dl dp
d^2 p dθ^2
= φ(θ)
d^2 p dθ^2
= −θφ(θ)
496 506 495 491 488 492 482 495 493 496 487 490 493 495 492 498 491 493 495 489
Assume that, given the values of parameters μ, τ, the weights are independent and each has a normal N (μ, τ ) distribution. The values of μ and τ are unknown. Our prior distribution is as follows. We have a gamma(2, 9) prior distribution for τ and a N (500, (0. 005 τ )−^1 ) conditional prior distribution for μ given τ.
(a) Find the posterior probability that μ < 495. (b) Find the posterior predictive probability that a new packet of rice will contain less than 500g of rice
Observation i 1 2 3 4 5 6 7 8 Age of machine ti (months) 3 6 9 12 15 18 21 24 Number of jams yi 10 13 24 17 20 22 20 23
Our model is as follows. Given the values of two parameters α, β, the number of jams yi on a dat when the machine has age ti months has a Poisson distribution
yi ∼ Poisson(λi)
where loge(λi) = α + βti. Assume that the effect of our prior distribution on the posterior distribution is negligible and that large-sample approximations may be used.
(a) Let the values of α and β which maximise the likelihood be ˆα and β.ˆ Assuming that the likelihood is differentiable at its maximum, show that these satisfy the following two equations
i=
(λˆi − yi) = 0
i=
ti(λˆi − yi) = 0
where loge(ˆλi) = ˆα + βtˆi and show that these equations are satisfied (to a good approximation) by
αˆ = 2. 552 and βˆ = 0. 02638.
(You may use R to help with the calculations, but show your commands). You may assume from now on that these values maximise the likelihood. (b) Find an approximate symmetric 95% posterior interval for α + 24β. (c) Find an approximate symmetric 95% posterior interval for exp(α + 24β), the mean jam-rate per day at age 24 months.
(You may use R to help with the calculations, but show your commands).
Solutions to Questions 9 and 10 of Problems 5 are to be submitted in the Homework Letterbox no later than 4.00pm on Tuesday May 5th.
MacGregor, G.A., Markandu, N.D., Roulston, J.E. and Jones, J.C., 1979. Essential hypertension: effect of an oral inhibitor of angiotensin-converting enzyme. British Medical Journal, 2 , 1106-1109.
Posterior means:
θ 11 :
θ 10 :
θ 01 :
θ 00 :
Posterior variances:
θ 11 :
θ 10 :
θ 01 :
θ 00 :
(d) Posterior distribution for θ 00 is beta(16, 62 − 16). That is beta(16,46).
Using the R command hpdbeta(0.95,16,46) gives 0. 15325 < θ 00 < 0. 36724. (e) The log posterior density is (apart from a constant)
∑^4
j=
(a 1 ,j − 1) log θj.
Add λ(
∑j j=1 θj^ −^ 1) to this and differentiate wrt^ θj^ then set the derivative equal to zero. This gives
a 1 ,j − 1 θˆj
which leads to θˆj = − (a^1 ,j^ −^ 1) λ
However
j=1 θj^ = 1 so
−
j=
(a 1 ,j − 1) λ
so
λ = −
j=
(a 1 ,j − 1)
and θˆj = ∑a^1 ,j^ −^1 a 1 ,k − 4
Hence the posterior mode for θ 00 is
θˆ 00 =^15 58
The second derivatives of the log likelihood are ∂^2 l ∂θ^2 j
a 1 ,j − 1 θ^2 j
and
∂^2 l ∂θj ∂θk
Since the mixed partial second derivatives are zero, the information matrix is diagonal and the posterior variance of θj is approximately
θˆ^2 j a 1 ,j − 1
(a 1 ,j − 1)^2 (a 1 ,j − 1)(
a 1 ,k − 4)^2
(ai,j − 1 (
a 1 ,k − 4)^2
The posterior variance of θ 00 is approximately 15 582
The approximate 95% hpd interval is 0. 2586 ± 1. 96
(f) Approximation based on posterior mode and curvature: Posterior modes: θ 11 :
θ 10 :
θ 01 :
So, approx. posterior mean of μ is
2 ×
Approx. posterior variances:
θ 11 :
θ 10 :
θ 01 :
Since the (approx.) covariances are all zero, the approx. posterior variance of μ is
4 ×
so approx. standard deviation is √ 0 .0377527 = 0. 1943.
N.B. There is an alternative exact calculation, as follows, which is also acceptable. Posterior mean:
2 ×
Posterior covariances:
− a 1 ,j a 1 ,k A^21 (A 1 + 1)
var(μ) = 4var(θ 11 ) + var(θ 10 ) + var(θ 01 ) +4covar(θ 11 , θ 10 ) + 4covar(θ 11 , θ 01 ) + 2covar(θ 10 , θ 01 )
= 4
So the standard deviation is 0. 1040. The difference is quite big!
That is M 1 ± 2. 048 × 0. 68591 That is
051 < μ < 52. 860
(a) We have
P 0 = 0. 01 Pd = nτ = 30 × 0 .04 = 1. 2 P 1 = 0 .01 + 1.2 = 1. 21 M 0 = 20 y¯ = 22. 4
M 1 =
P 0 M 0 + Pd ¯y P 1
Posterior:
μ ∼ N (22. 380 , 1. 21 −^1 ). That is μ ∼ N (22. 380 , 0 .8264).
95% hpd interval: 22. 380 ± 1. 96
60 < μ < 24. 16
(b) We have
d 0 = 2 d 1 = d 0 + n = 32 v 0 = 10 c 0 = 0. 1 c 1 = c 0 + n = 30. 1 m 0 = 20 ¯y = 22. 4
m 1 =
c 0 m 0 + ny¯ c 0 + n
s^2 n =
n
∑^ n
i=
y i^2 −
n
( (^) n ∑
i=
yi
(¯y − m 0 )^2 = (22. 4 − 20)^2 = 5. 76 r^2 = (¯y − m 0 )^2 + s^2 n = 43. 7667
vd =
c 0 r^2 + ns^2 n c 0 + n
v 1 =
d 0 v 0 + nvd d 0 + n
Marginal posterior distribution for τ : d 1 v 1 τ = 11670. 77 τ ∼ χ^232. Marginal posterior distribution for μ:
μ − m 1 √ v 1 /c 1
μ − 22. 392 √
∼ t 32
95% hpd interval: m 1 ± 2. 037
v 1 c 1
That is
That is
16 < μ < 24. 63.
Data: n = 15
y = − 284
y^2 = 6518
¯y = − 18. 9333 s^2 n =
Calculate posterior:
d 0 = 0. 7 v 0 = 2. 02 / 0 .7 = 2. 8857 c 0 = 0. 003 m 0 = 0 d 1 = d 0 + 15 = 15. 7 (¯y − m 0 )^2 = ¯y^2 = 358. 4711 r^2 = (¯y − m 0 )^2 + s^2 n = 434. 5333
vd =
c 0 r^2 + ns^2 n c 0 + n
v 1 =
d 0 v 0 + nvd d 0 + n
c 1 = c 0 + 15
m 1 =
c 0 m 0 + ny¯ c 0 + n
(a) Marginal posterior distribution for τ is gamma(d 1 / 2 , d 1 v 1 /2). That is
gamma(7. 85 , 572 .014).
(Alternatively d 1 v 1 τ ∼ χ^2 d 1. That is 1144. 025 τ ∼ χ^215. 7 ). (b) Marginal posterior for μ:
μ − m 1 √ v 1 /c 1
μ − 18. 9295 √
∼ t 15. 7
(c) We can use R for the critical points of t 15. 7 : ± 2. 1232
qt(0.975,15.7)
95% interval: − 18. 9295 ± 2. 1232
− 23. 61 < μ < − 14. 25.
(d) Comment, e.g., since zero is well outside the 95% interval it seems clear that the drug reduces the blood pressure.
This quadratic equation has two solutions but one is negative and ρ must be positive so
ρˆ =
t^2 i )(2n + 1) 4
t^2 i
The second derivative is ∂^2 g ∂ρ^2
2 n + 1 ρ^2
∑^ n
i=
t^2 i
so the posterior variance is approximately
1 2[(n + 1/2)/ρˆ^2 +
t^2 i ]
Our 95% hpd interval is therefore 0. 0097410 ± 1. 96
900519 × 10 −^8. That is
009190 < ρ < 0. 010292.
So the prior makes no noticeable difference in this case.
From tables, 95% interval, 3. 247 < 2 λ < 20. 48. That is
(b) Prior density prop. to λ^5 −^1 e−λ. Likelihood
L =
i=
e−λλxi xi!
e−^45 λλ
xi ∏ xi!
∝ e−^45 λλ^182.
Posterior density prop. to λ^187 −^1 e−^46 λ. This is a
gamma(187, 46)
distribution. (c) Posterior mean: 147 46
Posterior variance:
187 462
Posterior sd: √ 187 462
95% interval 4. 0652 ± 1. 96 × 0. 29728. That is
(d) Joint prob. of λ, X = m:
46187 Γ(187)
λ^187 −^1 e−^46 λ^
λme−λ m!
Γ(187 + m) 47 187+m
m!
47 187+m Γ(187 + m)
λ187+m−^1 e−^47 λ
Integrate out λ:
Pr(X = m) =
47 187+m
Γ(187 + m) Γ(187)m!
=
(186 + m)! 186!m!
)m
186 + m m
)m
(e) Joint probability (density) of θ, X = m:
θme−θ m!
m!
Γ(1 + m)
05 m+
05 m+ Γ(1 + m)
θm+1−^1 e−^1.^05 θ
Integrate out θ :
Pr(X = m) =
Γ(1 + m) m!
)m
Log posterior probs: “Ordinary”:
log(P 1 ) = log[Γ(187 + 10)] − log[Γ(187)] − log[Γ(11)] + 187 log(46/47) + 10 log(1/47) = log[Γ(197)] − log[Γ(187)] − log(Γ(11)] + 187 log(46) − 197 log(47) = − 5. 079796
lgamma(197) - lgamma(187) - lgamma(11) + 187log(46) - 197log(47) [1] -5.
“Type 2”:
log(P 2 ) = log(0. 05 / 1 .05) + 10 log(1/ 1 .05) = log(0.05) − 11 log(1.05) = − 3. 532424
Hence the predictive probabilities are as follows. “Ordinary”: P 1 = exp(− 5 .079796) = 0. 006221178 “Type 2”: P 2 = exp(− 3 .532424) = 0. 02923396 Hence the posterior probability that this is an ordinary customer is
9 × 0. 006221178 9 × 0 .006221178 + 1 × 0. 02923396
i=
e−λi^ λy ii yi!
Log likelihood:
l = −
λi +
yi log λi −
log(yi!)
= −
λi +
yi(α + βti) −
log(yi!)
Derivatives:
∂λi ∂α
∂α
eα+βti^ = λi ∂λi ∂β
∂β
eα+βti^ = λiti
∂l ∂α
∑ (^) ∂λi ∂α
yi = −
λi +
yi = −
(λi − yi) ∂l ∂β
∑ (^) ∂λi ∂β
yiti = −
λiti +
yiti = −
ti(λi − yi)
At the maximum
∂l ∂α
∂l ∂β
Hence ˆα and βˆ satisfy the given equations. Calculations in R:
y<-c(10,13,24,17,20,22,20,23) t<-seq(3,24,3) lambda<-exp(2.552+0.02638t) sum(lambda-y) [1] 0. sum(t(lambda-y)) [1] -0.
These values seem close to zero but let us try a small change to the parameter values:
lambda<-exp(2.55+0.0264t) sum(lambda-y) [1] -0. sum(t(lambda-y)) [1] -3.
The results are now much further than zero suggesting that the given values are very close to the solutions.
(5 marks)
(b) Second derivatives:
∂^2 l ∂α^2
∑ (^) ∂λi ∂α
λi
∂^2 l ∂β^2
ti
∂λi ∂α
t^2 i λi
∂^2 l ∂α∂β
∑ (^) ∂λi ∂β
tiλi
Variance matrix:
∂^2 l ∂α^2
∂^2 l ∂α∂β ∂^2 l ∂α∂β
∂^2 l ∂β^2
Numerically using R:
lambda<-exp(2.552+0.02638t) d2<-matrix(nrow=2,ncol=2) d2[1,1]<- -sum(lambda) d2[1,2]<- - sum(tlambda) d2[2,1]<- - sum(tlambda) d2[2,2]<- - sum((t^2)lambda) V<- - solve(d2) V [,1] [,2] [1,] 0.038194535 -0. [2,] -0.002136181 0. The mean of α + 24β is 2.552 + 24 × 0 .02638 = 3. 18512. The variance is 0.038194535 + 24^2 ∗ 0 .0001449430 + 2 × 1 × 24 × (− 0 .0021361807) =
Alternative matrix-based calculation in R:
dim(m)<-c(1,2) v<-m%%V%%t(m) v [,1] [1,] 0.
The approximate 95% interval is
That is
(5 marks)
(c) The interval for λ 24 is e^2.^9139 < eα+24β^ < e^3.^4563.
That is
(2 marks)