









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The lecture notes for stat 710: mathematical statistics at the university of wisconsin-madison, covering the topics of bayesian method, prior and posterior distributions, and bayes formula. It explains how to construct the posterior distribution using the joint distribution of x and ⃗θ, and provides the bayes formula for continuous and discrete cases.
Typology: Study notes
1 / 15
This page cannot be seen from the preview
Don't miss anything!










logo
Jun Shao
Department of Statistics University of Wisconsin Madison, WI 53706, USA
logo
X is from a population in a parametric family P = { P θ : θ ∈ Θ}, where Θ ⊂ R k^ for a fixed integer k ≥ 1
Bayesian method Minimaxity and admissibility Likelihood approach
Decision rules minimizing the average risk w.r.t. a given probability measure Π on Θ Optimal rules in the Bayesian approach , which is fundamentally different from the classical frequentist approach that we have been adopting
logo
X is from a population in a parametric family P = { P θ : θ ∈ Θ}, where Θ ⊂ R k^ for a fixed integer k ≥ 1
Bayesian method Minimaxity and admissibility Likelihood approach
Decision rules minimizing the average risk w.r.t. a given probability measure Π on Θ Optimal rules in the Bayesian approach , which is fundamentally different from the classical frequentist approach that we have been adopting
logo
θ is viewed as a realization of a random vector ~ θ ∈ Θ whose prior distribution is Π Prior distribution: past experience, past data, or a statistician’s belief (subjective) Sample X ∈ X : from P θ = Px | θ , the conditional distribution of X given ~ θ = θ Posterior distribution: updated prior distribution using the sample X = x
By Theorem 1.7, the joint distribution of X and ~ θ is a probability measure on X × Θ determined by
P ( A × B ) =
∫
B
Px | θ ( A ) d Π( θ ), A ∈ BX , B ∈ BΘ
The posterior distribution is the conditional distribution P θ | x whose existence is guaranteed by Theorem 1.7 a.s. x ∈ X
logo
When Px | θ has a p.d.f., Theorem 4.1 provides a formula for the p.d.f. of the posterior distribution
Assume P = { Px | θ : θ ∈ Θ} is dominated by a σ -finite measure ν and f θ ( x ) = dPx | θ / d ν is a Borel function on (X × Θ, σ (BX × BΘ)). Let Π be a prior distribution on Θ. Suppose that m ( x ) =
∫ Θ f^ θ^ ( x ) d Π^ >^ 0. (i) The posterior distribution P θ | x ≪ Π and
dP θ | x / d Π = f θ ( x )/ m ( x )
(ii) If Π ≪ λ and d Π/ d λ = π( θ ) for a σ -finite measure λ , then
dP θ | x / d λ = f θ ( x ) π( θ )/ m ( x )
Result (ii) follows from result (i) and Proposition 1.7(iii)
logo
When Px | θ has a p.d.f., Theorem 4.1 provides a formula for the p.d.f. of the posterior distribution
Assume P = { Px | θ : θ ∈ Θ} is dominated by a σ -finite measure ν and f θ ( x ) = dPx | θ / d ν is a Borel function on (X × Θ, σ (BX × BΘ)). Let Π be a prior distribution on Θ. Suppose that m ( x ) =
∫ Θ f^ θ^ ( x ) d Π^ >^ 0. (i) The posterior distribution P θ | x ≪ Π and
dP θ | x / d Π = f θ ( x )/ m ( x )
(ii) If Π ≪ λ and d Π/ d λ = π( θ ) for a σ -finite measure λ , then
dP θ | x / d λ = f θ ( x ) π( θ )/ m ( x )
Result (ii) follows from result (i) and Proposition 1.7(iii)
logo
∫
X
m ( x ) d ν =
∫
X
∫
Θ
f θ ( x ) d Π d ν =
∫
Θ
∫
X
f θ ( x ) d ν d Π = 1
The second equality follows from Fubini’s theorem m ( x ) is integrable w.r.t. ν and m ( x ) < ∞ a.e. ν
Without loss of generality we may assume m ( x ) > 0 If m ( x ) = 0 for an x ∈ X , then f θ ( x ) = 0 a.s. Π Either x should be eliminated from X or the prior Π is incorrect and a new prior should be specified
For x ∈ X with m ( x ) < ∞, define
P ( B , x ) =
m ( x )
∫
B
f θ ( x ) d Π, B ∈ BΘ
Then P (·, x ) is a probability measure on Θ a.e. ν.
logo
∫
X
m ( x ) d ν =
∫
X
∫
Θ
f θ ( x ) d Π d ν =
∫
Θ
∫
X
f θ ( x ) d ν d Π = 1
The second equality follows from Fubini’s theorem m ( x ) is integrable w.r.t. ν and m ( x ) < ∞ a.e. ν
Without loss of generality we may assume m ( x ) > 0 If m ( x ) = 0 for an x ∈ X , then f θ ( x ) = 0 a.s. Π Either x should be eliminated from X or the prior Π is incorrect and a new prior should be specified
For x ∈ X with m ( x ) < ∞, define
P ( B , x ) =
m ( x )
∫
B
f θ ( x ) d Π, B ∈ BΘ
Then P (·, x ) is a probability measure on Θ a.e. ν.
logo
By Fubini’s theorem, P ( B , ·) is a measurable function of x Let Px ,θ denote the “joint" distribution of ( X ,~ θ ) For any A ∈ σ ( X ),
∫
A ×Θ
IB ( θ ) dPx , θ =
∫
A
∫
B
f θ ( x ) d ν d Π
∫
A
B
f θ ( x ) m ( x )
d Π
Θ
f θ ( x ) d Π
d ν
∫
Θ
∫
A
B
f θ ( x ) m ( x )
d Π
f θ ( x ) d ν d Π
∫
A ×Θ
P ( B , x ) dPx , θ
where the third equality follows from Fubini’s theorem This completes the proof
logo
P (~ θ = θ | X = x ) =
P ( X = x |~ θ = θ ) P (~ θ = θ ) ∑ θ ∈Θ P ( X^ =^ x |~^ θ^ =^ θ^ ) P (~^ θ^ =^ θ^ )
The posterior P θ | x contains all the information we have about θ Statistical decisions and inference should be made based on P θ | x , conditional on the observed X = x In estimating θ , P θ | x can be viewed as a randomized decision rule under the approach discussed in §2. After X = x is observed, P θ | x is a randomized rule, which is a probability distribution on the action space A = Θ The Bayesian method can be applied iteratively