Statistical Models - Lecture Notes - Mathematics, Study notes of Probability and Statistics

Statistical Models - Lecture Notes - Mathematics, Probability, Statistical Models, parametric model, Statistics, Sufficiency, Sufficient Statistics, Sufficient Partitions, The Factorization Theorem, Minimal Sufficient Statistic

Typology: Study notes

2010/2011

Uploaded on 11/03/2011

sergeybrin
sergeybrin 🇺🇸

4.6

(8)

236 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture Notes 5
1 Statistical Models
A statistical model Pis a collection of probability distributions (or a collection of densities).
An example of a nonparametric model is
P=(p:Z(p00(x))2dx < ).
Aparametric model has the form
P=(p(x;θ) : θΘ)
where Θ Rd. An example is the set of Normal densities {p(x;θ) = (2π)1/2e(xθ)2/2}.
For now, we focus on parametric models.
The model comes from assumptions. Some examples:
Time until something fails is often modeled by an exponential distribution.
Number of rare events is often modeled by a Poisson distribution.
Lengths and weights are often modeled by a Normal distribution.
These models are not correct. But they might be useful. Later we consider nonpara-
metric methods that do not assume a parametric model
2 Statistics
Let X1, . . . , Xnp(x;θ). Let Xn(X1, . . . , Xn). Any function T=T(X1, . . . , Xn) is itself
a random variable which we will call a statistic.
Some examples are:
order statistics, X(1) X(2) · · · X(n)
1
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Statistical Models - Lecture Notes - Mathematics and more Study notes Probability and Statistics in PDF only on Docsity!

Lecture Notes 5

1 Statistical Models

A statistical model P is a collection of probability distributions (or a collection of densities). An example of a nonparametric model is

P =

p :

(p′′(x))^2 dx < ∞

A parametric model has the form

P =

p(x; θ) : θ ∈ Θ

where Θ ⊂ Rd. An example is the set of Normal densities {p(x; θ) = (2π)−^1 /^2 e−(x−θ)^2 /^2 }. For now, we focus on parametric models. The model comes from assumptions. Some examples:

  • Time until something fails is often modeled by an exponential distribution.
  • Number of rare events is often modeled by a Poisson distribution.
  • Lengths and weights are often modeled by a Normal distribution.

These models are not correct. But they might be useful. Later we consider nonpara- metric methods that do not assume a parametric model

2 Statistics

Let X 1 ,... , Xn ∼ p(x; θ). Let Xn^ ≡ (X 1 ,... , Xn). Any function T = T (X 1 ,... , Xn) is itself a random variable which we will call a statistic. Some examples are:

  • order statistics, X(1) ≤ X(2) ≤ · · · ≤ X(n)
  • sample mean: X = (^1) n^ ∑ i Xi,
  • sample variance: S^2 = (^) n−^11 ∑ i(Xi − x)^2 ,
  • sample median: middle value of ordered statistics,
  • sample minimum: X(1)
  • sample maximum: X(n)
  • sample range: X(n) − X(1)
  • sample interquartile range: X(. 75 n) − X(. 25 n)

Example 1 If X 1 ,... , Xn ∼ Γ(α, β), then X ∼ Γ(nα, β/n). Proof:

MX = E[etx] = E[e P (^) Xit/n ] =

i

E[eXi(t/n)]

= [MX (t/n)]n^ =

[( 1

1 − βt/n

)α]n

[ 1

1 − β/nt

]nα .

This is the mgf of Γ(nα, β/n).

Example 2 If X 1 ,... , Xn ∼ N (μ, σ^2 ) then X ∼ N (μ, σ^2 /n).

Example 3 If X 1 ,... , Xn iid Cauchy(0,1),

p(x) = (^) π(1 +^1 x (^2) )

for x ∈ R, then X ∼ Cauchy(0,1).

Example 4 If X 1 ,... , Xn ∼ N (μ, σ^2 ) then

(n − 1) σ^2 S

(^2) ∼ χ (^2) (n−1).

The proof is based on the mgf.

3.1 Sufficient Statistics

Definition: T is sufficient for θ if the conditional distribution of Xn|T does not depend on θ. Thus, f (x 1 ,... , xn|t; θ) = f (x 1 ,... , xn|t).

Example 6 X 1 , · · · , Xn ∼ Poisson(θ). Let T = ∑ni=1 Xi. Then,

pXn|T (xn|t) = P(Xn^ = xn|T (Xn) = t) = P^ (X

n (^) = xn (^) and T = t) P (T = t).

But

P (Xn^ = xn^ and T = t) =

0 if T (xn) 6 = t P (Xn^ = xn) if T (Xn) = t Hence, P (Xn^ = xn) =

∏^ n i=

e−θθxi xi! =^

e−nθθ P (^) xi ∏(x i!)^ =^ ∏e−nθθt (xi!). Now, T (xn) = ∑^ xi = t and so

P (T = t) = e

−nθ(nθ)t t! since^ T^ ∼^ Poisson(nθ).

Thus, P (Xn^ = xn) P (T = t) =^

t! (∏^ xi)!nt which does not depend on θ. So T = ∑ i Xi is a sufficient statistic for θ. Other sufficient statistics are: T = 3. 7 ∑ i Xi, T = (∑ i Xi, X 4 ), and T (X 1 ,... , Xn) = (X 1 ,... , Xn).

3.2 Sufficient Partitions

It is better to describe sufficiency in terms of partitions of the sample space.

Example 7 Let X 1 , X 2 , X 3 ∼ Bernoulli(θ). Let T = ∑^ Xi.

xn^ t p(x|t) (0, 0, 0) → t = 0 1 (0, 0, 1) → t = 1 1/ (0, 1, 0) → t = 1 1/ (1, 0, 0) → t = 1 1/ (0, 1, 1) → t = 2 1/ (1, 0, 1) → t = 2 1/ (1, 1, 0) → t = 2 1/ (1, 1, 1) → t = 3 1

8 elements → 4 elements

  1. A partition B 1 ,... , Bk is sufficient if f (x|X ∈ B) does not depend on θ.
  2. A statistic T induces a partition. For each t, {x : T (x) = t} is one element of the partition. T is sufficient if and only if the partition is sufficient.
  3. Two statistics can generate the same partition: example: ∑ i Xi and 3 ∑ i Xi.
  4. If we split any element Bi of a sufficient partition into smaller pieces, we get another sufficient partition.

Example 8 Let X 1 , X 2 , X 3 ∼ Bernoulli(θ). Then T = X 1 is not sufficient. Look at its partition:

3.4 Minimal Sufficient Statistics (MSS)

We want the greatest reduction in dimension.

Example 12 X 1 , · · · , Xn ∼ N (0, σ^2 ). Some sufficient statistics are:

T (X 1 , · · · , Xn) = (X 1 , · · · , Xn) T (X 1 , · · · , Xn) = (X 12 , · · · , X^2 n) T (X 1 , · · · , Xn) =

( (^) ∑m

i=

X i^2 , ∑^ n i=m+

X i^2

T (X 1 , · · · , Xn) =

X i^2.

Definition: T is a Minimal Sufficient Statistic if the following two statements are true:

  1. T is sufficient and
  2. If U is any other sufficient statistic then T = g(U ) for some function g. In other words, T generates the coarsest sufficient partition.

Suppose U is sufficient. Suppose T = H(U ) is also sufficient. T provides greater reduction than U unless H is a 1 − 1 transformation, in which case T and U are equivalent.

Example 13 X ∼ N (0, σ^2 ). X is sufficient. |X| is sufficient. |X| is MSS. So are X^2 , X^4 , eX^2.

Example 14 Let X 1 , X 2 , X 3 ∼ Bernoulli(θ). Let T = ∑^ Xi.

xn^ t p(x|t) u p(x|u) (0, 0, 0) → t = 0 1 u = 0 1 (0, 0, 1) → t = 1 1/3 u = 1 1/ (0, 1, 0) → t = 1 1/3 u = 1 1/ (1, 0, 0) → t = 1 1/3 u = 1 1/ (0, 1, 1) → t = 2 1/3 u = 73 1/ (1, 0, 1) → t = 2 1/3 u = 73 1/ (1, 1, 0) → t = 2 1/3 u = 91 1 (1, 1, 1) → t = 3 1 u = 103 1

Note that U and T are both sufficient but U is not minimal.

3.5 How to find a Minimal Sufficient Statistic

Theorem 15 Define R(xn, yn; θ) = p(y

n; θ) p(xn; θ). Suppose that T has the following property:

R(xn, yn; θ) does not depend on θ if and only if T (yn) = T (xn).

Then T is a MSS.

Example 16 Y 1 , · · · , Yn iid Poisson (θ).

p(yn; θ) = e

−nθθP^ yi ∏ (^) y i^ ,^

p(yn; θ) p(xn; θ) =^ ∏θP yi−P^ xi yi!/ ∏^ xi!

which is independent of θ iff ∑^ yi = ∑^ xi. This implies that T (Y n) = ∑^ Yi is a minimal sufficient statistic for θ.

The minimal sufficient statistic is not unique. But, the minimal sufficient partition is unique.