Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Statistics: Populations, Samples, Models, and Data Analysis, Study notes of Mathematical Statistics

University of Wisconsin (UW) - Madison Mathematical Statistics

An introduction to statistics, including the concepts of populations, samples, models, and data analysis. It covers various statistical methods such as descriptive statistics, parametric and nonparametric families, and semi-parametric models. The document also discusses the importance of understanding the distribution of data and transformations to find approximations of distributions.

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-51g-1 🇺🇸

5

(1)

10 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

Leture 17: Populations, samples, models, and statistis

One or a series of random exp eriments is p erformed.

Some data from the experiment(s) are olleted.

Planning experiments and olleting data (not disussed in the textbo ok).

Data analysis: extrat information from the data, interpret the results, and draw some

onlusions.

A desriptive data analysis: summary measures of the data, suh as the mean, median,

range, standard deviation, et., and some graphial displays, suh as the histogram and

box-and-whisker diagram, et.

It is simple and requires almost no assumptions, but may not allow us to gain enough insight

into the problem.

We fous on more sophistiated methods of analyzing data:

statistial inferene

and

deision

theory

.

The data set is a realization of a random element dened on a probability spae (

;

F

; P

)

P

is alled the

population

.

The data set or the random element that produes the data is alled a

sample

from

P

.

The size of the data set is alled the

sample size

.

A population

P

is

known

if and only if

P

(

A

) is a known value for every event

A

2 F

.

In a statistial problem, the population

P

is at least partially unknown.

We would like to dedue some prop erties of

P

based on the available sample.

Examples 2.1-2.3

A

statistial model

(a set of assumptions) on the p opulation

P

in a given problem is often

postulated to make the analysis possible or easy.

Although testing the orretness of postulated mo dels is part of statistial inferene and

deision theory, postulated models are often based on knowledge of the problem under on-

sideration.

Denition 2.1.

A set of probability measures

P



on (

;

F

) indexed by a

parameter



2



is said to be a

parametri family

if and only if 

 R

d

for some xed positive integer

d

and

eah

P



is a

known

probability measure when



is known. The set  is alled the

parameter

spae

and

d

is alled its

dimension

.

Parametri model: the population

P

is in a parametri family

P

=

f

P



:



2



g

P

=

f

P



:



2



g

is

identiable

if and only if



1

6

=



2

and



i

2

 imply

P



1

6

=

P



2

.

In most ases an identiable parametri family an be obtained through reparameterization.

A family of populations

P

is dominated by



(a



-nite measure) if

P





for all

P

2 P

P

an be identied by the family of densities

f

dP

d

:

P

2 Pg

or

f

dP



d

:



2



g

.

Parametri methods: methods designed for parametri mo dels

1

Discover Study notes of Mathematical Statistics University of Wisconsin (UW) - Madison

Partial preview of the text

Download Statistics: Populations, Samples, Models, and Data Analysis and more Study notes Mathematical Statistics in PDF only on Docsity!

Le ture 17: Populations, samples, mo dels, and statisti s

One or a series of random exp eriments is p erformed. Some data from the exp eriment(s) are olle ted. Planning exp eriments and olle ting data (not dis ussed in the textb o ok). Data analysis: extra t information from the data, interpret the results, and draw some on lusions.

A des riptive data analysis: summary measures of the data, su h as the mean, median, range, standard deviation, et ., and some graphi al displays, su h as the histogram and b ox-and-whisker diagram, et.

It is simple and requires almost no assumptions, but may not allow us to gain enough insight into the problem. We fo us on more sophisti ated metho ds of analyzing data: statisti al inferen e and de ision theory.

The data set is a realization of a random element de ned on a probability spa e ( ; F ; P ) P is alled the population. The data set or the random element that pro du es the data is alled a sample from P. The size of the data set is alled the sample size. A p opulation P is known if and only if P (A) is a known value for every event A 2 F. In a statisti al problem, the p opulation P is at least partially unknown. We would like to dedu e some prop erties of P based on the available sample.

Examples 2.1-2.

A statisti al model (a set of assumptions) on the p opulation P in a given problem is often p ostulated to make the analysis p ossible or easy. Although testing the orre tness of p ostulated mo dels is part of statisti al inferen e and de ision theory, p ostulated mo dels are often based on knowledge of the problem under on- sideration.

De nition 2.1. A set of probability measures P on ( ; F ) indexed by a parameter 2 is said to b e a parametri family if and only if Rd^ for some xed p ositive integer d and ea h P is a known probability measure when is known. The set is alled the parameter spa e and d is alled its dimension.

Parametri mo del: the p opulation P is in a parametri family P = fP : 2 g P = fP : 2 g is identi able if and only if 1 6 = 2 and i 2 imply P 1 6 = P 2. In most ases an identi able parametri family an b e obtained through reparameterization.

A family of p opulations P is dominated by (a - nite measure) if P for all P 2 P P an b e identi ed by the family of densities f dPd : P 2 P g or f dPd : 2 g.

Parametri metho ds: metho ds designed for parametri mo dels

Example (The k -dimensional normal family).

P = fNk (; ) : 2 Rk^ ; 2 Mk g;

where Mk is a olle tion of k k symmetri p ositive de nite matri es. This family is dominated by the Leb esgue measure on Rk^. When k = 1, P = fN (; 2 ) : 2 R; 2 > 0 g.

Nonparametri family: P is not parametri a ording to De nition 2.1. A nonparametri mo del: the p opulation P is in a given nonparametri family.

Examples of nonparametri family on (Rk^ ; B k^ ): (1) The joint .d.f.'s are ontinuous. (2) The joint .d.f.'s have nite moments of order a xed integer. (3) The joint .d.f.'s have p.d.f.'s (e.g., Leb esgue p.d.f.'s). (4) k = 1 and the .d.f.'s are symmetri. (5) The family of all probability measures on (Rk^ ; B k^ ).

Nonparametri metho ds: metho ds designed for nonparametri mo dels

Semi-parametri mo dels and metho ds

Statisti s and their distributions

Our data set is a realization of a sample (random ve tor) X from an unknown p opulation P Statisti T (X ): A measurable fun tion T of X ; T (X ) is a known value whenever X is known. Statisti al analyses are based on various statisti s, for various purp oses. X itself is a statisti , but it is a trivial statisti. The range of a nontrivial statisti T (X ) is usually simpler than that of X. For example, X may b e a random n-ve tor and T (X ) may b e a random p-ve tor with a p mu h smaller than n. (T (X )) (X ) and the two - elds are the same if and only if T is one-to-one. Usually (T (X )) simpli es (X ), i.e., a statisti provides a \redu tion" of the - eld.

The \information" within the statisti T (X ) on erning the unknown distribution of X is ontained in the - eld (T (X )). S is any other statisti for whi h (S (X )) = (T (X )). Then, by Lemma 1.2, S is a measurable fun tion of T , and T is a measurable fun tion of S. Thus, on e the value of S (or T ) is known, so is the value of T (or S ). It is not the parti ular values of a statisti that ontain the information, but the generated - eld of the statisti. Values of a statisti may b e imp ortant for other reasons.

A statisti T (X ) is a random element. If the distribution of X is unknown, then the distribution of T may also b e unknown, although T is a known fun tion. Finding the form of the distribution of T is one of the ma jor problems in statisti al inferen e and de ision theory.

Y 1 ; :::; Yn are i.i.d. random 2-ve tors with E Y 1 = (0; 2 ) and varian e- ovarian e matrix

B ^

2 E (X 1 ) 3

E (X 1 )^3 E (X 1 )^4 4

CA :

Note that Y = n^1

Pn i=1 Yi^ =^ (^ X^ ^ ;^ S~^ (^2) ), where S~ 2 = n 1 Pn i=1 (Xi^ ^ )

Applying the CLT (Corollary 1.2) to Yi 's, we obtain that

p n( X ; S~ 2 2 ) !d N 2 (0; ):

Sin e S 2 =

n n 1

h S~ 2 ( X )^2

i

and X !a:s: (the SLLN), an appli ation of Slutsky's theorem leads to

p n( X ; S 2 2 ) !d N 2 (0; ):

Example 2.9 (Order statisti s). Let X = (X 1 ; :::; Xn ) with i.i.d. random omp onents. Let X(i) b e the ith smallest value of X 1 ; :::; Xn. The statisti s X(1) ; :::; X(n) are alled the order statisti s. Order statisti s is a set of very useful statisti s in addition to the sample mean and varian e. Supp ose that Xi has a .d.f. F having a Leb esgue p.d.f. f. Then the joint Leb esgue p.d.f. of X(1) ; :::; X(n) is

g (x 1 ; x 2 ; :::; xn ) =

n!f (x 1 )f (x 2 ) f (xn ) x 1 < x 2 < < xn 0 otherwise.

The joint Leb esgue p.d.f. of X(i) and X(j ) , 1 i < j n, is

gi;j (x; y ) =

n![F (x)℄i^1 [F (y )F (x)℄j^ i^1 [1F (y )℄nj^ f (x)f (y ) (i1)!(j i1)!(nj )! x^ <^ y 0 otherwise

and the Leb esgue p.d.f. of X(i) is

gi (x) =

n! (i 1)!(n i)!

[F (x)℄i^1 [1 F (x)℄ni^ f (x):

Statistics: Populations, Samples, Models, and Data Analysis, Study notes of Mathematical Statistics

Related documents

Partial preview of the text

Download Statistics: Populations, Samples, Models, and Data Analysis and more Study notes Mathematical Statistics in PDF only on Docsity!

B ^

2 E (X 1 ) 3

E (X 1 )^3 E (X 1 )^4  4

CA :

B ^

2 E (X 1 ) 3

E (X 1 )^3 E (X 1 )^4 4