Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Assessing Normality - Lecture Notes | MATH 6010, Study notes of Mathematics

University of Utah (The U)Mathematics

Material Type: Notes; Class: Linear Models; Subject: Mathematics; University: University of Utah; Term: Fall 2004;

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-o0h 🇺🇸

9 documents

1 / 5

This page cannot be seen from the preview

Don't miss anything!

ASSESSING NORMALITY

DAVAR KHOSHNEVISAN

1. Histograms

Consider the linear model Y=Xβ +ε. The pressing question is, “is it

true that ε∼Nn(0,σ

2In)”?

To answer this, consider the “residuals,”

b

ε=Y−Xb

β.

If ε∼Nn(0,σ

2In) then one would like to think that the histogram of the

bεi’s should look like a normal pdf with mean 0 and variance σ2(why?). How

close is close? It helps to think more generally.

Consider a sample U1,...,U

n(e.g., Ui=bεi). We wish to know where the

Ui’s are coming from a normal distribution. Again, the first thing to do is

to plot the histogram. In Ryou type,

hist(u,nclass=n)

where udenotes the vector of the samples U1,...,U

nand ndenotes the

number of bins in the histogram.

For instance, consider the following exam data:

16.8 9.2 0.0 17.6 15.2 0.0 0.0 10.4 10.4 14.0 11.2 13.6 12.4

14.8 13.2 17.6 9.2 7.6 9.2 14.4 14.8 15.6 14.4 4.4 14.0 14.4 0.0

0.0 10.8 16.8 0.0 15.2 12.8 14.4 14.0 17.2 0.0 14.4 17.2 0.0 0.0

0.0 14.0 5.6 0.0 0.0 13.2 17.6 16.0 16.0 0.0 12.0 0.0 13.6 16.0

8.4 11.6 0.0 10.4 0.0 14.4 0.0 18.4 17.2 14.8 16.0 16.0 0.0 10.0

13.6 12.0 15.2

The command f1.dat,hist(nclass=15) produces Figure 1(a).1

Try this for different values of nclass to see what types of hitograms you

can obtain. You should always ask, “which one represents the truth the

best”? Is there a unique answer?

Now the data U1,...,U

nis probably not coming from a normal distribu-

tion if the histogram does not have the “right” shape. Ideally, it would be

symmetric, and the tails of the distribution taper off rapidly.

In Figure 1(a), there were many students who did not take the exam in

question. They received a ‘0’ but this grade should probably not contribute

to our knowledge of the distribution of all such grades. Figure 1(b) shows

Date: September 1, 2004.

1You can obtain this data freely from the website b elow:

http://www.math.utah.edu/˜davar/math6010/2004/notes/f1.dat.

1

Discover Study notes of Mathematics University of Utah (The U)

Partial preview of the text

Download Assessing Normality - Lecture Notes | MATH 6010 and more Study notes Mathematics in PDF only on Docsity!

ASSESSING NORMALITY

DAVAR KHOSHNEVISAN

Histograms Consider the linear model Y = Xβ + ε. The pressing question is, “is it true that ε ∼ Nn( 0 , σ^2 In)”? To answer this, consider the “residuals,” ̂ ε = Y − X β̂.

If ε ∼ Nn( 0 , σ^2 In) then one would like to think that the histogram of the ̂ εi’s should look like a normal pdf with mean 0 and variance σ^2 (why?). How close is close? It helps to think more generally. Consider a sample U 1 ,... , Un (e.g., Ui = ε̂i). We wish to know where the Ui’s are coming from a normal distribution. Again, the first thing to do is to plot the histogram. In R you type,

hist(u,nclass=n)

where u denotes the vector of the samples U 1 ,... , Un and n denotes the number of bins in the histogram. For instance, consider the following exam data: 16.8 9.2 0.0 17.6 15.2 0.0 0.0 10.4 10.4 14.0 11.2 13.6 12. 14.8 13.2 17.6 9.2 7.6 9.2 14.4 14.8 15.6 14.4 4.4 14.0 14.4 0. 0.0 10.8 16.8 0.0 15.2 12.8 14.4 14.0 17.2 0.0 14.4 17.2 0.0 0. 0.0 14.0 5.6 0.0 0.0 13.2 17.6 16.0 16.0 0.0 12.0 0.0 13.6 16. 8.4 11.6 0.0 10.4 0.0 14.4 0.0 18.4 17.2 14.8 16.0 16.0 0.0 10. 13.6 12.0 15.

The command f1.dat,hist(nclass=15) produces Figure 1(a).^1 Try this for different values of nclass to see what types of hitograms you can obtain. You should always ask, “which one represents the truth the best”? Is there a unique answer? Now the data U 1 ,... , Un is probably not coming from a normal distribu- tion if the histogram does not have the “right” shape. Ideally, it would be symmetric, and the tails of the distribution taper off rapidly. In Figure 1(a), there were many students who did not take the exam in question. They received a ‘0’ but this grade should probably not contribute to our knowledge of the distribution of all such grades. Figure 1(b) shows

Date: September 1, 2004. (^1) You can obtain this data freely from the website below: http://www.math.utah.edu/˜davar/math6010/2004/notes/f1.dat. 1

2 DAVAR KHOSHNEVISAN

Histogram of f

f

Frequency

0 5 10 15

0

5

10

15

(a) Grades

Histogram of f1.censored

f1.censored

Frequency

5 10 15

0

2

4

6

8

(b) Censored Grades

4 DAVAR KHOSHNEVISAN

−2 −1 0 1 2

0

5

10

15

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

(c) QQ-plot of grades

−2 −1 0 1 2

4

6

8

10

12

14

16

18

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

(d) QQ-plot of censored grades

ASSESSING NORMALITY 5

This creates two vectors: V$x and V$y. The first contains the values of all qj ’s, and the second all of the U(j)’s. So now you can compute the correlation coefficient of the qq-plot by typing:

V = qqnorm(u, plot = FALSE) cor(V$x, V$y).

If you do this for the qq-plot of the grade data, then you will find a correlation of ≈ 0 .910. After censoring out the no-show exams, we obtain a correlation of ≈ 0 .971. This produces a noticeable difference, and shows that the grades are indeed normal. In fact, one can analyse this procedure statistically, as we shall do later on.

Assessing Normality - Lecture Notes | MATH 6010, Study notes of Mathematics

Related documents

Partial preview of the text

Download Assessing Normality - Lecture Notes | MATH 6010 and more Study notes Mathematics in PDF only on Docsity!

ASSESSING NORMALITY