Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Linear Regression Analysis: Inference for Intercept, Slope, and Mean Response, Study notes of Data Analysis & Statistical Methods

University of Oregon (UO)Data Analysis & Statistical Methods

The relationship between a response variable and an explanatory variable using linear regression analysis. Topics include the history of regression, simple linear regression model, estimating regression parameters, confidence intervals for slope and intercept, significance tests for intercept and slope, and prediction intervals. The document also covers the analysis of variance for regression.

Typology: Study notes

Pre 2010

Uploaded on 07/29/2009

koofers-user-jow 🇺🇸

10 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

Chapter 10

Inference for Regression

This chapter considers the relationship between a response variable and an explanatory

variable by using the linear regression analysis. It will focus on confidence intervals for intercept,

slope and mean response and significance tests for intercept and slope.

10.1 Simple Linear Regression

The term “regression” and the general methods for studying relationships now included

under this term were introduced by Francis Galton in 1908, the renowned British biologist.

Galton was engaged in the study of heredity. One of his observations was that the children of

tall parents to be taller than average but not as tall as their parents. This “regression toward

mediocrity” gave these statistical methods their name.

Parents’ height Children’s height

64.5 65.8

65.5 66.7

66.5 67.2

67.5 67.6

68.5 68.2

69.5 68.9

70.5 69.5

71.5 69.9

72.5 72.2

Simple linear regression model:

Given nobservations on the explanatory variable xand responses variable y,

(x1, y1),(x2, y2),···,(xn, yn)

•statistical model:

yi=β0+β1xi+εi,

where εiare assumed to be independent N(0, σ)

•Parameters:β0,β1,σ

•Mean response:E(yi) = β0+β1xi

•Population regression line:µy=β0+β1x

10-1

Discover Study notes of Data Analysis & Statistical Methods University of Oregon (UO)

Partial preview of the text

Download Linear Regression Analysis: Inference for Intercept, Slope, and Mean Response and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Chapter 10 Inference for Regression

This chapter considers the relationship between a response variable and an explanatory variable by using the linear regression analysis. It will focus on confidence intervals for intercept, slope and mean response and significance tests for intercept and slope.

10.1 Simple Linear Regression The term “regression” and the general methods for studying relationships now included under this term were introduced by Francis Galton in 1908, the renowned British biologist. Galton was engaged in the study of heredity. One of his observations was that the children of tall parents to be taller than average but not as tall as their parents. This “regression toward mediocrity” gave these statistical methods their name.

Parents’ height Children’s height 64.5 65. 65.5 66. 66.5 67. 67.5 67. 68.5 68. 69.5 68. 70.5 69. 71.5 69. 72.5 72.

Simple linear regression model:

Given n observations on the explanatory variable x and responses variable y,

(x 1 , y 1 ), (x 2 , y 2 ), · · · , (xn, yn)

statistical model: yi = β 0 + β 1 xi + εi, where εi are assumed to be independent N (0, σ)
Parameters: β 0 , β 1 , σ
Mean response: E(yi) = β 0 + β 1 xi
Population regression line: μy = β 0 + β 1 x

Estimating the regression parameters:

Recall the least-squares regression line in Chapter 2:

yˆ = b 0 + b 1 x,

where

b 1 = r

sy sx

=

(xi − x¯)(yi − ¯y) ∑ (xi − x¯)^2

=

Sxy Sxx

b 0 = y¯ − b 1 ¯x, Sxy =

(xi − ¯x)(yi − y¯), Sxx =

(xi − x¯)^2

E(b 0 ) = β 0 , E(b 1 ) = β 1
Var(b 1 ) = σ

2 Sxx

Var(b 0 ) = σ^2

1 n +^

¯x^2 Sxx

b 1 ∼ N

β 1 ,

σ √ Sxx

b 0 ∼ N

β 0 , σ

n

x¯^2 Sxx

Predicted response: ˆyi = b 0 + b 1 xi
Residual: ei = yi − ˆyi = yi − b 0 − b 1 xi
The estimate of σ^2 : s^2 =

n − 2

e^2 i =

n − 2

(yi − yˆi)^2

n − 2 is called the the degrees of freedom for s^2

E(s^2 ) = σ^2
Standard error of b 1 : sb 1 =

s √ Sxx

Standard error of b 0 : sb 0 = s

n

x¯^2 Sxx

Parents’ height Children’s height 64.5 65. 65.5 66. 66.5 67. 67.5 67. 68.5 68. 69.5 68. 70.5 69. 71.5 69. 72.5 72.

x ¯ = 68. 5 , y¯ = 68. 4444 Sxy = 41. 1 , Sxx = 60, s = 0. 4998

(a) Find the equation of the least-squares regression line.

(b) Give a 90% confidence interval for the slope and the intercept.

(c) Test H 0 : β 1 = 0 against Ha : β 1 > 0 at the 0.05 significance level.

Solution:

Example 10.2 The relationship between school funding and student performance continues to be a hotly debated political and philosophical issue. Typical of the data available are the following figures, showing the 1991 per-pupil expenditures and average SAT scores for 13 randomly chosen districts in Virginia.

Spending per pupil Average SAT score 3877 886 3947 817 3754 904 3864 754 5770 975 3736 861 4377 887 5107 922 4002 905 4078 890 4259 852 3591 869 4613 909

The following statistics can be derived from the above data

x ¯ = 4228. 85 , y¯ = 879. 308 ,

Sxy = 254046, Sxx = 4602526, s = 42. 74

(a) Find the equation of the least-squares regression line.

(b) Give a 95% confidence interval for the slope and the intercept.

(c) Test H 0 : β 1 = 0 against Ha : β 1 > 0 at the 0.05 significance level.

Solution:

Confidence intervals for mean response:

For any specific value of x, say x∗, the mean of the response is given by

μy = β 0 + β 1 x∗

σˆy = σ

n

(¯x − x∗)^2 Sxx

yˆ − y ∼ N (0, σˆy)

yˆ − y syˆ

∼ T (n − 2), where

sˆy = s

n

(¯x − x∗)^2 Sxx

A level C prediction interval for y is ˆy ± t∗sˆy

where t∗^ is the upper (1-C)/2 critical value of Tn− 2.

Example 10.4 Refer to Example 10.1. If Mark’s parents are 70 inches tall, find a 95% pre- diction interval for Mark’s height.

Solution:

Example 10.5 Can the highest price next day of a stock be predicted from today’s closing price? Table below are the closing prices and highest prices (next day) of a stock in NASDAQ.

Closing price Highest price (next day) x y 27.94 27. 26.75 27. 26.19 26. 27.19 27. 26.69 28. 27.87 28. 37.06 39. 36.81 37. 36.38 36. 33.50 34. 31.44 33. 33.25 33. 34.56 36. 34.25 35. 33.19 34. 32.00 31. 31.25 31. 30.00 30. 28.31 31. 28.56 29. It is known that ¯x = 31. 16 , y¯ = 32. 01 , Sxy = 247. 93 , Sxx = 247. 157 , s = 0. 9825

(a) Find the equation of the least-squares regression line.

(b) Give a 95% confidence interval for the slope and the intercept.

(c) Test H 0 : β 1 = 0 against Ha : β 1 > 0 at the 0.05 significance level.

(d) Suppose that today’s closing price is $25. Find a 80% prediction interval for tomorrow’s highest price.

10.2 Analysis of Variance for Regression

Analysis of variance (ANOVA) summarizes information about the sources of variation in the data. It is based on Data = Fit + Residual

Inference for correlation: Let ρ be the population correlation between the variables x and y, and let r be the sample correlation, where

r =

Sxy √ SxxSyy

b 1 sb 1

r

n − 2 √ 1 − r^2

Test for a zero population correlation:

Null hypothesis: H 0 : ρ = 0

t statistic:

t =

r

n − 2 √ 1 − r^2

Alternative Ha P-value Reject H 0 at α level ρ > 0 P (Tn− 2 ≥ t) if t ≥ t∗ α ρ < 0 P (Tn− 2 ≤ t) if t ≤ −t∗ α ρ 6 = 0 2 P (Tn− 2 ≥ |t|) if |t| ≥ t∗ α/ 2

Linear Regression Analysis: Inference for Intercept, Slope, and Mean Response, Study notes of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Linear Regression Analysis: Inference for Intercept, Slope, and Mean Response and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!