Linear Regression in Applied Statistics - Study Guide | STAT 324, Exams of Statistics

Material Type: Exam; Class: Introductory Applied Statistics for Engineers; Subject: STATISTICS; University: University of Wisconsin - Madison; Term: Spring 2000;

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-u5e
koofers-user-u5e 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STAT 324 Discussion #7
TA Jiale Xu
Webpage www.stat.wisc.edu/xujiale/stat324
Office hour Tue. 2:10-3:20pm and Wed. 2:10-3:00pm
1 Linear Regression
1.1 Simple linear regression
The simple linear regression is given by
yi=α+βxi+i
where is’s are independent and identically distributed with N(0, σ2). Here x is called independent
variable (or predictor), y is called dependent variable (or response). The coefficients αand βcan be
estimated by the method of least square.
1.2 General model of linear regression
In most case, we have many dependence variables, such as temperature, pressure and so on.
The general linear model is
Y=β0+β1X1+· · · +βp1Xp1+.
Suppose we have n observations then the the model can be written as
yi=β0+xi1β1+· · · +xi,p1βp1+ii= 1,· · · , n.
If we introduce matrix notation,
Y=
y1
y2
.
.
.
yn
X=
1x11 · · · x1,p1
1x21 · · · x2,p1
.
.
..
.
.
1xn1· · · xn,p1
β=
β0
β1
.
.
.
βp1
=
0
1
.
.
.
p1
Then the linear model can be written as
Y= +.
Here Xis called designed matrix. The Least Square estimator of βis
ˆ
β= (XTX)1XTY.
1
pf2

Partial preview of the text

Download Linear Regression in Applied Statistics - Study Guide | STAT 324 and more Exams Statistics in PDF only on Docsity!

STAT 324 Discussion

TA Jiale Xu

Webpage www.stat.wisc.edu/∼xujiale/stat

Email [email protected]

Office hour Tue. 2:10-3:20pm and Wed. 2:10-3:00pm

1 Linear Regression

1.1 Simple linear regression

The simple linear regression is given by

yi = α + βxi + i

where is’s are independent and identically distributed with N (0, σ

2 ). Here x is called independent

variable (or predictor), y is called dependent variable (or response). The coefficients α and β can be

estimated by the method of least square.

1.2 General model of linear regression

In most case, we have many dependence variables, such as temperature, pressure and so on.

The general linear model is

Y = β 0 + β 1 X 1 + · · · + βp− 1 Xp− 1 + .

Suppose we have n observations then the the model can be written as

yi = β 0 + xi 1 β 1 + · · · + xi,p− 1 βp− 1 + i i = 1, · · · , n.

If we introduce matrix notation,

Y =

y 1

y 2

. . .

yn

X =

1 x 11 · · · x 1 ,p− 1

1 x 21 · · · x 2 ,p− 1

. . .

1 xn 1 · · · xn,p− 1

β =

β 0

β 1

. . .

βp− 1

p− 1

Then the linear model can be written as

Y = Xβ + .

Here X is called designed matrix. The Least Square estimator of β is

βˆ = (XT^ X)−^1 XT^ Y.

2 Linear Regression with R

2.1 Steps

  • Plot the data (xyplot) to see whether there is a linear relationship. You can add smoother to see

the trend or add the the regression line.

  • Look for transformation, etc (log). Some time after the transformation, the data is linear.
  • Fit the model by

fm1<-lm(y~x,data)

  • Check the result: summary(fm1),coef(fm1),model.matrix(),predict(),fitted(),resid(),confint().
  • Residual Analysis: Plot the residual versus fitted value to see whether the assumption is satisfied.
  • Statistical Inference: t-test, F-test.

2.2 Examples

  • Duncan(car): The Duncan data frame has 45 rows and 4 columns. Data on the prestige and other

characteristics of 45 U. S. occupations in 1950. We are interested in the relation between the

prestige and education.

  • Plot the data (prestige vs education) with a smoother or fitted regression line.
  • Fit a simple linear regression model (prestige vs education), find the estimates of parameters and

their standard errors, and p-values.

  • Interpret the estimates and the p-value related to the t test in the model.
  • Make a plot of residual vs. the fitted values for the fitted model. Use direct call to xyplot or

”pre-packaged” plot. Check whether the assumptions are satisfied.

  • Some functions used to check the information: coef(), predict(), fitted(), resid(), confint()
  • Predict prestige on a given value of education.
  • Use anova() to perform a F test. Compare the result from F test to the t test result. What do

you find?