Simple Linear Regression - Final Exam | STAT 112, Exams of Economic statistics

Material Type: Exam; Class: Business & Econ Statistics II; Subject: Statistics; University: George Washington University; Term: Spring 2004;

Typology: Exams

Pre 2010

Uploaded on 08/18/2009

koofers-user-sg2-1
koofers-user-sg2-1 🇺🇸

9 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics 112 Simple Linear Regression
Fuel Consumption Example
March 1, 2004
E. Bura
Fuel Consumption Case: reducing natural gas transmission fines.
In 1993, the natural gas industry was deregulated. In consequence, the
natural gas companies b ecame responsible for acquiring the natural gas
needed to heat the homes and businesses they serve.
Natural gas companies place orders for natural gas to be transmitted
by pipeline transmission systems to their cities. For placing an order, the
natural gas companies need to make a prediction of the city’s natural gas
need for that period.
In order to encourage natuaral gas companies to make accurate pre-
dictions and to help control costs, pipeline transmission systems charge in
addition to their usual fees, transmission fines if the order is below need or
above need. There is of course some leeway; i.e., there is a minimum amount
of errors that go unfined.
Suppose a management consulting firm is responsible to make predictions
for need of gas for a natural gas company serving a small city. The problem
is to predict weekly fuel consumption (y) on the basis of average hourly
temperature (x).
For this we observed yand xfor eight weeks:
Week xyx
2xy
1 28 12.4 784 347.2
2 28 11.7 784 327.6
3 32.5 12.4 1056.25 403
4 39.0 10.8 1521 421.2
5 45.9 9.4 2106.81 431.46
6 57.8 9.5 3340.84 549.1
7 58.1 8.0 3375.61 464.8
8 62.5 7.5 3906.25 468.75
n
i=1 xi= 351.8n
i=1 yi=81.7n
i=1 x2
i= 16874.74 n
i=1 xiyi= 3413.11
The plot of yversus xsuggests that the simple linear regression model
may provide a good fit to the data. Hence, we hypothesize that
yi=β0+β1xi+i
pf3
pf4
pf5

Partial preview of the text

Download Simple Linear Regression - Final Exam | STAT 112 and more Exams Economic statistics in PDF only on Docsity!

Statistics 112 — Simple Linear Regression Fuel Consumption Example March 1, 2004 E. Bura

Fuel Consumption Case: reducing natural gas transmission fines.

In 1993, the natural gas industry was deregulated. In consequence, the natural gas companies became responsible for acquiring the natural gas needed to heat the homes and businesses they serve. Natural gas companies place orders for natural gas to be transmitted by pipeline transmission systems to their cities. For placing an order, the natural gas companies need to make a prediction of the city’s natural gas need for that period. In order to encourage natuaral gas companies to make accurate pre- dictions and to help control costs, pipeline transmission systems charge in addition to their usual fees, transmission fines if the order is below need or above need. There is of course some leeway; i.e., there is a minimum amount of errors that go unfined. Suppose a management consulting firm is responsible to make predictions for need of gas for a natural gas company serving a small city. The problem is to predict weekly fuel consumption (y) on the basis of average hourly temperature (x). For this we observed y and x for eight weeks:

Week x y x^2 xy 1 28 12.4 784 347. 2 28 11.7 784 327. 3 32.5 12.4 1056.25 403 4 39.0 10.8 1521 421. 5 45.9 9.4 2106.81 431. 6 57.8 9.5 3340.84 549. 7 58.1 8.0 3375.61 464. 8 62.5 ∑ 7.5 3906.25 468. n i=1 xi^ = 351.^8

∑n i=1 yi^ = 81.^7

∑n i=1 x

2 i = 16874.^74

∑n i=1 xiyi^ = 3413.^11

The plot of y versus x suggests that the simple linear regression model may provide a good fit to the data. Hence, we hypothesize that

yi = β 0 + β 1 xi + i

with

  1. E(i) = 0
  2. Var(i) = σ^2
  3. i ∼ N (0, σ^2 )
  4. The errors are independent of one another. That is, the data are a random sample.

The least squares estimates of the parameters of the model, β 0 and β 1 , are

βˆ 1 = SSxy SSxx β^ ˆ 0 = ¯y − βˆ 1 ¯x

where

SSxy =

∑^ n

i=

xiyi −

∑n i=1 xi)(

∑n i=1 yi) n

SSxx =

∑^ n

i=

x^2 i −

∑n i=1 xi)

2 n

Also, ¯y = 10.2125 and ¯x = 43.98. These yield,

βˆ 1 = SSxy SSxx

βˆ 0 = ¯y − βˆ 1 ¯x = 10.2125 +. 1279 × 43 .98 = 15. 84

The fitted line is given by

yˆi = 15. 84 −. 1279 xi

1 The Meaning of ˆσ = s

ˆσ = s =

M SE =

SSE

n − 2 Since y ∼ N (β 0 + β 1 x, σ^2 ) we expect most of the observed responses (roughly 95%) to fall within 2s from the fitted line.

In general, to test

H 0 :β 1 = β versus H 1 :β 1 = β β 1 > β β 1 < β

use the test statistic βˆ 1 − β √^ s SSxx

∼ tn− 2

Reject the null at level α if

|t| > tα/ 2 (n − 2) t > tα(n − 2) t < −tα(n − 2)

with respect to the analogous alternative. Also, a 100(1-α)% confidence interval for β 1 is given by

βˆ 1 ± tα/ 2 (n − 2) √s SSxx Observe that if β 1 = 0 then the population correlation coefficient, ρ, is also equal to zero. Therefore, the t-test for the slope of the model can be also used to test whether ρ = 0.

2.1 Back to the fuel consumption example

SSE =

∑n i=1(yi^ −^ ˆyi)

(^2) = 2.5680112, and

M SE =

SSE

n − 2

So, s =

M SE = .6542.

To test whether β 1 = 0 versus β 1 = 0, we compute the test statistic

t =

βˆ 1 √^ s SSxx

√^.^6542

  1. 355

Since |t| = 7.33, the p-value of the test is smaller twice the area to the right of 5.959. That is, p-value < 2 × .0005 = .001. This is a highly significant result so we reject the null in favor of the alternative. The linear model is useful for modelling the mean of fuel consumption, y.

3 The coefficient of determination

The coefficient of determination, R^2 , is a measure of the contribution of x or the model in predicting y.

R^2 =

explained variability total variability in y about its mean ¯y SSyy − SSE SSyy

SSE

SSyy

In other words, R^2 represents the proportion of variability in y explained by the fitted model. In the case of the simple linear regression, that is when the hypothesized model is of the form y = β 0 + β 1 x + , R^2 = r^2 , where r is the correlation coefficient of y and x.

3.1 Back to the fuel consumption example

R^2 = 1 −

SSE

SSyy

Hence, 89.95% of the variability in the y values about their mean ¯y is ex- plained by the fitted simple linear regression model.

4 Estimation and Prediction

The regression model has two main uses:

  • Estimating the mean y value for a specific value of x
  • Predicting a new individual y value for a given x

The standard error of ˆy as an estimator of the mean y-value when x = xp is

σyˆ = σ

n

(xp − ¯x)^2 SSxx The standard error of ˆy as a predictor of the individual y-value when x = xp is