Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Review Problems for Midterm Exam #1 - Applied Regression Analysis | STAT 51200, Exams of Statistics

Material Type: Exam; Professor: Jennings; Class: Applied Regression Analysis; Subject: STAT-Statistics; University: Purdue University - Main Campus; Term: Unknown 1989;

Typology: Exams

Pre 2010

Uploaded on 07/30/2009

koofers-user-b9q
koofers-user-b9q 🇺🇸

5

(1)

10 documents

1 / 3

Toggle sidebar

Related documents


Partial preview of the text

Download Review Problems for Midterm Exam #1 - Applied Regression Analysis | STAT 51200 and more Exams Statistics in PDF only on Docsity!

Statistics 512: Review Problems for First Midterm Exam

Keep for First Exam Review (October 6)

1. Short answer questions. Unless stated otherwise, each part is unrelated.

(a) A polynomial regression model y = β 0 + β 1 X + β 2 X^2 +  fit to a set of data gives

b 0 = 2, b 1 = 4, and b 2 = 3. Find the predicted value of the response variable when the

explanatory variable is equal to 4.

(b) The M SE for a multiple regression is 40. What is the estimate of the standard deviation

of the error term in the model?

(c) For a simple linear regression, the estimate of the slope is 7 with a standard error of 2.

Give an estimate of the change in the response variable that you would expect if the

explanatory variable increased by 4.

(d) Refer to the previous problem. Assume that the sample size is 36. Give a 95% confidence

interval for your estimate.

(e) A multiple regression is run with 40 cases and eight explanatory variables. Give the

degrees of freedom for the F -statistic that tests the null hypothesis that the coefficients

for the first three explanatory variables are equal to zero.

(f) In a simple linear regression here are two intervals associated with X = 5: (20, 30) and

(15, 35). Which is the prediction interval and which is the confidence interval for the

mean response? Explain your answer.

(g) The correlation between two variables U and V is -0.5. What percent of the variation

in U can be explained by V using a simple linear regression?

(h) There are numerous ways to estimate a regression line. Describe the method of least

squares (with a picture and/or words) and explain the effect of this approach on the

sum of squares error, SSE.

(i) Suppose the estimated regression equation is Yˆ = 3 + 5X. Give the estimated regression

function if the variable U = (X − 4)/10 were used in place of X.

(j) Explain how a 98% confidence interval for the slope can be used to test H 0 : β 1 = 5 and

at what significance level α?

(k) Rob Poorman Auto Sales has decided to use R^2 to select the best model in predicting

car demand. Explain when this is and when this is not a reasonable approach.

2. Refer to the SAS output on the last pages (marked OUTPUT FOR PROBLEM 2). The

data are from a study of 78 seventh grade students. The goal is to predict GRADE (average

school grade on a scale of 0 to 11) from variables which include IQ (score on an I.Q. test)

and GENDER (0 = female, 1 = male).

(a) Using the output for the simple linear regression, does there appear to be a linear

relationship between GRADE and IQ? Give a test statistic with degrees of freedom and

p-value to support your answer (you may use other evidence as well).

(b) Individual 51 has GRADE = 0.53 and IQ = 103. What value of GRADE is predicted

for this individual by the estimated simple linear regression model? The studentized

residual (residual divided by its standard error) for this individual is equal to -3..

On that basis, do you consider this observation to be an outlier? Explain.

(c) The variable IQGEN is the product of IQ and GENDER. Examine the output for the

model involving these three variables. Write down the estimated regression equation for

this model. Also write down the two separate fitted lines for female and male students.

(d) Examine the results of the t-tests for the three regression coefficients as well as the result

of the (general linear) F -test labeled “SAMELINE”. The results of this general linear

test were produced with the SAS input line “test gender, iqgen;”.

State the null hypotheses tested by each of these four tests and whether that hypoth-

esis is rejected. What apparent conflict do you see between the results of these tests?

Explain why such a conflict might arise and suggest one possible action that might be

used to eliminate this conflict.

OUTPUT FOR PROBLEM 2

The REG Procedure Model: MODEL Dependent Variable: grade

Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F

Model 1 136.31881 136.31881 51.01 <. Error 76 203.10809 2. Corrected Total 77 339.

Root MSE 1.63477 R-Square 0. Dependent Mean 7.44654 Adj R-Sq 0. Coeff Var 21.

Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| 95% Confidence Limits

Intercept 1 -3.55706 1.55176 -2.29 0.0247 -6.64766 -0. iq 1 0.10102 0.01414 7.14 <.0001 0.07285 0.

The REG Procedure Model: MODEL Dependent Variable: grade

Analysis of Variance

Sum of Mean Source DF Squares Square F Value Pr > F

Model 3 155.42484 51.80828 20.84 <. Error 74 184.00205 2. Corrected Total 77 339.

Root MSE 1.57687 R-Square 0. Dependent Mean 7.44654 Adj R-Sq 0. Coeff Var 21.

Parameter Estimates

Parameter Standard Variable DF Estimate Error t Value Pr > |t|

Intercept 1 -2.25235 2.15377 -1.05 0. iq 1 0.09400 0.02017 4.66 <. gender 1 -3.84266 3.03670 -1.27 0. iqgen 1 0.02656 0.02784 0.95 0.

Test sameline Results for Dependent Variable grade

Mean Source DF Square F Value Pr > F

Numerator 2 9.55302 3.84 0. Denominator 74 2.