Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Assignment; Professor: DeCook; Class: 22S - Applied Linear Regression; Subject: Statistics and Actuarial Science; University: University of Iowa; Term: Unknown 1989;
Typology: Assignments
1 / 2
22s: Homework 5
Assigned Wednesday, October 8 Due Wednesday, October 15 at classtime
Dummy variables, One-way ANOVA
Turn-in homework with hand-written or typed responses and include any relevant plots which you are describing.
(a) Plot scatterplots for the bivariate plots of the quantitative variables, and give the correlation matrix. Here, moral is the dependent variable in our model. Com- ment on the relationship between the quantitative predictors and the dependent variable (positive or negative, strong or weak), and on the relationship between the two quantitative predictors.
(b) Construct dummy variables to represent the four regions East, Midwest, West, and South (use South as the baseline group). Regress moral on the three other predictors in the data set. Provide the ‘summary’ output from the model fit.
(c) Perform a Partial F-test to see if region is a significant predictor of moral given hetero and mobility has been accounted for. Give the relevant findings of the test.
A smoking dummy variable Di was created with non-smoker=0 and smoker=1.
Consider the common-slope model (this is an additive model):
Yi = β 0 + βagexi + βDDi + i
The parameter estimates from the fitted model are: βˆ 0 = 0. 3673 βˆage = 0. 2306 βˆD = − 0. 2090
(a) What is the fitted model for smokers? (provide it using the values above) (b) What is the fitted model for non-smokers? (c) Based on the fitted values, does it look like smokers or non-smokers have a higher expelled volume at every age? On what did you base your decision? (d) The individuals in this study ranged in age from 6-22 years old. Does the volume of expelled air go up or down as one gets older for this age group? On what did you base your decision?
(a) Find the means of the 4 groups, and draw parallel boxplots of mobility by region. Perform Levene’s test for constant variance. > levene.test(mobility, region) Comment on the plot, the test, and provide the plot, as well. (b) In the effects model for ANOVA, we have mentioned that we have an over- parameterization, and we need to impose a constraint on the parameters before we do our estimation. The constraint chosen impacts the interpretation of the parameters.
I want you to confirm that identical sums of squares are produced by the following three computational methods. For each, provide the coding you used and the RegSS and the RSS. (i) Set α 4 = 0 (note that setting α 4 = 0 is the same as using South as the baseline group, and you did this already in problem 1). Set up your dummy variables accordingly, and fit the 1-way ANOVA model using the lm() function. (ii) Set α 4 = −(α 1 + α 2 + α 3 ). You’ll use -1, 0, 1 coding for your dummy vari- ables. p.145/146 shows an example of this often used coding system called deviation regressors. Set up your dummy variables accordingly, and fit the 1-way ANOVA model using the lm() function. (iii) Use the formulas in the first column of Table 8.1 on p. 147 to calculate the sums of squares directly.
(c) Summarize the sums of squares in an ANOVA table like the bottom of p. 148. (d) Test the significance of region in the 1-way ANOVA model using the following:
lm.out = lm(mobility ∼ region) anova(lm.out)