Statistics 512: Problem Set No. 9 - Comparing Regression with Dummy Variables and ANOVA - , Assignments of Statistics

A problem set from a statistics 512 course, focusing on comparing regression with dummy variables and analysis of variance (anova) using sas. Students are required to run provided sas code, compare anova tables and parameter results from different parameterizations, calculate coefficients, and check assumptions. The document also includes instructions for using tukey multiple comparison method and testing hypotheses.

Typology: Assignments

Pre 2010

Uploaded on 07/30/2009

koofers-user-cme
koofers-user-cme 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics 512: Problem Set No. 9
Due November 7, 2008
1. For this problem, the idea is to demonstrate the similarity between regression with dummy
variables and ANOVA. To do this run the SAS code stat512prob8.sas.
(a) Compare the ANOVA table and parameter results from the GLM analysis and Param-
eterization #1. What do the coefficients associated with X1and X2(i.e. b1and b2)
estimate in terms of treatment means? What constraint system does this parameteriza-
tion correspond to?
(b) Compare the ANOVA table and parameter results from the GLM analysis and Param-
eterization #2. What do the coefficients associated with X1and X2(i.e. b1and b2)
estimate in terms of treatment means? What constraint system does this parameteriza-
tion correspond to?
(c) Calculate b0+b1for both the parameterizations and show that the answers are the same.
What does this quantity estimate in terms of the treatment means (i.e. why are they
the same)?
(d) Calculate b1b2for both the parameterizations and show that the answers are the same.
What does this quantity estimate in terms of the treatment means (i.e. why are they
the same)?
The next three problems use the dataset from Problem 16.11 described on page 725
of KNNL, and continue the analysis begun on Problem Set 8.
2. Use the Tukey multiple comparison method to determine which pairs of machines differ
significantly. Summarize the results.
3. Suppose you want to compare the average of the first two machines with the average of the
last four. Use the estimate and contrast statements in proc glm to test the appropriate
hypothesis. Report the estimated value of this contrast with its standard error; state the null
and alternative hypotheses, the test statistic with degrees of freedom, the p-value and your
conclusion.
4. Check assumptions using the residuals. Turn in the plots/output you used to check the
assumptions and state your conclusions.
The remaining problems use the dataset from Problem 18.15 on page 804 of KNNL.
5. KNNL 18.15 (Omit part e).) Please do not print out all 80 values for part a); it is sufficient
to plot them in part b).
6. A rather simple approximation of the Box-Cox procedure is the following:
(a) Compute the mean and standard deviation for each treatment factor level.
(b) Take the log of both the mean and standard deviation.
1
pf2

Partial preview of the text

Download Statistics 512: Problem Set No. 9 - Comparing Regression with Dummy Variables and ANOVA - and more Assignments Statistics in PDF only on Docsity!

Statistics 512: Problem Set No. 9 Due November 7, 2008

  1. For this problem, the idea is to demonstrate the similarity between regression with dummy variables and ANOVA. To do this run the SAS code stat512prob8.sas.

(a) Compare the ANOVA table and parameter results from the GLM analysis and Param- eterization #1. What do the coefficients associated with X 1 and X 2 (i.e. b 1 and b 2 ) estimate in terms of treatment means? What constraint system does this parameteriza- tion correspond to? (b) Compare the ANOVA table and parameter results from the GLM analysis and Param- eterization #2. What do the coefficients associated with X 1 and X 2 (i.e. b 1 and b 2 ) estimate in terms of treatment means? What constraint system does this parameteriza- tion correspond to? (c) Calculate b 0 +b 1 for both the parameterizations and show that the answers are the same. What does this quantity estimate in terms of the treatment means (i.e. why are they the same)? (d) Calculate b 1 −b 2 for both the parameterizations and show that the answers are the same. What does this quantity estimate in terms of the treatment means (i.e. why are they the same)?

The next three problems use the dataset from Problem 16.11 described on page 725 of KNNL, and continue the analysis begun on Problem Set 8.

  1. Use the Tukey multiple comparison method to determine which pairs of machines differ significantly. Summarize the results.
  2. Suppose you want to compare the average of the first two machines with the average of the last four. Use the estimate and contrast statements in proc glm to test the appropriate hypothesis. Report the estimated value of this contrast with its standard error; state the null and alternative hypotheses, the test statistic with degrees of freedom, the p-value and your conclusion.
  3. Check assumptions using the residuals. Turn in the plots/output you used to check the assumptions and state your conclusions.

The remaining problems use the dataset from Problem 18.15 on page 804 of KNNL.

  1. KNNL 18.15 (Omit part e).) Please do not print out all 80 values for part a); it is sufficient to plot them in part b).
  2. A rather simple approximation of the Box-Cox procedure is the following:

(a) Compute the mean and standard deviation for each treatment factor level. (b) Take the log of both the mean and standard deviation.

(c) Fit the regression model log(σi) = β 0 + β 1 log(μi) +  using the observed means and standard deviations as the data for μi and σi respectively (there are 4 “observations” in this dataset). (d) Set λˆ = 1 − b 1 where b 1 is the estimate for β 1 obtained in (6c).

Use the Helicopter service data to perform this approximation. What value of λ appears reasonable according to this method?

  1. Define a new response variable by adding 1 to the original response. (This will avoid 0’s which mess up the log and reciprocal transformations.) Then use SAS’s Box-Cox procedure to determine an appropriate transformation. Proc transreg can be used to perform ANOVA if we tell it shift is a class variable, as in the following:

proc transreg data=helicopter; model boxcox(usesplus1) = class(shift);

  1. KNNL 18.16 (Omit the coefficient of correlation in part b).)
  2. Use the Tukey multiple comparison method for differences in means on both the untrans- formed and transformed Helicopter service data to determine which shifts differ significantly. Summarize and compare the results.