

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: STATISTICAL METHODS II; Subject: Statistics; University: University of South Carolina - Columbia; Term: Spring 2003;
Typology: Exams
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Part I: Answer the two following questions. Five points each.
In performing a regression, ANOVA, or ANCOVA, what four assumptions must be satisfied? The errors must be normally distributed, have mean zero, and equal variances at each treatment level, and must be independent.
Define what is meant by the p-value (or empirical significance level) of a test. The probability of observing a test statistic as extreme as the one observed, or more extreme, if the null-hypothesis is true.
Part II: Answer fifteen of the following sixteen questions. Six points each.
Complete the model equation for this regression, identifying the parameters used. y i = β 0 + βdbh + βheight + βage + βgrav + εi y i = weight of the i th^ tree β 0 = intercept βdbh = slope for dbh βheight = slope for height βage = slope for age βgrav = slope for grav εi = error
Which of the trees had the most extreme set of x-values? What statistic did you use to tell this? Tree 16 has a a hat diagonal of 0.
What set of these variables forms the best regression model for predicting the weight of trees? What did you use to tell this? dbh, age, and grav has the fewest variables of all the models with C(p) less than p+1 and has the highest adjusted r^
What problems with the assumptions are apparent from the residual vs. predicted plot? Suggest a transformation that might fix this problem. Mean of errors isn’t zero and the variances increase (horn shape). Take log of y.
Briefly explain why this model is balanced, factorial, with replications, and fixed effects. It is balanced because each combination occurs exactly 5 times Because they all occur it is factorial and because they occur more than once it has replication Fixed effects because we care about these specific conditions.
The DF, SS, MS, and F for oven have been deleted. What should they be? oven + roast + oven*roast=model, so take the differences for the SS and df... use the MSE to get the F 1 14.06596 14.06596 1.
What hypothesis is being tested by the contrast c1? That the average gas needed for fresh is equal to the average needed for the three frozen (or formerly frozen conditions). μfresh = (μ 24 + μ 12 + μfroz) / 3
Is the assumption of equal variances met? Yes. Modified Levene test p-value > 0.05 (it’s 0.4042)
Assume the assumption of equal slopes is met. Is there a significant difference in the 9 week weight in pigs for the different weaning stages? Yes, p-value = 0.0004 for stage.
Assume the assumption of equal slopes is met. What is the estimated standard deviation of the errors in this model for predicting the 9 week weight of pigs? root-mse=3.
Assume the assumption of equal slopes is met. What percent of the variation in 9 week weight in pigs that is explained by this model using wwt and stage? R-squared=0.706675=70.6675%
Is the assumption of equal slopes met for this model? Yes. P-value for interaction is > 0.05 (it’s 0.5190)
Does a logistic regression seem to fit this data at α=0.05? Yes. Hosmer and Lemeshow p-value > 0.05 (it’s 0.2027)
Assuming that the logistic regression does fit this data, is the age of the party member related to their chance of survival at an α=0.05 level? Yes. Likelihood ratio test p-value < 0.05 (it’s 0.0186)
What is the predicted chance of a 15 year old surviving the journey? exp(1.8183+15(-0.0665))/ (1 + exp(1.8183+15(-0.0665)))= 0.
Why can’t you trust the answer you gave in 15? We would be extrapolating. (Note, that there is actually a 15 in the data set though… so this was a typo on the exam!!! It would work if I had asked about, say, 10.)