Lab for Statistical Methods for Research - Fall 2008 | STAT 401, Lab Reports of Statistics

Material Type: Lab; Class: STAT METH FOR RSRCH; Subject: STATISTICS; University: Iowa State University; Term: Fall 2008;

Typology: Lab Reports

Pre 2010

Uploaded on 09/02/2009

koofers-user-6o0
koofers-user-6o0 🇺🇸

10 documents

1 / 1

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 401B Lab 9: Due November 18 Fall 2008
Below are the Olympic Gold Medal 200 m dash times for women and men from 1948 through
2004.
Year Women’s Time Men’s Time Year Women’s Time Men’s Time
1948 24.40 21.10 1980 22.03 20.19
1952 23.70 20.70 1984 21.81 19.80
1956 23.40 20.60 1988 21.34 19.75
1960 24.00 20.50 1992 21.81 20.01
1964 23.00 20.30 1996 22.12 19.32
1968 22.50 19.80 2000 21.84 20.09
1972 22.40 20.00 2004 22.05 19.79
1976 22.37 20.23 2008
We want to be able to predict the gold medal time for the 200 m dash at the 2008 Olympics in
Beijing. We also wish to investigate how the winning times have changed over the past 60 years
for both women and men.
1. Consider the combined set of women’s and men’s times. Fit a simple linear regression
with time as the response and year as the explanatory variable.
a) Give the least squares prediction equation.
b) Give an interpretation of the estimated slope coefficient.
c) Why can’t we interpret the estimated intercept within the context of the problem?
d) How much of the variability in time is explained by year?
e) What do you notice about the plot of residuals versus year? What does this indicate?
2. Fit a multiple linear regression with time as the response and (year – 1976), an indicator
variable for gender (Gender = 0 if female, Gender = 1 if male) and a (year – 1976) by
gender interaction term.
a) How much of the variability in time is explained by this model?
b) Give the least squares prediction equation for this model.
c) Interpret each of the estimated parameters within the context of the problem.
d) In 1976, were the predicted times for men and women statistically different? Support
your answer with the appropriate test or confidence interval.
e) Are women’s and men’s times changing at statistically different rates? Support your
answer with the appropriate test or confidence interval.
f) Predict the winning times for women and men at the 2008 Beijing Olympics.
g) Describe the plot of residuals versus year. What does this indicate about the fit of the
model?
3. Fit a multiple linear regression with time as the response and (year – 1976), Gender,
Gender*(year – 1976), (year – 1976)2, and Gender*(year – 1976)2.
a) How much of the variability in time is explained by this model?
b) Does Gender*(Year – 1976)2 add significantly to the model? Support your answer
statistically.
c) What is the “best” model for predicting time? Give the prediction equation.
d) Use this “best” model to predict the men’s and women’s 200 m dash times for the
2008 Beijing Olympics. How do these predictions differ from those in 2 f)?
e) Analyze the residuals for the “best” model. What does this analysis indicate about
the conditions of equal standard deviations, identically and normally distributed
errors?
Be sure to turn in the JMP output you used to answer the questions above.

Partial preview of the text

Download Lab for Statistical Methods for Research - Fall 2008 | STAT 401 and more Lab Reports Statistics in PDF only on Docsity!

Stat 401B Lab 9: Due November 18 Fall 2008

Below are the Olympic Gold Medal 200 m dash times for women and men from 1948 through

Year Women’s Time Men’s Time Year Women’s Time Men’s Time 1948 24.40 21.10 1980 22.03 20. 1952 23.70 20.70 1984 21.81 19. 1956 23.40 20.60 1988 21.34 19. 1960 24.00 20.50 1992 21.81 20. 1964 23.00 20.30 1996 22.12 19. 1968 22.50 19.80 2000 21.84 20. 1972 22.40 20.00 2004 22.05 19. 1976 22.37 20.23 2008

We want to be able to predict the gold medal time for the 200 m dash at the 2008 Olympics in Beijing. We also wish to investigate how the winning times have changed over the past 60 years for both women and men.

  1. Consider the combined set of women’s and men’s times. Fit a simple linear regression with time as the response and year as the explanatory variable. a) Give the least squares prediction equation. b) Give an interpretation of the estimated slope coefficient. c) Why can’t we interpret the estimated intercept within the context of the problem? d) How much of the variability in time is explained by year? e) What do you notice about the plot of residuals versus year? What does this indicate?
  2. Fit a multiple linear regression with time as the response and (year – 1976), an indicator variable for gender (Gender = 0 if female, Gender = 1 if male) and a (year – 1976) by gender interaction term. a) How much of the variability in time is explained by this model? b) Give the least squares prediction equation for this model. c) Interpret each of the estimated parameters within the context of the problem. d) In 1976, were the predicted times for men and women statistically different? Support your answer with the appropriate test or confidence interval. e) Are women’s and men’s times changing at statistically different rates? Support your answer with the appropriate test or confidence interval. f) Predict the winning times for women and men at the 2008 Beijing Olympics. g) Describe the plot of residuals versus year. What does this indicate about the fit of the model?
  3. Fit a multiple linear regression with time as the response and (year – 1976), Gender, Gender(year – 1976), (year – 1976)^2 , and Gender(year – 1976)^2. a) How much of the variability in time is explained by this model? b) Does Gender*(Year – 1976) 2 add significantly to the model? Support your answer statistically. c) What is the “best” model for predicting time? Give the prediction equation. d) Use this “best” model to predict the men’s and women’s 200 m dash times for the 2008 Beijing Olympics. How do these predictions differ from those in 2 f)? e) Analyze the residuals for the “best” model. What does this analysis indicate about the conditions of equal standard deviations, identically and normally distributed errors?

Be sure to turn in the JMP output you used to answer the questions above.