Homework 2 with 23 Practice Problems on Data Analysis I | STAT 528, Assignments of Statistics

Material Type: Assignment; Class: Data Analysis I; Subject: Statistics; University: Ohio State University - Main Campus; Term: Unknown 2006;

Typology: Assignments

Pre 2010

Uploaded on 07/22/2009

koofers-user-wik
koofers-user-wik 🇺🇸

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics 528, Summer 2006
Homework #2
1. Dataset “EX02_011.MTP” gives data on the lean body mass and metabolic rate
for 12 women and 7 men.
a. Make a scatterplot. Use different symbols or colors for women and men.
Do you think the correlation will be about the same for men and women or
quite different for the two groups? Why?
b. Find the correlation (r) for women alone and for men alone.
c. Calculate the mean body mass for the women and for the men. Does the
fact that the men are heavier than the women on average influence the
correlations you calculated for (b)? If so, in what way?
d. Lean body mass was measure in kilograms. How would the correlations
change if we measured body mass in pounds? (There are about 2.2
pounds in a kilogram).
2. A mutual fund company’s newsletter says, “A well-diversified portfolio includes
assets with low correlations.” The newsletter includes a table of correlations
between the annual returns on various classes of investments. For example, the
correlation between municipal bonds and large-cap stocks is 0.5, and the
correlation between municipal bonds and small-cap stocks is 0.21.
a. Rachel invests heavily in municipal bonds. She wants to diversity by
adding an investment whose returns do not closely follow the returns on
her bonds. Should she choose large-cap stocks or small-cap stocks for this
purpose? Explain your answer.
b. If Rachel wants an investment that tends to increase when the return on
her bonds drops, what kind of correlation should she look for?
3. Keeping water supplies clean requires regular measurement of levels of
pollutants. The measurements are indirect – a typical analysis involves forming a
dye by a chemical reaction with the dissolved pollutant, then passing light through
the solution and measuring its “absorbance.” To calibrate such measurements, the
laboratory measures known standard solutions and uses regression to relate
absorbance to pollutant concentration. This is usually done every day. Dataset
“EX02_040.MTP” contains one series of data on the absorbance for different
levels of nitrates. Nitrates are measures in milligrams per liter of water.
a. Chemical theory says that these data should lie on a straight line. If the
correlation is not at least 0.997, something went wrong and the calibration
procedure is repeated. Plot the data and find the correlation. Must the
calibration be done again?
b. What is the equation of the least-squares line for predicting absorbance
from concentration? If the lab analyzed a specimen with 500 milligrams
of nitrates per liter, what do you expect the absorbance to be? Based on
your plot and the correlation, do you expect your predicted absorbance to
be very accurate?
pf3
pf4

Partial preview of the text

Download Homework 2 with 23 Practice Problems on Data Analysis I | STAT 528 and more Assignments Statistics in PDF only on Docsity!

Statistics 528, Summer 2006

Homework

  1. Dataset “EX02_011.MTP” gives data on the lean body mass and metabolic rate for 12 women and 7 men. a. Make a scatterplot. Use different symbols or colors for women and men. Do you think the correlation will be about the same for men and women or quite different for the two groups? Why? b. Find the correlation ( r ) for women alone and for men alone. c. Calculate the mean body mass for the women and for the men. Does the fact that the men are heavier than the women on average influence the correlations you calculated for (b)? If so, in what way? d. Lean body mass was measure in kilograms. How would the correlations change if we measured body mass in pounds? (There are about 2. pounds in a kilogram).
  2. A mutual fund company’s newsletter says, “A well-diversified portfolio includes assets with low correlations.” The newsletter includes a table of correlations between the annual returns on various classes of investments. For example, the correlation between municipal bonds and large-cap stocks is 0.5, and the correlation between municipal bonds and small-cap stocks is 0.21. a. Rachel invests heavily in municipal bonds. She wants to diversity by adding an investment whose returns do not closely follow the returns on her bonds. Should she choose large-cap stocks or small-cap stocks for this purpose? Explain your answer. b. If Rachel wants an investment that tends to increase when the return on her bonds drops, what kind of correlation should she look for?
  3. Keeping water supplies clean requires regular measurement of levels of pollutants. The measurements are indirect – a typical analysis involves forming a dye by a chemical reaction with the dissolved pollutant, then passing light through the solution and measuring its “absorbance.” To calibrate such measurements, the laboratory measures known standard solutions and uses regression to relate absorbance to pollutant concentration. This is usually done every day. Dataset “EX02_040.MTP” contains one series of data on the absorbance for different levels of nitrates. Nitrates are measures in milligrams per liter of water. a. Chemical theory says that these data should lie on a straight line. If the correlation is not at least 0.997, something went wrong and the calibration procedure is repeated. Plot the data and find the correlation. Must the calibration be done again? b. What is the equation of the least-squares line for predicting absorbance from concentration? If the lab analyzed a specimen with 500 milligrams of nitrates per liter, what do you expect the absorbance to be? Based on your plot and the correlation, do you expect your predicted absorbance to be very accurate?

Note: for this problem, you must use Minitab to perform the regression and paste your Minitab output into your homework.

  1. Investors ask about the relationship between returns on investments in the United States and on investments overseas. “TA02_008.MTP” gives the total returns on Overseas (column C2) and U.S. (column C3) common stocks over a 30-year period. a. Make a scatterplot suitable for predicting overseas returns from U.S. returns. b. Find the correlation and R^2. Describe the relationship between U.S. and overseas returns in words, using r and R^2 to make your description more precise. c. Find the least-squares regression line of overseas returns on U.S. Returns. Draw the line on the scatterplot. What are the predicted return y ˆ^ and the observed return y for 1993? d. Are you confident that predictions using the regression line will be quite accurate? Why?
  2. Gas chromatography is a technique used to detect very small amounts of a substance, for example, a contaminant in drinking water. Laboratories use regression to calibrate such techniques. The data in “EX02_062.MTP” show the results of five measurements for each of four amounts of the substance being investigated. The explanatory variable x is the amount of substance in the specimen, measure in nanograms (ng), units of 10-9^ gram. The response variable y is the reading from the gas chromatograph. a. Make a scatterplot of these data. The relationship appears to be approximately linear, but the wide variation in the response values makes it hard to see detail in this graph. b. Compute the least squares regression line of y on x , and plot this line on your graph. c. Now compute the residuals and make a plot of the residuals against x. It is much easier to see deviations from linearity in the residual plot. Describe the pattern displayed by the residuals.
  3. A study finds that high school students who take the SAT, enroll in an SAT coaching course, and then take the SAT a second time raise their SAT mathematics scores from a mean of 521 to a mean of 561. What factors other than “taking the course causes higher scores” might explain this improvement?
  4. Many studies have found that people who drink alcohol in moderation have lower risk of heart attack than either nondrinkers or heavy drinkers. Does alcohol consumption also improve survival after a heart attack? One study followed 1913 people who were hospitalized after severe heart attacks. In the year before their heart attack, 47% of these people did not drink, 36% drank moderately, and 17% drank heavily. After four years, fewer of the moderate drinkers had died. Is this an observational study or an experiment? Why? What are the explanatory and response variables?
  5. A manufacturer of food products uses package liners that are sealed at the top by applying heated jaws after the package is filled. The customer peels the sealed pieces apart to open the package. The manufacturer wants to know what effect
  1. The Ministry of Health in the Canadian province of Ontario wants to know whether the national health care system is achieving its goals in the province. The ministry conducted the Ontario Health Survey, which interviewed a probability sample of 61,239 adults who live in Ontario. a. What is the population for this sample survey? What is the sample? b. The survey found that 76% of males and 86% of females in the sample had visited a general practitioner at least once in the past year. Do you think these estimate are close to the truth about the entire population? Why?