Practice Final Exam - Basic Applied Statistics | STAT 0200, Exams of Statistics

Material Type: Exam; Class: BASIC APPLIED STATISTICS; Subject: Statistics; University: University of Pittsburgh; Term: Spring 2009;

Typology: Exams

Pre 2010

Uploaded on 09/09/2009

koofers-user-ga1
koofers-user-ga1 🇺🇸

5

(1)

10 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Name:
Practice FINAL
Dr. Nancy Pfenning
Statistics 0200
Spring 2009
The Final will be a closed book exam worth 250 points. You are allowed to bring a
calculator and 2 two-sided sheets of notes.
1. (10 pts.) Select the most appropriate display for each of the following.
(a) Workers’ income and whether they are black, white, Asian, or Hispanic:
(i) bar graph (ii) histogram (iii) pie chart (iv) scatterplot
(v) back-to-back stemplot (vi) side-by-side boxplots
(b) Rent charged and distance from the university for a sample of Oakland apart-
ments:
(i) bar graph (ii) histogram (iii) pie chart (iv) scatterplot
(v) back-to-back stemplot (vi) side-by-side boxplots
(c) Workers’ marital status and whether they smoke or not:
(i) histogram (ii) bar graph (iii) scatterplot (iv) back-to-back stemplot
(v) side-by-side boxplots
2. (5 pts.) Which one of the following is a paired study? (a) Measure level of depression
for a random sample of internet users and for a random sample of non-users
(b) Measure level of depression for a random sample of non-internet users; provide
them with internet use for a year and then measure their level of depression.
3. (5 pts.) In general, which is more likely to contain the unknown population mean?
(a) a 90% confidence interval (b) a 99% confidence interval (c) both the same
4. (25 pts.) Scientists believe people’s ears get larger with age. They measured ear length
in a sample of patients, aged 30 to 93, and found their ears grew about .01 inches a
year.
(a) Was this an experiment or an observational study?
(b) What is the explanatory variable? Is it quantitative or categorical?
(c) What is the response variable? Is it quantitative or categorical?
(d) According to our notation, .01 is denoted by which of the following?
µ, σ, p, β0, β1, χ2, n, ¯x, s, ˆp, bo, b1, α
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Practice Final Exam - Basic Applied Statistics | STAT 0200 and more Exams Statistics in PDF only on Docsity!

Name:

Practice FINAL

Dr. Nancy Pfenning Statistics 0200 Spring 2009

The Final will be a closed book exam worth 250 points. You are allowed to bring a calculator and 2 two-sided sheets of notes.

  1. (10 pts.) Select the most appropriate display for each of the following.

(a) Workers’ income and whether they are black, white, Asian, or Hispanic: (i) bar graph (ii) histogram (iii) pie chart (iv) scatterplot (v) back-to-back stemplot (vi) side-by-side boxplots (b) Rent charged and distance from the university for a sample of Oakland apart- ments: (i) bar graph (ii) histogram (iii) pie chart (iv) scatterplot (v) back-to-back stemplot (vi) side-by-side boxplots (c) Workers’ marital status and whether they smoke or not: (i) histogram (ii) bar graph (iii) scatterplot (iv) back-to-back stemplot (v) side-by-side boxplots

  1. (5 pts.) Which one of the following is a paired study? (a) Measure level of depression for a random sample of internet users and for a random sample of non-users (b) Measure level of depression for a random sample of non-internet users; provide them with internet use for a year and then measure their level of depression.
  2. (5 pts.) In general, which is more likely to contain the unknown population mean? (a) a 90% confidence interval (b) a 99% confidence interval (c) both the same
  3. (25 pts.) Scientists believe people’s ears get larger with age. They measured ear length in a sample of patients, aged 30 to 93, and found their ears grew about .01 inches a year.

(a) Was this an experiment or an observational study? (b) What is the explanatory variable? Is it quantitative or categorical? (c) What is the response variable? Is it quantitative or categorical? (d) According to our notation, .01 is denoted by which of the following? μ, σ, p, β 0 , β 1 , χ^2 , n, ¯x, s, ˆp, bo, b 1 , α

(e) A Chinese physician recalled his mother’s childhood nagging: “Stretch your ears daily, child, to ensure long life.” In fact, some scientists believe that people with small ears die younger, leaving a population of healthier old people with big ears. What would be the explanatory variable, according to this theory?

  1. (25 pts.) Below are a scatterplot and regression output, where the data consist of average viewer ratings (on a scale of 1 to 10) and gross receipts (in millions of dollars) of 15 recent movies:

(a) Considering the appearance of the scatterplot, which of the following is the most reasonable guess for the correlation r? (i) -.95 (ii) -.6 (iii) -.1 (iv) +.1 (v) +.6 (vi) +. (b) Use the regression equation to predict the receipts of a movie rated 9 by viewers. UNDERLINE the part of the output that estimates about how far off this prediction would be. (c) According to the regression output, is there statistical evidence of a relationship between ratings and receipts? Answer yes or no and CIRCLE THE SPECIFIC PART OF THE OUTPUT THAT YOU USE TO DECIDE. (d) Since estimated receipts for a movie rated 8 are −113 + 36.5(8) = 179, and s = 75.93, a 95% confidence interval for mean receipts of movies rated 8 is roughly 179 ± 2(75.93)/

15 = (140, 218) and a 95% prediction interval for receipts of a particular movie rated 8 is roughly 179 ± 2(75.93) = (27,331). Are you willing to believe that all movies rated 8 average 300 million in receipts? (e) If we set up a prediction interval for receipts of particular movie rated 4 instead of 8, what would be the most noticeable difference: (i) the interval would be centered at lower values (ii) the interval would be wider (iii) the interval would be narrower

Regression Analysis: receipts versus rating The regression equation is receipts = - 113 + 36.5 rating

Predictor Coef SE Coef T P Constant -112.60 84.37 -1.33 0. rating 36.45 12.00 3.04 0.

S = 75.93 R-Sq = 41.5% R-Sq(adj) = 37.0%

  1. (30 pts.) This table from 1995 shows layoffs by ethnic group for U.S. Labor Department employees. We want to use a chi-square test to determine if there is convincing evidence of a relationship between race and layoffs.

Observed Laid Off Not Laid Off Total African American 130 1380 1510 White 90 2810 2900 Total 220 4190 4410

(a) The data were probably obtained from (i) an observational study (ii) an experiment (iii) it could easily have been either (i) or (ii). (b) State the appropriate null and alternative hypotheses. (c) In the “Observed” table, who has a higher proportion of layoffs? (i) African Americans (ii) whites (iii) both the same. (d) If there were no relationship between race and layoffs, what counts would be expected in each cell? Display them in the table provided. Expected Laid Off Not Laid Off Total African American 1510 White 2900 Total 220 4190 4410

(e) Calculate the value X^2 of the chi-square statistic. (f) How many degrees of freedom are there? (g) The p-value is (i) very small (ii) somewhat small (iii) borderline (iv) not small at all (h) Draw your conclusions (circle two of the following): i. There is evidence of a relationship between race and layoffs. ii. There is no compelling evidence of a relationship between race and layoffs. iii. These workers could represent random samples taken from populations where equal proportions were laid off for African Americans and whites. iv. We do not believe these workers represent random samples taken from pop- ulations where equal proportions were laid off for African Americans and whites.

  1. (5 pts.) A biologist observes a sample of 81 cells to have mean dividing time 30. minutes and standard deviation 4.9 minutes.

(a) Construct a 95% confidence interval for population mean dividing time μ. (b) Suppose a sample of only 11 cells had mean 30.5 and standard deviation 4.9. Give three reasons why a 99% confidence interval for μ would be wider than your interval in (a).

0 (df=3)

 - size n= t for sample - -3.18 +3. - -5. - -4. 
  • area=. - area=.
    • area=. - -2. - area=. - +5. - +4. - +2. - area=. - area=. - area=. - area=.
  1. (20 pts.) Researchers at the University of California Earthquake Center want to com- pare intensities of earthquakes at San Francisco Bay and at the Los Angeles Basin. The data, assumed to be independent random samples from the two areas, represent intensity on the Richter scale of tremors felt during several months in 1994.

(a) If researchers want to test if mean intensities differ between the two regions, state the appropriate null and alternative hypotheses. (b) Use the two-sample t statistic in the output to characterize the p-value: (i) extremely small (ii) somewhat small (iii) borderline (iv) not small at all (c) Is there statistically significant evidence of a difference? (d) Would you expect a confidence interval for difference in population means to contain zero? (e) Are your conclusions still valid if one of the data sets is very skewed? (f) Circle the part of the output that would help you decide if the Rule of Thumb for use of a pooled procedure is satisfied.

Two-sample T for SF vs LA N Mean StDev SE Mean SF 14 2.59 1.02 0. LA 16 2.13 1.09 0. Difference = mu SF - mu LA Estimate for difference: 0. %95% CI for difference: (-0.339, 1.248) T-Test of difference = 0 (vs not =): T-Value = 1.

  1. (10 pts.) Below are boxplots for students’ scores on 3 different versions of an exam. Tell whether each of the following would make the data more convincing or less convincing that in general the three exams do not share the same level of difficulty.

(a) if sample means differed more: (i) more convincing (ii) less convincing (b) if the boxes were taller, with longer whiskers (i) more convincing (ii) less convinc- ing (c) if if sample sizes were larger (i) more convincing (ii) less convincing

  1. (30 pts.) Three groups of six guinea pigs each were injected, respectively, with small, medium, and large doses of a new tranquilizer, and the following are data for the number of minutes Xi (for i = 1, 2 , 3) it took them to fall asleep:

small: n 1 = 6 x¯ 1 = 22. 5 , s 1 = 2. 168 medium: n 2 = 6 x¯ 2 = 20. 0 , s 2 = 1. 414 large: n 3 = 6 x¯ 3 = 13. 0 , s 3 = 2. 098

(a) Is this study an experiment? Circle one of the following: i. Yes, because more than one dosage level is tested. ii. Yes, because there is a treatment involved (injecting with various amounts of tranquilizer) for the units. iii. No, this is an observational study on a random sample of guinea pigs. iv. No, because there is no control group of guinea pigs which receive no injection. (b) Under which dosage level did sample guinea pigs fall asleep fastest? (c) Is it safe to assume that population standard deviations are equal (as required in an analysis of variance)? Answer yes or no, and circle the values above that you use to decide. (d) State the appropriate null and alternative hypotheses. (e) Complete the blanks in the ANOVA table below. (f) Circle two of the following which are correct conclusions: i. Mean time to fall asleep may be the same for all guinea pigs given the three dosage levels. ii. Not all the mean times to fall asleep are the same for populations of guinea pigs on the three dosage levels. iii. There is a relationship between dosage level and time it takes to fall asleep. iv. There is no relationship between dosage level and time it takes to fall asleep. (g) Which of these would tend to lead to less convincing evidence? (You may circle anywhere from none to all three.) i. if sample sizes were larger ii. if sample standard deviations were larger iii. if sample means were more different

Source DF Sum of Squares Mean Square F P-value Group 291. 0 0. Error 55. 5 XXXXXXX XXXXXXX Total XX XXXXX XXXXXXX XXXXXXX XXXXXXX

(h) Which of these is the best way to assign guinea pigs to treatments? i. with eyes closed select 6 guinea pigs to receive small doses, then 6 guinea pigs to receive medium doses; the rest receive large doses

ii. use random numbers from a table or generated by software iii. give small doses to the 6 least active guinea pigs, and large doses to the 6 most active iv. give small doses to the 6 most active guinea pigs, and large doses to the 6 least active

iii. 95% is the probability that the confidence interval produced from this sample contains the proportion of all smokers whose desire to quit would be increased by the warnings. iv. 95% is the probability that the proportion of all smokers whose desire to quit would be increased by the warnings falls in the interval.

(d) Based on your confidence interval in (c), what would be the outcome of a test of Ho : p = .5 vs Ha : p < .5 where p refers to the population proportion whose desire to quit would be increased by the warnings?