







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: BASIC APPLIED STATISTICS; Subject: Statistics; University: University of Pittsburgh; Term: Spring 2009;
Typology: Exams
1 / 13
This page cannot be seen from the preview
Don't miss anything!








Name:
Dr. Nancy Pfenning Statistics 0200 Spring 2009
The Final will be a closed book exam worth 250 points. You are allowed to bring a calculator and 2 two-sided sheets of notes.
(a) Workers’ income and whether they are black, white, Asian, or Hispanic: (i) bar graph (ii) histogram (iii) pie chart (iv) scatterplot (v) back-to-back stemplot (vi) side-by-side boxplots (b) Rent charged and distance from the university for a sample of Oakland apart- ments: (i) bar graph (ii) histogram (iii) pie chart (iv) scatterplot (v) back-to-back stemplot (vi) side-by-side boxplots (c) Workers’ marital status and whether they smoke or not: (i) histogram (ii) bar graph (iii) scatterplot (iv) back-to-back stemplot (v) side-by-side boxplots
(a) Was this an experiment or an observational study? (b) What is the explanatory variable? Is it quantitative or categorical? (c) What is the response variable? Is it quantitative or categorical? (d) According to our notation, .01 is denoted by which of the following? μ, σ, p, β 0 , β 1 , χ^2 , n, ¯x, s, ˆp, bo, b 1 , α
(e) A Chinese physician recalled his mother’s childhood nagging: “Stretch your ears daily, child, to ensure long life.” In fact, some scientists believe that people with small ears die younger, leaving a population of healthier old people with big ears. What would be the explanatory variable, according to this theory?
(a) Considering the appearance of the scatterplot, which of the following is the most reasonable guess for the correlation r? (i) -.95 (ii) -.6 (iii) -.1 (iv) +.1 (v) +.6 (vi) +. (b) Use the regression equation to predict the receipts of a movie rated 9 by viewers. UNDERLINE the part of the output that estimates about how far off this prediction would be. (c) According to the regression output, is there statistical evidence of a relationship between ratings and receipts? Answer yes or no and CIRCLE THE SPECIFIC PART OF THE OUTPUT THAT YOU USE TO DECIDE. (d) Since estimated receipts for a movie rated 8 are −113 + 36.5(8) = 179, and s = 75.93, a 95% confidence interval for mean receipts of movies rated 8 is roughly 179 ± 2(75.93)/
15 = (140, 218) and a 95% prediction interval for receipts of a particular movie rated 8 is roughly 179 ± 2(75.93) = (27,331). Are you willing to believe that all movies rated 8 average 300 million in receipts? (e) If we set up a prediction interval for receipts of particular movie rated 4 instead of 8, what would be the most noticeable difference: (i) the interval would be centered at lower values (ii) the interval would be wider (iii) the interval would be narrower
Regression Analysis: receipts versus rating The regression equation is receipts = - 113 + 36.5 rating
Predictor Coef SE Coef T P Constant -112.60 84.37 -1.33 0. rating 36.45 12.00 3.04 0.
S = 75.93 R-Sq = 41.5% R-Sq(adj) = 37.0%
Observed Laid Off Not Laid Off Total African American 130 1380 1510 White 90 2810 2900 Total 220 4190 4410
(a) The data were probably obtained from (i) an observational study (ii) an experiment (iii) it could easily have been either (i) or (ii). (b) State the appropriate null and alternative hypotheses. (c) In the “Observed” table, who has a higher proportion of layoffs? (i) African Americans (ii) whites (iii) both the same. (d) If there were no relationship between race and layoffs, what counts would be expected in each cell? Display them in the table provided. Expected Laid Off Not Laid Off Total African American 1510 White 2900 Total 220 4190 4410
(e) Calculate the value X^2 of the chi-square statistic. (f) How many degrees of freedom are there? (g) The p-value is (i) very small (ii) somewhat small (iii) borderline (iv) not small at all (h) Draw your conclusions (circle two of the following): i. There is evidence of a relationship between race and layoffs. ii. There is no compelling evidence of a relationship between race and layoffs. iii. These workers could represent random samples taken from populations where equal proportions were laid off for African Americans and whites. iv. We do not believe these workers represent random samples taken from pop- ulations where equal proportions were laid off for African Americans and whites.
(a) Construct a 95% confidence interval for population mean dividing time μ. (b) Suppose a sample of only 11 cells had mean 30.5 and standard deviation 4.9. Give three reasons why a 99% confidence interval for μ would be wider than your interval in (a).
0 (df=3)
- size n= t for sample - -3.18 +3. - -5. - -4. (a) If researchers want to test if mean intensities differ between the two regions, state the appropriate null and alternative hypotheses. (b) Use the two-sample t statistic in the output to characterize the p-value: (i) extremely small (ii) somewhat small (iii) borderline (iv) not small at all (c) Is there statistically significant evidence of a difference? (d) Would you expect a confidence interval for difference in population means to contain zero? (e) Are your conclusions still valid if one of the data sets is very skewed? (f) Circle the part of the output that would help you decide if the Rule of Thumb for use of a pooled procedure is satisfied.
Two-sample T for SF vs LA N Mean StDev SE Mean SF 14 2.59 1.02 0. LA 16 2.13 1.09 0. Difference = mu SF - mu LA Estimate for difference: 0. %95% CI for difference: (-0.339, 1.248) T-Test of difference = 0 (vs not =): T-Value = 1.
(a) if sample means differed more: (i) more convincing (ii) less convincing (b) if the boxes were taller, with longer whiskers (i) more convincing (ii) less convinc- ing (c) if if sample sizes were larger (i) more convincing (ii) less convincing
small: n 1 = 6 x¯ 1 = 22. 5 , s 1 = 2. 168 medium: n 2 = 6 x¯ 2 = 20. 0 , s 2 = 1. 414 large: n 3 = 6 x¯ 3 = 13. 0 , s 3 = 2. 098
(a) Is this study an experiment? Circle one of the following: i. Yes, because more than one dosage level is tested. ii. Yes, because there is a treatment involved (injecting with various amounts of tranquilizer) for the units. iii. No, this is an observational study on a random sample of guinea pigs. iv. No, because there is no control group of guinea pigs which receive no injection. (b) Under which dosage level did sample guinea pigs fall asleep fastest? (c) Is it safe to assume that population standard deviations are equal (as required in an analysis of variance)? Answer yes or no, and circle the values above that you use to decide. (d) State the appropriate null and alternative hypotheses. (e) Complete the blanks in the ANOVA table below. (f) Circle two of the following which are correct conclusions: i. Mean time to fall asleep may be the same for all guinea pigs given the three dosage levels. ii. Not all the mean times to fall asleep are the same for populations of guinea pigs on the three dosage levels. iii. There is a relationship between dosage level and time it takes to fall asleep. iv. There is no relationship between dosage level and time it takes to fall asleep. (g) Which of these would tend to lead to less convincing evidence? (You may circle anywhere from none to all three.) i. if sample sizes were larger ii. if sample standard deviations were larger iii. if sample means were more different
Source DF Sum of Squares Mean Square F P-value Group 291. 0 0. Error 55. 5 XXXXXXX XXXXXXX Total XX XXXXX XXXXXXX XXXXXXX XXXXXXX
(h) Which of these is the best way to assign guinea pigs to treatments? i. with eyes closed select 6 guinea pigs to receive small doses, then 6 guinea pigs to receive medium doses; the rest receive large doses
ii. use random numbers from a table or generated by software iii. give small doses to the 6 least active guinea pigs, and large doses to the 6 most active iv. give small doses to the 6 most active guinea pigs, and large doses to the 6 least active
iii. 95% is the probability that the confidence interval produced from this sample contains the proportion of all smokers whose desire to quit would be increased by the warnings. iv. 95% is the probability that the proportion of all smokers whose desire to quit would be increased by the warnings falls in the interval.
(d) Based on your confidence interval in (c), what would be the outcome of a test of Ho : p = .5 vs Ha : p < .5 where p refers to the population proportion whose desire to quit would be increased by the warnings?