Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Various statistical analysis problems involving confidence intervals, hypothesis testing, and z-scores. Topics covered include interpreting confidence intervals, identifying valid statements, and calculating z-scores for normal distributions.
Typology: Exams
1 / 11
Multiple choice and fill in the blank. 4 points/part; each choice and blank counts as a separate part. For multiple choice, circle the letter of the one best answer. For fill in the blank, no partial credit; work need not be shown.
(1) A phone-in poll conducted by the athletic department reported that 83% of those who called in thought Ernie Kent was their number one choice for basketball coach. The number 83% is what? (a) A statistic (b) A sample. (c) A parameter. (d) A population. (e) Both (1b) and (1d). (f) Both (1c) and (1a). (g) None of the above.
(2) You can find the Excite Poll online at poll.excite.com. You simply click on a response to become part of the sample. The poll question for June 19, 2005, was “Do you prefer watching first-run movies at a movie theater, or waiting until they are available on home video or pay-per-view?” In all, 8896 people responded, with only 13% (1118 people) saying they preferred theaters. You can conclude that: (a) American adults strongly prefer watching movies at home. (b) American adults who visit excite.com strongly prefer watching movies at home. (c) Everyone who visits excite.com strongly prefers watching movies at home. (d) Most people who visit excite.com prefer watching movies at home. (e) American adults who use the internet strongly prefer watching movies at home. (f) The sample is too small to draw any conclusions. (g) None of the above.
You can’t conclude anything from the sample, not even about the pref- erences of people who visit excite.com, because the sample is a voluntary response sample.
(3) Consider the following scatterplot:
Date: 5 May 2009. 1
5 10 15 20
2
4
6
8
10
12
14
The association of the variables plotted is: (a) Clearly positive with no outliers. (b) Clearly positive with one or more outliers. (c) Neither clearly positive nor clearly negative. (d) Clearly negative with no outliers. (e) Clearly negative with one or more outliers. (f) Not defined. (g) None of the above.
(4) Choose a young person (age 19 to 25) at random and ask “In the past seven days, how many days did you watch television?” Call the response X for short. Here is a probability model for the response: Days X 0 1 2 3 4 5 6 7 Probability 0. 04 0. 03 0. 06 0. 08 0. 09 0. 08 0. 05 0. 57
Then P (2 < X < 6) = 0. 25
Solution: P (2 < X < 6) = P (3) + P (4) + P (5) = 0.08 + 0.09 + 0.08 = 0.25.
(5) Research doctors test drugs by prescribing different amounts and observing the effects on their patients. One question we could ask here is: “Does the amount of drug prescribed determine the length of the recovery time?” The explanatory variable is: (a) The drug being tested. (b) The amount of drug prescribed. (c) The length of the recovery time. (d) The effects on the patients. (e) The disease the patients had. (f) None of the above.
(6) A maker of fabric for clothing is setting up a new line to “finish” the raw fabric. The line will use either metal rollers or natural-bristle rollers to raise the surface of the fabrics; a dyeing cycle time of either 30 minutes or 40 minutes; and a temperature of either 150◦C or 175◦C. An experiment will compare all combinations of these choices. Seven specimens of fabric will be subjected to each treatment and scored for quality. How many factors are there in this experiment? (a) 1.
(b) 2. (c) 3. (d) 4. (e) 6. (f) 7. (g) 8. (h) None of the above.
The factors are type of roller, dyeing cycle time, and temperature.
(7) Let μ be the mean score of students at Central Community College (CCC) on the Survey of Study Habits and Attitudes (SSHA), a questionnaire on which scores range from 0 to 200. The test is administered to a simple random sample of CCC students, and the following 95% confidence interval for the mean μ is computed: 87 ≤ μ ≤ 106. Which of the following statements gives a valid interpretation of this interval? (a) 95% of the sample of CCC students would score between 87 and 106 on the SSHA. (b) 95% of all CCC students would score between 87 and 106 on the SSHA.
(c) If the procedure were repeated many times, approximately 95% of the resulting confidence intervals would contain the mean SSHA score of all CCC students. (d) If the procedure were repeated many times, approximately 95% of the resulting con- fidence intervals would contain the mean SSHA score of the sample of CCC students. (e) The mean SSHA score of all CCC students is 95. (f) The sample contained at least 95% of all CCC students. (g) Nothing, because of a poor choice of sampling method.
Notes: Choices (7a) and (7b) are not related to the meaning of a con- fidence interval. If a large sample is used, the confidence interval will contain much less than 95% of the data. Choice (7d) is incorrect because every confidence interval, not just 95% of them, contains the mean of the sample it was derived from.
(8) From the University of Oregon, 26 students are randomly selected. They are then asked the total cost of their textbooks for the term. What is the sample? (a) The total cost of their textbooks for the term. (b) All University of Oregon students. (c) The textbooks bought by the 26 randomly selected students. (d) The 26 randomly selected students. (e) Impossible to tell from the information given. (f) None of the above.
(9) An experiment to test whether gingko extract improves memory and concentration used as subjects 230 healthy people over 60 years old. These subjects were randomly assigned to two groups. Those in one group received pills containing gingko extract, while those in the other received placebo pills which looked and tasted the same. All subjects took a battery of tests for learning and memory before the treatment started, and again after six weeks. Neither the subjects nor the experimenters knew which subject got which treatment until after the experiment was complete. The experiment found no evidence that gingko
extract improves memory and concentration. What was the purpose of administering gingko extract to only half the subjects? (a) To prevent fraud. (b) To ensure that when the outcomes for the subjects are compared, the experimenters don’t know whether any particular subject received gingko extract or not.
(c) To allow for comparison of the outcomes for subjects given gingko extract with the outcomes for subjects not given gingko extract but otherwise treated alike. (d) To test whether gingko extract can be used as a placebo in future experiments. (e) There is no good reason. (f) To save money by using only half as much gingko extract. (g) None of the above.
(10) A 90% z confidence interval for the mean reading achievement score for third grade students in the city of Megalopolis is computed to be (46. 3 , 57 .9). The sample mean x was (a) 5. 8 (b) 46. 3 (c) 52. 1 (d) 57. 9 (e) 12. 7 (f) The answer cannot be determined from the information given. (g) None of the above
The sample mean is always at the midpoint (center) of a z confidence interval.
(11) A biologist is studying Rocky Mountain spotted frogs. He says, “The evidence indicates that the correlation between the size of the frog and the size of the stream it lives in is essentially zero.” Which of the following is the correct interpretation of this statement? (a) Large frogs tend to live in large streams and small frogs tend to live in small streams. (b) Large frogs tend to live in small streams and small frogs tend to live in large streams. (c) Most or all of the frogs are small and live in small streams.
(d) Large frogs are just as likely to live in small streams as in large streams, and the same is true of small frogs. (e) The biologist does not understand statistics.
(12) A researcher wants to study the relationship between color and a certain genetic mutation in Rocky Mountain spotted frogs. He chooses a stream which looks like good frog habitat and collects for study all the frogs he can net on this stream. The frogs he collected form: (a) A population. (b) A simple random sample. (c) A stratified random sample. (d) A convenience sample. (e) A voluntary response sample. (f) A systematic random sample.
(13) If you buy a ticket from the Outer Slobbovia National Lottery, the probability of winning $50 is 1/200. Which of the following is true? (a) If you and 199 friends each buy a ticket, one of you will win. (b) If this is your 200th time playing the lottery, you have a better chance of winning. (c) If you buy 200 tickets, then you will win $50. (d) If you buy 2, 000 ,000 tickets, 10,000 of them will be winning tickets. (e) None of the above.
(14) Two variables in a study are said to be confounded if what?
(a) One cannot distinguish their effects on a response variable. (b) They are highly correlated. (c) They do not have a normal distribution. (d) One of them is a placebo. (e) Both are lurking variables. (f) The statistician conducting the study is incompetent.
(15) Assume that your blood alcohol concentration twenty minutes after consuming two drinks is normally distributed with mean 0.076 and standard deviation 0.016. In Oregon, you are legally intoxicated if your blood alcohol level is at or above 0.08. If one blood sample is taken twenty minutes after you consume two drinks, what is the probability that you will be legally intoxicated? (a) 0. 3707 (b) 0. 3841 (c) 0. 7500 (d) 0. 2500 (e) 0. 6293 (f) None of the above.
Solution using Table A: The z-score of the legal intoxication level is
z =
Table A gives P (z < 0 .25) ≈ 0. 5987. We want P (z ≥ 0 .25) ≈ 1 − 0 .5987 = 0. 4103.
(16) Let μ be the mean dollar amount spent by customers at a particular store during 2002. Using the customers who visit the store during its annual inventory reduction sale, the following 95% confidence interval for the mean μ is computed: $83 ≤ μ ≤ $98. Which of the following statements gives a valid interpretation of this interval? (a) If the procedure were repeated many times, approximately 95% of the resulting con- fidence intervals would contain the mean dollar amount spent by all customers in
(b) If the procedure were repeated many times, approximately 95% of the resulting con- fidence intervals would contain the mean dollar amount spent by the sample of cus- tomers. (c) 95% of the sample of customers spent between $83 and $98. (d) 95% of all customers in 2002 spent between $83 and $98.
(e) The sample contained at least 95% of all customers in 2002. (f) The mean dollar amount spent by all customers in 2002 is $95. (g) Nothing, because of a poor choice of sampling method.
The sample is a convenience sample.
(17) The following least-squares regression line is used to predict the score y on a final exam from the score x on the midterm exam:
̂ y = 12 + 0. 8 x
Mary got a score of 90 on the midterm and a score of 80 on the final exam. Her predicted final exam score and its residual are: (a) 90 and 10 (b) 84 and 4 (c) 84 and − 4 (d) 90 and − 10 (e) 80 and − 4 (f) 80 and 0 (g) None of the above
Solution: The predicted final exam score is 12 + (0.8)(90) = 84. The residual is obtained by subtracting the predicted score from the actual score, giving 80 − 84 = − 4.
(18) At a party there are 30 students over age 21 and 20 students under 21. You choose at random 3 of those over 21 and separately choose at random 2 of those under 21 to interview about attitudes towards alcohol. We can say the following about your sample: (a) The sample is a simple random sample because every student at the party has an equal chance of being chosen. (b) The sample is not a simple random sample because some students are more likely to be chosen then others. (c) The sample is not a simple random sample because not every group of five students has an equal chance of being chosen. (d) The sample is not random at all, as breaking the students into two age groups intro- duces bias to the results. (e) The sample is a simple random sample because every group of five students has an equal chance of being chosen. (f) None of the above.
(19) The number of LTD boardings at Eugene Station at 5:00 pm on a weekday varies with mean 174 .43 and standard deviation 47.5. Let x denote the mean numbers of LTD boardings at Eugene Station at 5:00 pm in a given 5-day work week. Assuming that a 5-day work week can be treated as a simple random sample of all weekdays (not really true), what is the probability that x ≤ 150? (a) 0. 8749 (b) 0. 0051 (c) 0. 3050
(d) 0. 6950 (e) 0. 9949 (f) None of the above.
Solution using Table A: The standard deviation for the mean of the values on a simple random sample of 5 days is 47. 5 /
5 ≈ 21. 2426. The z-score of the number 150 is z =
To use Table A, round to − 1. 15 , and find P (z < − 1 .15) ≈ 0. 1251.
(20) Consider the following scatterplot and regression line.
5 10 15 20 25 x
5
10
15
20
25
y
What is the approximate correlation between the variables? Circle one.
Solution: The correlation is clearly negative and not close to zero, but the points are not so close to the regression line that it can be close to −1 either. (The points plotted actually all have integer coordinates, and the correlation is − 0 .721937, giving r^2 ≈ 0 .521193.)
(21) Is it possible to have more than 68% of a population above the mean?
(a) Yes, an extreme outlier could cause this. (b) Yes, where I’m from, everyone is above average. (c) No, the mean is always close to the median. (d) No, 68% of the population is always within one standard deviation of the mean. (e) No, otherwise the mean would be useless as a measure of center.
Example: consider a population of four individuals taking Math 618. Their final exam scores (out of 200) are 70, 130 , 137 , and 143. The mean is 120, and three out of the four people in the population, that is, 75% > 68%, scored above the mean. Note that there is no reason to assume that the population is nor- mally distributed. Moreover, even when 68% of the population is within one standard deviation of the mean, it is still possible for 68% of the population to be above the mean. In the example above, the standard deviation is about 33. 754 , so all but the lowest score is within one stan- dard deviation of the mean.
(22) A party host gives a door prize to one guest chosen at random. There are 42 men over 18 , 19 men under 18, 25 women over 18, and 16 women under 18. What is the probability that the door prize goes to a guest under 18?
Probability: 35 / 102 ≈ 0. 343137
Solution: There are 42 + 19 + 25 + 16 = 102 people at the party, and 19 + 16 = 35 of them are under 18. So the answer is
(23) 75 seventeen year old girls were asked how many servings of fruit per day they ate. The figure below is a histogram of their answers.
0
2
4
6
8
10
12
14
16
18
0 1 2 3 4 5 6 7 8 Servings of fruit per day
Number of subjects
a. Which of the following describes the distribution? (a) The distribution is skewed left and has no outliers. (b) The distribution is skewed right and has no outliers. (c) The distribution is symmetric and has no outliers. (d) The distribution is skewed right and skewed left and has no outliers. (e) The distribution is skewed left and has at least one outlier. (f) The distribution is skewed right and has at least one outlier. (g) The distribution is symmetric and has at least one outlier.
(h) The distribution is skewed right and skewed left and has at least one outlier.
Find the following and fill in the blanks:
b. The median: 2 Solution: There are 75 observations. So the median is the 38th obser- vation. Of these observations, 14 are zero, 14 + 12 = 26 are zero or one, and 14 + 12 + 17 = 43 are zero, one, or two. So the 38th observation is 2.
c. Percentage of the 75 girls who claimed to eat more than 6 servings per day: about 5.333%
Solution: There are 75 observations, and 2 + 2 = 4 are seven or eight. So the proportion is 4 75
(24) The five number summary of the final exam scores of the 28 students in Professor Green- bottle’s section of Math 251 is
and the mean is 64 and the standard deviation (correct to three decimal places) is 19. 318. Since the final exam is supposed to be worth twice as much as a midterm, Professor Greenbottle multiplies every exam score by 2.
(a) What is the new first quartile? New first quartile: 110
Solution: The new first quartile is 110, twice the old one. Since the scores were doubled, if one fourth of the old ones were 55 or less, then one fourth of the new ones will be 110 or less. (b) What is the new standard deviation? New standard deviation: 38. 636
Solution: The new standard deviation is 2(19.318) = 38. 636 , twice the old standard deviation. The standard deviation is a measure of the spread of the distribution. Since the scores are doubled, the spread is twice as large.
(25) If scores on a certain test are normally distributed with mean 352.7 and standard deviation 112 .6, for what number L are 90% of the scores below L?
Solution: This solution uses Table A. The number in Table A closest to 0.90 is 0. 8997. This is the approximate probability P (z < 1 .28). So L should be the number whose z-score is 1. 28. From z = L − μ σ
we solve for L to get L = zσ + μ ≈ (1.28)(112.6) + 352.7 = 496. 828. Using a calculator to find the number z 0 such that P (z < z 0 ) = 0.90 will give a slightly more accurate answer.
Long answer. Points as indicated; work MUST be shown.
(26) (6 points.) Explain clearly what it means for a point in a scatterplot to be influential for the correlation.
Solution: An influential point for the regression line is a point which, if omitted, would markedly change the correlation. (No credit will be given for merely saying that the omission would change the correlation. That is usually true for every point.)
(27) (20 points. Not all the steps have the same point value.) Bottles of a popular cola are supposed to contain 600 milliliters (ml) of cola. There is some variation from bottle to bottle because the filling machinery is not perfectly precise. An inspector who suspects the bottler is underfilling measures the contents of 6 bottles. The results in ml are:
Assume that the volumes of the contents are normally distributed, that the sample can be treated as a simple random sample of all bottles of this cola, and that the standard deviation is known to be 1.5 ml, and that the appropriate statistical procedure is safe to use. Is this convincing evidence that the mean content of cola bottles is less than the advertised 600 ml, at significance level α = 0.10? Answer using the following steps. (1) State the hypotheses you will test. (2) Calculate the test statistic. (3) Give a P -value (or give two values between which P lies). (4) Draw the appropriate conclusion, expressing it in words appropriate for the context of the problem.
Solution: (1) Let μ be the mean contents, in ml, of all bottles filled by this bottler. Then the hypotheses are: H 0 : μ = 600. Ha: μ < 600. (2) We calculate the z test statistic:
z =
x − 600 σ/
n
For the data given, we find x ≈ 599 .0167 so
z ≈
(3) Using Table A: We have to round to − 1. 61. The corresponding entry in Table A is
(28) (10 points.) The masses of adult dragons are known to vary normally with a standard deviation of 0.24 tons. At considerable personal risk, you have managed to determine the masses of six adult dragons. They have mean 1.80 tons. Assuming that these six dragons may be treated as a simple random sample, and that the appropriate statistical procedure is safe to use, find a 99% confidence interval for the mean mass of adult dragons. (Remember that showing your work includes showing the appropriate critical value!)
Solution: From Table C, the relevant critical value is z∗^ = 2. 576. (Use the column with 99% at the bottom, or note that the upper tail probability must be 12 (1 − 0 .99) = 0. 005 .) The confidence interval is x ± z∗σ/
n, which is 1. 80 ± (2.576)(0.24)/
6 tons. This works out to 1. 80 ± 0 .2524 tons. In interval form, this is (1. 5476 , 2 .0524).