Download Statistics: Sample Proportions, Confidence Intervals, and Hypothesis Testing and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!
Sample Test Questions
- As we keep tossing a coin (as n increases), which of the following happens? A. The sample proportion, p gets smaller. B. The sample proportion, p gets closer to , the population proportion. C. The standard deviation of the sample proportion, p , gets smaller. D. All of the above will happen. E. Only two of the above will happen.
- Why use (or report) an average of several observations instead of just one? A. You could have made a mistake with one, but it's less likely you'd make the same mistake with several. B. The average is less biased than any individual observation. C. An average can't be an outlier. D. Averages are less variable than the individual observations. E. An individual observation can't represent the mean of a whole population. 90% |Lower Limit = 10. |Upper Limit = 11. 95% |Lower Limit = 10. |Upper Limit = 11. 99% |Lower Limit = 9. |Upper Limit = 12.
- We haven't done this exactly in class, but using the chart at the bottom of the review sheet, what is the correct range of the p -value for testing H 0 : = 10 vs. HA: 10? A. p -value > 0 : 10 B. 0 : 10 > p -value > 0 : 05 C. 0 : 05 > p -value > 0 : 01 D. p -value < 0 : 01
- Which of the following is FALSE? A. If I reject at the 5% level, I will always reject at the 10% level. B. A test of hypotheses can never prove the null to be true. C. Assuming the data is normal and we are given the population standard deviation, we use a t -test if the sample size is small. D. A random sample is always necessary. E. All of the above statements are true; none are false.
- What is the 79th percentile for the standard normal, Z ~ N(0, 1^2 )? A. 0.79 B. 0.7852 C. 0.2148 D. 0.81 E. -0.
- Let X ~N(25; 4^2 ). What is P(20 < X < 26)? A. 0.4931 B. 0.7043 C. 0.4013 D. 0.8413 E. 0.
- Suppose the standard deviation of some population, , is 36. How large of a sample would you need for the standard deviation of the mean, X, to be half as large? A. 2 B. 4 C. 6 D. 9 E. 18
8. Let X 9 ~ N(5,2^2 ). What is the range of the middle 90% of these X 9 's? In other words, what are xa and xb such
that P( xa < X 9 < xb ) = 0:90 (centered at the mean, = 5)?
A. (1.71, 8.29) B. (-1.645, 1.645) C. (-8.29, 8.29) D. (-1.28, 1.28) E. (2.44, 7.56)
- If I had asked for the middle 95% instead, which of the following would be true? A. The interval would be wider since the standard deviation would be larger. B. The interval would be narrower since the standard deviation would be smaller. C. The interval would be wider since it covers more of the possible observations.
D. The interval would be narrower since it's more accurate. E. The interval would be the same since the mean, , and standard deviation, , would not change.
- When is a sample size of 30 not enough to say the distribution of approximately normal? A. when the data is categorical and the true proportion of successes, , is less than 15% B. when the data is already normal C. when the data is highly skewed D. All of the above are true statements. E. Exactly two of the above are true statements.
- Let p 42 ~N(0.7, 0.071^2 ). What is P( p 42 < 0.5)? A. 0.5 B. 0 C. 0.9976 D. 0.0024 E. 0.
12. Why do we call the distribution of the sample mean, X n , a sampling distribution?
A. because it's the distribution of the sample of random observations B. because we must take a sample just to get one random observation C. because we sample from the distribution to find the sample mean D. because the distribution is only of a sample, not the whole population E. because we can't get the distribution of the whole population of sample means, only samples
- Let p 24 ~ N(0.4, 0.1^2 ). What is the maximum sample proportion, some p *, you most likely would observe? Define a rare event (one that `most likely won't happen') as something with a probability of 0.001 or less. In other words, what is p * such that P( p 24 > p *) = 0:001? A. 0.308 B. 0.7 C. 0.708 D. 0.43 E. 0.
- Is it reasonable to think you could get a sample proportion of 25% or less? How likely is this occurrence? Use the same distribution as above. A. 0.5596 B. 0.4404 C. 0.25 D. 0.15 E. 0.
- While he was a prisoner of the Germans during WWII, John Kerrich tossed a coin 10,000 times. He got 5067 heads. If we say that these tosses represent a simple random sample from the population of all possible tosses of his coin, is there reason to believe that his coin was biased (gave too many heads to be fair)? Well, how likely is it to get at least this proportion of heads from a fair coin? (NOTE: The true proportion of heads for a fair coin is = 0:5,
and the standard deviation for this many tosses is p = (1 ) / n = 0:005.)
A. 0.5; it'll happen half of the time B. 0.09; not very likely, but plausible C. 0.34; fairly likely, so it's believable D. 0.067; rare, but it could happen E. 0.005; pretty rare, it most likely isn't a fair coin
- Why is the Central Limit Theorem so important in the study of statistics? A. It allows us to use the normal distribution for any kind of data. B. It tells us that any data can be approximately normal if we take a large enough sample. C. It tells us that any sample mean can be approximately normal. D. It tells us that any sample mean will be unbiased. E. None of the above are true statements of the CLT.
- Let X ~ N(10, 3^2 ). What is P(X > 18)? A. 0.9962 B. 0.9971 C. 2.67 D. 0.0038 E. 0.
- Had we taken a sample of size, n = 25, from the population above, what would the probability have been for
X 25?
A. More than for X, since more X^ 25 's are closer to the mean, .
B. Less than for X, since less X 25 's would be that far from the mean, .
C. Less than for X, since less X 25 's are above the mean, .
D. The same as for X, since the mean, is the same for both. E. You can't say without calculating the probability.
- Ok, let's say you just got a job as a lab tech, and you're going to be doing different tests on possible new drugs that your company is creating. Of course, the reason you got the job is because they know you have an excellent knowledge of how statistics works, and they're sure you will do the job right! You need to find statistical evidence that your company's new wonder drug actually works better than Brand X, which is the best selling product on the market today. Now Brand X claims their 'effectiveness' rating is 8, out of a possible 10. You, however, are skeptical that this is true and decide to test their product along with yours. Let's call yours Brand A, and let A be your product's true mean effectiveness rating. X be the true mean effectiveness rating for Brand X. First of all, what hypotheses should you test? A. H 0 : A = X vs. HA: A X B. H 0 : A = X vs. HA: A < X C. H 0 : A = X vs. HA: A > X D. H 0 : A = 10 vs. HA: A > 8 E. H 0 : A = 8 vs. HA: A > 8
- Same scenario: How are you going to go about getting the data to test your hypotheses? A. Take random samples of both drugs and give them to the first 50 people who have a headache. B. Take two random samples of people with headaches and give one group Brand A and the other Brand X. C. Take one random samples of people with headaches and give every other one Brand A and the rest Brand X. D. Take two random samples of people with headaches and give each person one tablet of each Brand. E. Take a couple of aspirin yourself because all of these people are giving you a headache!
- Same scenario still: Let's say you decide to test H 0 : A = X vs. HA: A < X since you've decided to use time until the headache is gone, i.e. , you're testing which drug works faster. Knowing what you do about Type I and Type II errors, what -level should you use in your test? Pick the answer that is most correct! A. Use = 0.10 because you want to reject as much as possible. B. Use = 0.01 because you want to reject as much as possible. C. Use = 0.10 because you don't want to claim there is insufficient evidence when your brand is really faster. D. Use = 0.01 because you don't want to claim there is insufficient evidence when your brand is really faster. E. Use = 0.10 because you don't want to claim your brand is better if it really isn't any faster.
- Ok, your output from your test of hypotheses gives you a p-value = 0.018. What can you conclude? A. At the 5 and 10% levels, you conclude your brand gets rid of headaches faster. B. At the 1% level, you conclude your brand gets rid of headaches faster. C. At the 1% level, you conclude your brand takes longer to get rid of headaches. D. Both A. and C. are correct conclusions. E. None of the above are correct conclusions.
- Now the guy in the office next door is jealous of all the attention you've been getting, so he decides to run his own little experiment. He takes 10 samples and calculates 90% confidence intervals for the true mean time it takes for Brand A to stop a headache. From these 10 con_dence intervals, he finds 3 of them don't contain the mean of Brand X = 15, their supposed true mean time. He thinks this is substantial proof than Brand A is better. What's really going on? A. He's correct. Brand A is obviously better. B. He's obviously miscalculated since all 10 intervals should contain 15. C. He's obviously miscalculated since all 10 intervals should NOT contain 15. D. He didn't take random samples so his results are skewed. E. He merely had approximately 10% of the intervals not contain the true mean = 15.
- So we're still worried about this jealous guy. Now he's doing a hypothesis test. You KNOW that there's is no
evidence that Brand A is better than Brand X. You've tested it a zillion times. Obviously, to you, your company's product is only just as good. But, the boss really wants to say it's better, and the guy next door wants to make him happy. Which of following would lead them, the boss and your neighbor, to the wrong conclusion, but the one they want? Remember, the null is that the brands are the same; the alternative is that Brand A is better. A. a Type I error B. a Type II error C. a test with a very small -level. D. switching the null and alternative hypotheses E. It is impossible to claim Brand A is better because it really is only just as good.
- If confidence intervals can tell us the same thing that a hypothesis test can, why would we ever need to run hypothesis tests anyway? A. There's no reason; it's just a different way to do analyze data. B. Hypothesis tests are more accurate because you are testing an exact value for ___ or ___. C. Hypothesis tests can test two samples, but confidence intervals are only for one sample. D. Hypothesis tests can have smaller p -values since you can run one sided tests ( > or < ), but confidence intervals are only equivalent to two sided tests. E. Exactly two of the above are true.
- Which of the following BEST describes what 95% confidence means in a 95% confidence interval for of (7.8,9.4)? A. There is a 95% probability that is between 7.8 and 9.4. B. In repeated sampling, will fall between 7.8 and 9.4 about 95% of the time. C. In repeated sampling, about 95% of the observations will fall between 7.8 and 9.4. D. In repeated sampling, about 95% of the observations will fall within the confidence interval. E. In repeated sampling, the confidence intervals will contain about 95% of the time. H 0 : = 0.5 vs. HA: < 0.5, p = 0.4 and p-value = 0.
- Looking at the graph above, what would have happened if we had gotten a sample proportion, p = 0.30, instead? A. The conclusion would have been exactly the same. B. The value of the test statistic would have increased. C. The value of the p -value would have decreased. D. The value of the p -value would have increased. E. The probability of making a Type I error would have decreased.
H 0 : = 12 vs. HA: > 12, x^ =15 and p-value = 0.
- Which of the following is the best definition of the p -value in terms of the test represented above? A. The p -value = 0 : 029 says that 97.1% of the time we will get sample means of 15 or more when the true mean is only 12. B. The p -value = 0 : 029 says that 97.1% of the time we will get sample means of 12 or more when the true mean is only 15. C. The p -value = 0 : 029 says that there is a 2.9% chance that the true mean is only 15. D. The p -value = 0 : 029 says that there is a 2.9% chance that the true mean is greater than 12. E. The p -value = 0 : 029 says that 2.9% of the time we will get sample means of 15 or more when the true mean is only 12.
- Which of the following defines the significance level of a hypothesis test, ? A. how often we make a Type I error. B. how often we reject H 0. C. how often H 0 is false. D. how often H 0 is true. E. Exactly two of the above (excluding D.)
- Z ~ N(0, 1^2 ). What is z* such that P(z* < Z < z*) = 0.25? A. 0.675 B. 0.625 C. 0.5987 D. 1.15 E. 0.