Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Inference: Type I Error, Hypothesis Testing, and Confidence Intervals - Prof. , Exams of Data Analysis & Statistical Methods

Various concepts in statistical inference, including type i errors, hypothesis testing, and confidence intervals. It includes examples and calculations for determining p-values, confidence intervals, and the appropriate hypothesis tests for given scenarios. The document also discusses the relationship between confidence intervals and hypothesis tests.

Typology: Exams

Pre 2010

Uploaded on 02/13/2009

koofers-user-yog-1
koofers-user-yog-1 🇺🇸

10 documents

1 / 4

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Inference: Type I Error, Hypothesis Testing, and Confidence Intervals - Prof. and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

STAT303 Sec 508-

Spring 2007

Exam

Form A

Instructor: Julie Hagen Carroll

October 10, 2007

  1. Don’t even open this until you are told to do so.
  2. Please PRINT your name in the blanks provided.
  3. There are 20 multiple-choice questions on this exam, each worth 5 points. There is partial credit. Please mark your answers clearly. Multiple marks will be counted wrong.
  4. You will have 60 minutes to finish this exam.
  5. If you have questions, please write out what you are thinking on the back of the page so that we can discuss it after I return it to you.
  6. If you are caught cheating or helping someone to cheat on this exam, you both will receive a grade of zero on the exam. You must work alone.
  7. This exam is worth the same as a regular exam (this may differ from section to section.
  8. Good luck!
  1. Suppose that it is commonly assumed that the mean flight from College Station to Houston is 25 minutes. However, you believe that the true average is greater than this. You randomly choose 30 flights to take and record their times. The mean of your sample is 27. What would be a Type I error?

A. Concluding that the mean is greater than 25 when it really is 25 minutes B. Concluding that the mean is not greater than 25 when the true mean is 24 C. Concluding that the mean is not greater than 25 when the true mean is 26 D. Concluding that the true mean is 24 when the true mean is 25 E. Two of the above are true.

  1. The following are confidence intervals for 1 - 2 com- puted from the same data:

90% CI = (0.02, 0.09) 95% CI = (0.01, 0.12) 99% CI = (-0.03, 0.14)

Based on the intervals above, if we were to test H 0 : π 1 = π 2 vs. HA : π 1 6 = π 2 , what would be the corre- sponding p-value?

A. p-value > 0. 10 B. 0. 10 > p-value > 0. 05 C. 0. 05 > p-value> 0. 01 D. p-value < 0. 01 E. You need a test statistic value to determine the p-value.

  1. An insurance company is conducting a study compar- ing the average number of accidents for females and males. The company wants to show on average females have less accidents than males to justify lower rates for females. What is the appropriate hypothesis?

A. H 0 : μf emale = μmale vs. HA : μf emale 6 = μmale B. H 0 : πf emale = πmale vs. HA : πf emale 6 = πmale C. H 0 : μf emale = μmale vs. HA : μf emale > μmale D. H 0 : μf emale = μmale vs. HA : μf emale < μmale E. H 0 : πf emale = πmale vs. HA : πf emale < πmale

  1. A bank wonders whether omitting the annual credit card fee for customers who charge at least $5000 in a year would increase the amount charged on its credit card. The bank makes this offer to a simple random sample of 500 existing credit card customers. The bank then compares the amount charged this year with the amount charged last year for each of these customers. What type of test should be used to analyze this study?

A. A two-sample test of proportions B. A one-sample t-test C. A two-sample t-test since the standard deviation is unknown D. A pooled t-test E. A paired t-test

  1. Suppose we want to test whether the proportion of pa- tients who come down with a cold during their hospi- tal stay is the same for patients taking Echinacea every day and patients on a placebo drug. One herb company wants to prove that it lowers the rate at which patients catch a cold, so we set up the hypotheses: H 0 : π 1 = π 2 and HA : π 1 > π 2 , where π 1 is the proportion of people taking the placebo who get a cold during their hospital stay and π 2 is the proportion of people taking Echi- nacea who get a cold. The resulting p-value is 0.2171. What does that mean in context of the problem?

A. The probability that Echinacea doesn’t keep you from catching a cold is 0.2171. B. The probability that we find a difference in propor- tions at least this small assuming that Echinacea doesn’t keep you from catching colds is 0.2171. C. Under repeated sampling, we would find that pa- tients taking Echinacea every day had the same rate of sickness as patients on a placebo 21.71% of the time, assuming Echinacea actually doesn’t keep you from catching a cold. D. Under repeated sampling, we would find that pa- tients taking Echinacea every day had at least this much lower rate of sickness about 21.71% of the time, assuming that Echinacea doesn’t keep you from catching a cold. E. Two of the above are true.

  1. Which of the following best describes the relationship between a (1 − α) ∗ 100% confidence interval for μ 1 − μ 2 and a 2-sided test of hypotheses for μ 1 = μ 2 some value?

A. There is no relationship between confidence inter- vals and hypothesis tests. B. If μ 1 or μ 2 fall within the confidence interval, we would reject the null. C. If μ 1 or μ 2 fall within the confidence interval, we would fail to reject the null. D. If the confidence interval contains 0, we would re- ject the null. E. If the confidence interval contains 0, we would fail to reject the null.

  1. The purpose of pairing in an experiment is to

A. make the samples independent. B. increase the degrees of freedom of the t-test so the test has more power. C. match the observations so that there is less chance of making an error. D. filter out the variability between the subjects. E. None of the above are correct.

  1. Suppose you tested the null hypothesis H 0 : μ 1 = μ 2 against the alternative HA : μ 1 6 = μ 2 and got an average difference in means of 0.5682 with a corresponding p- value of 0.0432. If you were to create a 95% confidence interval for the difference between the two means using the same data, which of the following would be true?

A. The confidence interval would include 0, since

  1. 0432 < 0 .5682. B. It would be impossible to tell whether the confi- dence interval would include positive numbers or negative numbers, since we don’t know the value of the test statistic. C. Under repeated sampling, 95% of the time the con- fidence interval for the difference in means would include 0.5682. D. The confidence interval would not include 0. E. Two of the above are true.

  2. Let μ denote the mean gas mileage of all cars when additive is used. When additive is not used, cars have a mean gas mileage of 18.25. Does using additive improve gas mileage? Test the hypotheses H 0 : μ = 18.25 vs. HA : μ > 18 .25 at α = 0.05. A car manufacturer took a sample of 10 cars and found a sample mean of 18.92 with a standard deviation of 7.47. What do you conclude about using additive?

A. We have evidence to say that using additive im- proves gas mileage since our p-value is less than α. B. Since our p-value is less than α, we do not have evidence to say that using additive improves gas mileage. C. With such a large p-value, we do not have evidence to say that using additive improves gas mileage. D. We have a large p-value, so we need to increase n in order to increase our power. E. Two of the above are true.

  1. In the hypothesis test above, what did we have to as- sume for the test to be valid?

A. The sample was random. B. The data was normal. C. The true standard deviation was known. D. All of the above are necessary assumptions for the test above. E. Only two of the above are necessary assumptions for the test above.

  1. In the testing procedure for a two-sided HA ( 6 =), we rejected H 0 at the α = 0.05 level. Using the same data and set of hypotheses

A. H 0 might not be rejected at the α = 0.01 level. B. H 0 might not be rejected at the α = 0.10 level. C. the p-value will always be less than α. D. Two of the above are true. E. All of the above are true.

  1. We want to see if there’s a difference in GPR’s for men and women at A&M or not. We ran a hypothesis test and found the p-value to be 0.32. Which of the following is the best conclusion?

A. Women are 32% smarter than men. B. Men are 32% smarter than women. C. There’s no difference between men and women. D. There’s no difference in the average GPR for men and women. E. There’s no difference in the proportion of 4.0’s in men and women.

  1. Suppose we test H 0 : μ 1 = μ 2 vs. HA : μ 6 = μ 2 with a sample of size 15 for the first population and a sample of size 20 for the second population, both of which are normal, and got t = 3.6. Which of the following is the correct range of the p-value for our test?

A. 0. 0025 > p-value > 0. 001 B. 0. 005 > p-value > 0. 002 C. 0. 001 > p-value > 0. 0005 D. 0. 002 > p-value > 0. 001 E. 0. 005 > p-value > 0. 0025

  1. Suppose we tested H 0 : π = 0.5 vs. HA : π 6 = 0.5, found our test statistic to be − 1 .87. What is our p-value?

A. P (Z > − 1 .87) = 0. 9693 B. P (Z < − 1 .87) = 0. 0307 C. 2 ∗ P (Z < − 1 .87) = 0. 0614 D. P (Z > 1 .87) = 0. 0307 E. 2 ∗ P (Z > 1 .87) = 0. 0614

  1. Suppose we wanted to find out whether chocolate and vanilla ice cream had the same number of calories for the 16 different brands of ice cream selected above. We then create a 95% confidence interval for the difference of the mean number of calories in the different brands: (17. 34 , 52 .39), where we took chocolate - vanilla. How might we interpret this confidence interval?

A. Under repeated sampling, 95% of the time the true mean difference between calories in chocolate ice cream and calories for vanilla ice cream would fall between 17.34 and 52.39. B. Under repeated sampling, 95% of the time, the difference between the sample means would fall in the interval we calculate. C. The probability that the difference in mean calo- ries for chocolate ice cream and calories for vanilla ice cream falls between 17.34 and 52.39 is 0.95. D. The probability that this confidence interval con- tains the true mean difference between calories in chocolate ice cream and vanilla ice cream is 0.95. E. Under repeated sampling, 95% of the time, this type of confidence interval will contain the true mean difference between calories in chocolate ice cream and calories in vanilla ice cream.

  1. Suppose we suspect that strawberry ice cream has f ewer calories than chocolate. We then want to test H 0 : μc = μs vs. HA : μc > μs, where μc is the mean number of calories in different brands of chocolate ice cream and μs is the mean number of calories in different brands of strawberry ice cream. Suppose that we ran the test, we found an average difference in the number of calories of 9.42. What would be a good interpreta- tion of α in context of this problem?

A. α is the probability (over the long run) we con- clude that strawberry ice cream has fewer calories than chocolate ice cream when in fact it does. B. α is the probability (over the long run) we con- clude that strawberry ice cream and chocolate ice cream have the same number of calories when in fact they do. C. α is the probability (over the long run) we con- clude that the number of calories in chocolate ice cream is greater than the number of calories in strawberry ice cream when actually they have the same number of calories. D. α is the probability (over the long run) we con- clude that strawberry and chocolate ice cream have the same number of calories when in reality chocolate ice cream has more calories than straw- berry. E. α is the probability that over the long run, we find a difference in calories of 9.42 or more just by chance, assuming that chocolate and strawberry ice cream have the same number of calories.

  1. Suppose you wanted to find out whether there is a dif- ference between the proportion of people 55 and over who voted for an increase in Social Security taxes and the proportion of people under 55 who voted for it. What would your null and alternative hypotheses be?

A. H 0 : μ = 55 vs. HA : μ 6 = 55 B. H 0 : π = 55 vs. HA : π 6 = 55 C. H 0 : μ 1 = μ 2 vs. HA : μ 1 > μ 2 D. H 0 : μ 1 = μ 2 vs. HA : μ 1 6 = μ 2 E. H 0 : π 1 = π 2 vs. HA : π 1 6 = π 2

  1. The Computer-Assisted Hypnosis Scale (CAHS) is de- signed to measure a person’s susceptibility to hypno- sis. CAHS scores range from 0 (no susceptibility) to 12 (highest possible susceptibility). A study at the Univer- sity of Texas reported that their undergraduates had a mean CAHS score of μ = 11.2. Suppose that you want to verify that undergraduates at A&M are less suscep- tible to hypnosis than t-sips. Which of the following situations best describes a Type II error? (Hint: write- out the alternative hypothesis in words and then use the definition for Type II error).

A. Finding significant statistical evidence that the mean CAHS score for Aggies is less than 11.2 when the true mean is 11.5.

B. Finding significant statistical evidence that the mean CAHS score for Aggies is less than 11.2 when the true mean is 4.6. C. Not finding significant statistical evidence that the mean CAHS score for Aggies is less than 11.2 when the true mean is 11.5. D. Not finding significant statistical evidence that the mean CAHS score for Aggies is less than 11.2 when the true mean is 4.6. E. Two of the above are true.

  1. In which of the following situations can you NOT use the normal approximation to conduct the hypothesis test? A. n = 80, H 0 : π ≥ 0. 9 B. n = 20, H 0 : π ≥ 0. 5 C. n = 60, H 0 : π = 0. 8 D. Two of the above would be invalid for the normal approximation test. E. None of the above would be valid for the normal approximation test.
  2. Suppose you tested H 0 : μ 1 = μ 2 vs. Ha : μ 1 6 = μ 2. Your data consisted of two samples with ¯x 1 = 10 and x¯ 2 = 12 and the resulting p-value was 0.806. Which of the following is the best interpretation of the p-value for this test? A. There is an 80.6% chance that the two true means are equal. B. There is an 80.6% chance of seeing at least this big of a difference in sample means when the true means are equal. C. If we took many samples from these same popu- lations, 80.6% of the time we would fail to reject H 0. D. If we took many samples from these same popula- tions, 80.6% of the time we would see at least this big of a difference in true means when the sample means are equal. E. If we took many samples from these same popula- tions, 80.6% of the time we would see at least this big of a difference in sample means when the true means are equal.
  3. Why must we know the sampling distribution of our statistic of interest in order to test hypotheses? A. Because the sampling distribution must be finite in order to look up the values on the chart. B. Because we don’t know which chart/table to use until we know what the distribution of the test statistic is. C. Because we cannot create the test statistic in the first place unless we know what the mean and vari- ance of the statistic are. D. Because the mean of the statistic of interest is al- ways hypothesized to be 0. E. Two of the above are true. 1A,2C,3D,4E,5D,6E,7D,8D,9E,10E,11A, 12D,13B,14C,15E,16C,17E,18D,19A,20E,21B