Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Practice Questions: Two-Sample Hypothesis Testing & Confidence Intervals, Exams of Statistics

Answers to practice questions for a statistics final exam focusing on two-sample hypothesis testing and confidence intervals. It includes calculations for various statistical tests such as t-tests, z-tests, and chi-square tests, as well as discussions on concepts like standard errors, degrees of freedom, and confidence intervals.

Typology: Exams

Pre 2010

Uploaded on 09/07/2009

koofers-user-7ri
koofers-user-7ri 🇺🇸

10 documents

1 / 4

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Practice Questions: Two-Sample Hypothesis Testing & Confidence Intervals and more Exams Statistics in PDF only on Docsity!

Stat 20 Fall 06 A. Adhikari

ANSWERS TO PRACTICE QUESTIONS FOR THE FINAL

  1. The box has 155 tickets. Each ticket has two parts: the left side shows the 1/0 that will be the result if the patient gets assigned to the treatment group, and the right side shows the 1/0 that will be the result if the patient gets assigned to the control group. We see a simple random sample of 78 of the left sides, and the remaining 77 right sides. The null hypothesis states that the proportion of 1’s on the left side of all 155 tickets is equal to the proportion of 1’s on the right side of all 155 tickets (that is, the treatment has no effect).

  2. z = 3.46. This is the usual “two sample” calculation in the context of the randomized experiment. The SE for difference = 7.21%

  3. (ii)

( 100 30

)

    1. 770
  1. 7.97%. Expect 30 blue tickets, SE 4.58, z = 0.11.

  2. (i). The chance of hitting the theoretical probability exactly will decrease.

  3. (ii) At least 79. This is the 80th percentile of the midterm distribution.

  4. 85.8.

  5. 78.82%. The rms error is 8.43.

  6. (ii) There’s a perfectly good random sample but the two sets of responses are dependent because they are obtained from the same people. There is no information about the nature of the dependence.

  7. The box has 15,000 tickets, one for each patient. Each ticket shows blood pressure. The average and SD of the box are unknown. The null hypothesis says that the average of the box is equal to

  8. The average of the box is less than 120.

  9. t (degrees of freedom = 5) = − 1 .2.

  10. The data support the investigators’ belief. (Or, the data do not support the conclusion that the population average is lower than 120.)

  11. 94.21%. Binomial n = 5, p = 0.2, k = 0, 1 , 2. Add the three terms.

  12. 8.075%. I expect to lose $0 give or take $35.36.

    1. Expect 80 hits with an SE of 8, and z = − 1 .3. You get 69.6, and should use the continuity correction to see that 69 is a better answer than 70.
  13. (ii) is approximately 25%. This is an “Are you awake?” question. With hundreds of throws, the distribution of the number of hits is roughly normal and centered at 20%. Therefore on a single day the chance of “more than 20% hits” is about 50%. The throws are independent, so the answer is 0. 5 × 0 .5 = 0.25.

  14. (ii) equal to 15.5%. It’s the center of the interval; recall the method of construction. This is another “Are you awake?” question.

  15. (ii) goes from 12.25% to 18.75%. The z is 2.6. The SE for the percent is 1.25%, because the

distance between the center and each end of the 95%-CI must be 2 times the SE for the percent. [See if you can find the sample size (at least to a pretty good approximation) and the number of senior citizens in the sample. Those numbers are not necessary for the problems here, but it’s instructive to find them.]

  1. 180 through 220. In 400 tosses of a fair coin you expect 200 heads give or take 10. It’s a two-sided test so the critical values of z are ± 2 .05.
    1. It’s 96% of 300.
    1. According to our earlier calculation, the students who will conclude that the coin is fair will be those who get 180 to 220 heads inclusive. In 400 tosses of a coin which lands heads with chance 0.6, the expected number of heads is 240 with an SE of

400 × 0. 6 × 0 .4 = 9.8. Convert 179.5 and 220.5 to standard units to get z = − 6 .17 and − 1 .99 respectively. The area in that region is essentially equal to the area to the left of −2, which is 2.275%. We have shown that 2.275% of the students will conclude that the coin is fair, which is the wrong conclusion for this coin. But the remaining 100% − 2 .275% = 97.725% of the students will conclude, correctly, that the coin is not fair.

If you used 95% instead of 95.45% as the area in the range ±2, that’s OK. You should still get 293 as your answer.

  1. 15.56%. You can draw a chart (which is pretty big; there are 90 possibilities because you remove the diagonal from a 10 × 10 grid). Or you can find the chances of SS, TT, II separately and add up.
  2. 30%. The easiest way is to reason by symmetry, as you do for “What’s the chance that the second card dealt from a standard deck is an ace?” Or you can use the chart to see that the fraction is 27/90.
  3. 33.33%. Given that a vowel gets used up on the second draw, only 9 tickets are possibilities for the first. Of these, 3 are T’s.
  4. (ii)
  5. (iii). The distribution of the sample is clearly non-normal.
  6. (iii). It’s the histogram of all possible sample averages and all their probabilities.
  7. I’ll accept anything in the range 24% to 26%. There 120 draws with replacement from a box which contains one ticket marked $1, four marked −$1.5, and one marked $6. The expected sum of the draws is $20 with an SE of $30.23 (use the old formula from Chapter 4 to find the SD of the box). So z is just around − 0 .67.
  8. 97.98%. This is about the number of bets my friend wins, so the box has 4 tickets marked 1 and 2 tickets marked 0. My friend has to win at least 70 bets (if he wins 70 then I win 50, so he wins 20 more). His expected number of wins is 80 with an SE of 5.16. Convert 69.5 to standard units to get z = − 2 .03.
  9. 88.87%. This is 1 minus the chance that we both win the same number of bets, that is, 1 minus the chance that my friend wins 6 bets. Use the binomial formula with n = 12, p = 4/6 and k = 6. Yes, that’s exactly the same as using the binomial formula with n = 12, p = 2/6, and k = 6.
  10. (ii), because the sample sizes are equal. If you want to calculate anything, compute the SE

for the percent in both cases; they’re almost equal, even if you use the correction factor. Those are estimates based on the sample percents so the exact SEs will be slightly different, and we’ll never know what they are. But for them to differ by a factor of 2, something hugely unlikely has to have happened: namely that two very similar random samples have come out of two hugely different populations. Don’t bet on it. By the way, the square root law works on sample sizes, not population sizes.

  1. − 2 .42% to 6.42%. The difference is estimated as 2% give or take the SE for the difference which is 2.21%. Note: differences can be negative, so there’s no problem with the use of the normal curve even though the SE is bigger than the expected value.
  2. (i) are equal, because the confidence interval contains 0. This means the data show that 0 is a reasonable value for the difference between the population percents of Republicans. If you want, you can do the usual two-sample test for the difference. You will find that you are essentially re-doing the calcuation in the previous problem.
  3. 37.5%. It’s 3/8. Either use binomial n = 3, p = 0.5, k = 1; or work out the chances of BGG, GBG, GGB and add up.
  4. She should use the χ^2 test. The null hypothesis says that her belief is correct (i.e. that her model is good), and this determines the box model. The box represents all possible families with 3 children; there are 263 draws at random with replacement. The tickets show the number of boys: 0, 1, 2, and 3. In the previous problem you figured out that according to the investigator the proportion of 1’s in the box should be 0.375. Using the same method you can figure out the other proportions in the box: 0.125 0’s, 0.375 2’s, and 0.125 3’s.
  5. χ^2 (d.f.=3) is 2.78. Remember to work in counts, not in proportions or percents.
  6. Between 30% and 50%.
  7. The data support the investigator’s belief.
  8. I’ll accept anything in the range 54% to 56%.

The regression effect tells you, without any calculation, that the answer must be bigger than 50%. Here are the calculations.

The z corresponding to the 40th percentile of the standard normal curve is − 0 .25. So the 40th percentile of final scores is − 0. 25 × 12 + 70 = 67. The given midterm score is z = − 0 .25 in standard units and the corresponding regression estimate of the final score is 68.2. The r.m.s. error is 9.6. Use the normal curve to find the percent over 67: now z is − 0 .125, and 0.125 is halfway in between 0.1 and 0.15 on your table. Hence the range of answers; not surprisingly, the actual answer is 55%.

  1. Roughly normal, center 20%, spread 1.96% (with the correction factor).
  2. 64% (63.9984% if you are careful about the non-replacement). It’s the chance of “the first one is not a freshman, and the second one is not a freshman.”
  3. 32% (more carefully, 32.0032% reasoning as above).
  4. There are three bars, centered 0, 1, and 2. The areas of the bars are respectively 64%, 32%, and 4%.
  5. 0.3333. It’s 1/3, and yes, it is that easy. For what follows, draw a tree diagram: the first stage consists of the die (D1, D2, D3), and the second stage consists of the color on the first roll (R, B).
  1. 0.5 = [(1/3) × (1/6)] + [(1/3) × (1/2)] + [(1/3) × (5/6)]. That is, red and blue are equally likely. To understand the discussion of independence in a later problem, it will be useful to notice that the calculation and the answer in this problem would have been exactly the same if the question had asked for the chance that the second roll shows a red face.
  2. 0.556 = 5/9 = (1/3) 0 ×. 5 (5 /6)by Bayes’ rule. So given that a red face appeared, the chance that Die 3 was rolled has gone up from its prior value of 1/3 to 5/9.
  3. (ii) No because if the first roll shows a red face then Die 3 is more likely than either of the others, which makes red more likely than blue.

Contrast this with our usual statement that “successive rolls of a die are independent”. They are, if you know which die you’re rolling. But if the die itself is unknown because it was picked randomly, then information about the first few rolls can provide information about the die, as above, which in turn can affect the probabilities of events in future rolls.

  1. 5%.

You can’t use the “SE for the difference” formula because the proportion “for” and the proportion “against”are seriously dependent. After all, if you know one then you can find the other; the correlation is −1. You have to look more carefully at what is being estimated.

Let p be the population proportion of voters for the proposition. Then the margin of victory is p − (1 − p) = 2p − 1. So its estimate is 2ˆp − 1 where ˆp is the sample proportion of “for” voters. In our sample the observed value of ˆp is 0.53 and the estimate is (2 × .53) − 1 = .06 as stated in the problem. Now think of properties of standard deviation: the −1 will not affect the SE, but the factor of 2 will. So the SE of our estimate is 2 times the SE of the sample proportion “for”, that is, 2 ×

. 53 ×. 47 /400 = 0.05.

You can see this in another way, as follows. Construct the 68%-confidence interval for the proportion “for”. That’s 0. 53 ±

. 53 ×. 47 /400 which is (50.5%, 55 .5%). The margins of error corresponding to the two endpoints are respectively 1% and 11%. So (1%, 11%) is a 68%-confidence interval for the margin of error. The SE for the margin of error must be half the width of the interval, which is 5%.