Statistical Inference Homework #2 - Prof. Guoqing Diao, Study Guides, Projects, Research of Statistics

The second homework assignment for stat 554, a course on statistical inference. It includes problems on confidence intervals, hypothesis testing, and power calculations.

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 12/09/2008

koofers-user-8kt
koofers-user-8kt 🇺🇸

4.5

(2)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STAT 554: HW #2
due March 3, 2008
As with HW #1, put all of your answers on the answer sheet (posted on the course web site). Express all
confidence intervals in the form (lower confidence bound, upper confidence bound), as opposed to (point
estimate ±δ). Also, for the first problem, round the confidence bounds to the nearest tenth, and for all of
the sample size computations report the exact integer value. For the power values requested in the second
problem, round to the nearest thousandth. Otherwise, round answers to two significant digits, unless specific
instructions are given to do something different. The p oints total to 13.5, but I’ll truncate scores at 13 for
this assignment.
1) Suppose that the percent moisture was measured for batches of a dozen seeds from the wheat heads of
six randomly selected wheat plants (by weighing, drying, and then reweighing each batch of seeds), and that
the following measurements were obtained: 62.7, 66.3, 60.6, 63.0, 62.7, 63.7.
(a) (0.5 point) Assume (near) normality and give a 95% confidence interval for the mean percent moisture
for all wheat plants grown in the same conditions as those in the sample.
(b) (0 points) Now give a 90% confidence interval (and note that it is narrower).
(c) (0.5 point) Now give a 99% confidence interval (and note that it is widest interval of the three).
(d) (1 point) Suppose one is disappointed that the 95% confidence interval of part (a) is as wide as it is, and
wants to supplement the sample with additional observations and produce a 95% confidence interval that
has a half-width no greater than 0.5 (making the total width of the interval less than or equal to 1.0). If
one merely combines new data with the original six observations to produce a single sample and forms
the confidence interval from that sample in the usual way, it’s not clear how many new observations
should be added since the width of the resulting interval depends upon the sample standard deviation,
and that value will not be known until the new observations are taken. However, if Stein’s two-stage
procedure is used, one can predetermine the width of the confidence interval which results from adding
a given number of new observations to the sample. Here’s how Stein’s method works: determine the
sample standard deviation s0from an initial sample of n0observations, then add n1more observations
to produce a combined sample of size n=n0+n1and compute the sample mean xfrom this combined
sample. A confidence interval follows from the fact that
nÃXµX
S0! Tn01,
and this result means that once the value s0is obtained from the initial sample, the width of the final
interval that will result from adding n1observations can be determined before those data values are
obtained. I want you to use the initial sample of six observations to determine the fewest number
of additional observations that will result in a 95% confidence interval formed using Stein’s two-stage
procedure having a half-width no greater than 0.5. Also, give the resulting confidence interval if the
sample mean of the n1additional observations is 61.38.
2) X1, X2, . . . , Xnare iid N(µ, 400.0) random variables. Consider using a ztest to do a test about the
(unknown) mean.
(a) (1.5 points) If n= 25, what is the power of a size α= 0.05 ztest of H0:µ= 200 vs. H1:µ6= 200
against the alternative µ= 195 ?
(b) (1.5 points) For a size α= 0.05 ztest of H0:µ= 200 vs. H1:µ6= 200, what sample size is needed in
order for the power of the test against the alternative µ= 195 to be approximately 0.80? (Hint: The
value of the power gets a contribution from each of the two portions of the rejection region. Initially,
assume that the contribution from one of the tails is so small that it’s negligible. Then determine the
sample size that makes the contribution from the other tail equal to 0.80. Once you get a sample size
this way, you can wrap things up by confirming that the contribution that was initially assumed to be
negligible really is negligible.)
3) Suppose that you observe 11 black outcomes in 15 trials with my toy roulette wheel. Let pbe the
probability of obtaining a black outcome, and consider a test of H0:p= 9/19 against H1:p6= 9/19.
pf2

Partial preview of the text

Download Statistical Inference Homework #2 - Prof. Guoqing Diao and more Study Guides, Projects, Research Statistics in PDF only on Docsity!

STAT 554: HW

due March 3, 2008

As with HW #1, put all of your answers on the answer sheet (posted on the course web site). Express all confidence intervals in the form (lower confidence bound, upper confidence bound), as opposed to (point estimate ± δ). Also, for the first problem, round the confidence bounds to the nearest tenth, and for all of the sample size computations report the exact integer value. For the power values requested in the second problem, round to the nearest thousandth. Otherwise, round answers to two significant digits, unless specific instructions are given to do something different. The points total to 13.5, but I’ll truncate scores at 13 for this assignment.

  1. Suppose that the percent moisture was measured for batches of a dozen seeds from the wheat heads of six randomly selected wheat plants (by weighing, drying, and then reweighing each batch of seeds), and that the following measurements were obtained: 62.7, 66.3, 60.6, 63.0, 62.7, 63.7. (a) (0.5 point) Assume (near) normality and give a 95% confidence interval for the mean percent moisture for all wheat plants grown in the same conditions as those in the sample. (b) (0 points) Now give a 90% confidence interval (and note that it is narrower). (c) (0.5 point) Now give a 99% confidence interval (and note that it is widest interval of the three). (d) (1 point) Suppose one is disappointed that the 95% confidence interval of part (a) is as wide as it is, and wants to supplement the sample with additional observations and produce a 95% confidence interval that has a half-width no greater than 0.5 (making the total width of the interval less than or equal to 1.0). If one merely combines new data with the original six observations to produce a single sample and forms the confidence interval from that sample in the usual way, it’s not clear how many new observations should be added since the width of the resulting interval depends upon the sample standard deviation, and that value will not be known until the new observations are taken. However, if Stein’s two-stage procedure is used, one can predetermine the width of the confidence interval which results from adding a given number of new observations to the sample. Here’s how Stein’s method works: determine the sample standard deviation s 0 from an initial sample of n 0 observations, then add n 1 more observations to produce a combined sample of size n∗^ = n 0 +n 1 and compute the sample mean x∗^ from this combined sample. A confidence interval follows from the fact that

n∗

X

∗ − μX S 0

∼ Tn 0 − 1 ,

and this result means that once the value s 0 is obtained from the initial sample, the width of the final interval that will result from adding n 1 observations can be determined before those data values are obtained. I want you to use the initial sample of six observations to determine the fewest number of additional observations that will result in a 95% confidence interval formed using Stein’s two-stage procedure having a half-width no greater than 0.5. Also, give the resulting confidence interval if the sample mean of the n 1 additional observations is 61.38.

  1. X 1 , X 2 ,... , Xn are iid N (μ, 400 .0) random variables. Consider using a z test to do a test about the (unknown) mean. (a) (1.5 points) If n = 25, what is the power of a size α = 0. 05 z test of H 0 : μ = 200 vs. H 1 : μ 6 = 200 against the alternative μ = 195? (b) (1.5 points) For a size α = 0. 05 z test of H 0 : μ = 200 vs. H 1 : μ 6 = 200, what sample size is needed in order for the power of the test against the alternative μ = 195 to be approximately 0.80? (Hint: The value of the power gets a contribution from each of the two portions of the rejection region. Initially, assume that the contribution from one of the tails is so small that it’s negligible. Then determine the sample size that makes the contribution from the other tail equal to 0.80. Once you get a sample size this way, you can wrap things up by confirming that the contribution that was initially assumed to be negligible really is negligible.)

  2. Suppose that you observe 11 black outcomes in 15 trials with my toy roulette wheel. Let p be the probability of obtaining a black outcome, and consider a test of H 0 : p = 9/19 against H 1 : p 6 = 9/19.

(a) (1.5 points) Report the p-value which results from an exact test which I described in class (and describe in the course notes). (That is, obtain the p-value using the appropriate binomial distribution and not an approximation, and do it using the method described in the notes.) (b) (0 points) Report the p-value which results from using the normal approximation to the binomial distribution (employing a continuity correction) to double a one-tail probability.

Now use the results from the 15 trials to form (possibly approximate) 95% confidence intervals for p, rounding confidence bounds to the place corresponding to the second significant digit of the estimated standard error of the sample proportion. (Note: See the homework web page of the STAT 554 web site for a link to Minitab commands which can be used to compute 4 of the 5 intervals requested below. You should be able to modify the commands to also compute the other interval that is requested.) (c) (0.5 point) Use the simple approximation (the Wald interval) given on p. 79 of the class notes. (d) (0.5 point) Use the modification of the simple approximation which incorporates a continuity correction. (e) (0.5 point) Use the Agresti-Coull interval (described in class, and on the course web site (see the homework web page)). (f) (0.5 point) Use the more complex approximation (the Wilson interval) given on the bottom half of p.

(g) (0 points) Use the exact formula (the Clopper-Pearson interval) given near the bottom of p. 80.

  1. Consider the situation described in Exercise 11 on p. 277 of Statistical Concepts and Methods. Give 95% confidence intervals using each of the methods indicated below, rounding confidence bounds to the place corresponding to the second significant digit of the estimated standard error of the sample proportion. (Note that now that the sample size isn’t nearly as small as it was in the preceding problem, the various intervals aren’t so different.) (a) (0 points) Use the simple approximation given on p. 79 of the class notes. (b) (0 points) Use the modification of the simple approximation which incorporates a continuity correction. (c) (0 points) Use the A-C interval presented in class. (d) (0 points) Use the more complex approximation given on the bottom half of p. 80. (e) (0.5 point) Use the exact formula given near the bottom of p. 80.

  2. (1.5 points) Consider the computer failure data given on the homework web page of the course web site. Counting a week with no failures as a success and a week with one or more failures as a failure, does the time-ordered data provide significant evidence that the weekly outcomes should not be considered to be the result of an iid Bernoulli process? Report the p-value which results from an appropriate two-tailed test. (In general, I’d use a two-tailed test for a computer failure situation like this one because it could be that the outcomes are positively correlated (if the cause for a failure isn’t properly treated, then it could be that another failure is likely to follow) or negatively correlated (if a failure results in maintenance which decreases the chance of a failure the following week).)

  3. (1.5 point) Consider the computer survey example described on pp. 86–88 of the class notes. If 250 of the 1000 records of the 1988 machines are checked, and 14% of those checked required warrenty service, can it be safely assumed that more than 10% of all of the 1000 1988 machines required service? Respond to this query by reporting an appropriate p-value.

  4. (1.5 point) Do part (b) of Exercise 20 on pp. 278-279 of Statistical Concepts and Methods, rounding each confidence bound to the nearest tenth.

A just-for-fun problem (for people with a lot of spare time): Go get a die from your backgammon set and roll it 10^6 times, counting the number of aces that you observe. After you’re done (perhaps by sometime in July), determine the p-value which results from a test of H 0 : P ( ace ) = 1/6 against H 1 : P ( ace ) 6 = 1/6. (Feel free to make use of some sort of a large sample approximation.)

Extra Credit Problem

This problem pertains to the situation of the first problem, involving the six moisture measurements. If you choose to do the extra credit problem, turn it in on paper not attached to the answer sheet for the rest of the problems, and show work to justify your answer. This problem is worth one point, and you are to work entirely on your own, not discussing the problem with anyone or getting help from anyone. Based on the original sample of size six, what coverage probability is associated with the interval (¯x ± 1 .5)?