Statistical Analysis Homework: Comparing Two Groups and Hypothesis Testing - Prof. Peter H | Assignments Statistics

Stat 502

Homework 2

Assigned 10/9/08

Due 10/16/08

1. (Voles): Researchers studying 24 wild prairie voles are interested in the effects of food sup-

plements on reproductive success. The researchers randomly assigned 12 voles to receive

supplements (group B), with the remaining 12 voles needing to forage to obtain all of their

food (group A). At the end of the mating season the number of pups for each of the 24

females was recorded.

(a) Make a histogram and boxplots for each of the two groups. Comment on the differences.

(b) Compute the means and medians of each group. Comment on the differences.

sample t-test to evaluate differences between the groups. Obtain the corresponding p-

value. Write down the assumptions which validate the use of this p-value, and comment

on whether or not they are met for these data.

(d) Make a histogram of the randomization distribution of the t-statistic, and compute the

corresponding p-value. Write down the assumptions which validate the use of this p-

value, and comment on whether or not they are met for these data.

2. (Null distribution of p-values): Recall that a p-value is a function of the data, and so before

the experiment is run it is a random variable.

(a) Consider the one-sample t-test for evaluating evidence against H0:µ=µ0. Derive the

distribution of the p-value under the null hypothesis. Using the result, show that the

type I error of a level-αtest is α.

(b) Now consider the hypothesis H0:µA=µB. Via simulation, compute the null distri-

bution of the p-value based on the two-sample t-test under each of the following six

experimental scenarios.

i. Y1,A, . . . , Y10,A ∼i.i.d. normal(1,1), Y1,B , . . . , Y10,B ∼i.i.d. normal(1,1)

ii. Y1,A, . . . , Y10,A ∼i.i.d. normal(1,1), Y1,B , . . . , Y10,B ∼i.i.d. normal(1,3)

iii. Y1,A, . . . , Y10,A ∼i.i.d. Poisson(1), Y1,B , . . . , Y10,B ∼i.i.d. Poisson(1)

iv. Y1,A, . . . , Y10,A ∼i.i.d. normal(1,1), Y1,B , . . . , Y10,B ∼i.i.d. normal(3,1)

v. Y1,A, . . . , Y10,A ∼i.i.d. normal(1,1), Y1,B , . . . , Y10,B ∼i.i.d. normal(3,3)

vi. Y1,A, . . . , Y10,A ∼i.i.d. Poisson(1), Y1,B , . . . , Y10,B ∼i.i.d. Poisson(3)

More specifically, for each of the above scenarios,

•simulate 1000 (or more) runs of the 10-sample experiment;

•compute the p-value for each of the 1000 runs;

•plot the empirical distribution of the 1000 p-values (using, for example, a histogram)

and compare to the distribution from part (a).

Partial preview of the text

Download Statistical Analysis Homework: Comparing Two Groups and Hypothesis Testing - Prof. Peter H and more Assignments Statistics in PDF only on Docsity!

Stat 502 Homework 2 Assigned 10/9/ Due 10/16/

(Voles): Researchers studying 24 wild prairie voles are interested in the effects of food sup- plements on reproductive success. The researchers randomly assigned 12 voles to receive supplements (group B), with the remaining 12 voles needing to forage to obtain all of their food (group A). At the end of the mating season the number of pups for each of the 24 females was recorded.

(a) Make a histogram and boxplots for each of the two groups. Comment on the differences. (b) Compute the means and medians of each group. Comment on the differences. (c) Plot the density of the appropriate t-distribution if one were to use the ordinary two- sample t-test to evaluate differences between the groups. Obtain the corresponding p- value. Write down the assumptions which validate the use of this p-value, and comment on whether or not they are met for these data. (d) Make a histogram of the randomization distribution of the t-statistic, and compute the corresponding p-value. Write down the assumptions which validate the use of this p- value, and comment on whether or not they are met for these data.

(Null distribution of p-values): Recall that a p-value is a function of the data, and so before the experiment is run it is a random variable.

(a) Consider the one-sample t-test for evaluating evidence against H 0 : μ = μ 0. Derive the distribution of the p-value under the null hypothesis. Using the result, show that the type I error of a level-α test is α. (b) Now consider the hypothesis H 0 : μA = μB. Via simulation, compute the null distri- bution of the p-value based on the two-sample t-test under each of the following six experimental scenarios. i. Y 1 ,A,... , Y 10 ,A ∼ i.i.d. normal(1,1), Y 1 ,B ,... , Y 10 ,B ∼ i.i.d. normal(1,1) ii. Y 1 ,A,... , Y 10 ,A ∼ i.i.d. normal(1,1), Y 1 ,B ,... , Y 10 ,B ∼ i.i.d. normal(1,3) iii. Y 1 ,A,... , Y 10 ,A ∼ i.i.d. Poisson(1), Y 1 ,B ,... , Y 10 ,B ∼ i.i.d. Poisson(1) iv. Y 1 ,A,... , Y 10 ,A ∼ i.i.d. normal(1,1), Y 1 ,B ,... , Y 10 ,B ∼ i.i.d. normal(3,1) v. Y 1 ,A,... , Y 10 ,A ∼ i.i.d. normal(1,1), Y 1 ,B ,... , Y 10 ,B ∼ i.i.d. normal(3,3) vi. Y 1 ,A,... , Y 10 ,A ∼ i.i.d. Poisson(1), Y 1 ,B ,... , Y 10 ,B ∼ i.i.d. Poisson(3) More specifically, for each of the above scenarios,

simulate 1000 (or more) runs of the 10-sample experiment;
compute the p-value for each of the 1000 runs;
plot the empirical distribution of the 1000 p-values (using, for example, a histogram) and compare to the distribution from part (a).

Write a sentence or two about what you have learned from each of these simulations (hint: you should have learned something about robustness of the t-test, power of the t-test and about whether your calculation in part (a) was correct).

(t-distribution)

(a) Recall that we defined the χ^2 m distribution as the distribution of a sum of m squared standard normal random variables. Thus if X ∼ χ^2 m then we can think of X as being represented by X = Z^21 + · · · Z m^2 where Z 1 ,... , Zm ∼ i.i.d. normal(0,1). Use this to derive the distribution of X 1 + X 2 , where X 1 ∼ χ^2 m 1 , X 2 ∼ χ^2 m 2 , and X 1 and X 2 are independent. (b) Let Y 1 ,... , Yn ∼ i.i.d. normal(μ, σ^2 ), and let Zi = (Yi − μ)/σ. i. What is the distribution of Z 1 ,... , Zn? ii. Write out (Zi − Z¯) in terms of the Y ’s, μ and σ^2. iii. Use the fact that

(Zi − Z¯)^2 ∼ χ^2 n− 1 to derive the distribution of

(Yi − Y¯ )^2 /σ^2. (c) Let Y 1 ,A,... , YnA,A ∼ i.i.d. normal(μA, σ^2 ) and Y 1 ,B ,... , YnB ,B ∼ i.i.d. normal(μB , σ^2 ). Use the results in (a) and (b) to obtain the distribution of (nA −1)s^2 A/σ^2 +(nB −1)s^2 B /σ^2. Indicate how you are using the results from (a) and (b). Note that (nA − 1)s^2 A/σ^2 + (nB − 1)s^2 B /σ^2 is equal to (nA + nB − 2)s^2 p/σ^2. (d) Use the above results to derive the distribution of the two-sample t-statistic under H 0 : μA = μB.

(Sample size) A researcher needs to estimate the mean circumference of a population of small trees in the Cascade mountains. The researcher would like to construct a confidence interval for the mean, and wants this interval to contain the true mean with probability at least. and would like to estimate the mean with a precision of about 1 cm, i.e. the length of the confidence interval should be no more than 1 cm. Studies from other regions suggest the standard deviation in circumference is about 7 cm. Write out a formula that gives the width of 95% confidence interval as a function of sample size n and the data. Make some assumptions that allow you to make a graph of sample size versus width of the confidence interval, where the sample size ranges from n = 2 to a value such that the width is less than or equal to 1 cm. How many trees do you recommend the researcher sample? What assumptions are you making for this recommendation?

Statistical Analysis Homework: Comparing Two Groups and Hypothesis Testing - Prof. Peter H, Assignments of Statistics

Related documents

Partial preview of the text

Download Statistical Analysis Homework: Comparing Two Groups and Hypothesis Testing - Prof. Peter H and more Assignments Statistics in PDF only on Docsity!