
Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Professor: Ane; Class: Statistical Methods for Bioscience I; Subject: STATISTICS; University: University of Wisconsin - Madison; Term: Fall 2007;
Typology: Exams
1 / 1
This page cannot be seen from the preview
Don't miss anything!

1(a) df= 10+(13−1)−2 = 20 since one zero is dropped in the second sample. With a 2-sided test. 02 < p <. 05 and we conclude that the two groups have different variances (reject “σ 1 = σ 2 ”). (b) Using the t-test that does not require equal variance we get s^2 ¯y 1 −¯y 2 = 8.^06 2 10 +^
6 .82 = 1.398. Using df= 9 and a 2-sided test we get. 10 < p < .20. We fail to reject “μ 1 = μ 2 ”: there is no significant difference between the average size of the two Lyrebird species. 2(a) the Mann-Whitney test p-value is 0.028 because the ranks do not change. The t-test p-value is smaller than 0.006 because only ¯y 1 − y¯ 2 changes (increases). Sample sizes (4) and variability (s 1 and s 2 ) stay the same. So t increases and the p-value decreases. (b) Data set 3 (top). Since the ranks are the same in data set 3 and in data set 1, the p-value from Mann- Whitney test is 0.028, which is < .05. With data set 4 the rank sums would be 23 and 13, and 13 is too big to get significance at the .05 level (the table says we need 10 or less). Note: data set 3 returns p >. 05 with a t-test because the variability in sample “b” is large. Data set 4 also returns p > .05 with a t-test because the difference in sample means is small. (c) For data set 3, Mann-Whitney test would be most appropriate, because we are not sure the data are normally distributed (presence of an ’outlier’ in sam- ple “b”) and the sample size in each sample is small (4). For data set 4, a t-test would be most appro- priate, because the normal distribution seems appro- priate in each sample and the t-test is more powerful than Mann-Whitney test when both can be used. 3(a) Smaller with a one-sided test in the correct direction. The formula for n is σ^2 (zα/ 2 + zβ )^2 /(μ 0 − μa)^2 for a 2-sided test, while we replace zα/ 2 = z. 015 = 2. 17 by zα = z. 03 = 1.88 for a one-sided test. zα being smaller, n is also smaller for a one-sided test. (b) Using n = σ^2 (zα/ 2 + zβ )^2 /(μ 0 − μa)^2 we get 25 = 104 ∗ (2.17 + zβ )^2 /(0 − 5)^2 i.e. zβ =
0 .281 and with Table A β = .3897 so the power is 61%. One can get the same result by determining the rejection region: outside 0± 2. 17 ∗
Then the power is P {Z < − 9. 426 / 2. 04 } + P {Z > − 0. 574 / 2. 04 } = 0 + P {Z > − 0. 281 } = 1 − 0 .3897 = 0 .61. 4(a) pˆA = 63/220 = .2864, pˆC = 59/80 = .7375 and pˆ = (63 + 59)/300 = .4067. We can approximate the
binomial with a normal distribution because 220∗ pˆ =
. 4067 ∗. 5933 ∗ (1/220 + 1/80) = 7.034. Us- ing the normal distribution we get p < 2 ∗ 2. 9 ∗ 10 −^7 is way smaller than .0001 and way smaller than α = .10. We strongly reject pA = pC : there is strong evidence that the locus is linked to some genetic region affect- ing flowering time, the C allele being linked to early flowering. Assumptions include random sampling of and independence among plants. (b) Observations are clearly paired: each leaf observa- tion is paired to the root observation made on the same plant. Therefore, one would use a two-sided paired-sample t-test, provided that the distribution of difference (leaf expression -root expression) is not too far from a normal distribution. Note: there is no need to assume that σ 1 = σ 2 and no need to assume normality of gene expression. Only the normality of the expression difference is needed. 5(a) false: we mostly use a t-distribution. The normal distribution is only used when the variance is known. true true because z-quantiles are smaller than t-quantiles. false: the confidence interval is centered at the sam- ple mean (¯y, which is random) while the not-rejection region is centered at the null hypothesis μ 0 (not ran- dom - known before collecting the data) true true: they both have the same length, which is twice zα/ 2 ∗ σ/
n. Summary of grades:
Frequency
20 40 60 80 100
0
10
20
30
40
l l l l l ll
76 83 89