Stat 571 - Second Midterm Exam Solutions - Prof. Cecile M. Ane, Exams of Data Analysis & Statistical Methods

Solutions to the stat 571 - second midterm exam held on november 20, 2007. The exam covers topics such as levene's test, t-test, mann-whitney test, sample size calculation, and hypothesis testing. Students are expected to understand concepts related to statistical analysis, variance, mean, and confidence intervals.

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-0lb
koofers-user-0lb šŸ‡ŗšŸ‡ø

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 571 Second Midterm Exam November 20, 2007
Name:
•The exam is open book and open notes.
•Do all your work in the spaces provided. If you need additional
space for your work, indicate clearly where the additional work can
be found.
•The parts within a problem are not necessarily sequential.
•To receive full credit, you must show your work.
•Do not dwell too long on any one question. Answer as many
questions as you can.
For instructor’s use:
1 25
2 20
3 20
4 20
5 15
Total 100
1. A researcher in Australia studies two different species of Lyrebird, a genus known for their ability to mimic
almost any sound. However, he is more interested in differences in size, as measured from head to tail, so
he collects information on 10 birds of the first species, and 13 of the second. The birds are captured in
such a way that they can be assumed to be independent of one another, and all of the measurements are
distinct (none are tied). There is reason to believe that the measurements can be assumed approximately
normal.
(a) As a first step in the analysis, the researcher performs Levene’s test to determine whether the variances
of the two groups might be assumed equal. His computed statistic works out to be 2.367. Determine
the appropriate degrees of freedom for the test, and compute the p-value associated with this statistic.
What would you conclude about the variances of the two groups?
(b) Based on your conclusion from (a), perform an appropriate two-sided t-test to compare the mean sizes
of the two groups. The summary statistics are as follows: n1= 10, n2= 13, s1= 8.06, s2= 2.05,
¯x1= 86.65, ¯x2= 90.30. Compute the statistic, the p-value, and make a conclusion using α= 0.01.
Note: the adjusted degree of freedom is 9.90, in case you wish to use it.
1
pf3
pf4

Partial preview of the text

Download Stat 571 - Second Midterm Exam Solutions - Prof. Cecile M. Ane and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

Stat 571 Second Midterm Exam November 20, 2007

Name:

  • The exam is open book and open notes.
  • Do all your work in the spaces provided. If you need additional space for your work, indicate clearly where the additional work can be found.
  • The parts within a problem are not necessarily sequential.
  • To receive full credit, you must show your work.
  • Do not dwell too long on any one question. Answer as many questions as you can.

For instructor’s use:

Total 100

  1. A researcher in Australia studies two different species of Lyrebird, a genus known for their ability to mimic almost any sound. However, he is more interested in differences in size, as measured from head to tail, so he collects information on 10 birds of the first species, and 13 of the second. The birds are captured in such a way that they can be assumed to be independent of one another, and all of the measurements are distinct (none are tied). There is reason to believe that the measurements can be assumed approximately normal.

(a) As a first step in the analysis, the researcher performs Levene’s test to determine whether the variances of the two groups might be assumed equal. His computed statistic works out to be 2.367. Determine the appropriate degrees of freedom for the test, and compute the p-value associated with this statistic. What would you conclude about the variances of the two groups?

(b) Based on your conclusion from (a), perform an appropriate two-sided t-test to compare the mean sizes of the two groups. The summary statistics are as follows: n 1 = 10, n 2 = 13, s 1 = 8.06, s 2 = 2.05, x¯ 1 = 86.65, x¯ 2 = 90.30. Compute the statistic, the p-value, and make a conclusion using α = 0.01. Note: the adjusted degree of freedom is 9.90, in case you wish to use it.

  1. Data will be collected to compare two treatments, called treatments ā€œaā€ and ā€œbā€. Since data collection is quite expensive, only 4 independent observations will be collected for each treatment (for a total of 8 observations). Samples from ā€œaā€ and ā€œbā€ treatments will be collected independently.

(a) The figure below shows two data sets, with boxplots and dotplots superposed. With data set 1 (top panel) the Mann-Whitney gives a p-value p = 0.028 and the t-test test gives p = 0.006. With data set 2 (bottom panel), the Mann-Whitney test p-value is  smaller than 0. 028  0. 028  larger than 0. 028 The t-test p-value is  smaller than 0. 006  0. 006  larger than 0. 006 Justify briefly.

a

b

l l l l

l l l l

data set 1

a

b

10.0 10.5 11.0 11.5 12.0 12.

l l l l

l l l l

data set 2

(b) Suppose now that the Mann-Whitney test returns p < 0 .05 while the t-test returns p > 0 .05. Which of ā€œdata set 3ā€ (top panel) and ā€œdata set 4ā€ (bottom panel) would give such results?  data set 3 (top)  data set 4 (bottom). a

b

l l l l

l l l l

data set 3

a

b

10.0 10.5 11.0 11.5 12.0 12.5 13.

l l l l

l l l l

data set 4

(c) For data set 3, which test would be most appropriate:  Mann-Whitney test  t-test? Why?

For data set 4, which test would be most appropriate:  Mann-Whitney test  t-test? Why?

(b) RNA expression of gene ā€œpolluxā€ is measured in 20 lotus plants. From each plant, two RNA expres- sions are measured: one from a root extract and one from a leaf extract. Which specific test would you use to know if ā€œpolluxā€ is expressed at a different level in roots and in leaves? Explain your choice, but don’t do any calculation. You may make assumptions, but say what they are.

  1. Answer the following questions as true or false. If false, explain why. For all questions, assume a one-sample, two-sided test about a population mean with a fixed and pre-determined sample size.

(a) If you are able to assume the data is normally distributed, then a confidence interval will always be computed using an appropriate quantile from the standard normal distribution.  true  false

(b) As α increases, the length of a 100 āˆ— (1 āˆ’ α)% confidence interval will decrease.  true  false

(c) If you assume the variance is known, the confidence interval will have a shorter length than if you assume the variance is unknown, in the case where σ happens to be equal to s.  true  false

For the following three questions, additionally assume normality, independence, and known vari- ance. The ā€œnot-rejection regionā€ is defined as those values of the sample mean that would not cause rejection of the null hypothesis. (d) The not-rejection region for an α = 0.05 level test will contain exactly the same numbers as a 95% confidence interval based on a particular set of observed data.  true  false

(e) If you perform an hypothesis test by creating a 100 āˆ— (1 āˆ’ α)% CI around the sample mean of a particular set of observed data, and reject if the value of μ under the null hypothesis is not inside the interval, this would give exactly the same result (in terms of reject/not-reject) as performing an hypothesis test by computing a test statistic and p-value, and rejecting if the p-value < α.  true  false

(f) The length of the not-rejection region for an α-level test is the same as the length of a 100 āˆ— (1 āˆ’ α)% confidence interval based on a particular set of observed data.  true  false