Midterm 2 Solutions - Statistical Methods 2011 | STAT 3005, Exams of Data Analysis & Statistical Methods

Material Type: Exam; Class: Statistical Methods; Subject: Statistics; University: Virginia Polytechnic Institute And State University; Term: Fall 2011;

Typology: Exams

2012/2013

Uploaded on 04/19/2013

sswagner
sswagner ๐Ÿ‡บ๐Ÿ‡ธ

5

(3)

14 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
STAT 3005_Statistical Methods Midterm 2 Solutions (100 pts+5 pts) Fall 2011
Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM)
1
Part I: Definition (Note: Put your answer above the line) (50 pts)
1. (5 pts)
a. A social scientist uses the General Social Survey to study how much time per day people
spend watching TV, denoted by TVWATCH. Its units are in hours. The sampled observations
of TVWATCH are 1.5, 4.45, 10.50, etc. Then, TVWATCH is _A_.
A. Continuous Random Variable
B. Discrete Random Variable
b. If we are interested in measuring peopleโ€™s addiction to watching TV, which is denoted by
TVADDIC. What is in below shows how we get TVADDIC from TVWATCH.
๐‘‡๐‘‰๐ด๐ท๐ท๐ผ๐ถ=๏ฟฝ๐‘†๐ธ๐‘‰๐ธ๐‘…๐ธ
๐‘‚๐พ
๐‘๐‘‚๐‘‡ ๐ด๐ท๐ท๐ผ๐ถ๐‘‡๐ธ๐ท ๐‘–๐‘“ ๐‘‡๐‘‰๐‘Š๐ด๐‘‡๐ถ๐ป ๐‘–๐‘  ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› 15 ๐‘Ž๐‘›๐‘‘ 24
๐‘–๐‘“ ๐‘‡๐‘‰๐‘Š๐ด๐‘‡๐ถ๐ป ๐‘–๐‘  ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› 5 ๐‘Ž๐‘›๐‘‘ 15
๐‘–๐‘“ ๐‘‡๐‘‰๐‘Š๐ด๐‘‡๐ถ๐ป ๐‘–๐‘  ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› 0 ๐‘Ž๐‘›๐‘‘ 5 ๎˜
Then, TVADDIC is _B__
A. Continuous Random Variable
B. Discrete Random Variable
2. (10 pts)
Below are some statements about certain distributions.
a. Select the one(s) that is/are correct with Normal Distribution. _A C E_
b. Which statement(s) is/are correct with Binomia l Distribution? __B C F___
c. Which statement(s) is/are correct with Bernoulli Distribution? __B D F___
d. Which statement(s) is/are correct with t distribution? _A D E_
A. It is symmetric.
B. It can be symmetric, left skewed, or right skewed.
C. It has two parameters.
D. It has only one parameter.
E. It has infinite possible values.
pf3
pf4
pf5
pf8

Partial preview of the text

Download Midterm 2 Solutions - Statistical Methods 2011 | STAT 3005 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM) Part I: Definition ( Note: Put your answer above the line ) (50 pts)

  1. (5 pts) a. A social scientist uses the General Social Survey to study how much time per day people spend watching TV, denoted by TVWATCH. Its units are in hours. The sampled observations of TVWATCH are 1.5, 4.45, 10.50, etc. Then, TVWATCH is A. A. Continuous Random Variable B. Discrete Random Variable

b. If we are interested in measuring peopleโ€™s addiction to watching TV, which is denoted by TVADDIC. What is in below shows how we get TVADDIC from TVWATCH.

๐‘‡๐‘‰๐ด๐ท๐ท๐ผ๐ถ = ๏ฟฝ

Then, TVADDIC is B_ A. Continuous Random Variable B. Discrete Random Variable

  1. (10 pts) Below are some statements about certain distributions. a. Select the one(s) that is/are correct with Normal Distribution. A C E b. Which statement(s) is/are correct with Binomial Distribution? B C F_ c. Which statement(s) is/are correct with Bernoulli Distribution? B D F_ d. Which statement(s) is/are correct with t distribution? A D E A. It is symmetric. B. It can be symmetric, left skewed, or right skewed. C. It has two parameters. D. It has only one parameter. E. It has infinite possible values.

Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM) F. It is binary.

  1. (5 pts) The Central Limit Theorem tells us that the sampling distribution is approximately normal. Which of the following condition is necessary for the theorem to be valid? A A. The sample size has to be large. B. We have to be sampling from a normal population. C. The population has to be symmetric. D. Population variance has to be small. E. Both A and C
  2. (5 pts) A random sample of size n=30 is taken from a population of size N=300. Which statement is generally correct? B A. ๐œ‡ is an estimate of ๐‘‹๏ฟฝ; ๐œŽ is an estimate of ๐‘ . B. ๐‘‹๏ฟฝ^ is an estimate of ๐œ‡; ๐‘  is an estimate of ๐œŽ. C. ๐œ‡ is an estimate of ๐‘‹๏ฟฝ; ๐‘  is an estimate of the standard deviation of the sample mean. D. ๐‘‹๏ฟฝ^ is an estimate of ๐œ‡; ๐‘  is an estimate of the standard deviation of the sample mean. E. ๐‘‹๏ฟฝ^ is an estimate of ๐œ‡; ๐‘  is the standard error of the sample mean.
  3. (5 pts) Which of the following is NOT correct? B The standard error of a statistic describes A. the standard deviation of the sampling distribution of that statistic. B. the standard deviation of the sample data measurements. C. how close that statistic falls to the parameter that it estimates. D. the variability in the values of the statistic for repeated random samples of size n.

Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM) Part II: Application ( Note: Keep numbers to two decimal places ) (50 pts)

  1. (25 pts) Suppose that the duration of human pregnancy can be described by a Normal distribution with a mean of 266 days and a standard deviation of 16 days. a. (4 pts) Whatโ€™s the probability that a pregnancy lasts between 260 and 270 days? Let Y denote the duration of human pregnancy ๐’€~๐‘ต(๐Ÿ๐Ÿ”๐Ÿ”, ๐Ÿ๐Ÿ” ๐Ÿ) ๐‘ท(๐Ÿ๐Ÿ”๐ŸŽ โ‰ค ๐’€ โ‰ค ๐Ÿ๐Ÿ•๐ŸŽ) = ๐‘ท ๏ฟฝ๐Ÿ๐Ÿ”๐ŸŽ โˆ’ ๐Ÿ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ” โ‰ค ๐’€ โˆ’ ๐Ÿ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ” โ‰ค ๐Ÿ๐Ÿ•๐ŸŽ โˆ’ ๐Ÿ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ” ๏ฟฝ = ๐‘ท(โˆ’. ๐Ÿ‘๐Ÿ– โ‰ค ๐’ โ‰ค. ๐Ÿ๐Ÿ“)^ =. ๐Ÿ“๐Ÿ—๐Ÿ–๐Ÿ•โˆ’. ๐Ÿ‘๐Ÿ“๐Ÿ๐ŸŽ =. ๐Ÿ๐Ÿ’๐Ÿ”๐Ÿ• โ‰ˆ. ๐Ÿ๐Ÿ“

b. (4 pts) At least how many days should the longest 25% of all pregnancies last? We need to compute the 75th^ percentile of Y, i.e., ๐’€.๐Ÿ•๐Ÿ“ ๐’.๐Ÿ•๐Ÿ“ = ๐’€.๐Ÿ•๐Ÿ“^ ๐Ÿ๐Ÿ”โˆ’ ๐Ÿ๐Ÿ”๐Ÿ” = ๐ŸŽ. ๐Ÿ”๐Ÿ– โ‡’ ๐’€.๐Ÿ•๐Ÿ“ = ๐ŸŽ. ๐Ÿ”๐Ÿ– โˆ— ๐Ÿ๐Ÿ” + ๐Ÿ๐Ÿ”๐Ÿ” = ๐Ÿ๐Ÿ•๐Ÿ”. ๐Ÿ–๐Ÿ– Therefore, at least 276.88 days should the longest 25% of all pregnancies last.

c. (3 pts) Suppose a certain obstetrician is currently providing prenatal care to 60 pregnant women. Let ๐‘ฆ๏ฟฝ represent the mean length of their pregnancies. What is the distribution of ๐‘ฆ๏ฟฝ? Specify the distribution, mean, and standard error. Y is exactly normal โŸน ๐’€๏ฟฝ^ is exactly normal ๐‘ฌ(๐’€๏ฟฝ) = ๐‘ฌ(๐’€) = ๐Ÿ๐Ÿ”๐Ÿ” ๐‘บ๐‘ฌ(๐’€๏ฟฝ) = ๐‘บ๐‘ซ(๐’€) โˆš๐’^

Therefore, ๐’€๏ฟฝ~๐‘ต(๐Ÿ๐Ÿ”๐Ÿ”, ๐Ÿ๐Ÿ”^

๐Ÿ ๐Ÿ”๐ŸŽ =^ ๐Ÿ’.^ ๐Ÿ๐Ÿ•) d. (4 pts) What is the probability that the mean duration of the 60 patientsโ€™ pregnancies in part (c) will be between 260 and 270 days? From part (c), we know that ๐’€๏ฟฝ~๐‘ต(๐Ÿ๐Ÿ”๐Ÿ”, ๐Ÿ๐Ÿ” ๐Ÿ/๐Ÿ”๐ŸŽ)

Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM) Therefore, ๐‘ท(๐Ÿ๐Ÿ”๐ŸŽ โ‰ค ๐’€๏ฟฝ โ‰ค ๐Ÿ๐Ÿ•๐ŸŽ) = ๐‘ท ๏ฟฝ๐Ÿ๐Ÿ”๐ŸŽ โˆ’ ๐Ÿ๐Ÿ”๐Ÿ”๐Ÿ๐Ÿ” โˆš๐Ÿ”๐ŸŽโ„ โ‰ค ๐’€

= ๐‘ท(โˆ’๐Ÿ. ๐Ÿ—๐ŸŽ โ‰ค ๐’ โ‰ค ๐Ÿ. ๐Ÿ—๐Ÿ’)^ = ๐ŸŽ. ๐Ÿ—๐Ÿ•๐Ÿ‘๐Ÿ– โˆ’ ๐ŸŽ. ๐ŸŽ๐ŸŽ๐Ÿ๐Ÿ— = ๐ŸŽ. ๐Ÿ—๐Ÿ•๐Ÿ๐Ÿ— โ‰ˆ. ๐Ÿ—๐Ÿ•

e. (5 pts) How does your answer in part (d) compare to your answer in part (a)? Does this relationship make sense? Why or why not? The answer in part (d) is much larger than that in part (a). It makes sense in that the mean is less variable than an individual value, so for a fixed range (260 days to 270 days in this case) we would expect the mean to be in the region with high probability.

f. (5 pts) The duration of human pregnancies may not always follow a Normal distribution. Suppose that we made a mistake and the correct distribution for human pregnancies is in fact skewed, does that change your answers in (a) and (d)? Briefly explain why or why not for each. For part (a), the answer changes because it cannot be computed via using the z-table without knowing the population distribution is normal. For part (d), the answer does not change because of the central limit theorem. Since we know the sample size is 60 (>=30), the sampling distribution is approximately normal even though the population distribution is not known.

  1. (25 pts) A pollster is interested in determining the 99% confidence interval for the true proportion of Americans who are voting for a particular candidate. In a sample of 400 Americans who are registered to vote, she finds that 311 are in favor of her candidate, Road Runner. The other 89 are in favor of the oppositionโ€™s candidate, Wile E. Coyote. a. (5 pts) Calculate the 99% confidence interval for the proportion that supports Road Runner. Interpret the 99% confidence interval in the context of the problem. Let ๐’‘ denote the population proportion that supports Road Runner,

Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM) (1-0.8311, 1-0.7329)=(0.1689, 0.2761) โ‰ˆ (0.17, 0.28) e. (15 pts) Suppose the true proportion that supports Road Runner is 70%. Specify the Population distribution , Sampling distribution , and Data distribution. (Hint: Let X=0 to denote the person votes for Wile E. Coyote, and let X=1 to denote the person votes for Road Runner) Population Distribution X 0 1 P(x) 0.3 0.

Data Distribution X 0 1 P(x) 0.2225 0.

Sampling Distribution ๐’‘๏ฟฝ~๐‘ต(๐ŸŽ. ๐Ÿ•, ๐ŸŽ๐Ÿ’๐ŸŽ๐ŸŽ.^ ๐Ÿ๐Ÿ = ๐ŸŽ. ๐ŸŽ๐ŸŽ๐ŸŽ๐Ÿ“๐Ÿ๐Ÿ“)

Tuesday, Nov 8, 2011 (9:30AM โ€“ 10:45AM) Extra Credit

  1. (5 pts) By setting the equation for the margin of error in a 100(1 โˆ’ ๐›ผ)% confidence interval for a proportion to be at most some fixed value, say m, derive the formula for the minimal sample size, n, that is needed to guarantee a margin of error of at most m. Answer:

๐‘€๐‘‚๐ธ = ๐ถ๐‘‰ โˆ— ๐‘†๐ธ = ๐‘1โˆ’๐›ผ 2 โˆ— ๏ฟฝ๐‘ฬ‚๐‘ž๐‘›๏ฟฝ

To guarantee a margin of error of at most m, We set ๐‘€๐‘‚๐ธ โ‰ค ๐‘š

โ‡’ ๐‘1โˆ’๐›ผ 2 โˆ— ๏ฟฝ๐‘ฬ‚๐‘ž๐‘› ๏ฟฝโ‰ค ๐‘š

โ‡’ ๐‘› โ‰ฅ ๐‘1โˆ’๐›ผ/^2