Midterm 2 (Version A), Slides of Statistics

For a 1 carat diamond, we divide the price by 100. The distributions and some sample statistics are shown below. 0.99 carats 1 carat. Mean. $ 44.51. $ 56.81.

Typology: Slides

2022/2023

Uploaded on 03/01/2023

hollyb
hollyb 🇺🇸

4.8

(44)

431 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sta 101 Dr. Mukherjee
Fall 2016 November 10, 2016
Midterm 2 (Version A)
Last Name: First Name:
Section: 8:30 10:05 11:45 1:25 3:05 Team Name:
I hereby state that I have not communicated with or gained information in any way from my classmates
during this exam, and that all work is my own.
Signature :
Any potential violation of Duke’s policy on academic integrity will be reported to Undergrad-
uate Conduct Board. All work on this exam must be your own.
1. You have 75 minutes to complete the exam.
2. Show all your work on the open ended questions in order to get partial credit. No credit will be given for open
ended questions where no work is shown, even if the answer is correct.
3. Mark the answers to the multiple choice questions by filling in the bubbles provided below. If you choose more
than one answer, you will not receive any credit for that question. No partial credit will be given for these
questions.
4. You are allowed a calculator, however you may not share a calculator with another student during the exam,
one 81
2×11” sheet of notes (cheat sheet) with writing on both sides, pen or a pencil, a dictionary, and to ask
questions to me and the TA.
5. You are not allowed a cell phone, even if you intend to use it as a calculator or for checking the time, music
device or headphones, notes (other than your cheat sheet), books, or other resources, and to communicate with
anyone other than myself and the TA during the exam.
6. Write clearly. Short answers are best!
Good luck!
MC
Q 1 Q 2 Q 3 Q 4 - 13 Total
Points earned xxxxx xxxxx xxxxx xxxxx xxxxx
Available points 20 25 25 30 100
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Midterm 2 (Version A) and more Slides Statistics in PDF only on Docsity!

Sta 101 Dr. Mukherjee Fall 2016 November 10, 2016

Midterm 2 (Version A)

Last Name: First Name:

Section: 8:30 10:05 11:45 1:25 3:05 Team Name:

I hereby state that I have not communicated with or gained information in any way from my classmates during this exam, and that all work is my own.

Signature :

Any potential violation of Duke’s policy on academic integrity will be reported to Undergrad- uate Conduct Board. All work on this exam must be your own.

  1. You have 75 minutes to complete the exam.
  2. Show all your work on the open ended questions in order to get partial credit. No credit will be given for open ended questions where no work is shown, even if the answer is correct.
  3. Mark the answers to the multiple choice questions by filling in the bubbles provided below. If you choose more than one answer, you will not receive any credit for that question. No partial credit will be given for these questions.
  4. You are allowed a calculator, however you may not share a calculator with another student during the exam, one 8 12 ” × 11” sheet of notes (cheat sheet) with writing on both sides, pen or a pencil, a dictionary, and to ask questions to me and the TA.
  5. You are not allowed a cell phone, even if you intend to use it as a calculator or for checking the time, music device or headphones, notes (other than your cheat sheet), books, or other resources, and to communicate with anyone other than myself and the TA during the exam.
  6. Write clearly. Short answers are best!

Good luck!

MC Q 1 Q 2 Q 3 Q 4 - 13 Total Points earned xxxxx xxxxx xxxxx xxxxx xxxxx Available points 20 25 25 30 100

(c) (8) Conduct the hypothesis test. i. Compute the test statistic value (2 pts for SE, 1 pt for correct value)

T =

(¯x 0. 99 − ¯x 1 ) − (μ 0. 99 − μ 1 ) √ s^20. 99 n 0. 99 +^

s^21 n 1

=

  1. 322 23 +^

  2. 132 23

ii. Compute the p-value for this test. (1 pt for d.f., 2 pts for p-value) df = 23 - 1 = 22 p-value = P (|T 22 | > 2 .82) = 0. 01

iii. State your decision and conclusion for this test using α = 0. 05. (1pt for correct decision, 1pt for statement) Since p-value < 0.05, reject H 0. The data provide convincing evidence that the average standardized price of 0.99 carats and 1 carat diamonds are different.

(d) (4) Suppose the 95% confidence interval for μ. 99 − μ 1 is (− 21. 36 , − 3 .23), FILL IN and CIRCLE the correct responses.

We are 95% confident that the average price standardized price of .99 carat diamonds is dollars more / less to dollars more / less than the average standardized price of 1 carat diamonds. (1 point each) 3.23, less, 21.36, less

  1. (25) Action on Aquatic Environment.

In a study that investigates the impact of the Danish Action Plan for the Aquatic Environment, which addresses pollution of the Danish water resources, the concentration of nitrogen (measured in g/m^3 ) was measured in a particular river in 1998, 2003, and

  1. Six measurements were randomly taken in each year. A summary of the nitrogen concentration is provided. We want to evaluate the relationship between the two variables using an ANOVA at the 5% significance level.

¯x s n 1998 5.55 0.486 6 2003 5.19 0.362 6 2011 4.05 0.673 6

(a) (3) What are the hypotheses for evaluating the relationship between the two variables? (1.5) H 0 : Average nitrogen concentration does not vary across years (1.5) HA: Average nitrogen concentration does vary across years, there is at least one mean different from the rest (at least two means are different) (b) (6) What are you necessary conditions for ANOVA? Circle all that apply.

 Independence

 5 expected counts for each group

 np ≥ 10 and n(1 − p) ≥ 10

 Constant variance

 Approximate normality

 n ≥ 30 for all groups.

1pt each, treat as T/F (c) (7) Complete the following ANOVA table. Show any work in the space provided below, and insert final values in the table. degrees of freedom Sum Sq Mean Sq F value p-value Year XXXXXXXX 7.39 XXXXXXXX XXXXXXXX 0. Residuals XXXXXXXX XXXXXXXX XXXXXXXX Total XXXXXXXX 11. (1 pt for each blank to be filled in) dfG = 3 − 1 = 2, dfT = (6 + 6 + 6) − 1 = 17, dfE = 17 − 2 = 15 SSG = 7. 39 , SSE = 11. 49 − 7 .39 = 4. 1 M SG = 7. 39 /2 = 3. 695 , M SE = 4. 1 /15 = 0. 2733 F = 3. 695 / 0 .2733 = 13. 52

  1. (25) Perils of Living Dangerously in the Slasher Horror Film. The slasher horror film has been deplored based on claims that it depicts eroticized violence against predominately female characters as punishment for sexual activities. To test this assertion, a quantitative content analysis was conducted to examine the extent to which gender differences are evident in the association between character survival and engagement in sexual activities. Information pertaining to gender, engagement in sexual activities, and survival was coded for film characters from a simple random sample of 50 English-language, North American slasher films released between 1960 and 2009.^1

(a) (20) Suppose we want to conduct a hypothesis test to evaluate whether the survival rates of female characters who engage in sexual activity is different from female characters who do not. i. (2) State the null and alternative hypotheses for this test (2pts) H 0 : ppres − pabs = 0 vs. HA : ppres − pabs 6 = 0, the opposite is ok too.

ii. (8) Calculate the test statistic for this hypothesis test. (8 - 3 pt for pooled p-hat, 3 pts for SE (0 if calculated using p-hats), 2 pts for Z (full credit computed correctly using wrong SE)) p ˆpool = (^) 83+13911+39 ≈ 0. 2252 ≈ 0. 23 SE =

  1. 23 ∗ 0. 77 83 +^

  2. 23 ∗ 0. 77 139 = 0.^0584 Z = (0.^1330 −. 05840 .281) −^0 = − 2. 53

iii. (2) Compute the p-value for this test (2 pts for p-value ) p-value = 0. iv. (2) Using your result from (iii), is the survival of female characters in slasher films associated with sexual activity? Yes. No. (Circle one, assume α = 0. 05 ) (2pts if consistent

with p-value)

v. (6) Could this same test be used for testing association between sexual activity and survival for male characters? Explain your answer by checking appropriate conditions. If the answer is NO, what alternative testing method should be used? (1 pt for No, 3 pts for showing S/F condition fails, 2 pts for randomization test)

(^1) Welsh, Andrew. “On the perils of living dangerously in the slasher horror film: Gender differences in the association between sexual activity and survival." Sex Roles 62.11-12 (2010): 762-773.

(b) (3) Compute the standard error used for computing the 95% confidence interval for ppres − pabs for females. (2 pts for using p-hats, 1 pt for correct value) SE =

. 133 ∗. 867 83 +^ . 281 ∗. 719 139 = 0.^0533

  1. (3) In a warehouse, employees have asked management to play music to relieve the boredom of the job. The manager wants to know whether efficiency is affected by the music. A random sample of 15 workers were selected. Their average efficiency score was 30.47 before the installation of the music system and 38 after its installation. We want to evaluate whether average efficiency score is affected by the music. What other information is required? (a) Standard deviation of efficiency scores after the installation of the music system: saf ter (b) Standard deviation of efficiency scores before the installation of the music system: sbef ore (c) Standard deviation of efficiency scores before and after the installation of the music system: saf ter and sbef ore (d) Standard deviation of differences in efficiency scores before and after: sdif f
  2. A variety of studies suggest that 10% of the world population is left-handed. It is also claimed that artists are more likely to be left-handed. In order to test this claim we take a random sample of 40 art students at a college and find that 6 of them (15%) are left handed. Which of the following is the correct set-up for calculating the p-value for this test? (a) Randomly sample 40 non-art students, and record the number of left-handed students in the sample. Repeat this many times and calculate the proportion of samples where at least 15% of the students are left-handed. (b) Roll a 10-sided die 40 times and record the proportion of times you get a 1. Repeat this many times, and calculate the number of simulations where the sample proportion is 10% or more. (c) Roll a 10-sided die 40 times and record the proportion of times you get a 1. Repeat this many times, and calculate the number of simulations where the sample proportion is 15% or more. (d) In a bag place 40 chips, 6 red and 34 blue. Randomly sample 40 chips, with replacement, and record the proportion of red chips in the sample. Repeat this many times, and calculate the proportion of samples where at least 10% of the chips are red.
  3. Which of the following is false about bootstrapping? (a) Bootstrap distributions that are extremely skewed or have isolated clumps of values may yield unreliable confidence intervals. (b) Bootstrap distributions are constructed by sampling with replacement from the original sample, while sampling distributions are constructed by sampling with replacement from the population. (c) A bootstrap confidence interval constructed based on a biased sample will yield an unbiased estimate for the population parameter of interest. (d) The endpoints of a 95% bootstrap confidence interval are the cutoff values for the top and bottom 2.5% of the bootstrap distribution.
  4. (3) A random sample of 600 24-35 year-old unemployed Americans yielded an average unemployment of 13 weeks. In order to construct a bootstrap confidence interval based on this sample a statistician took 100 bootstrap samples and recorded their means. The standard deviation of these means was found to be 0.5. Which of the following is the correct 95% bootstrap interval?

(a) 13 ± 1. 96 × 0. 5

(b) 13 ± 1. 96 × √^0600.^5

(c) 13 ± 1. 96 × √^0100.^5

(d) 13 ± 1. 96 ×

  1. 5 × 0. 5 600

Answer questions 11 to 13 based on the information below.

Hepatitis C causes about 10,000 deaths each year in the US, but often lies undetected for year after infection. A study from University of Texas Southwestern Medical Center examined whether the risk of Hepatitis C was related to whether people had tattoos and to where they got their tattoos. The data from this study can be summarized in a two-way table, as follows: Hepatitis C No Hepatitis C Total Tattoo, parlor 17 35 52 Tattoo, elsewhere 12 53 65 No tattoo 22 491 513 Total 51 579 630

  1. (3) If in fact having Hepatitis C is independent of having a tattoo (and where one got their tattoo), how many people with no tattoos would you expect to have Hepatitis C? Choose the closest answer.

(a) 22 (b) 42 → 513*51/

(c) 47 (d) 491

  1. (3) Which of the following is the appropriate test for evaluating the relationship between Hepatitis C and having a tattoo (and where one got their tattoo)?

(a) Z-test (b) T-test

(c) chi-square test of independence (d) chi-square test of goodness of fit

  1. (3) Based on your decision in question # 12, what is the degrees of freedom for the above test?

(a) 2 (b) 3

(c) 4 (d) none of the above.

negative Z

Second decimal place of Z