Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Various statistical concepts and techniques, including hypothesis testing and confidence intervals. It examines several case studies, such as comparing the speed of light measured by michelson to stigler's value, analyzing the proportion of identity theft complaints in alaska, investigating the incidence of autism spectrum disorder in arizona, and evaluating the economic dynamism of middle-income countries. The document demonstrates how to state and check assumptions, calculate sample statistics and test statistics, and interpret p-values and confidence intervals to draw conclusions about population parameters. It provides a comprehensive overview of these statistical methods and their applications in real-world scenarios.

Typology: Exams

2023/2024

1 / 24

Download Hypothesis Testing and Confidence Intervals and more Exams Nursing in PDF only on Docsity! Page 1 of 12 1 STAT 200: Introduction to Statistics Homework #5 Solutions 1. (3 points): Stephen Stigler determined in 1977 that the speed of light is 299,710.5 km/sec. In 1882, Albert Michelson had collected measurements on the speed of light ("Student t-distribution," 2013). Is there evidence to show that Michelson’s data is different from Stigler’s value of the speed of light? a.) State the random variable For this problem, the random variable will be: x = speed of light measured by Albert Michelson b.) State the population parameter The population parameter will be: μ = mean speed of light measured by Albert Michelson c.) State the hypotheses The hypotheses for this experiment are given by: 𝑯𝟎: 𝝁 = 𝟐𝟗𝟗, 𝟕𝟏𝟎. 𝟓 𝒌𝒎/𝒔 𝑯𝑨: 𝝁 ≠ 𝟐𝟗𝟗, 𝟕𝟏𝟎. 𝟓 𝒌𝒎/𝒔 2. (3 points): According to the February 2008 Federal Trade Commission report on consumer fraud and identity theft, 23% of all complaints in 2007 were for identity theft. In that year, Alaska had 321 complaints of identity theft out of 1,432 consumer complaints ("Consumer fraud and," 2008). Does this data provide enough evidence to show that Alaska had a lower proportion of identity theft than 23%? a.) State the type I error in this case, consequences of this error type for this situation, and the appropriate alpha level to use. In this situation, the Type I error is saying that the proportion of complaints from identity theft in Alaska is less than 23%, when it is 23%. One consequence of this error is that the Federal Trade Commission (FTC) would think that identity theft isn’t as big as a problem when it is. Thus, the FTC may not put as much effort into stopping or investigating identity theft in Alaska as it should. Page 2 of 12 2 b.) State the type II error in this case, consequences of this error type for this situation, and the appropriate alpha level to use. Type II error: saying that the proportion of complaints from identity theft in Alaska is 23%, when it is less than 23%. One consequence of this error is that the Federal Trade Commission would put more effort into Alaska then it needs to. Thus, resources that could be used other places will be wasted in Alaska. The best alpha level in this case would be 1%, since a type I error looks to have worse consequences than a type II error. Page 5 of 12 5 v.) Conclusion Since the p-value is greater than the level of significance (i.e. [p-value = 0.2998] > [α = 0.05]), we fail to reject 𝑯𝟎. vi.) Interpretation (do not skip this part! This is the “so what” of the entire hypothesis test). There is not enough evidence to show that the proportion of complaints due to identity theft in Alaska is less than 23%. 4. (3 points): In 2008, there were 507 children in Arizona out of 32,601 who were diagnosed with Autism Spectrum Disorder (ASD) ("Autism and developmental," 2008). Nationally 1 in 88 children are diagnosed with ASD ("CDC features -," 2013). Is there sufficient data to show that the incident of ASD is more in Arizona than nationally? Why or why not? Test at the 1% level. We should start by writing down what we know (which is always a great place to start): x = 507 n = 32,601 p = 1/88 = 0.0114 (or 1.14%) α = 0.01 To fully address this problem, we should follow the six step process presented in the textbook. i.) State the random variable and the parameter in words. The random variable is given by: x = number of children in Arizona in 2008 that were diagnosed with Autism Spectrum Disorder (ASD) The parameter of interest is given by: p = proportion of children in Arizona in 2008 that were diagnosed with Autism Spectrum Disorder (ASD) ii.) State the null and alternative hypotheses and the level of significance The hypotheses for this experiment are given by: 𝐻 : 𝑝 = 1 = 0.0114 0 88 𝐻 : 𝑝 > 1 = 0.0114 𝐴 88 The level of significance is α = 0.01. Page 6 of 12 6 iii.) State and check the assumptions for a hypothesis test a) A simple random sample of the 32,601 diagnoses of children was taken in 2008. The study was conducted by the CDC so this assumption is probably true. b) ii. There are 32,601 diagnoses in this sample. The diagnoses of one Arizona child doesn’t affect the opinion of the next one. There are only two outcomes, either the Arizona child has ASD or they do not. The chance that one Arizona child has ASD does not change. Thus the conditions for the binomial distribution are satisfied c) In this case p = 1 88 = 0.0114 and n = 32,601. np = 32601 * 1 88 = 370.47 ≥ 5 and nq = 32601 * (1 – 1 ) = 32,230.5 ≥ 5. 88 Thus, the sampling distribution for 𝑝 is a normal distribution; this means we will use a z-test. Page 7 of 12 7 iv.) Find the sample statistic, test statistic, and p- value The sample proportion is given by: x = 507 n = 32,601 𝑝 = 𝑥 = 507 = 0.0156 𝑛 32,601 The test statistic is given by: 𝑝 − 𝑝 0.0156 − 0.0114 𝑧 = = = 7.134 𝑝𝑞 0.0114(1 − 0.0114) √ 𝑛 √ 32601 The p-value associated with this problem (going back to homework 4 for how to compute the p-value from a z-statistic) is given by: =1 - NORM.S.DIST(z,cumulative) =1 - NORM.S.DIST (7.134, TRUE) = 4.866 * 10-13 v.) Conclusion Since the p-value is less than the level of significance (i.e. [p-value = 4.866 * 10-13] < [α = 0.01]), we reject 𝑯𝟎. vi.) Interpretation (do not skip this part! This is the “so what” of the entire hypothesis test). There is enough evidence to show that the proportion of Arizona children in 2008 with ASD is more than the national proportion. 5. (3 points): The economic dynamism, which is the index of productive growth in dollars for countries that are designated by the World Bank as middle-income are in Table 1 ("SOCR data 2008," 2013). Countries that are considered high-income have a mean economic dynamism of 60.29. Does the data show that the mean economic dynamism of middle-income countries is less than the mean for high income countries? Why or why not? Test at the 5% level. 25.8057 37.4511 51.9150 43.6952 47.8506 43.7178 58.0767 41.1648 38.0793 37.7251 39.6553 42.0265 48.6159 43.8555 49.1361 61.9281 41.9543 44.9346 46.0521 48.3652 43.6252 50.9866 59.1724 39.6282 33.6074 21.6643
70.0000
60.0000
50,0000
40,0000
30,0000
20.0000
10.0000
0.0000
Box and Whisker for Economic Dynamism
Page 10 of 12
Page 11 of 12 11 √ √ 2.5 Normal Probability Plot for Economic Dynamism 2 1.5 1 0.5 0 0. 00 -0.5 -1 -1.5 -2 -2.5 iv.) Find the sample statistic, test statistic, and p- value Sample mean and standard deviation: 𝑥 = $43.87 𝑠 = $9.07 n = 26 Test Statistic: 𝑥 − 𝜇 𝑡 = 𝑠 43.87 − 60.29 = 9.07 = −9.228 ⁄ 𝑛 ⁄ 26 p-value: To get the p-value from excel, we use the t.dist function: Syntax: T.DIST(x,deg_freedom, cumulative) The T.DIST function syntax has the following arguments: X Required. The numeric value at which to evaluate the distribution Deg_freedom Required. An integer indicating the number of degrees of freedom. Cumulative Required. A logical value that determines the form of the function. If 0000 10.00 00 20.00 00 30.00 00 40.00 00 50.00 00 60.00 00 70.00 Page 12 of 12 12 cumulative is TRUE, T.DIST returns the cumulative distribution function; if FALSE, it returns the probability density function. The function to put into Excel is: =T.DIST(-9.228, 26-1, TRUE) = 7.900 * 10-10 v.) Conclusion Since the p-value is less than the significance level (i.e. 7.900 * 10-10 < 0.05), we reject 𝑯𝟎 vi.) Interpretation There is enough evidence to show that the mean economic dynamism for a middle-income country is less than 60.29, the mean for high- income countries. Page 15 of 12 15 ⁄ iv.) Find the sample statistic, test statistic, and p- value Sample mean and standard deviation: Test Statistic: 𝑥 − 𝜇 𝑥 = 26.33 𝑚𝑚 𝑠 = 9.77 𝑚𝑚 n = 9 26.33 − 18.125 𝑡 = 𝑠 = √𝑛 9.77⁄ √9 = 2.5198 p-value: To get the p-value from excel, we use the t.dist function; however, we are looking for the area to the left (our alternative hypothesis is “greater than”), so we take 1 – the area to the left. In Excel: =1-T.DIST(2.5198, 9-1, TRUE) = 0.0179 v.) Conclusion Since the p-value is greater than the significance level (i.e. 0.0179 > 0.01), we fail to reject 𝑯𝟎 vi.) Interpretation There is not quite enough evidence to show that the mean sway forward and backward of elderly people is more than 18.125 mm, the sway of younger people at the 0.01 (or 1%) level. However, if we increased our level of significance to 0.05 (the 5% level), we would conclude that the mean sway of elderly people is more than that of younger people. 7. (3 points): Suppose you compute a confidence interval with a sample size of 100. What will happen to the confidence interval if the sample size decreases to 80? A confidence interval will become wider if the sample size is decreased. 8. (3 points): In 2013, Gallup conducted a poll and found a 95% confidence interval of Page 16 of 12 16 0.52 p 0.60, where p is the proportion of Americans who believe it is the government’s responsibility for health care. Give the statistical interpretation. The proportion of Americans who believe it is the government’s responsibility for health care is between 52% and 60%. 9. (3 points): In 2008, there were 507 children in Arizona out of 32,601 who were diagnosed with Autism Spectrum Disorder (ASD) ("Autism and developmental," 2008). Find the proportion of ASD in Arizona with a confidence level of 99%. This is a confidence interval about a proportion. Thus, we will use the standard normal distribution. i.) State the random variable and the parameter in words. x = number of children in Arizona in 2008 that were diagnosed with Autism Spectrum Disorder (ASD) p = proportion of children in Arizona in 2008 that were diagnosed with Autism Spectrum Disorder (ASD) Page 17 of 12 17 ii.) State and check the assumptions a. A simple random sample of the 32,601 diagnoses of children was taken in 2008. The study was conducted by the CDC, so this assumption is probably true. b. There are 32,601 diagnoses in this sample. The diagnoses of one Arizona child doesn’t affect the opinion of the next one. There are only two outcomes, either the Arizona child has ASD or they do not. The chance that one Arizona child has ASD does not change. Thus, the conditions for the binomial distribution are satisfied c. In this case, 𝑝 = 𝑥 = 507 = 0.0156 and n = 32601. 𝑛 32,601 Thus, n𝑝 = 32601 * 507 32,601 = 507 ≥ 5 and n𝑞 = 32601 * (32,601−507) = 32094 ≥ 5. 32,601 Thus, the sampling distribution for 𝑝 is a normal distribution. iv.) Find the sample statistic and confidence interval The sample proportion is given by: x = 507 n = 32,601 𝑝 = 𝑥 = 507 = 0.0156 𝑛 32,601 Confidence Interval: First, we need to determine the value for 𝑧𝑐, the critical value where C = 1 – α If we use Table A.1 in the back of the Kozak textbook, we find this value is 2.575. Table A.1: Normal Critical Values for Confidence Levels Confidence Level, C Critical Value, zc 99% 2.575 98% 2.33 95% 1.96 90% 1.645 80% 1.28 You might actually want to know from where this value came, so here is how you can find it in Excel: Since we are looking at the 99% confidence interval, we have an area of 1 – 0.99 = 0.01 outside of our confidence interval; however, half is on both sides of the interval. Thus, it goes from 0.005 to 0.995. Page 20 of 12 20 a. A simple random sample of economic dynamism for 26 middle-income countries was taken. The problem doesn’t mention how the sample was taken. Thus, this assumption may not have been met. b. Recall from question 5: The population of the economic dynamism for all middle-income countries is normally distributed or the sample size is 30 or more. The sample size is 26. The histogram looks somewhat bell shaped, there is one outlier (but it is not far outside 1.5*IQR), and the normal probability plot does appear linear. Thus, this assumption is probably met (nothing is ever “perfect” in real life). iv.) Find the sample statistic and confidence interval Also from question 5: Sample mean and standard deviation: 𝑥 = $43.87 𝑠 = $9.07 n = 26 Page 21 of 12 21 Confidence Interval: First, we need to determine the value for 𝑧𝑐, the critical value where C = 1 – α If we use Table A.2 in the back of the Kozak textbook, we look in the 95% column down to degrees of freedom of n – 1 = 26 – 1 = 25 and find the value of tc = 2.060. You might actually want to know from where this value came, so here is how you can find it in Excel: Page 22 of 12 22 Since we are looking at the 95% confidence interval, we have an area of 1 – 0.95 = 0.05 outside of our confidence interval; however, half is on both sides of the interval. Thus, it goes from 0.025 to 0.975.