


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A statistics assignment focusing on hypothesis testing for two samples. The assignment covers concepts such as left-skewed distributions, p-values, type i and ii errors, and hypothesis testing procedures for means using the two-sample independent t-test. The document also includes examples of applying these concepts to real data sets, including one on body balance and another on the nhanes cholesterol study.
Typology: Exercises
Uploaded on 04/03/2019
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Awin Gray (s3073521)
September 29, 2017
(a) A left-skewed distribution will have its median being greater than the mean because extreme observa- tions on the lower side will pull the average towards the low values.
(b) P-value is the probability of observing the results as specified in the null hypothesis given that the null hypothesis is true. Type I error occurs when a researcher rejects the null hypothesis when the null hypothesis is true, thereby accepting a false positive. Type II error occurs when the researcher fails to reject a false null hypothesis, thereby accepting a false negative.
These two conceptual types help distinguish between a null hypothesis and an alternative hypothesis.
(c)
Figure 1: box-and-whiskers plot
(a) There is an outlier observed for the Elderly category while there is no outlier in the Young category as seen in Figure 1. The lower fence, the first quartile, the median, the third quartile and the upper fence for the elderly are all greater than the corresponding percentiles for the young individuals.
(b) It is assumed that the data are normally distributed when constructing the confidence interval and that there are no extreme outliers. These assumptions seem to be tenable. The two populations from which the data is drawn have the same variance. It is also assumed that the observations are independent of each other.
R-output:
Welch Two Sample t-test
data: sway$FBSway by sway$Age t = 2.3035, df = 10.971, p-value = 0. alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 0.3627401 16. sample estimates: mean in group Elderly mean in group Young 26.33333 18.
The 95% confidence interval for the difference in means is (0.3627, 16.0539). Observe that 0 is not included in the confidence interval.
(e) The sample data is inconsistent with the null hypothesis, therefore we reject the null hypothesis and instead conclude that the alternative hypothesis is true. There is a difference in proportions of high cholesterol level across the age groups. Therefore, it can be concluded that age is a factor that influences having high cholesterol.
(f) If we altered the sampling method in the design and for example - use simple random sampling or systematic sampling instead, a significant difference might occur with respect to the measures of central tendency and dispersion to those yielded by the sampling technique used by the study’s sampling method. The difference could have an impact of altering the degree of association between age and the level of high cholesterol.
Using independent sample t-test to test for difference in means of cholesterol level between the old and the young might present contradicting results. There would be no recording of the cholesterol variable in that case.