


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Instructions for performing two-sample hypothesis tests using r software and the 't.test' function. The lab covers independent samples t-tests for the difference in population means, one-sided and two-sided tests, and the test for homogeneity of variances. Students are expected to read the document, complete the lab exercises, and write out null and alternative hypotheses, significance levels, and conclusions.
Typology: Lab Reports
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Lab 3 STAT 3000
kudzu.df=read.csv("kudzu.csv") kudzu.df Notice that the kudzu data are in two columns (with fairly long headings) and the first column contains only 20 numbers and then a sequence of "NA". That is because for the first sample, there are only 20 observations, whereas the second column or sample has a total of 25 observations. To make the dataset more manageable, rename the column headings to something shorter: names(kudzu.df)=c("without","with") kudzu.df Notice that the names have now been changed. See page 384 of the textbook for a description of this dataset.
By default, the 't.test' function provides information about a two-sided hypothesis test for the situation where the null difference in means is zero. In order to change it you just have to specify a few more options in the command. For example, assuming the population variances are unequal, to perform a one-sided hypothesis test to determine if the difference between mean pulp yield that has been treated and mean pulp yield that has not been treated is less than 5 (this implies the null hypothesis is that the difference in means is greater than or equal to 5):
t.test(x=kudzu.df$with,y=kudzu.df$without, alt="less",mu=5,var.equal=FALSE) Notice the ordering of the samples in the R command, this is important, if the order was switched we would have to use 'mu=-5' and 'alt="greater"'. If we are using a 0.05 level of significance, would we have sufficient evidence to reject the null hypothesis given this information? What if you wanted to test whether the population means are significantly different? Use the following command: t.test(x=kudzu.df$with,y=kudzu.df$without, var.equal=FALSE) Notice that here the ordering of samples could be switched with no effect on the results because the test is two-sided with null difference equal to zero. Now would you reject the null hypothesis? Does the two-sided confidence interval for difference in population means support this decision? Does assuming the population variances are equal change the p-value?
golf.df=read.csv("golfball.csv") names(golf.df)=c("golfer","old","new") golf.df
Lab Assignment 3 Instructions: When performing hypothesis tests for the labs, you should write out the null and alternative hypotheses, whether you reject the null or not based on the results, and a conclusion for the test (like we did in class). 1.) For a set of 20 trucks, two different types of tires (standard and new) were placed randomly on either the right or left front wheels. The tire manufacturer would like to determine if the new tires wear more slowly than the standard tires. After a set amount of drive time over similar road conditions, the reductions in tread depths of the tires are measured. These data are on the course website in the file called 'tires.csv'. a. Perform the appropriate statistical test to address the manufacturer's question. Be sure to fully document the type of test(s) you perform, hypotheses, significance levels, and be sure to summarize your findings. Be sure to discuss any assumptions you make. 2.) The viscosity of oil after it has been used in an engine over a period of time may change from its initial value because the high temperature inside the engine can cause the oil to break down. An experiment was conducted to compare the effect of oil viscosity of two different engines. Various samples of the same type of oil with a constant viscosity were used, some in engine 1 and some in engine 2, and the engines were run under identical operating conditions. The resulting values of the oil viscosities after having been used in the engines are given in the dataset called 'oil.csv' on the course website. a. Is there reason to believe that the true variability of oil viscosity is different after being run in the different engines? b. Is there any evidence that the engines have different effects on oil viscosity? Be sure to document any and all statistical methods used to address these questions (e.g., type of test(s) you perform, specific hypotheses, significance levels, assumptions made, and summary of findings).