Lecture Notes for Statistical Methods II | STAT 516, Study notes of Data Analysis & Statistical Methods

Material Type: Notes; Class: STATISTICAL METHODS II; Subject: Statistics; University: University of South Carolina - Columbia; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 10/01/2009

koofers-user-x4u
koofers-user-x4u 🇺🇸

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
> # This example shows the analysis for the Latin Square experiment
> # using the productivity data example we looked at in class
>
> # Entering the data and defining the variables:
>
> ##########
> ##
> # Reading the data into R:
>
> my.datafile <- tempfile()
> cat(file=my.datafile, "
+ 1 A 1 1 6.3
+ 2 B 1 2 9.8
+ 24 C 5 4 9.6
+ 25 D 5 5 11.0
+ ", sep=" ")
>
> options(scipen=999) # suppressing scientific notation
>
> musicprod <- read.table(my.datafile, header=FALSE, col.names=c("OBS", "MUSIC", "DAY", "TIME",
"PRODUCT"))
>
> # Note we could also save the data columns into a file and use a command such as:
> # musicprod <- read.table(file = "z:/stat_516/filename.txt", header=FALSE, col.names = c("OBS",
"MUSIC", "DAY", "TIME", "PRODUCT"))
>
> attach(musicprod)
>
> # The data frame called musicprod is now created,
> # with five variables, OBS, MUSIC, DAY, TIME, and PRODUCT.
> ##
> #########
>
> ############################################################################
>
> # lm() and anova() will do a standard analysis of variance
> # We specify our (qualitative) factors with the factor() function:
>
> # Making MUSIC, DAY, TIME factors:
>
> MUSIC <- factor(MUSIC)
> DAY <- factor(DAY)
> TIME <- factor(TIME)
>
> # The lm statement specifies that PRODUCT is the response
> # and MUSIC, DAY, TIME are the factors
> # MUSIC is the treatment factor here, and TIME and DAY are the row and column factors.
> # The ANOVA table is produced by the anova() function
>
> musicprod.fit <- lm(PRODUCT ~ MUSIC + DAY + TIME);
> anova(musicprod.fit)
Analysis of Variance Table
Response: PRODUCT
Df Sum Sq Mean Sq F value Pr(>F)
MUSIC 4 56.314 14.079 12.2750 0.0003341 ***
DAY 4 41.362 10.341 9.0159 0.0013326 **
TIME 4 42.922 10.731 9.3559 0.0011356 **
Residuals 12 13.763 1.147
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> # From the F-tests and their P-values, there is a significant effect of music type
> # on mean productivity. We also see a significant row (TIME) effect and column (DAY)
> # effect.
>
> ############################################################################
>
> # The sample mean productivity values for each music type, listed from smallest to largest:
>
> sort( tapply(PRODUCT, MUSIC, mean) )
A B E D C
7.96 9.20 10.10 11.28 12.22
>
> # Now, which of these means are significantly different?
>
> # Tukey's procedure tells us which pairs of music types are significantly
> # different:
>
> # Tukey CIs for pairwise treatment mean differences:
pf2

Partial preview of the text

Download Lecture Notes for Statistical Methods II | STAT 516 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

> # This example shows the analysis for the Latin Square experiment > # using the productivity data example we looked at in class > > # Entering the data and defining the variables: > > ########## > ## > # Reading the data into R: > > my.datafile <- tempfile() > cat(file=my.datafile, "

  • 1 A 1 1 6.
  • 2 B 1 2 9. …
  • 24 C 5 4 9.
  • 25 D 5 5 11.
  • ", sep=" ") > > options(scipen=999) # suppressing scientific notation > > musicprod <- read.table(my.datafile, header=FALSE, col.names=c("OBS", "MUSIC", "DAY", "TIME", "PRODUCT")) > > # Note we could also save the data columns into a file and use a command such as: > # musicprod <- read.table(file = "z:/stat_516/filename.txt", header=FALSE, col.names = c("OBS", "MUSIC", "DAY", "TIME", "PRODUCT")) > > attach(musicprod) > > # The data frame called musicprod is now created, > # with five variables, OBS, MUSIC, DAY, TIME, and PRODUCT. > ## > ######### > > ############################################################################ > > # lm() and anova() will do a standard analysis of variance > # We specify our (qualitative) factors with the factor() function: > > # Making MUSIC, DAY, TIME factors: > > MUSIC <- factor(MUSIC) > DAY <- factor(DAY) > TIME <- factor(TIME) > > # The lm statement specifies that PRODUCT is the response > # and MUSIC, DAY, TIME are the factors > # MUSIC is the treatment factor here, and TIME and DAY are the row and column factors. > # The ANOVA table is produced by the anova() function > > musicprod.fit <- lm(PRODUCT ~ MUSIC + DAY + TIME); > anova(musicprod.fit) Analysis of Variance Table Response: PRODUCT Df Sum Sq Mean Sq F value Pr(>F) MUSIC 4 56.314 14.079 12.2750 0.0003341 *** DAY 4 41.362 10.341 9.0159 0.0013326 ** TIME 4 42.922 10.731 9.3559 0.0011356 ** Residuals 12 13.763 1.

Signif. codes: 0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1 > > # From the F-tests and their P-values, there is a significant effect of music type > # on mean productivity. We also see a significant row (TIME) effect and column (DAY) > # effect. > > ############################################################################ > > # The sample mean productivity values for each music type, listed from smallest to largest: > > sort( tapply(PRODUCT, MUSIC, mean) ) A B E D C 7.96 9.20 10.10 11.28 12. > > # Now, which of these means are significantly different? > > # Tukey's procedure tells us which pairs of music types are significantly > # different: > > # Tukey CIs for pairwise treatment mean differences:

> TukeyHSD(aov(musicprod.fit),conf.level=0.95)$MUSIC diff lwr upr p adj B-A 1.24 -0.91893738 3.39893738 0. C-A 4.26 2.10106262 6.41893738 0. D-A 3.32 1.16106262 5.47893738 0. E-A 2.14 -0.01893738 4.29893738 0. C-B 3.02 0.86106262 5.17893738 0. D-B 2.08 -0.07893738 4.23893738 0. E-B 0.90 -1.25893738 3.05893738 0. D-C -0.94 -3.09893738 1.21893738 0. E-C -2.12 -4.27893738 0.03893738 0. E-D -1.18 -3.33893738 0.97893738 0. > > # NOTE: The CIs which do NOT contain zero indicate the treatment means > # that are significantly different at (here) the 0.05 experimentwise significance level. >