



































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to Analysis of Variance (ANOVA), a method for testing the hypothesis that there is no difference between two or more population means. It explains why variance is important when interested in means, introduces notation, and discusses within-groups variance and F-tests. The document also covers assumption checking, including homogeneity of variance and normality assumptions, and provides R code for ANOVA. Finally, it discusses what to do with a significant ANOVA result.
Typology: Schemes and Mind Maps
1 / 43
This page cannot be seen from the preview
Don't miss anything!




































Analysis of variance (ANOVA) is a method for testing the hypothesis that there is nodifference between two or more population means (usually at least three)
ANOVA
Often used for testing the hypothesis that there is no difference between a number oftreatments
The t-test, which is based on the standard error of the difference between two means,can only be used to test differences between two means
-^
With more than two means, could compare each mean with each other mean using t-tests
-^
Conducting multiple t-tests can lead to severe inflation of the Type I error rate (falsepositives) and is NOT RECOMMENDED
ANOVA
ANOVA is used to test for differences among several means without increasing the TypeI error rate
-^
The ANOVA uses data from all groups to estimate standard errors, which can increasethe power of the analysis
ANOVA
Three groups tightly spread about their respective means, the variability within eachgroup is relatively small
-^
Easy to see that there is a difference between the means of the three groups
To distinguish between the groups, the variability between (or among) the groups mustbe greater than the variability of, or within, the groups
-^
If the within-groups variability is large compared with the between-groups variability, any difference between the groups is difficult to detect
ANOVA
difference between the groups is difficult to detect
-^
To determine whether or not the group means are significantly different, the variabilitybetween groups and the variability within groups are compared
One-Way ANOVA^ -
When there is only one qualitative variable which denotes the groups and only onemeasurement variable (quantitative), a one-way ANOVA is carried out
-^
For a one-way ANOVA the observations are divided into
mutually exclusive
categories, giving the one-way classification
ANOVA
Each of the populations is Normally distributed with the same variance(homogeneity of variance)
-^
The observations are sampled independently, the groups under consideration areindependent ANOVA is robust to moderate violations of its assumptions, meaning that the
probability values (
values) computed in an ANOVA are sufficiently accurate even
if the assumptions are violated
Consider
groups, whose means we want to compare
Let
be the sample size of group
For the simulated verbal IQ and genotype data,
, representing the three possible
genotypes at the particular locus of interest. Each person in this data set, as well ashaving a genotype, also has a verbal IQ score
-^
Want to examine if the mean verbal IQ score is the same across the 3 genotype groups
ANOVA
Want to examine if the mean verbal IQ score is the same across the 3 genotype groups^ -
Null hypothesis is that the mean verbal IQ is the same in the three genotype groups
Remember assumption that the population variances of the three groups is the same
-^
Under this assumption, the three variances of the three groups all estimate this commonvalue-
True population variance =
-^
Within-groups variance = within-groups mean square = error mean square =
For groups with equal sample size this is given by the average of the variances of thegroups
ANOVA
For unequal sample sizes, the variances are weighted by their degrees of freedom
If the null hypothesis is true, the three groups can be considered as random samplesfrom the same population(assumed equal variances, because the null hypothesis is true, then the populationmeans are equal)
-^
The three means are three observations from the same sampling distribution of themean
-^
The sampling distribution of the mean has variance
ANOVA
The sampling distribution of the mean has variance
-^
This gives a second method of obtaining an estimate of the population variance
-^
The observed variance of the treatment means is an estimate of
and is given by
For equal sample sizes, the between-groups variance is then given by:
ANOVA
For unequal sample sizes, the between-groups variance is given by:
Another way of thinking about this ratio:
ANOVA
The F distribution is the continuous distribution of the ratio of two estimates of variance
-^
The F distribution has two parameters: degrees of freedom numerator (top) and degreesof freedom denominator (bottom)
-^
The F
ANOVA
The F
The validity of the F-test is based on the requirement that the populations from which thevariances were taken are Normal
-^
In the ANOVA, a one-sided F-test is used, why not two-sided in this case?
I-1 = 3-1 = 2, since3 genotype groups,
AA, AG, GG
Slightly more complicated as thesample sizes are^ not all equal (
-^1
ANOVA
For the simulated genotype, verbal IQ data:
AA, AG, GG
not all equal (
-^1
One-sided P value,āstatistically significantāat 0.05 level
Homogeneity of variance = homoscedasticity-
The dependent variable (quantitative measurement) should have the samevariance in each category of the independent variable (qualitative variable)
-^
Needed since the denominator of the F-ratio is the within-group mean square,which is the average of the group variances
ANOVA
ANOVA is robust for small to moderate departures from homogeneity of variance,especially with equal sample sizes for the groups
-^
Rule of thumb: the ratio of the largest to the smallest group variance should be 3:1or less, but be careful, the more unequal the sample sizes the smaller thedifferences in variances which are acceptable