

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The relationship between analysis of variance (anova) and experimental design. It explains the concept of decomposing total sums of squares into between-groups and within-groups components for a one-way anova. The document also introduces the f-test for comparing the null hypothesis of equal group means to the alternative hypothesis of at least one difference. An example calculation and an analysis of variance table.
Typology: Exams
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Not all data analyzed by ANOVA are from a designed experiment. On the other hand, some designed experiments lead to data for which ANOVA methods are inappropriate. However, there is a strong historical connection between ANOVA and Experimental Design.
The key idea behind an analysis of variance involves a decomposition of the total sum of squares. For a one-way ANOVA (possibly arising from a completely randomized design) the decomposition is TSS = SSB
∑^ k
i=
∑^ n
j=
(Yij − Y (^) ..)^2 =
∑^ k
i=
∑^ n
j=
(Yij − Y (^) i.)^2 +
∑^ k
i=
∑^ n
j=
(Y (^) i. − Y (^) ..)^2 , or TSS = SSW +SSB.
As an example, consider three groups with the following data: Group 1 has Y 1 j values of 1, 2, and 3, Group 2 has Y 2 j values of 5, 3, and 4, and Group 3 has Y 3 j values of 6, 7, and 5. The overall sample mean
is Y (^) .. = (
∑k i=
∑n j=1 Yij^ )/kn^ = 36/9 = 4.^ Then TSS is
∑^ k
i=
∑^ n
j=
(Yij − Y (^) ..)^2 = (1 − 4)^2 + (2 − 4)^2 + ... + (5 − 4)^2 = 30.
The group means are Y (^1). = 2, Y (^2). = 4, and Y (^3). = 6, so SSW and SSB are
∑^ k
i=
∑^ n
j=
(Yij − Y (^) i.)^2 = (1 − 2)^2 + (2 − 2)^2 + (3 − 2)^2 + (5 − 4)^2 + ... + (5 − 6)^2 = 6, and
∑^ k
i=
∑^ n
j=
(Y (^) i. − Y (^) ..)^2 =
∑^ k
i=
n(Y (^) i. − Y (^) ..)^2 = 3(2 − 4)^2 + 3(4 − 4)^2 + 3(6 − 4)^2 = 24.
Thus TSS = SSW +SSB or 30 = 6 + 24 partitions the total sum of squares about the overall mean into two parts, one within groups (due to error, or effects not accounted by the model) and one between groups (measuring the difference between sample means). Since each group here has n observations, each group contributes n − 1 degrees of freedom for the within group sum of squares, for a total of k(n − 1) degrees of freedom for SSW. SSB is calculating the sum of squares of k sample means about their (overall) mean, so it has k − 1 degrees of freedom. For the example data above, k(n − 1) = 3(2) = 6, and k − 1 = 3 - 1 = 2. We can summarize this information in an analysis of variance table:
Source SS df MS F Between groups 24 2 12 12 Within groups 6 6 1 Total sum of squares 30 8
To test the null hypothesis H 0 : μ 1 = μ 2 = μ 3 against the alternative hypothesis Ha : some μi’s differ, we compare the F statistic to an F distribution with numerator df = k − 1 = 3 - 1 = 2, and denominator df = k(n − 1) = 3(2) = 6. When group sample sizes are unequal we replace n in the expressions by ni, which is the sample size in the ith group.
The model for ANOVA with one grouping factor is
Yij = μ + αi + εij , where μ is the population grand mean, αi = μi − μ is the treatment effect for the ith group, and εij = Yij − μ − αi is the random error effect for Yij. Using this notation we can write the null hypothesis H 0 : μ 1 = μ 2 = ... = μk in the alternate form H 0 : α 1 = α 2 = ... = αk = 0. In performing ANOVA for a completely randomized experiment we assume that 1) random samples have been taken from each of the k populations, 2) the errors εij have a normal distribution with mean 0, and 3) the errors εij have a common variance σ^2. It can be shown that E(M SW ) = σ^2 , and E(M SB) = σ^2 + n
α^2 i /(k − 1). Then if H 0 is true, all αi terms equal zero and M SW and M SB be nearly equal. Thus F = M SB/M SW ≈ 1 when H 0 is true and F > 1 when H 0 is false.