












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of multiple comparisons in anova, which is used when we reject the null hypothesis that all treatment means are equal and want to identify which ones are significantly different. The overall error rate, planned and unplanned comparisons, and various methods for carrying out multiple comparisons, such as the bonferroni and newman-keuls procedures. It also includes examples and calculations for pairwise comparisons and interpreting the results.
Typology: Study notes
1 / 20
This page cannot be seen from the preview
Don't miss anything!













When we carry out an ANOVA on k treatments, we test
H 0 : μ 1 = · · · = μk versus Ha : H 0 is false
Assume we reject the null hypothesis, i.e. we have some evidence that not all treatment means are equal. Then we could for example be interested in which ones are the same, and which ones differ. For this, we might have to carry out some more hypothesis tests.
This procedure is referred to as multiple comparisons.
1
We will be conducting, say, T different tests, and we become concerned about the overall error rate (sometimes called the “family-wise” error rate).
Overall error rate = Pr(reject at least one H 0 | all H 0 are true)
= 1 − [ 1 − Pr(reject first | first H 0 is true)]T^ if independent
≤ T × Pr(reject first | first H 0 is true) generally
There are two different types of multiple comparisons procedures:
Sometimes we already know in advance what questions we want to answer. Those comparisons are called planned (or a priori) comparisons.
Sometimes we do not know in advance what questions we want to answer, and the judgement about which group means will be studied the same depends on the ANOVA outcome. Those comparisons are called unplanned (or a posteriori) comparisons.
3
Planned comparisons: adjust for just those tests that are planned.
Unplanned comparisons: adjust for all possible comparisons.
The investigator wants to answer three specific questions:
Planned comparisons! 7
60 65 70 75 response
treatment
l lll (^) l l (^) l ll l
l (^) l l ll l l l l l
ll^ ll^ lll^ l ll
l lll lll ll l
l l l l l
ll ll l
C
G
F
G+F
S
Source SS Df MS F-value p-value
Between treatment 1077.3 4 269.3 49.4 < 0.
Within treatment 245.5 45 5.
9
Compare the control to the others.
Test
μC –
μG + μF + μG+F + μS 4
Look at
y^ ¯C –
y¯G + ¯yF + ¯yG+F + ¯yS 4
Source SS Df MS F-value p-value
Between treatment 1077.3 4 269.3 49.4 < 0.
Within treatment 245.5 45 5.
Source SS Df MS F-value p-value
Between treatment 832.3 1 832.3 152.4 < 0.
Within treatment 245.5 45 5. 13
Source SS Df MS F-value p-value
Between treatment 48.1 1 48.1 8.82 0.
Within treatment 245.5 45 5.
Source SS Df MS F-value p-value
Between treatment 196.9 2 98.4 18.0 < 0.
Within treatment 245.5 45 5.
Two comparisons c 1 = (c1,1,... , c1,k) and c 2 = (c2,1,... , c2,k) are orthogonal if and only if
∑^ k
t = 1
where nt is the number of observations in treatment group t.
15
source SS df MS F-value
Treatment 1077.3 4 269.3 49.
Control versus sugars 832.3 1 832.3 152.
Pure sugars versus mixed sugar 48.1 1 48.1 8.
Among pure sugars 196.9 2 98.4 18.
Within treatment 245.5 45 5.
Total 1322.8 49
Suppose we are comparing k treatment groups.
Suppose ANOVA indicates that you reject H 0 : μ 1 = · · · = μk
What next?
Which of the μ’s are different from which others?
Consider testing H 0 : μi = μj for all pairs i,j.
There are
(k 2
= k^ (k 2 − 1 )such pairs.
k = 5 −→
(k 2
19
Suppose we have 10 treatment groups, and so 45 pairs.
If we perform 45 t-tests at the significance level α = 0.05, we’d expect to reject 5% × 45 ≈ 2 of them, even if all of the means were the same.
Let α = Pr(reject at least one pairwise test | all μ’s the same)
≤ (no. tests) × Pr(reject test #1 | μ’s the same)
The Bonferroni correction:
Use α′^ = α/(no. tests) as the significance level for each test.
For example, with 10 groups and so 45 pairwise tests, we’d use α′^ = 0.05 / 45 ≈ 0.0011 for each test.
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 Coagulation Time
Diet A
Diet B
Diet C
Diet D
Combined
21
Comparison p-value α′′^ = α k = 0.05 6 = 0.
A vs B 0.
A vs C < 0.
A vs D 1.
B vs C 0.
B vs D < 0.
C vs D < 0.
σ ˆ = 2.
n = 10 for each group
SE = 2.34 ×
2 /10 = 1.05 for each comparison.
df = 45, α′^ = 0.005 −→ t = 2.
Groups with
| Y¯i· − Y¯j·| > 2.69 × 1. = 2.
are deemed different.
Bonferroni-corrected CIs:
( Y¯i· − Y¯j·) ± 2.
25
HSD = “Honest significant difference”
Reject H 0 : μi = μj if
| Y¯i· − Y¯j·| > Qα(k, df) ×
MW/n
We’re assuming equal sample sizes (n) for the treatment groups.
k = no. treatment groups; df = n · k – k
Qα(k, df) = 1 – α quantile of the “Studentized range distribution.”
We won’t go into where Qα(k, df) comes from. Suffice it to say: it’s an adjustment not unlike the Bonferroni correction, and it can be calculated using qtukey() in R. Alternatively, the function TukeyHSD() will do the whole thing.
Taking α = 0.05, k = 5, df = 45,
Qα(k, df) = 4.02. qtukey(0.95, 5, 45)
Groups with
| Y¯i· − Y¯j·| > 4.02 ×
are deemed different.
27
Uncorrected:
Each interval, individually, had (in advance) a 95% chance of covering the true mean differ- ence.
Corrected:
(In advance) there was a greater than 95% chance that all of the intervals would cover there re- spective parameters.
Sorted sample means:
Table of differences:
F G S C G+F 0.2 1.3 6.1 12. F 1.1 5.9 11. G 4.8 10. S 6. 31
From the ANOVA table:
MW = 5.46 n = 10 for each group
MW/10 = 0.739 df = 45
The qi (using df=45 and α = 0.05):
q 2 q 3 q 4 q 5 2.85 3.43 3.77 4.
Ri = qi ×
Table of differences:
F G S C G+F 0.2 1.3 6.1 12. F 1.1 5.9 11. G 4.8 10. S 6.
Ri = qi ×
33
Sorted sample means:
G+F F G S C
58.0 58.2 59.3 64.1 70.
Interpretation:
G+F ≈ F ≈ G < S < C
Table of differences:
C A B E D 3.3 10.4 11.1 19. C 7.1 7.8 15. A 0.7 8. B 8.
Ri = qi ×
37
Sorted sample means:
D C A B E
29.6 32.9 40.0 40.7 48.
Interpretation:
{D, C, A, B} < E and D < {A, B}
For the Tukey and Newman-Keuls methods, we assumed that the numbers of responses in each treatment group were the same.
What to do if they vary?
39
10 20 30 40 50 60 70
J
I
H
G
F
E
D
C
B
A
resp
l l l l l
l ll l^ l
l l l l l
l l (^) l l l
l l^ l l l
l l l l l
l ll^ l l
l l l l l
l (^) l l ll
l ll l l
10 20 30 40 50 60 70
F−J
A−E
resp
l l^ l l ll l^ l ll ll l lll l ll lll l ll
l (^) l l l (^) l lll l l l l lll l lll (^) l lllll