










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Notation and procedures for conducting a one-way analysis of variance (anova) study. It includes formulas for group statistics, group statistics from summary statistics, anova table, descriptive statistics, variance estimates and confidence intervals, tests for homogeneity of variances, user-supplied contrasts, polynomial contrasts, and weighted contrasts and statistics. The document also covers the student-newman-keuls (snk), tukey, tukeyb, duncan, scheffe, lsd, and modified lsd procedures.
Typology: Study notes
1 / 18
This page cannot be seen from the preview
Don't miss anything!











1
For post hoc range tests and pairwise multiple comparisons, see Appendix 10.
The following notation is used throughout this chapter unless otherwise stated:
X (^) lj Value of the^ jth observation in group^ l
wlj Weight for the^ jth observation in group^ l
Wl j, Sum of weights of the first^ j^ cases in group^ l
Wl Sum of weights of all cases in group^ l k Number of groups, determined as maximum group values minus minimum plus one k ′ Number of nonempty groups nl Number of cases in group^ l W Sum of weights of cases in all groups
Computation of Group Statistics
A weighted version of the Young-Cramer (1971) algorithm is used to compute recursively the corrected sum of squares for each group.
w X W w X
l i l i
li li l i lj lj j
i
l i li
, ,
,
,
F
H
G G
I
K
J J −
−
−
∑
1
1 1
1 2
1
The initial value is 0; the value for each group after the last observation has been processed is the corrected sum of squares.
SS (^) l = SSQl n,l
The sum and mean for each group are
T X w
l li li i
n
l (^) l l
=
∑ 1
The variance is
S (^) l^2 = SS (^) l bW (^) l− (^1) g
The grand sum is
G Ti i
=
∑ 1
Group Statistics from Summary Statistics
With matrix data input, the user supplies sum of weights in each group (^) b gWl ,
means (^) e jT (^) l , and standard deviations (^) b gS (^) l. From these,
Mean Square Between Mean Square Within
The significance level is obtained from the F distribution with numerator and denominator degrees of freedom.
Descriptive Statistics
Sample size Mean Standard deviation Standard error
q q q q q
95% Confidence Interval for the Mean
T (^) q ± t (^) Wq − 1 S (^) q Wq
where tWq − 1 is the upper 2.5% critical value for the t distribution with Wq − 1 degrees of freedom.
Pooled Standard Deviation
S (^) p = WSSM
Standard Error
Standard error = WSSM W
95% Confidence Interval for the Mean
G ± t (^) W −k ′ WSSM W
where tW − k′ is the upper 2.5% critical value for the t distribution with W − k′ degrees of freedom.
Between-Groups Component of Variance (Snedecor and Cochran 1967)
2 2 1
F
H
G G
I
K
J J =
∑
BSSM WSSM W k
W Wi i
k
b g bc gh
Standard Error of the Mean = V G ( ) (Brownlee 1965)
where
W k BSSM WSSM
i i
k
i i
e j k
F
H
GG
I
K
JJ ′ −^ −
F
H
G G
I
K
J J
=
∑
∑
2 1
2 2 1
Let C 1 through C (^) k be the coefficients for a particular contrast. If the sum of the coefficients is not 0, a warning is printed and the contrast number is starred. For each contrast the following are printed.
V T Ci i i
=
∑ 1
Standard Error
SE S (^) p Ci Wi i
=
∑
2 2 1
t Value
t =V SE
Degrees of Freedom
W − k′
Two-tailed significance level based on the t distribution with W – k´ degrees of freedom
Standard Error
SE Ci S (^) i Wi i
=
∑
2 2 1
e j
t =V SE
Degrees of Freedom (Brownlee 1965)
df
i i i i
k
i i i i i
= k
F
H
G G
I
K
J J
=
=
∑
∑
2 2 1
2
2 2 2 1
e j b^1 g
Two-tailed significance level based on the t distribution with df degrees of freedom
If the specified degree of the polynomial (NP) is less than or equal to 0, or greater than 5, a message is printed and the procedure is terminated. If the degree of the polynomial specified is greater than the number of nonempty groups, it is set to k ′ − 1. If the sums of the weights in each group are equal, only the WEIGHTED contrasts will be generated. For unequal sample sizes with equal spacing between groups, both WEIGHTED and UNWEIGHTED contrasts are computed. For
The F statistic for the qth degree contrast is computed as
T G c
c W
i i q i
k
i q i i
= k
L
N
M M
O
Q
P = P
=
∑
∑
d i (^) ,
,
1
2
2 1
where WSSM is the mean square within. The significance level is obtained from the F distribution with 1 and W − k′ degrees of freedom.
The contrast for the qth degree polynomial component is computed from the following recursive relations:
d (^) i q, = (^) di − A (^) q′ (^) id (^) i q, − 1 − C dq′ i q, − 2
for q NP i k
K K
with initial values
d d
iW d
W d
iW d d
W d
i i
q
i i q i
k
i i q i
k
q
i i q i q i
k
i i q i
k
q
, ,
,
,
, ,
,
1
2 1
1
2 1
1 2 1
2
2 1
−
for q 2
for q 1
The test for the contribution of the qth degree orthogonal polynomial component is based on
F =D (^) q WSSM
The appropriate range of values for the tests are
SNK R (^) r = Sr , f r = 2, K,k′
TUKEY (^) R (^) r = Sk ′,f
TUKEYB R
r
e (^) , +^ ′, j 2
For the DUNCAN procedure, alphas of 0.01, 0.05, and 0.10 can be used. The
ranges (^) e Dr ,fj are generated using the algorithm of Gebhardt (1966).
DUNCAN (^) R (^) r = Dr , f r = 2, K,k′
The Scheffé, LSD, and modified LSD procedures all use critical points from the F
SCHEFFE (^) R k F k f r =^2 b^ ′ −^1 g^1 − α b^ ′ −^1 , g LSD (^) R F f r =^2 1 − α b^1 , g MODLSD (^) R F f r =^2 1 − α ′b^1 , g
where
k (^) bk 1 g
Compute the multiplier of the ranges for the difference of means i and j.
n n
n
k
i j p i j
i j p
l l
k
,
,
F
H
G
I
K
J
=
1
default
harmonic mean for all groups
a f
b g
If the sample sizes in all groups are equal, or the harmonic mean for all groups has been selected, or the multiple comparison procedure is SNK or DUNCAN, homogeneous subsets are established as follows:
such that
Tb g (^) q − Tb g (^) i ≤ R (^) q − +i 1 Mq i, b g∗
are systematically searched for and
{^ T (^) b g^ i ,^ K,Tb gq}
is considered a homogeneous subset. The search procedure is as follows: At each step t, the value of i is incremented by 1 (the starting value is 1), and q = k′. The value of q is then decremented by one until b g∗ is true. Call this value q (^) t. If q (^) t > qt− 1 and (^) b g∗ is true,
{^ T (^) b g^ i ,^ K,Tq (^) t}
is considered homogeneous. Otherwise i is incremented and the next is step done. The procedure terminates when i = k or q (^) t = k.
the statistic if any one group has zero standard deviation. Moreover, sample sizes of all groups have to be greater than or equal to zero.
In Brown and Forsythe (1974a,1974b), a test statistic for equal means was proposed. The statistic as the following form,
2 1 2 1
BF
l l l
k
l l l
k
=
=
=
Â
Â
The statistic has an approximate F distribution with (k-1) and f degrees of freedom, where
f
c 2 / ( 1) 1
 l^ l l
k W
and
c
2
2 1
l
l l
l l l
k
=
=
Â
When we look at the denominator of FBF , we can see that it tries to estimate the ‘pooled variance’ by
Â^ ω^ l^ l l
k S 1
where
W k
=
We can see that the weight of i-th group decreases as sample size increases. Moreover, the weight tends to ‘average out’ the differences in sample sizes. This can be visualized in the following example.
n (^) i 10 20 30 40 50 w (^) i*^ .233 .217 .200 .183. n (^) i /N .067 .133 .200 .267.
With such a weighting scheme, the pooled variance will not be dominate by a group with very large sample size and thus preventing over/under estimate of the pooled variance. Under the assumption of unequal variances, S (^) pool^2 does not have chi-square distribution anymore. However, by using Satterthwaite’s method (1946), one can show that S (^) pool^2 (with proper scaling) has an approximate chi- square distribution with f degree of freedom. The Brown & Forsythe statistic cannot be computed if all groups have zero standard deviation or any group has sample size less than or equal to 1. In the situation that some groups have zero standard deviations, the statistic can be computed but the approximation may not work.
Brown, M. B. and A. B. Forsythe (1974a). "The small sample behavior of some statistics which test the equality of several means", Technometrics, v16, 129-
Brown, M. B. and A. B. Forsythe (1974b). "Robust tests for the equality of variances", J. American Statistical Association, v69, 364-367.
Brownlee, K. A. 1965. Statistical theory and methodology in science and engineering. New York: John Wiley & Sons, Inc.
Duncan, D. B. 1955. Multiple range and multiple F tests. Biometrics, 11: 1–42.
Eisenhart, C. 1947. Significance of the largest of a set of sample estimates of variance. In: Techniques of Statistical Analysis, C. Eisenhart, M. W. Hastay, and N. A. Wallis, eds. New York: McGraw-Hill.