One-Way Analysis of Variance (ANOVA) Notation and Procedures, Study notes of Mathematical Statistics

Notation and procedures for conducting a one-way analysis of variance (anova) study. It includes formulas for group statistics, group statistics from summary statistics, anova table, descriptive statistics, variance estimates and confidence intervals, tests for homogeneity of variances, user-supplied contrasts, polynomial contrasts, and weighted contrasts and statistics. The document also covers the student-newman-keuls (snk), tukey, tukeyb, duncan, scheffe, lsd, and modified lsd procedures.

Typology: Study notes

2011/2012

Uploaded on 10/31/2012

sangawar
sangawar 🇮🇳

4.5

(4)

118 documents

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
ONEWAY
For post hoc range tests and pairwise multiple comparisons, see Appendix 10.
Notation
The following notation is used throughout this chapter unless otherwise stated:
Xlj Value of the jth observation in group l
wlj Weight for the jth observation in group l
Wlj, Sum of weights of the first j cases in group l
Wl Sum of weights of all cases in group l
k Number of groups, determined as maximum group values minus
minimum plus one
k Number of nonempty groups
nl Number of cases in group l
W Sum of weights of cases in all groups
Group Statistics
Computation of Group Statistics
A weighted version of the Young-Cramer (1971) algorithm is used to compute
recursively the corrected sum of squares for each group.
SSQ SSQ
wXW wX
WW
li li
li li l i lj lj
j
i
li li
,,
,
,
=+
F
H
G
G
I
K
J
J
=
1
1
1
12
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download One-Way Analysis of Variance (ANOVA) Notation and Procedures and more Study notes Mathematical Statistics in PDF only on Docsity!

1

For post hoc range tests and pairwise multiple comparisons, see Appendix 10.

Notation

The following notation is used throughout this chapter unless otherwise stated:

X (^) lj Value of the^ jth observation in group^ l

wlj Weight for the^ jth observation in group^ l

Wl j, Sum of weights of the first^ j^ cases in group^ l

Wl Sum of weights of all cases in group^ l k Number of groups, determined as maximum group values minus minimum plus one k ′ Number of nonempty groups nl Number of cases in group^ l W Sum of weights of cases in all groups

Group Statistics

Computation of Group Statistics

A weighted version of the Young-Cramer (1971) algorithm is used to compute recursively the corrected sum of squares for each group.

SSQ SSQ

w X W w X

W W

l i l i

li li l i lj lj j

i

l i li

, ,

,

,

F

H

G G

I

K

J J −

1

1 1

1 2

1

The initial value is 0; the value for each group after the last observation has been processed is the corrected sum of squares.

SS (^) l = SSQl n,l

The sum and mean for each group are

T X w

T T W

l li li i

n

l (^) l l

l

=

∑ 1

The variance is

S (^) l^2 = SS (^) l bW (^) l− (^1) g

The grand sum is

G Ti i

k

=

∑ 1

Group Statistics from Summary Statistics

With matrix data input, the user supplies sum of weights in each group (^) b gWl ,

means (^) e jT (^) l , and standard deviations (^) b gS (^) l. From these,

F
BSSM
WSSM

Mean Square Between Mean Square Within

The significance level is obtained from the F distribution with numerator and denominator degrees of freedom.

Basic Statistics

Descriptive Statistics

Sample size Mean Standard deviation Standard error

W
T
S
S W

q q q q q

95% Confidence Interval for the Mean

T (^) q ± t (^) Wq − 1 S (^) q Wq

where tWq − 1 is the upper 2.5% critical value for the t distribution with Wq − 1 degrees of freedom.

Variance Estimates and Confidence Interval for Mean

Fixed-Effects Model

Pooled Standard Deviation

S (^) p = WSSM

Standard Error

Standard error = WSSM W

95% Confidence Interval for the Mean

G ± t (^) W −k ′ WSSM W

where tW − k′ is the upper 2.5% critical value for the t distribution with W − k′ degrees of freedom.

Random-Effects Model

Between-Groups Component of Variance (Snedecor and Cochran 1967)

2 2 1

F

H

G G

I

K

J J =

BSSM WSSM W k

W Wi i

k

b g bc gh

Standard Error of the Mean = V G ( ) (Brownlee 1965)

where

V G

W k BSSM WSSM

W W W
WSSM
W

i i

k

i i

e j k

b gb g

F

H

GG

I

K

JJ ′ −^ −

F

H

G G

I

K

J J

=

2 1

2 2 1

User-Supplied Contrasts

Let C 1 through C (^) k be the coefficients for a particular contrast. If the sum of the coefficients is not 0, a warning is printed and the contrast number is starred. For each contrast the following are printed.

Value of the Contrast

V T Ci i i

k

=

∑ 1

Pooled Variance Statistics

Standard Error

SE S (^) p Ci Wi i

k

=

2 2 1

t Value

t =V SE

Degrees of Freedom

W − k′

Two-tailed significance level based on the t distribution with W – k´ degrees of freedom

Separate Variance Statistics

Standard Error

SE Ci S (^) i Wi i

k

=

2 2 1

e j

t Value

t =V SE

Degrees of Freedom (Brownlee 1965)

df

C S W
C S W W

i i i i

k

i i i i i

= k

F

H

G G

I

K

J J

=

=

2 2 1

2

2 2 2 1

e j b^1 g

Two-tailed significance level based on the t distribution with df degrees of freedom

Polynomial Contrasts (Speed 1976)

If the specified degree of the polynomial (NP) is less than or equal to 0, or greater than 5, a message is printed and the procedure is terminated. If the degree of the polynomial specified is greater than the number of nonempty groups, it is set to k ′ − 1. If the sums of the weights in each group are equal, only the WEIGHTED contrasts will be generated. For unequal sample sizes with equal spacing between groups, both WEIGHTED and UNWEIGHTED contrasts are computed. For

The F statistic for the qth degree contrast is computed as

F

T G c

c W

WSSM

i i q i

k

i q i i

= k

L

N

M M

O

Q

P = P

=

d i (^) ,

,

1

2

2 1

where WSSM is the mean square within. The significance level is obtained from the F distribution with 1 and W − k′ degrees of freedom.

WEIGHTED Contrasts and Statistics (Emerson 1968;

Robson 1959)

The contrast for the qth degree polynomial component is computed from the following recursive relations:

d (^) i q, = (^) di − A (^) q′ (^) id (^) i q, − 1 − C dq′ i q, − 2

for q NP i k

K K

with initial values

d d

A

iW d

W d

C

iW d d

W d

C

i i

q

i i q i

k

i i q i

k

q

i i q i q i

k

i i q i

k

q

, ,

,

,

, ,

,

1

2 1

1

2 1

1 2 1

2

2 1

′ =^ =

− −

for q 2

for q 1

The test for the contribution of the qth degree orthogonal polynomial component is based on

F =D (^) q WSSM

The appropriate range of values for the tests are

SNK R (^) r = Sr , f r = 2, K,k′

TUKEY (^) R (^) r = Sk ′,f

TUKEYB R

S S

r

r f k f

e (^) , +^ ′, j 2

For the DUNCAN procedure, alphas of 0.01, 0.05, and 0.10 can be used. The

ranges (^) e Dr ,fj are generated using the algorithm of Gebhardt (1966).

DUNCAN (^) R (^) r = Dr , f r = 2, K,k′

The Scheffé, LSD, and modified LSD procedures all use critical points from the F

distribution. Any α ≤ 0 5. can be used.

SCHEFFE (^) R k F k f r =^2 b^ ′ −^1 g^1 − α b^ ′ −^1 , g LSD (^) R F f r =^2 1 − α b^1 , g MODLSD (^) R F f r =^2 1 − α ′b^1 , g

where

k (^) bk 1 g

Compute the multiplier of the ranges for the difference of means i and j.

M S

n n

M S

n

k

i j p i j

i j p

l l

k

,

,

F

H

G

I

K

J

=

1

default

harmonic mean for all groups

a f

b g

Establishment of Homogeneous Subsets

If the sample sizes in all groups are equal, or the harmonic mean for all groups has been selected, or the multiple comparison procedure is SNK or DUNCAN, homogeneous subsets are established as follows:

The means are sorted into ascending order from Tb g 1 to Tb g k ′. Values of i and q

such that

Tb g (^) q − Tb g (^) i ≤ R (^) q − +i 1 Mq i, b g∗

are systematically searched for and

{^ T (^) b g^ i ,^ K,Tb gq}

is considered a homogeneous subset. The search procedure is as follows: At each step t, the value of i is incremented by 1 (the starting value is 1), and q = k′. The value of q is then decremented by one until b g∗ is true. Call this value q (^) t. If q (^) t > qt− 1 and (^) b g∗ is true,

{^ T (^) b g^ i ,^ K,Tq (^) t}

is considered homogeneous. Otherwise i is incremented and the next is step done. The procedure terminates when i = k or q (^) t = k.

Since the weight used in Welch statistic is ω l = Wl / Sl^2 , one cannot compute

the statistic if any one group has zero standard deviation. Moreover, sample sizes of all groups have to be greater than or equal to zero.

Brown-Forsythe Test

In Brown and Forsythe (1974a,1974b), a test statistic for equal means was proposed. The statistic as the following form,

F

2 1 2 1

BF

l l l

k

l l l

k

W T G
W W S

=

=

=

Â

Â

The statistic has an approximate F distribution with (k-1) and f degrees of freedom, where

f

c 2 / ( 1) 1

= -

 l^ l l

k W

and

c

2

2 1

l

l l

l l l

k

W W S
W W S

=

=

Â

When we look at the denominator of FBF , we can see that it tries to estimate the ‘pooled variance’ by

S (^) pool^2 = *^2

Â^ ω^ l^ l l

k S 1

where

ω l l
W W

W k

* (^ )

=

We can see that the weight of i-th group decreases as sample size increases. Moreover, the weight tends to ‘average out’ the differences in sample sizes. This can be visualized in the following example.

n (^) i 10 20 30 40 50 w (^) i*^ .233 .217 .200 .183. n (^) i /N .067 .133 .200 .267.

With such a weighting scheme, the pooled variance will not be dominate by a group with very large sample size and thus preventing over/under estimate of the pooled variance. Under the assumption of unequal variances, S (^) pool^2 does not have chi-square distribution anymore. However, by using Satterthwaite’s method (1946), one can show that S (^) pool^2 (with proper scaling) has an approximate chi- square distribution with f degree of freedom. The Brown & Forsythe statistic cannot be computed if all groups have zero standard deviation or any group has sample size less than or equal to 1. In the situation that some groups have zero standard deviations, the statistic can be computed but the approximation may not work.

References

Brown, M. B. and A. B. Forsythe (1974a). "The small sample behavior of some statistics which test the equality of several means", Technometrics, v16, 129-

Brown, M. B. and A. B. Forsythe (1974b). "Robust tests for the equality of variances", J. American Statistical Association, v69, 364-367.

Brownlee, K. A. 1965. Statistical theory and methodology in science and engineering. New York: John Wiley & Sons, Inc.

Duncan, D. B. 1955. Multiple range and multiple F tests. Biometrics, 11: 1–42.

Eisenhart, C. 1947. Significance of the largest of a set of sample estimates of variance. In: Techniques of Statistical Analysis, C. Eisenhart, M. W. Hastay, and N. A. Wallis, eds. New York: McGraw-Hill.