























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of nonparametric methods for testing location parameters, specifically the kruskal-wallis and wilcoxon rank sum tests. The purpose and assumptions of these tests, their test statistics, hypotheses, and decision rules. The kruskal-wallis test is designed to deal with non-normality and is most sensitive to population distributions that are equal up to a location shift. The wilcoxon rank sum test, also known as the mann-whitney test, assumes samples are taken from two independent, continuous populations and tests for median differences. Examples and computational details.
Typology: Study notes
1 / 31
This page cannot be seen from the preview
Don't miss anything!
























BTRY 6010 & ILRST 6100
Nonparametric methods
Independent observations (within and between groups)
Independent observations (within and between groups)
Data are
continuous
& from independent populations
deal with
non-normality
by converting data to
ranks
be most sensitive to population distributions that are equalup to a
location shift
same except location e g
medians
up to a
location
shift
same
except location, e.g.,
medians
H
: All population distributions are identical 0
(Hence there
H
: All population distributions are identical. (Hence there 0 is equality of means, variances, everything!)
H
: All distributions are a
not
identical, and at least one tends
to give larger observations than at least one otherto give larger observations than at least one other.
BTRY 6010 & ILRST 6100
Nonparametric methods
The KW tests asks (
without
assuming normality!): Can the
differences among the medians shown in these boxplots bereasonably explained as random variation from identicaldistributions?
BTRY 6010 & ILRST 6100
Nonparametric methods
Test details:Test
details:
1
K^
nd
nd
0
a^
BTRY 6010 & ILRST 6100
Nonparametric methods
The actual testThe
actual test
Sort
yij
for
j =
,…,n
, i =i
into increasing order.
j^
i^
d fi
k^
i^
d t
or
j^
,…,n
, i = 1,…,K:i
d
efine
ij^ = rank assigned to
y
ij
Compute the average rank for group
i :
1 1
ni
i^
ij j i
T^
R
n^
^
Basic test statistic (assumes no tied data values):
(^1) j n^ i
2 1
K^
n
H compares average
1
tot^2
i^
i
i
tot
tot
n
n
n
n
H compares
average
group rank to valuethat we expect toobserve under
H
0
^
2
1
K
i^
i^
tot
i
tot
tot
n T
n
n
n
Computationallyconvenient form
1
i
tot
tot
n
n
BTRY 6010 & ILRST 6100
Nonparametric methods
: The 0
groups (i.e., populations) are identical.
: All distributions are a
not
identical, and at least one
tends to give larger observations than at least oneother. (A simple version: at least one population hasa different median) than the others.)a different median) than the others.)
0
0
BTRY 6010 & ILRST 6100
Nonparametric methods
A manufacturer of concrete bridge supports is interested indetermining the effect of varying the sand content on thedetermining the effect of varying the sand content on thestrength of the supports. Five supports are made for eachof five different amounts of sand in the concrete mix andeach is tested for compression resistanceeach is tested for compression resistance.
Percent Sand
15%
20%
25%
30%
35%
15%
20%
25%
30%
35%
7
17
14
20
7
7
12
18
24
10
7
12
18
24
10
10
11
18
22
11
15
18
19
19
15
9
19
19
23
11
BTRY 6010 & ILRST 6100
Nonparametric methods
Standard ANOVA:
K^
= 5,
n
tot
= 25
ANOVA
Table
SSBSSESST
There is strong evidence against
H
0
p
-value is very small
(much less than
α
= 0.05). Hence we conclude that the mean
degree of resistance is not the same for the five levels of sanddegree of resistance is not the same for the five levels of sand.
BTRY 6010 & ILRST 6100
Nonparametric methods
Three independent samples of
40
.^
.05.
.
.
.
.90.
.
Distributions^ Residual Score
p
ministers selected from threespecific religious denominations(Catholic Methodist and
30 20 10 0 10
(Catholic, Methodist, andPentecostal).
Response
variable
-10 -20 -
-^ -^ -^
0
1
2
3
Normal Quantile Plot
Response variable Y = mental illness awareness score
Are there differences between
Shapiro Wilk: p = 0.
the distributions of these scoresacross the three denominations?
BTRY 6010 & ILRST 6100
Nonparametric methods
’^
f
Levene’s Test for Minister data:
P -value of Levene’s test >
α
= 0.05.
S^
t^
th t
t^
l^
t
Suggests that we cannot rule outequal variances assumption.
BTRY 6010 & ILRST 6100
Nonparametric methods
Only an approximate
p
-value can be obtained when
using chi-square table (Table 8 in O&L):
H = 3.2532 < 4.
p
-value > 0.
Note: The standard ANOVA that assumes normality leads to a p
-
Note:
The standard ANOVA that assumes normality leads to a p
value of 0.1683. The ANOVA F test and KW test nearly always leadto different quantitative results but frequently come to the sameconclusion – especially when variances are comparable and normality
y^
y
is not severely violated.
16
BTRY 6010 & ILRST 6100
Nonparametric methods
Equivalent to Mann-Whitney test, so also called MWW
Assumes samples are taken from two independent,continuous populations (possibly non-normal).
: The two population distributions are identical. 0 (So means, variances, & every other feature are equal.) H
Distrib tions are identical e cept location shift
: Distributions are identical except location shift. a (Example: median or
1
or
3
differs.)
( H
another form: Not equal
Too broad to be useful!)
( H
, another form: Not equal. Too broad to be useful!) a
Intuition for location shift
: WRS tests for median a
differences.
WRS is appropriate for comparing skewed
g
or heavy-tailed distributions. (ANOVA is not!)
BTRY 6010 & ILRST 6100
Nonparametric methods
Standard form of the WRS testStandard
form of the WRS test
Compute the test statistic, say
, labeling as “population
1” the population that has the smaller sample size 1
the population that has the smaller sample size
Hypotheses:^
H
: The two populations are identical 0
H
: The two populations are identical. 0
Three possibilities for
H
a^ (one-sided & two-sided)
^
H
: Population 1 is shifted to the right of population 2. a
^
H
: Population 1 is shifted to the left of population 2. a
^
H
: Populations 1 & 2 have different location parameters. a
In practice: we rarely use one sided version of this test
In practice: we rarely use one-sided version of this test.As suggested on Slide 12, two-sided version of test isequivalent to KW test (for the case of
= 2 groups).
BTRY 6010 & ILRST 6100
Nonparametric methods
The raw data
Data, now sorted on logPPM,with column of ranks added inwith column of ranks added in Notice: Ranks are the same whether
nbackground
= 7, n
site
= 7
we rank items on PPM or logPPM
BTRY 6010 & ILRST 6100
Nonparametric methods