Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas


material de estudo de matemática, Manuais, Projetos, Pesquisas de Matemática

material de estudo de matemática

Tipologia: Manuais, Projetos, Pesquisas

2021

Compartilhado em 16/09/2021

thiago-leitao-ribeir
thiago-leitao-ribeir 🇧🇷

5

(2)

10 documentos

1 / 10

Toggle sidebar

Esta página não é visível na pré-visualização

Não perca as partes importantes!

bg1
Published on
STAT 414 / 415
(
https://onlinecourses.science.psu.edu/stat414
)
Home
> Th e Wilcoxon Signed Ran k Test for a Med ian
The Wilcoxon Signed Rank Test for a Median
"nonparametric" procedures developed. It is considered a nonparametric procedure, because we
make only two simple assumptions about the underlying distribution of the data, namely that:
(1) the random variable
X
is continuous
(2) the probablility density function of
X
is symmetric
Then, upon taking a random sample
X
1
,
X
2
, ...,
X
n
, we are interested in testing the null hypothesis:
against any of the possible alternative hypotheses:
or
or
As we often do, let's motivate the procedure by way of example.
Example
Let
X
i
denote the length, in centimeters, of a randomly
selected pygmy sunfish,
i
= 1, 2, ... 10. If we obtain the
following data set:
5.0 3.9 5.2 5.5 2.8 6.1 6.4 2.6
1.7 4.3
can we conclude that the median length of pygmy
sunfish differs significantly from 3.7 centimeters?
Solution.
We are interested in testing the null hypothesis
H
0
:
m
= 3.7 against the
alternative hypothesis
H
A
:
m
≠ 3.7. In general, the Wilcoxon signed rank test procedure
requires five steps. We'll introduce each of the steps as we apply them to the data in this
example.
Step #1.
In general, calculate
X
i
m
0
for
i
= 1, 2, ...,
n
. In this case, we have to
calculate
X
i
− 3.7 for
i
= 1, 2, ..., 10:
:
m
=
H
0
m
0
:
m
>
H
A
m
0
:
m
<
H
A
m
0
:
m
H
A
m
0
https://onlinecourses.science.psu.edu/stat414/prin...
1 de 10 20-12-2015 00:31
pf3
pf4
pf5
pf8
pf9
pfa

Pré-visualização parcial do texto

Baixe material de estudo de matemática e outras Manuais, Projetos, Pesquisas em PDF para Matemática, somente na Docsity!

Published on Home The Wilcoxon Signed Rank Test for a Median Developed in 1945 by the statistician Frank Wilcoxon, the signed rank test was one of the first "nonparametric" procedures developed. It is considered a nonparametric procedure, because we make only two simple assumptions about the underlying distribution of the data, namely that: Then, upon taking a random sample against any of the possible alternative hypotheses: As we often do, let's motivate the procedure by way of example. Example Let selected pygmy sunfish, following data set: X 5.0 > The Wilcoxon Signed Rank Test for a Median i 1.7 denote the length, in centimeters, of a randomly (1) the random variable (2) the probablility density function of 3.9 STAT 414 / 415 4.3 5.2 i 5.5 = 1, 2, ... 10. If we obtain the ( https://onlinecourses.science.psu.edu/stat414 X 2.8 is continuous X 1 or, 6.1 X 2 , ..., X 6.4 X is symmetric n , we are interested in testing the null hypothesis: 2.6 or )

can we conclude that the median length of pygmy sunfish differs significantly from 3.7 centimeters? Solution. alternative hypothesis requires five steps. We'll introduce each of the steps as we apply them to the data in this example. Step #1. calculate X In general, calculateWe are interested in testing the null hypothesis i − 3.7 for H Ai = 1, 2, ..., 10: H^ :^ A m :^ m > ≠ 3.7. In general, the Wilcoxon signed rank test procedure m X^0 i^ − m 0 H H for A^0^ :: i^ = 1, 2, ..., mm^ =< mm^00 n. In this case, we have to HH 0 A :^ : m^ m = 3.7 against the^ ≠ m^0

Step #2. ..., Step #3. according to their magnitude. In this case, the value of 0.2 is the smallest, so it gets rank 1. The value of 0.6 is the next smallest, so it gets rank 2. We continue ranking the data in this way until we have assigned a rank to each of the data values: n. In this case, we have to calculate | In general, calculate the absolute value ofDetermine the rank Ri , i = 1, 2,..., Xi − 3.7| for n of the abolute values (in ascending order)^ https://onlinecourses.science.psu.edu/stat414/prin... iX = 1, 2, ..., 10: im 0 , that is, | Xim 0 | for i = 1, 2,

The Distribution of As is always the case, in order to find the distribution of the discrete random variable Let's tackle the would happen if each observation hypothesis, thereby causing The largest that above the value of the median Step #5.^ the median under the null hypothesis. That is, calculate the^ make a decision about whether to reject or not to reject. Whoa, nellie!^ have to take a break from this example before we can finish, as we first have to learn^ something about the distribution of (1) to find the range of possible values of (2) to determine the probability that^ Determine if the observed value of support of W W Zi first. Well, the smallest that= 0, for m could be is 0 X specified in the null hypothesis, thereby causing i fell below the value of the median i^ W = 1, 2, ...,. W takes on each of the values in the support WW . That would happen if each observation fell n , that is, we need to specify the support of^ :is extreme in light of the assumed value of P -value associated with m 0 specified in the nullWe're going tocould be is 0. That Z i W = 1, for, we need:^ W , and Wi =

1, 2, ..., and therefore So, in summary, Now, if we have a small sample size exact probability distribution of first we have to determine the exact probability distribution of takes some thinking and perhaps a bit of tedious work. Let's make our discussion concrete by considering a very small sample size, integers 0, 1, 2, 3, 4, 5, 6. Now, each of the three data points would be assigned a rank 1, 2, or 3, and depending on whether the data point fell above or below the hypothesized median m 0 , each of the three possible ranks 1, 2, or 3 would remain either a positive signed rank or n : W ^ W reduces to the sum of the integers from 1 to W^ is a discrete random variable whose support ranges between 0 and=^ ∑ n^ i =^1 Zi^ Ri^ W = W n , such as we do in the above example, we could use the n ∑ i to calculate the== 3, say. In that case, the possible values of n 1 Zi nR ( ni^2 +=^1 ) ∑ i = n 1 = P -values for our hypothesis tests. Errr.... n (^ Wn 2 n + = W : 1. Doing so is very doable. It just∑) n^ i =^1 Zi^ Ri W are the nR ( in of either+1)/2.

become a negative signed rank. In this case, because we are considering such a small sample size, we can easily enumerate each of the possible outcomes, as well as sum see how each arrangement results in one of the possible values of There we have it. We're just about done with finding the exact probability distribution of = 3. All we have to do is recognize that under the null hypothesis, each of the above eight arrangements (columns) is equally likely. Therefore, we can use the classical approach to assigning the probabilities. That is: And, just to make sure that we haven't made an error in our calculations, we can verify that the sum of the probabilities over the support 0, 1, ..., 6 is indeed 1/8 + 1/8 + ... + 1/8 = 1. Hmmm. That was easy enough. Let's do the same thing for a sample size of case, the possible values of would be assigned a rank above or below the hypothesized median remain either a positive signed rank or become a negative signed rank. Again, because we are considering such a small sample size, we can easily enumerate each of the possible outcomes, as well as sum values of P P P P P P P ((((((( WWWWWWW (^) = 0) = 1/8, because there is only one way that= 1) = 1/8, because there is only one way that= 2) = 1/8, because there is only one way that= 3) = 2/8, because there are two ways that= 4) = 1/8, because there is only one way that= 5) = 1/8, because there is only one way that= 6) = 1/8, because there is only one way that W : W of the positive ranks to see how each arrangement results in one of the possible R (^) i W of either 1, 2, 3, or 4, and depending on whether the data point fell are the integers 0, 1, 2, ..., 10. Now, each of the four data points m 0 , each of the three possible ranks 1, 2, 3, or 4 would W (^) WWWWWW = 3 = 0= 1= 2= 4= 5= 6 W : W of the positive ranks to n = 4. Well, in that W when n Do you want to do the calculation for the case where possible outcomes looks like:^ Again, under the null hypothesis, each of the above 16 arrangements is equally likely, so we can^ use the classical approach to assigning the probabilities:^ P^ P^ P^ P^ and so on...^ P^ P (((((( WWWWWW^ = 0) = 1/16, because there is only one way that= 1) = 1/16, because there is only one way that= 2) = 1/16, because there is only one way that= 3) = 2/16, because there are two ways that= 9) = 1/16, because there is only one way that= 10) = 1/16, because there is only one way that n^ = 5? Here's what the enumeration of W^^ WWWW = 3^ W^ = 0= 1= 2= 9^ = 10

Theorem. follows an approximate standard normal distribution Proof. normal distribution part of the theorem is trivial. Our proof therefore reduces to showing that the mean and variance of respectively. To find of In case that claim was less than obvious, consider this intuitive, hand-waving kind of argument: At any rate, we therefore have: (^) U U W Under symmetry, an equally likely chance of getting assigned either a + or a − is equivalent to having an equally likely chance of being included in the sum or not. ii (^) Because the Central Limit Theorem is at work here, the approximate standard= 0 with probability ½=and i (^) When the null hypothesis is true, for largewith probability ½ U are both sums of a subset of the numbers 1, 2, ...,where: E ( W ) and Var W (^) (are: W ), note that and Nn (0, 1). : has the same distribution n and: because the and therefore: Therefore, in summary, under the null hypothesis, we have that: Ui 's are independent under the null hypothesis. Now:

E ( W^ U^ )= =^ ∑ En^ i (= U^1 ) U = i E^ E ( ( W ) )=^ = Wn ( n^^4 ′+^1 =) ∑ n^ i^ √=^1 ‾ Z^ n ‾ V (^ in ‾^ Ra +‾^ Wi^1 r ‾^^24 )(^ −(‾ W^2 =‾ n ‾+)^ n ‾^ ∑(^1 n =)‾^4 + n^ i^ =^1 ) n^1 (^ nZ + i^1^ R^24 )( i^2 n +^1 )

Var ( U Vi ) a (^) r =( (^) W ( E ) ∑(^ i == U^ n^1 (^) i^2 (^) ∑ i )= n 1 −^ UV iE^ a ( r^ V U ( a Ui ∑) r^ i = ) i^ n ( )^21^ W ==[) (^0) ∑ i =[^ =( n (^0 1)^ V^212 i 4 a ( 2 ) r 12 (+= U ) i ) 14^^ (+= ∑^ i^12 i = n^ ∑^2^ i 1 )=^^ n (^1 i ] 2 12 V = (^) = a )^ r ]^12 ( 14^ U^ −∑^ i × i =^ n )(^1^ n i 2^ i (^ n =)^2 +^12 = 1 × (^) ) 6 ( i^222^ nn (− n +^ 2 +^ i (^412) )^1 =) =^ i 42^ n ( n^^4 +^1 )

Let's return to our example now to complete our work. Example (continued) Let sunfish, can we conclude that the median length of pygmy sunfish differs significantly from 3.7 centimeters? X 5.0 follows an approximate standard normal distribution as was to be proved. Solution. the alternative hypothesis as far as determining that we know about the distribution of with upper and lower percentiles of the Wilcoxon signed rank statistic when i 1.7 denote the length of a randomly selected pygmy i n = 1, 2, ... 10. If we obtain the following data set: 3.9 = 10, our sample size is fairly small so we can use the exact distribution of 4.3 Recall that we are interested in testing the null hypothesis 5.2 5.5 WH 2.8 A = 40 for the given data set. Now, we just have to use what: m ≠ 3.7. The last time we worked on this example, we got W 6.1 to complete our hypothesis test. Well, in this case, 6.4 2.6 H 0 n : = 10 are: m = 3.7 against W. The

Notes A couple of notes are worth mentioning before we take a look at another example:^ Therefore, our^ the null hypothesis. There is insufficient evidence at the 0.05 level to conclude that the^ median length of pygmy sunfish differs significantly from 3.7 centimeters. (1) Our textbook authors define just the sum of the defining W.^ P -value is 2 × 0.116 = 0.232. Because our positive ranks. That is perfectly fine, but not the most typical way of as the sum of^ P -value is large, we cannot reject all of the ranks, as opposed to

W ′^ =∑ n^ i √=^1 ‾ Z n ‾( in ‾^ R +‾ i 1 ‾^ 24 )−(‾ 2 ‾ n ‾+ n ‾( 1 n )‾^4 +^1 )

W = ∑ n i = 1 Ri

Source URL: Links: Because we have a large sample (^ distribution of^ 200. Therefore, using a half-unit correction for continuity, our transformed signed rank^ statistic is:^ Therefore, upon using a normal probability calculator (or table), we get that our^ Because our^ evidence at the 0.05 level to conclude that the median age of the onset of diabetes differs^ significantly from 45 years.^ By the way, we can even be lazier and let Minitab do all of the calculation work for us.^ Under the^ get: https://onlinecourses.science.psu.edu/stat414/node/319^ Stat^ P^ -value is large, we cannot reject the null hypothesis. There is insufficient W^ menu, if we select. In this case, our P-value is defined as two times the probability that W ≤^ n Nonparametrics^ = 30), we can use the normal approximation to the, and then^ 1-Sample Wilcoxon^ P -value is:, we [1] https://onlinecourses.science.psu.edu/stat414/sites/onlinecourses.science.psu.edu.stat414/files/lesson48 /ExactW_Table.pdf^ P^ ≈^ W^2^ ×′^ = P^ (^ W 200.5^ ′^ √<^ ‾^30 −‾−^ (‾^31 0.66(‾^24 ‾^ )^30 (‾^61 ‾(^4 )^31 ‾)^ =))^2 (=0.2546^ −0.6581)^ ≈^ 0.