













Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
In this study material file, you will learn about: NPAR TESTS, One-Sample Chi-Square Test, Degrees of Freedom, Kolmogorov-Smirnov One-Sample Test, Calculation of Theoretical Cumulative Distribution Functions, Poisson, Normal
Typology: Study notes
1 / 21
This page cannot be seen from the preview
Don't miss anything!














If a WEIGHT variable is specified, it is used to replicate a case as many times as indicated by the weight value rounded to the nearest integer. If the workspace requirements are exceeded and sampling has been selected, a random sample of cases is chosen for analysis using the algorithm described in SAMPLE. For the RUNS test, if sampling is specified, it is ignored. The tests are described in Siegel (1956).
Note: Before SPSS version 10.1.3, the WEIGHT variable was used to replicate a case as many times as indicated by the integer portion of the weight value. The case then had a probability of additional inclusion equal to the fractional portion of the weight value.
If the (lo, hi) specification is used, each integer value in the lo to hi range is designated a cell. Otherwise, each distinct value encountered is considered a cell.
If (lo, hi) has been selected, every observed value is truncated to an integer and, if it is in the lo to hi range, it is included in the frequency count for the corresponding cell. Otherwise, a count of the frequency of occurrence of the distinct values is obtained.
If none or EQUAL is specified,
i =^ k
number of observations [in range] number of cells
proportions,
i i N i i
= k
∑ 1
If there are cells with expected values less than 5, the number of such cells and the minimum expected value are printed. If the number of user-supplied expected frequencies is not equal to the number of cells generated, or if an expected value is less than or equal to zero, the test terminates with an error message.
2
1 1
=
∑
df k
i i i i
The significance level is from the chi-square distribution with k − 1 degrees of freedom.
Poisson
=
∑ X^ i N i
N
1
The test is not done if, for the uniform, all data are not within the user-specified range or, for the Poisson, the data are not non-negative integers. If the variance of the normal or the mean of the Poisson is zero, the test is also not done.
For Uniform
min max min
For Poisson
F X e i l
l
l
X (^) i 0 0
−
=
∑
For Normal
i (^) S
For the Uniform and Normal, two differences are computed:
D F X F X i N
i i i i i i
$ − ~ (^) $ , ,
1 0 0 1
For the Poisson: 1
D F X^ F X^ X^ i^ N X
D F X F X
i i^ i^ i i
i i i
K
The maximum positive, negative, and absolute differences are printed.
The test statistic is
Z = N max (^) i e Di , D ~ i j
The two-tailed probability level is estimated using the first three terms of the Smirnov (1948) formula.
if
if
Z p
Z p (^) Z Q Q Q
. ,.^ e j
where Q = e −^ 1 233701.^ Z^ −^2.
(^1) This algorithm applies to SPSS 7.0 and later releases.
For each of the data points, in the sequence in the file, the difference
Di = Xi − CUTPOINT
is computed. If Di ≥ 0 , the difference is considered positive, otherwise negative. The number of times the sign changes, that is, Di ≥ 0 and Di + 1 <0 , or Di < 0 and Di + 1 ≥0 , as well as the number of positive (^) d i n (^) p and (^) b g n (^) a signs, are
with
r
p a p a
r
p a p a a p p a p a
n n n n
n n n n n n n n n n
2
d i d i d i
The two-sided significance level is based on
Z R^ r r
unless n < 50 ; then
c
r r r r r r r
if if if
n 1 Number of observations in category 1 n 2 Number of observations in category 2 p (^) Test probability
N (^) n (^) 1 + n 2
p ∗^ p^ if^ m^ =^ n^ 1 , 1 −^ p^ if m^ = n 2
Two-tailed exact probability is
0
i
p p
m
p p i
m F i N^ i^ m N^ m HG
I KJ^
F
H
GG
I
K
JJ −^
F HG
I KJ^
=
∗ ∗ −^ ∗ ∗ − ∑ e^ j^ e^ j
If an approximate probability is reported, the following algorithm is used:
Z n^ Np Np p
Z n^ Np Np p P Zi
1 1 2 1
Then,
P X n P Z P Z P X n P Z
1 2 1 1 1
and the two-tailed approximate probability is
If n (^) p + nn ≤ 25 , the exact probability of r or fewer “successes” occurring in n (^) p + nn trials, when p = 0 5. and r = min (^) d n (^) p , nn i, is calculated recursively from the binomial
p X r n^ p^ inn i
r
b g (^) ∑ b g 0
If n (^) p + nn > 25 , the significance level is based on the normal approximation
n n n n c n n
p n p n p n
max ,.. .
d i 0 5d^ i 0 5 0 5
A two-tailed significance level is printed.
For each case, the difference
Di = X (^) i − Yi
is computed, as well as the absolute value of Di. All nonzero absolute differences are then sorted into ascending order, and ranks are assigned. In the case of ties, the average rank is used. The sums of the ranks corresponding to positive differences (^) d S (^) p i and negative differences b g S (^) n are calculated. The average positive rank is
X (^) p = S (^) p np
and the average negative rank is
X (^) n = S (^) n nn
where n (^) p is the number of cases with positive differences and n (^) n the number with negative differences.
The test statistic is^2
S S n n
n n n t t
p n
j j j
= (^) l
=
∑
min (^) d , i c b g h
b gb g (^) e j
1
where n Number of cases with non-zero differences l Number of ties t (^) j Number of elements in the j -th tie, j = 1, K, l
For large sample sizes the distribution of Z is approximately standard normal. A two-tailed probability level is printed.
For each of the N cases, the k variables specified may take on only one of two possible values. If more than two values, or only one, are encountered, a message is printed and the test is not done. The first value encountered is designated a “success” and for each case the number of variables that are “successes” are
(^2) This algorithm applies to SPSS 7.0 and later releases.
2 1 2
=
∑
∑
Nk k C N k
T Nk k
l l
k c b gh b g
e j
where (^) ∑ T is the same as in Kendall’s coefficient of concordance (see Lehmann, 1985, p. 265).
N , k , and l are the same as in Friedman, above.
N k
N k k N k k N T
F HG^
I KJ^
F
H
G G
I
K
J (^1) ∑ J
2 2
b g^2
e j e j
T t t l
k
i
N = − = =
∑ ∑ ∑
3 1 1
e j
with t = number of variables tied at each tied rank for each case.
(^4) This algorithm applies to SPSS 7.0 and later releases.
If the median value is not specified by the user, the combined data from both samples are sorted and the median calculated.
Md
N N N
R + S| T|
2 2 1 1 2
e j^2
b g
if is even otherwise
where X (^) N is the largest value and X (^) 1 the smallest. The number of cases in each of the two groups which exceed the median are counted. These will be denoted as g 1 and g 2 , and the corresponding sample sizes as n 1 and n 2.
χ (^) c
g n g g n g N N g g n n g g n n
2 1 2 2 2 1 1
2
1 2 1 2 1 2 1 2
Z U^ n n n n N N
i i
∑ JJ
1 2
1 2 3
which is distributed approximately as a standard normal. A two-tailed significance level is printed.
For each of the two groups separately the data sorted into ascending order, from X (^) 1 to X (^) ni , and the empirical cdf for group i is computed as
j n X X X X X
i i (^) j j n
1 1 1
For all of the X (^) j values in the two groups, the difference between the two groups is
D (^) j = F $ 1^ d X (^) j i − F $ 2 (^) d Xj i
where F $ 1 (^) d X (^) j i is the cdf for the group with the larger sample size. The maximum positive, negative, and absolute differences are also computed.
The test statistic (Smirnov, 1948) is
Z D n n j j n n j
max 1 2 1 2
and the significance level is calculated using the Smirnov approximation described in the K-S one sample test.
All observations from the two samples are pooled and sorted into ascending order. The number of changes in the group numbers corresponding to the ordered data are counted. The number of runs ( R ) is the number of group changes plus one. If there are ties involving observations from the two groups, both the minimum and maximum number of runs possible are calculated.
If n (^) 1 + n 2 , the total sample size, is less than or equal to 30, the one-sided significance level is exactly calculated from
P r R n n n
n r
n r r
R ≤ =
b g (^) ∑ = KJ
1
1 2
2
when R is even. When R is odd
P r R n n n
n k
n k
n k
n r k
R ≤ =
=
b g (^) ∑
1
1 2 1 2 2
where
r = 2 k − 1.
For sample sizes greater than 30, the normal approximation is used (see RUNS test described previously).
If the median value is not specified by the user, the combined data from all groups are sorted and the median is calculated.
Md
if is even if is odd
R + S| T|
N N N
2 2 1 1 2
e j^2
b g
where X (^) N is the largest value and X (^) 1 the smallest. The number of cases in each of the groups that exceed the median are counted and the following table is formed. Group 1 2 3 … k LE Md (^) O 11 O 12 O 13 ... O 1 (^) k R 1 GT Md (^) O 21 O 22 O 23 ... O 2 (^) k R 2 n 1 n 2 n 3 ...^ n (^) k N
χ 2
2 1
2
1
= =
∑ ∑ Oij^ E^ ij^ Eij j i
k d i
where
R n ij N = i^ j.
where k is the number of nonempty groups. A message is printed if any cell has an expected value less than one, or more than 20% of the cells have expected values less than five.
Observations from all k nonempty groups are jointly sorted and ranked, with the average rank being assigned in the case of ties. The number of tied scores in a set of ties, t (^) i , is also found, and the sum of Ti = t (^) i^3^ − ti is accumulated. For each group the sum of ranks, Ri , as well as the number of observations, ni , is obtained.
The test statistic unadjusted for ties is
R (^) i ni N i
=
∑
where N is the total number of observations. Adjusted for ties, the statistic is
=
∑
Ti N N i
m 1 3 1
e j
where m is the total number of tied sets.
freedom.