Fisher Exact Test - Mathematics and Statistics - Study Notes, Study notes of Mathematical Statistics

In this study material file, you will learn about: Fisher Exact Test, Significance Levels, Computations, Background, algorithm, Table Rearrangement, One-Tailed

Typology: Study notes

2011/2012

Uploaded on 10/31/2012

sangawar
sangawar 🇮🇳

4.5

(4)

118 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Appendix 5: Significance Levels for
Fisher’s Exact Test1
The procedure described in this appendix is used to calculate the exact one-tailed
and two-tailed significance levels of Fisher’s exact test for a 22× table under the
assumption of independence of rows and columns and conditional on the marginal
totals. All cell counts are rounded to the nearest integers.
Background
Consider the following observed 22× table:
n1 n2 nn
12
+
n3 n4 nn
34
+
nn
13
+ nn
24
+ N
Conditional on the observed marginal totals, the values of the four cell counts can
be expressed as the observed count of the first cell n1 only. Under the hypothesis of
independence, the count of the first cell
N
1 follows a hypergeometric distribution
with the probability of
N
n
11
= given by
Prob Nn nn n n nn n n
Nn n n n
11 12341324
1234
==++++
16
1616161 6
!!!!
!!!!!
where
N
1 ranges from max ,014
nn
16
to min ,nnnn
1213
++
16
and
N
nnnn=+++
1234
.
The exact one-tailed significance level
p
1 is defined as
pNn nEN
Nn nEN
111 1 1
11 1 1
=≥>
≤≤
%
&
K
'
K
Prob if
Prob if
16 16
16 16
1 This algorithm applies to SPSS 6.1.2 and later releases.
pf3

Partial preview of the text

Download Fisher Exact Test - Mathematics and Statistics - Study Notes and more Study notes Mathematical Statistics in PDF only on Docsity!

1

Appendix 5: Significance Levels for

Fisher’s Exact Test

The procedure described in this appendix is used to calculate the exact one-tailed and two-tailed significance levels of Fisher’s exact test for a 2 × 2 table under the assumption of independence of rows and columns and conditional on the marginal totals. All cell counts are rounded to the nearest integers.

Background

Consider the following observed 2 × 2 table:

n 1 n 2 n (^) 1 + n 2 n 3 n 4 n (^) 3 + n 4 n 1 (^) + n 3 n (^) 2 + n 4 N Conditional on the observed marginal totals, the values of the four cell counts can be expressed as the observed count of the first cell n 1 only. Under the hypothesis of independence, the count of the first cell N 1 follows a hypergeometric distribution with the probability of N (^) 1 = n 1 given by

Prob N n

n n n n n n n n (^1 1) N n n n n 1 2 3 4 1 3 2 4 1 2 3 4

1 6 1!^ 6 1!^ 6 1!^6!

where N 1 ranges from max 10 , n 1 − n 46 to min 1 n 1 + n 2 , n 1 + n 36 and

N = n 1 (^) + n (^) 2 + n (^) 3 + n 4. The exact one-tailed significance level p 1 is defined as

p N n n E N (^1) N n n E N

1 1 1 1 1 1 1 1

K

'K

Prob if Prob if

1 This algorithm applies to SPSS 6.1.2 and later releases.

2 Appendix 5

where E N 1 6 1 1 (^) = n (^) 1 + n (^) 2 61 n (^) 1 + n 3 (^) 6 / N. The exact two-tailed significance level p 2 is defined as the sum of the one- tailed significance level p 1 and the probabilities of all points in the other side of the sample space of N 1 which are not greater than the probability of N (^) 1 = n 1.

Computations

To begin the computation of the two significance levels p 1 and p 2 , the counts in the observed 2 × 2 table are rearranged. Then the exact one-tailed and two-tailed significance levels are computed using the CDF.HYPER cumulative distribution function.

Table Rearrangement

The following steps are used to rearrange the table:

  1. Check whether n 1 (^) > E N 1 6 1 , which can be done by checking whether n n 1 (^) 4 > n n 2 3. If so, rearrange the table so that the first cell contains the minimum of n 2 and n 3 , maintaining the row and column totals; otherwise, rearrange the table so that the first cell contains the minimum of n 1 and n 4 , again maintaining the row and column totals.
  2. Without loss of generality, we assume that the count of the first cell is n 1 after the above rearrangement. Calculate the first row total, the first column total, and the overall total, and name them SAMPLE , HITS , and TOTAL , respectively.

One-Tailed Significance Level

The following steps are used to calculate the one-tailed significance level:

  1. If TOTAL = 0 , set the one-tailed significance level p 1 equal to 1; otherwise, obtain p 1 by using the CDF.HYPER cumulative distribution function with arguments n 1 , SAMPLE , HITS , and TOTAL.
  2. Also calculate the probability of the first cell count equal to n 1 by finding the difference between p 1 and the value obtained from CDF.HYPER with n 1 − 1 , SAMPLE , HITS , and TOTAL as its arguments, provided that n 1 > 0. Call this probability PEXACT.
  3. If n 1 = 0 , set PEXACT = p 1. PEXACT will be used in the next step to find the points for which the probabilities are not greater than PEXACT.