9.3 A Permutation F-Test, Exercises of Statistics

In the traditional analysis of variance (ANOVA) F-test, we are testing the null hypothesis ... the p-value for the Permutation F-test.

Typology: Exercises

2022/2023

Uploaded on 02/28/2023

marylen
marylen 🇺🇸

4.6

(26)

250 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
9.3 A Permutation F-Test
The data setup is the same as Friedman’s Test. That is, we have ktreatments in either b
blocks from a RCBD or bsubjects from a SRMD.
Assume µis the overall mean, τiis the ith treatment effect, βjis the jth block (subject)
effect, and ij is the random error of the observation. The linear model for a RCBD or
SRMD is
yij =µ+τi+βj+ij and ij II DN (0, σ 2).(20)
In the traditional analysis of variance (ANOVA) F-test, we are testing the null hypothesis
of equality (no differences) in treatment effects
H0:τ1=τ2=·· · =τk
against the alternative hypothesis
H1:τi6=τjfor some i6=j.
That is, if H1is true, then all treatment effects are not equal.
To compare k3 treatment means, the test statistic is
F=SStrt /(k1)
SSE/(Nk)=M Str t
MSE
where N=kb the total number of observations in the data set.
The RCBD or SRMD total sum of squares (SStotal) is partitioned into 3 components:
SST otal =SST rt +SSBlock +SSE
Formulas to calculate SST otal,SST rt and SSBlock are
SST otal =
k
X
i=1
b
X
j=1
y2
ij y2
··
kb SST rt =
k
X
i=1
y2
i·
by2
··
kb SSBlock =
b
X
j=1
y2
·j
ky2
··
kb
SSE=SST otal SST rt SSBlock where y2
··
kb is the correction factor.
Dot notation:y·· = the sum of all of the responses, yi·= sum of responses for treatment
i, and y·j= sum of responses for block j.
If the residual errors are approximately normally distributed with equal variances, then
the test statistic Fhas an F-distribution with k1 degrees of freedom for the numerator
and (k1)(b1) degrees of freedom for the denominator.
In this case, the experimenter compares the F-statistic to the F[(k1)(b1)] distribution
to determine a p-value for the test.
However, if the assumptions are violated, (that is, the residual errors are not normally
distributed with constant variance), then a permutation F-test may be appropriate.
212
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download 9.3 A Permutation F-Test and more Exercises Statistics in PDF only on Docsity!

9.3 A Permutation F-Test

  • The data setup is the same as Friedman’s Test. That is, we have k treatments in either b blocks from a RCBD or b subjects from a SRMD.
  • Assume μ is the overall mean, τi is the ith^ treatment effect, βj is the jth^ block (subject) effect, and ij is the random error of the observation. The linear model for a RCBD or SRMD is yij = μ + τi + βj + ij and ij ∼ IIDN (0, σ^2 ). (20)
  • In the traditional analysis of variance (ANOVA) F -test, we are testing the null hypothesis of equality (no differences) in treatment effects

H 0 : τ 1 = τ 2 = · · · = τk

against the alternative hypothesis

H 1 : τi 6 = τj for some i 6 = j.

That is, if H 1 is true, then all treatment effects are not equal.

  • To compare k ≥3 treatment means, the test statistic is

F =

SStrt/(k − 1) SSE/(N − k)

M Strt M SE

where N = kb the total number of observations in the data set.

  • The RCBD or SRMD total sum of squares (SStotal) is partitioned into 3 components:

SST otal = SST rt + SSBlock + SSE

  • Formulas to calculate SST otal, SST rt and SSBlock are

SST otal =

∑^ k

i=

∑^ b

j=

y^2 ij −

y^2 ·· kb

SST rt =

∑^ k

i=

y^2 i· b

y^2 ·· kb

SSBlock =

∑^ b

j=

y^2 ·j k

y ··^2 kb

SSE = SST otal − SST rt − SSBlock where

y^2 ·· kb

is the correction factor.

Dot notation: y·· = the sum of all of the responses, yi· = sum of responses for treatment i, and y·j = sum of responses for block j.

  • If the residual errors are approximately normally distributed with equal variances, then the test statistic F has an F -distribution with k − 1 degrees of freedom for the numerator and (k − 1)(b − 1) degrees of freedom for the denominator.
  • In this case, the experimenter compares the F -statistic to the F [(k − 1)(b − 1)] distribution to determine a p-value for the test.
  • However, if the assumptions are violated, (that is, the residual errors are not normally distributed with constant variance), then a permutation F -test may be appropriate.

The Steps in the Permutation F -Test (Monte-Carlo Approach)

  • Calculate the F -statistic from the original data. Call this Fobs.
  • Generate a large number Prep of permutations where observations are permuted within each block. That is, we are randomly permuting the treatments to the observations within blocks.
  • For each permutation, calculate the F -statistic.
  • Find the proportion of this set of Prep permutation F -statistics that are ≥ Fobs. This is the p-value for the Permutation F -test.

R code for Permutation F-Test for RCBD

RCBD data from Table 4.4.3 (Higgins, page 130)

library(lmPerm)

Enter vector of responses

y <- c(120,208,199,194,177,195,207,188,181,164,155,175, 122,137,177,177,160,138,128,128,160,142,157,179)

treatment <- c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4) block <- c(rep(c(1,2,3,4,5,6),4))

treatment <- as.factor(treatment) block <- as.factor(block)

rcbd <- data.frame(y,block,treatment)

Parametric F-test for a RCBD

summary(aov(y~treatment+block,rcbd))

Permutation Test for RCBD

summary(aovp(y~treatment+block,rcbd))

R output for Permutation F-Test for RCBD

> # Parametric F-test for a RCBD

Df Sum Sq Mean Sq F value Pr(>F) <-- P-value for the treatment 3 5408 1802.8 3.121 0.0575. <-- parametric F-Test block 5 2817 563.4 0.975 0. Residuals 15 8664 577.

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

> # Permutation Test for RCBD

Df R Sum Sq R Mean Sq Iter Pr(Prob) <-- P-value for the treatment 3 5408.3 1802.78 1848 0.07251. <-- Permutation F-Test block 5 2816.8 563.37 573 0. Residuals 15 8664.2 577.

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Example of Page’s Test (from Table 4.5.2 in Introduction to Modern Nonparametric Statistics

by J. Higgins).

A researcher reported the scores of 36 children who perform a certain task as part

of an experiment. The children formed 12 blocks of three children such that each

child in the same block were of similar age and gender. In each block, one child

was congenitally blind (Blind), one child could see but his/her eyes were covered

to block sight (Blindfolded), and one child could see without being blindfolded

(Seeing). The results are shown in the following table (where a : b → a years

and b months old).

Age Gender Block Blind Blindfolded Seeing Blind Blindfolded Seeing

5:7 F 1 0 0 0

6:0 M 2 0 8 1

6:4 F 3 0 0 8

6:6 M 5 0 0 8

6:11 F 5 1 2 0

7:9 F 6 8 8 8

7:11 F 7 8 5 8

8:0 F 8 8 6 8

8:5 F 9 0 8 8

8:6 F 10 8 8 8

8:10 F 11 8 3 8

9:6 M 12 8 8 8

The researchers wanted to test the null hypothesis of identical results against the alternative

that blind children tend to score lower than blindfolded children, and that blindfolded children

tend to score lower than seeing children.

9.5 Cochran’s Q Test

7.4 Cochran’s Q Test

  • Example 7.8Example 7.8

R code for Cochran’s Q Test Example

Cochran’s Q Test

library(statmod) library(doBy) library(ade4) library(RVAideMemoire)

y <- c(1,0,0,0, 1,1,1,1, 0,0,0,0, 0,1,1,1, 1,1,1,1, 1,0,0,1, 1,0,1,1, 1,0,0,1, 1,0,0,0, 1,0,0,0, 1,1,1,1)

blocks <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7, 8,8,8,8,9,9,9,9,10,10,10,10,11,11,11,11)

treatment <- c(rep(c("MajOp","ActPIP","SubjPIP","SemiPIP"),11))

cochran.qtest(y~treatment|blocks,alpha=.05,p.method="fdr")

SAS code for Cochran’s Q Test Example

  • You can use SAS to calculate the Cochran Q statistics and an approximate p-value based on a chi-squared distribution.
  • This p-value may not be very accurate when the number of treatments or blocks (or subjects) is not very large.

DM ’LOG;CLEAR;OUT;CLEAR;’;

OPTION PS=60 LS=72 NODATE NONUMBER;

*** Cochran’s Q Test Example ***; ********************************;

*** Analysis by entering ranks within blocks ***;

DATA in; DO patient = 1 TO 11; DO treatment = ’MajOp’, ’ActP ’, ’SubjP’, ’SemiP’; INPUT y @@; OUTPUT; END; END; CARDS; 1 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 0 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 1 1 1 ; PROC FREQ DATA=in; TABLE patienttreatmenty / NOPRINT CMH; TITLE ’Cochran’’s Q Test’; RUN;

SAS output for Cochran’s Q Test

Cochran’s Q Test

The FREQ Procedure

Summary Statistics for treatment by y Controlling for patient

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Prob

1 Nonzero Correlation 1 0.0261 0. 2 Row Mean Scores Differ 3 7.6957 0.0527 <--- 3 General Association 3 7.6957 0.

Total Sample Size = 44

R code for Permutation Test for Cochran’s Q

  • We can use a monte-carlo approach to finding a p-value for Cochran’s Q Test instead of using the asymptotic chi-square distribution.
  • The idea is to generate randomizations of the 0’s and 1’s within each block or subject, and calculate the Q test statistic for each randomization.
  • A large number of randomizations Prep would be performed to generate the ECDF of the Q statistic assuming H 0 is true. This ECDF can then be used to find the p-value. That is, find the proportion of Q values out of the Prep values in the ECDF that are ≥ Q from the observed data.

R output for Permutation Test for Cochran’s Q

> # Permutation Test for Cochran’s Q

> sum(Cvec) [1] 25 > N <- sum(Rvec) [1] 25

> Qdenom [1] 23 > Qnumer [1] 177

> # Cochran’s Q Test Statistic for Observed Data

> Q0 [1] 7.

> # Calculate p-value [1] 0.

R code for Permutation Test for Cochran’s Q

Permutation Test for Cochran’s Q

library(gtools)

Enter the number of permutations to take

Prep = 50000