



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The methods for conducting linear hypothesis tests based on estimated parameters in complex sample (cs) models. It covers wald chi-square test, wald f test, and adjusted wald chi-square test, along with their assumptions and distributions. The document also discusses individual tests and multiple comparison tests.
Typology: Study notes
1 / 7
This page cannot be seen from the preview
Don't miss anything!




1
This document describes the methods used for conducting linear hypothesis tests based on
the estimated parameters in CS models.
Required input is a set of the linear hypothesis, parameter estimates and their covariance
matrix estimated for the complex sample design. Some methods require an estimate of the
parameter covariance matrix under the simple random sampling assumption as well. Also
needed is the number of degrees of freedom for the complex sample design; typically this
will be the difference between the number of primary sampling units and the number of
strata in the first stage of sampling.
Given consistent estimates of the above constructs, no additional restrictions are imposed on
the complex sample design.
p
Number of regression parameters in the model.
r The number of linear hypothesis considered.
Generalized linear hypothesis matrix with r rows and p columns.
Hypothesis value vector with r elements.
Vector of p unknown population parameters.
Vector of p estimated population parameters (solution).
V( B Estimated covariance matrix for B
given the complex sample design.
The number of sampling design degrees of freedom.
Hypothesis Testing
Given matrix L with r rows and p columns, and vector K with r elements, the following
test of generalized linear hypothesis is performed:
0
It is assumed that LB is estimable.
Wald chi-square test
Wald
2
Χ statistic proposed by Koch et al. (1975) is defined by
−
ˆ ˆ ˆ ˆ
2
.
Asymptotic distribution of the
2
Χ test statistic is chi-square with
I
r degrees of freedom,
where )
r = rank
I
. If r r
I
−
is a generalized inverse such that
Wald tests are effective for restricted set of hypothesis
I I
L B = K containing a particular
subset I of independent rows from
0
Wald F test
Wald F statistic suggested by Fellegi (1980) is computed by the formula
2
I
I
r
r
This statistic is asymptotically approximated by the F-distribution ( , − + 1 )
I I
F r ν r , where
if
I
ν < r. See Korn and Graubard (1990) for the properties of this statistic.
Adjusted Wald chi-square test
Wald
2
srs
Χ statistic under the simple random sampling assumption is given by the following
expression:
−
ˆ ˆ ˆ ˆ
2
srs srs
.
l l
l k
2
2
and
adj adj
2 2
.
Asymptotic distribution used for test statistics
2
Χ and
2
adj
Χ is the chi-square distribution
with 1 degree of freedom. Test statistics F and
adj
F are approximated by the F-distribution
is not positive.
P-values
Given a value of test statistic T and a corresponding cumulative distribution function G as
specified above, the p-value p of the given test is computed as p = 1 − G ( T ).
Multiple comparison tests
In addition to the testing methods mentioned in the previous section, the hypothesis
0
H can also be tested using the multiple row hypotheses testing technique. Let
i
l ′ be the i -th row vector of matrix L , and
i
k be the i -th element of vector K. The i -th row
hypothesis is
i i i
H : l ′ B = k
0
. Testing
0
H is the same as testing multiple hypotheses
{ }
R
i
i
1
simultaneously, where R is the number of non-redundant row hypotheses. A
hypothesis
i
0
is redundant if there exists another hypothesis H j i
j
0
such that
l = cl , k = ck , c ≠ 0
i j i j
.
For each individual hypothesis
i
0
, tests described in the previous section can be
performed. Let
i
p denotes the p -value for testing
i
0
, and
i
p denotes the adjusted p -
reject
i i i
H l = k
0
i
p ;
reject : LB = K
0
i
i
p.
There are different methods to adjust p -values. Five methods are provided here. Please note
that if the adjusted p -value is bigger than 1, it is set to 1 in all the methods.
LSD (Least Significant Difference)
The adjusted p -values are the same as the original p -values:
i i
p = p
.
The adjusted p -values are
i i
p = Rp
.
The adjusted p -values are
R
i i
p 1 ( 1 p )
Sequential Bonferroni test (Holm)
In sequential test, the p -values are first ordered from the smallest to the biggest, and then
adjusted depending on the order. Let the ordered p -values be
( 1 ) ( 2 ) ( R )
p ≤ p ≤" ≤ p with
corresponding hypotheses being
0 ( 1 ) 0 ( 2 ) 0 ( )
R
The adjusted p -value of
( i )
p is
( )
−
max( 1 ) , 2
() ( 1 )
( 1 )
( )
R i p p i
Rp i
p
i i
i
.
Sequential Sidak test
The adjusted p -value of
( i )
p is
( )
−
− +
max 1 ( 1 ) , 2
( 1 )
1
()
( 1 )
( )
p p i
p i
p
i
Ri
i
R
i
.
Rao, J. N. K., and Scott, A. J. (1984), “On chi-squared tests for multiway contingency tables
with cell proportions estimated from survey data”, Annals of Statistics, volume 12, pages
46-60.
Rao, J. N. K., and Thomas, D. R. (2003), “Analysis of categorical response data from
complex surveys: an upraisal and update”, In Analysis of Survey Data, ed. R.Chambers
and C. Skinner. New York: John Wiley & Sons.
Thomas, D. R., and Rao, J. N. K. (1987), “Small-sample comparisons of level and power for
simple goodness-of-fit statistics under cluster sampling”, Journal of the American
Statistical Association, volume 82, pages 630-636.
Wright, S. P. (1992). Adjusted P-values for simultaneous inference. Biometrics, 48, 1005–