









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The issue of correlated data in statistics, specifically in the context of cluster randomized studies. It covers the central issue of testing, estimation, and sample size calculations when responses within a cluster may not be independent. The document also introduces the concept of the variance inflation factor (vif) and its role in measuring the increase in variance due to within-subject correlation.
Typology: Study notes
1 / 15
This page cannot be seen from the preview
Don't miss anything!










Correlated Data Bios 662
Michael G. Hudgens, Ph.D. [email protected]
http://www.bios.unc.edu/∼mhudgens
2007-11-14 12:
Clustered Data
Central Issue
Continuous response model
i = 1, 2 ,... , t j = 1, 2 ,... , c k = 1, 2 ,... , m
Y¯ij =
∑m k=1 Yijk m
Continuous response model
V ar( Y¯ij) = E( Y¯ (^) ij^2 ) − μ^2 i
= m−^2 E
{∑m k=1 Yijk
− μ^2 i
= m−^2 E
m k=1 Y^ 2 ijk +^
k 6 =k′YijkYijk′
− μ^2 i
= m−^2
mσ^2 + m(m − 1)ρσ^2
= σ
2 m {1 + (m^ −^ 1)ρ}
Variance Inflation Factor (VIF)
Continuous response model
zignore =
σ
1 /n 1 + 1/n 2
ztrue = Y¯ 1 − Y¯ 2 σ
(1/n 1 +1/n 2 ){V IF }
= √zignoreV IF
Effect of correlation
Sample Size when t = 2
n = 2
(z 1 −α/ 2 +^ z^1 −β ∆
n = 2
(z 1 −α/ 2 +^ z^1 −β ∆
m
Variance Inflation Factors
ρ m 0.001 0.01 0.02 0.05 0. 2 1.001 1.01 1.02 1.05 1. 5 1.004 1.04 1.09 1.20 1. 10 1.009 1.09 1.18 1.45 1. 100 1.099 1.99 2.98 5.95 10. 1000 1.999 10.99 20.98 50.95 100.