

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A series of simulation study problems related to statistical inference and model comparison. Topics include: estimators and their asymptotics, standard errors under heteroskedasticity, error structure in pd/pk models, testing variance components, and analysis of rate statistics. Students are expected to compare different estimators, estimators under heteroskedasticity, and model performances.
Typology: Assignments
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Recall from Exercise 8.13, the estimator ˜μ that minimizes
i |Xi^ −^ μ|
3 / (^2). The asymptotics for this estimator
should follow (^) √ n(˜μ − μ) ≈ N ormal(0, a/b)
where a =
|x|f (x)dx is estimated by
i |Xi−μ˜|^ and^ b^ =^
|x|^1 /^2 f ′(x)dx
is estimated by
i |Xi^ −^ μ˜|
Compare this estimator to other location estimators; see where asymptotics apply.
The sandwich covariance estimate in Chapter 9 is a generalization of some work by Halbert White (among others) on the effect of heteroskedasicity (different variances) in multiple regression. Under standard (ho- moskedastic) assumptions with V ar(ei) = σ^2 , the covariance matrix of the parameter estimates is the usual σ^2 (XT^ X)−^1 ; sought is a consistent estimator under heteroskedasticity. One proposed estimator is
H 1 = n n − p
where Ω 1 = diag{ˆei} and ˆei, i = 1,... , N are residuals. A second estimator does a different correction
H 2 = (XT^ X)−^1 (XT^ Ω 2 X)(XT^ X)−^1
where Ω 2 = diag{ˆei/(1 − (PX)ii)}. Compare these estimators with the usual.
In pharmacodynamic/pharmacokinetic models, the response – often a chemical concentration – must be nonnegative. Two routes are commonly used for fitting these nonlinear regression models:
Choose one of these four (that is, three values of θ and log-normal) as the truth and compare the performance of these models. Include as another (fifth) competitor a model with no heteroskedasticity.
Consider the balanced (ni = n) one-way ANOVA case of Yij = μ + αi + eij where eij N ormal(0, σ^2 ) and, independently, αi N ormal(0, σ^2 a). We want to test the hypothesis H : σ a^2 = 0. Two approaches are considered.
a) The usual F-test: F = SSA/ [a − 1] SSE/ [a(n − 1)]
where SSA = n
i(yi.^ −^ y..)
ij (yij^ −^ yi.) (^2) , and reject H if F is too big.
b) Likelihood Ratio Test
the log-likelihood under the alternative can be written as
`(μ, σ^2 a, σ^2 ) = −
log(2π) −
log
(σ^2 )a(n−1)(σ^2 + nσ^2 a)a)
SSE/σ^2 + SSA/(σ^2 + nσ a^2 ) + an(y.. − μ)^2 /(σ^2 + nσ a^2 )
and you should be able to write and maximize the likelihood under the hypothesis. Reject H if the difference in log-likelihoods is too large.
In evaluating the performance of public health programs, often the statistics are cited in terms of rates of incidence of disease per unit, say, deaths per 10,000. In cases where only aggregate data are available, sometimes the aggregation units are very different in size. For example, in North Carolina, disease prevalence data are available for each county either in terms of counts or in terms of rates per 1,000 or some similar unit. (There are some very large counties in NC, e.g. Mecklenberg and Wake, as well as many very small ones.) So the true model may be that the rate in county i follows λi = β 0 + β 1 incomei with a single covariate of income, and the data are only available by county, so the number observed Yi in county i may be Poisson with rate popiλi. Compare some different methods of analysis:
Yi on income.
The file ncc03q4.dat in the ’rfiles’ directory holds NC County data as of the end of 2003. The columns are:
Choose one of the two income measures.