Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Homework Assignment: Statistical Inference and Model Comparison - Prof. John Monahan, Assignments of Statistics

North Carolina State University (NCSU)Statistics

Prof. John Monahan

A series of simulation study problems related to statistical inference and model comparison. Topics include: estimators and their asymptotics, standard errors under heteroskedasticity, error structure in pd/pk models, testing variance components, and analysis of rate statistics. Students are expected to compare different estimators, estimators under heteroskedasticity, and model performances.

Typology: Assignments

Pre 2010

Uploaded on 03/18/2009

koofers-user-qex 🇺🇸

10 documents

1 / 2

This page cannot be seen from the preview

Don't miss anything!

Homework #7 – Simulation Study Problems

ST790R

01 November 2008

1 Least Three-halves Estimator

Recall from Exercise 8.13, the estimator ˜µthat minimizes Pi|Xi−µ|3/2. The asymptotics for this estimator

should follow √n(˜µ−µ)≈Nor mal(0, a/b)

where a=R|x|f(x)dx is estimated by Pi|Xi−˜µ|and b=R|x|1/2f0(x)dx2is estimated by Pi|Xi−˜µ|−1/2/22

Compare this estimator to other location estimators; see where asymptotics apply.

2 Standard Errors under Heteroskedasticity

The sandwich covariance estimate in Chapter 9 is a generalization of some work by Halbert White (among

others) on the effect of heteroskedasicity (different variances) in multiple regression. Under standard (ho-

moskedastic) assumptions with V ar(ei) = σ2, the covariance matrix of the parameter estimates is the usual

σ2(XTX)−1; sought is a consistent estimator under heteroskedasticity. One proposed estimator is

H1=n

n−p(XTX)−1(XTΩ1X)(XTX)−1

where Ω1=diag{ˆei}and ˆei, i = 1,. . . , N are residuals. A second estimator does a different correction

H2= (XTX)−1(XTΩ2X)(XTX)−1

where Ω2=diag{ˆei/(1 −(PX)ii)}. Compare these estimators with the usual.

3 Error Structure in PD/PK models

In pharmacodynamic/pharmacokinetic models, the response – often a chemical concentration – must be

nonnegative. Two routes are commonly used for fitting these nonlinear regression models:

•Using generalized least squares with error variance proportional to the mean, or square: Yj∼Normal(gj, σ2g2θ

j),

where θmay be 0, 1/2, or 1.

•Fitting a log-normal model: log(Yj)∼Normal(log(gj), σ2)

Choose one of these four (that is, three values of θand log-normal) as the truth and compare the

performance of these models. Include as another (fifth) competitor a model with no heteroskedasticity.

1

Discover Assignments of Statistics North Carolina State University (NCSU)

Partial preview of the text

Download Homework Assignment: Statistical Inference and Model Comparison - Prof. John Monahan and more Assignments Statistics in PDF only on Docsity!

Homework #7 – Simulation Study Problems

ST790R

01 November 2008

1 Least Three-halves Estimator

Recall from Exercise 8.13, the estimator ˜μ that minimizes

i |Xi^ −^ μ|

3 / (^2). The asymptotics for this estimator

should follow (^) √ n(˜μ − μ) ≈ N ormal(0, a/b)

where a =

|x|f (x)dx is estimated by

i |Xi−μ˜|^ and^ b^ =^

[∫

|x|^1 /^2 f ′(x)dx

] 2

is estimated by

[∑

i |Xi^ −^ μ˜|

− 1 / 2 / 2 ]^2

Compare this estimator to other location estimators; see where asymptotics apply.

2 Standard Errors under Heteroskedasticity

The sandwich covariance estimate in Chapter 9 is a generalization of some work by Halbert White (among others) on the effect of heteroskedasicity (different variances) in multiple regression. Under standard (ho- moskedastic) assumptions with V ar(ei) = σ^2 , the covariance matrix of the parameter estimates is the usual σ^2 (XT^ X)−^1 ; sought is a consistent estimator under heteroskedasticity. One proposed estimator is

H 1 = n n − p

(XT^ X)−^1 (XT^ Ω 1 X)(XT^ X)−^1

where Ω 1 = diag{ˆei} and ˆei, i = 1,... , N are residuals. A second estimator does a different correction

H 2 = (XT^ X)−^1 (XT^ Ω 2 X)(XT^ X)−^1

where Ω 2 = diag{ˆei/(1 − (PX)ii)}. Compare these estimators with the usual.

3 Error Structure in PD/PK models

In pharmacodynamic/pharmacokinetic models, the response – often a chemical concentration – must be nonnegative. Two routes are commonly used for fitting these nonlinear regression models:

Using generalized least squares with error variance proportional to the mean, or square: Yj ∼ N ormal(gj , σ^2 g^2 j θ), where θ may be 0, 1/2, or 1.
Fitting a log-normal model: log(Yj ) ∼ N ormal(log(gj ), σ^2 )

Choose one of these four (that is, three values of θ and log-normal) as the truth and compare the performance of these models. Include as another (fifth) competitor a model with no heteroskedasticity.

4 Testing Variance Components – Balanced

Consider the balanced (ni = n) one-way ANOVA case of Yij = μ + αi + eij where eij N ormal(0, σ^2 ) and, independently, αi N ormal(0, σ^2 a). We want to test the hypothesis H : σ a^2 = 0. Two approaches are considered.

a) The usual F-test: F = SSA/ [a − 1] SSE/ [a(n − 1)]

where SSA = n

i(yi.^ −^ y..)

2 , SSE = ∑

ij (yij^ −^ yi.) (^2) , and reject H if F is too big.

b) Likelihood Ratio Test

the log-likelihood under the alternative can be written as

`(μ, σ^2 a, σ^2 ) = −

log(2π) −

log

[

(σ^2 )a(n−1)(σ^2 + nσ^2 a)a)

]

[

SSE/σ^2 + SSA/(σ^2 + nσ a^2 ) + an(y.. − μ)^2 /(σ^2 + nσ a^2 )

]

and you should be able to write and maximize the likelihood under the hypothesis. Reject H if the difference in log-likelihoods is too large.

5 Analysis of Rate Statistics

In evaluating the performance of public health programs, often the statistics are cited in terms of rates of incidence of disease per unit, say, deaths per 10,000. In cases where only aggregate data are available, sometimes the aggregation units are very different in size. For example, in North Carolina, disease prevalence data are available for each county either in terms of counts or in terms of rates per 1,000 or some similar unit. (There are some very large counties in NC, e.g. Mecklenberg and Wake, as well as many very small ones.) So the true model may be that the rate in county i follows λi = β 0 + β 1 incomei with a single covariate of income, and the data are only available by county, so the number observed Yi in county i may be Poisson with rate popiλi. Compare some different methods of analysis:

Simple linear regression of the rate Yi/popi on income.
Simple linear regression of the square root of the incidence

Yi on income.

Generalized least squares of the rate Yi/popi with variance proportional to the reciprocal of popi
Poisson regression of the incidence Yi with rate popiexp{β 0 + β 1 incomei}
Finally, you might try to fit the true model – a Poisson regression

The file ncc03q4.dat in the ’rfiles’ directory holds NC County data as of the end of 2003. The columns are:

county name (character)
population
median household income
per capita personal income

Choose one of the two income measures.

Homework Assignment: Statistical Inference and Model Comparison - Prof. John Monahan, Assignments of Statistics

Related documents

Partial preview of the text

Download Homework Assignment: Statistical Inference and Model Comparison - Prof. John Monahan and more Assignments Statistics in PDF only on Docsity!

Homework #7 – Simulation Study Problems

ST790R

01 November 2008

1 Least Three-halves Estimator

[∫

] 2

[∑

− 1 / 2 / 2 ]^2

2 Standard Errors under Heteroskedasticity

(XT^ X)−^1 (XT^ Ω 1 X)(XT^ X)−^1

3 Error Structure in PD/PK models

4 Testing Variance Components – Balanced

2 , SSE = ∑

[

]

[

]

5 Analysis of Rate Statistics