

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Guidelines for model selection in statistical analysis for both association and prediction in the context of biostatistics and epidemiology. It covers the importance of regression coefficient estimates, variable selection, use of automated procedures, area under the roc curve, and goodness of fit tests. The document also discusses model selection strategies for association and prediction, including confirmatory and exploratory methods, and presents suggestions for future studies.
Typology: Lab Reports
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Is each of the following important
for ASSOCIATION?
for PREDICTION?
regression coefficient estimates
Yes
No
what variables are in the model
Yes
No*
the use of automated procedures
No
Yes
the area under the ROC curve
No
Yes
goodness of fit tests
No
Yes
However, the choice of this set of variables may depend on what data is available or what type of data we want to use in our prediction.
Confirmatory[can be used to test hypotheses]
Exploratory[can generate hypotheses, but cannot test them]
Main AssociationVariable
include whether or not significant
-^
include in form consistent with prior hypothesis
include whether or not significant
-^
include in best-fitting form
Interactions with MainAssociation Variable
include only if specified in prior hypothesis
-^
include only in form consistent with prior hypothesis
include only if specified in prior hypothesis
-^
explore functional forms and choose one that fits well[but include main effects if interaction is included]
Adjustment Variables(e.g. confounders,precision variables)
include only if specified in prior hypothesis
-^
include in as rich a form as possible to minimizeresidual confounding
-^
statistical significance is irrelevant
examine possible confounders to see if controlling forthem changes coefficient of interest
-^
examine different forms, and choose richer model whenthere is a difference in the coefficient of interest**
Presentation
explanation of what was controlled, and how
-^
adjusted odds ratios, with CIs
-^
possibly unadjusted odds ratios, with CIs
-^
sometimes partially adjusted odds ratios
explanation of what was controlled, and how
-^
adjusted odds ratios, with CIs
-^
possibly unadjusted odds ratios, with CIs
-^
sometimes partially adjusted odds ratios
-^
brief description of model selection and suggestions forfuture studies of hypotheses that were generated
** F-tests should only be used to compare functional form (e.g. splines versus linear) if the splines are created with the main association variable. If they are created with an adjustmentvariable, then the choice of form should be made based on the coefficient of interest only. If this coefficient is different with different forms of adjustment, choose the richer one.