


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An in-depth exploration of model selection strategies for binomial data, focusing on the balance between model complexity and interpretability. Indications of collinearity and numerical instability, model building strategies using univariate analysis and multiple logistic regression, and automated variable selection. The document also emphasizes the importance of assessing assumptions and identifying confounding variables.
Typology: Study notes
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Note: These notes are a revision of P. K. Choudhary lecture notes at the University of Texas at Dallas. I have edited them for our class and have included a section on automated variable selection.
MODEL SELECTION
Competing goals:
Issue: How to select a parsimonious (simple) model that fits the data well?
Indications of collinearity:
Indications of numerical instability:
Models building strategy: (seven or fewer explanatory variables)
Step 1: Use univariate analysis to identify important covariates - the ones that are at least moderately associated with response - one covariate at a time.
Step 2: Fit a multiple logistic regression model using the variables selected in step 1.
Step 2 (alternate): Build the main effects model.
Model Building Strategy: Automatic stepwise selection procedure (10 or more explana- tory variables)
The use of automated explanatory variable selection is somewhat controversial. The North- east SAS Users Paper 222-26 by Shtatland, Cain, adn Barton seems to be a reasonable attempt to balance over and under parameterization of models chosen by blindly applying automated selection procedures. We will apply their procedure to the Pima Indian diabetes study found in the SAS file pima logistic modelbuilding.sas.
Diagnostics: Validate your model as we have previously discussed. Model building is iterative. The previous steps may have yielded several candidate models from which to choose.