

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Lab; Class: CAT ANALYSIS EPIDEM; Subject: Epidemiology; University: University of Washington - Seattle; Term: Autumn 2008;
Typology: Lab Reports
1 / 2
This page cannot be seen from the preview
Don't miss anything!


Biostat/Epi 536 Discussion Session 4 – October 21, 2008
Breslow and McCann (Cancer Research 31, 2098-2103, December 1971) use a series of logistic models to analyze 2-year survival probabilities for 246 children with neuroblastoma. Age and extent of tumor spread at diagnosis (i.e., tumor stage) are found to be important factors in determining chances for survival.
We have been tasked to replicate the Barlow/McCann analysis in part, modeling probability of survival (msurv=1), adjusting for tumor stage (stg) and age group (ageg). We load the data into Stata and use the list command to display it.
Given that age is categorized as 0-11, 12-23, and 24+ months and stage is categorized as Stage I, Stage II, Stage III, Stage IV, and Stage IV-S, how many distinct covariate patterns might we expect there to be in the fitted model?
We apply the following commands in Stata
gen freq= collapse (count) freq (sum) msurv, by (stg ageg)
then display the data a second time:
What does freq represent in the current display? What does msurv represent (notice that it has been redefined)?
How many children are there within the stg=IV, ageg=12-23 covariate pattern? How many of these children survived?
Why might the number of observations in the current display differ from the number of distinct covariate patterns determined above? What is the number of distinct covariate patterns in the fitted model?
Data in the form of display I are referred to as “binary data.” Data in the form of display II are referred to as “binomial data.” Notice that, with binary data, the outcome (msurv) takes a value of 0 or 1. With binomial data, the outcome (together with freq, in this case) gives the proportion of children who survived.