












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Categorical Data Analysis I; Subject: Statistics; University: Penn State - Main Campus; Term: Unknown 1989;
Typology: Study notes
1 / 20
This page cannot be seen from the preview
Don't miss anything!













Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r > 2 categories. (Note: The word polychotomous is sometimes used, but this word does not exist!) When analyzing a polytomous response, it’s important to note whether the response is ordinal (consisting of ordered categories) or nominal (consisting of unordered categories). Some types of models are appropriate only for ordinal responses; other models may be used whether the response is ordinal or nominal. If the response is ordinal, we do not necessarily have to take the ordering into account, but it often helps if we do. Using the natural ordering can
If the response variable is polytomous and all the potential predictors are discrete as well, we could describe the multiway contingency table by a loglinear model. But fitting a loglinear model has two disadvantages:
In ungrouped form, the response occupies a single column of the dataset, but in grouped form the response occupies r columns. Most computer programs for polytomous logistic regression can handle grouped or ungrouped data. Whether the data are grouped or ungrouped, we will imagine the response to be multinomial. That is, the “response” for row i, yi = (yi 1 , yi 2 ,... , yir )T^ , is assumed to have a multinomial distribution with index ni =
Pr j=1 yij^ and parameter πi = (πi 1 , πi 2 ,... , πir )T^. If the data are grouped, then ni is the total number of “trials” in the ith row of the dataset, and yij is the number of trials in which outcome j occurred. If the data are ungrouped, then yi has a 1 in the position corresponding to the outcome that occurred and 0’s elsewhere, and ni = 1. Note, however, that if the data are ungrouped, we do not have to actually create a dataset with columns of 0’s and 1’s; a single column containing the response level 1, 2 ,... , r is sufficient.
Describing polytomous responses by a sequence of binary models. In some cases, it makes sense to “factor” the response into a sequence of binary choices and model them with a sequence of ordinary logistic models. For example, consider the study of the effects of radiation exposure on mortality. The four-level response can be modeled in three stages:
Population
Alive Dead
Non-cancer Cancer
Other cancer Leukemia
Stage 1
Stage 2
Stage 3
Baseline-category logit model. Suppose that yi = (yi 1 , yi 2 ,... , yir )T
has a multinomial distribution with index ni =
Pr j=1 yij^ and parameter πi = (πi 1 , πi 2 ,... , πir )T^.
When the response categories 1, 2 ,... , r are unordered, the most popular way to relate πi to covariates is through a set of r − 1 baseline-category logits. Taking j∗^ as the baseline category, the model is
log
πij πij∗
= xTi βj , j = j∗.
If xi has length p, then this model has (r − 1) × p free parameters, which we can arrange as a matrix or a vector. For example, if the last category is the baseline (j∗^ = r), the coefficients are
β = [β 1 , β 2 ,... , βr− 1 ]
or
vec(β) =
β 1 β 2 .. . βr− 1
Comments on this model
length of xi, so the degrees of freedom are df = (N − p)(r − 1). The corresponding Pearson statistic is
X^2 =
i=
X^ r j=
r ij^2 ,
where rij = yij^ p^ −^ μˆij μˆij is the Pearson residual. If the model is true, both are approximately distributed as χ^2 df provided that
Overdispersion Overdispersion means that the actual covariance matrix of yi exceeds that specified by the multinomial
model,
V (yi) = ni
h Diag(πi) − πiπTi
i .
It is reasonable to think that overdispersion is present if
h Diag(πi) − πiπTi
i .
The usual estimate for σ^2 is
σˆ^2 = X
2 (N − p)(r − 1) , which is approximately unbiased if (N − p)(r − 1) is large. Introducing a scale parameter does not alter the estimate of β (which then becomes a quasilikelihood estimate), but it does alter our
Primary Food Choice Lake Sex Size Fish Inv. Rept. Bird Other Hancock M small 7 1 0 0 5 large 4 0 0 1 2 F small 16 3 2 2 3 large 3 0 1 2 3 Oklawaha M small 2 2 0 0 1 large 13 7 6 0 0 F small 3 9 1 0 2 large 0 1 0 1 0 Trafford M small 3 7 1 0 1 large 8 6 6 3 5 F small 2 4 1 1 4 large 0 1 0 0 0 George M small 13 10 0 2 2 large 9 0 0 1 2 F small 3 9 1 0 1 large 8 1 0 0 1
Because the usual primary food choice of alligators appears to be fish, we’ll use fish as the baseline category; the four logit equations will then describe the log-odds that alligators select other primary food types instead of fish.
Entering the data. When the data are grouped, as
they are in this example, SAS expects the response categories 1, 2 ,... , r to appear in a single column of the dataset, with another column containing the frequency or count. That is, the data should look like this: Hancock male small fish 7 Hancock male small invert 1 Hancock male small reptile 0 Hancock male small bird 0 Hancock male small other 5 Hancock male large fish 4 Hancock male large invert 0 Hancock male large reptile 0 Hancock male large bird 1 Hancock male large other 2 --lines omitted-- George female large bird 0 George female large other 1 The lines that have a frequency of zero are not actually used in the modeling, because they contribute nothing to the loglikelihood. You can include them if you want to, but it’s not necessary.
Response Profile Ordered Total Value food Frequency 1 bird 13 2 fish 94 3 invert 61 4 other 32 5 reptile 19 Logits modeled use food=’fish’ as the reference category. NOTE: 24 observations having zero frequencies or weights were excluded since they do not contribute to the analysis.
Deviance and Pearson Goodness-of-Fit Statistics Criterion DF Value Value/DF Pr > ChiSq Deviance 40 50.2637 1.2566 0. Pearson 40 52.5643 1.3141 0. Number of unique profiles: 16 There are N = 16 profiles (unique combinations of lake, sex and size) in this dataset. The saturated model, which fits a separate multinomial distribution to each profile, has 16 × 4 = 64 free parameters. The current model has an intercept, three lake coefficients, one sex coefficient and one size coefficient for each of the four logit equations, for a total of 24 parameters. Therefore, the overall fit statistics have 64 − 24 = 40 degrees of freedom.
Output pertaining to the significance of covariates: Testing Global Null Hypothesis: BETA= Test Chi-Square DF Pr > ChiSq Likelihood Ratio 66.4974 20 <. Score 59.4616 20 <. Wald 51.2336 20 0.
Type III Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq lake 12 36.2293 0. size 4 15.8873 0. sex 4 2.1850 0. The first section (global null hypothesis) tests the fit of the current model against a null or intercept-only model. The null model has four parameters (one for each logit equation). Therefore the comparison has 24 − 4 = 20 degrees of freedom. This test is highly significant, indicating that at least one of the covariates has an effect on food choice. The next section (Type III analysis of effects) shows the change in fit resulting from discarding any one of the covariates—lake, sex or size—while keeping the others in the model. For example, consider the test for lake. Discarding lake is equivalent to setting three coefficicients to zero in each of the four logit
size large other 1 -0.2906 0.4599 0.3992 0. size large reptile 1 0.5570 0.6466 0.7421 0. sex female bird 1 0.6064 0.6888 0.7750 0. sex female invert 1 0.4630 0.3955 1.3701 0. sex female other 1 0.2526 0.4663 0.2933 0. sex female reptile 1 0.6275 0.6852 0.8387 0. Odds Ratio Estimates Point 95% Wald Effect food Estimate Confidence Limits lake Oklawaha vs Hancock bird 0.324 0.031 3. lake Oklawaha vs Hancock invert 14.786 3.983 54. lake Oklawaha vs Hancock other 0.477 0.111 2. lake Oklawaha vs Hancock reptile 4.058 0.829 19. lake Trafford vs Hancock bird 1.938 0.369 10. lake Trafford vs Hancock invert 18.846 4.899 72. lake Trafford vs Hancock other 2.206 0.697 6. lake Trafford vs Hancock reptile 6.900 1.369 34. lake George vs Hancock bird 0.563 0.118 2. lake George vs Hancock invert 5.933 1.749 20. lake George vs Hancock other 0.465 0.152 1. lake George vs Hancock reptile 0.323 0.031 3. size large vs small bird 2.076 0.578 7. size large vs small invert 0.263 0.117 0. size large vs small other 0.748 0.304 1. size large vs small reptile 1.745 0.492 6. sex female vs male bird 1.834 0.475 7. sex female vs male invert 1.589 0.732 3. sex female vs male other 1.287 0.516 3. sex female vs male reptile 1.873 0.489 7. How do we interpret them? Recall that there are four logit equations to predict the log-odds of