Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Multinomial Logistic Regression Model - Categorical Data | STAT 544, Study notes of Statistics

Pennsylvania State University - Abington Statistics

Material Type: Notes; Class: Categorical Data Analysis I; Subject: Statistics; University: Penn State - Main Campus; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 09/24/2009

koofers-user-ti0 🇺🇸

10 documents

1 / 20

This page cannot be seen from the preview

Don't miss anything!

Stat 544, Lecture 19 1

Multinomial Logistic

Regression Models

Polytomous responses. Logistic regression can be

extended to handle responses that are polytomous,i.e.

taking r>2 categories. (Note: The word

polychotomous is sometimes used, but this word does

not exist!) When analyzing a polytomous response,

it’s important to note whether the response is ordinal

(consisting of ordered categories) or nominal

(consisting of unordered categories). Some types of

models are appropriate only for ordinal responses;

other models may be used whether the response is

ordinal or nominal. If the response is ordinal, we do

not necessarily have to take the ordering into account,

but it often helps if we do. Using the natural ordering

can

•lead to a simpler, more parsimonious model and

•increase power to detect relationships with other

variables.

Discover Study notes of Statistics Pennsylvania State University - Abington

Partial preview of the text

Download Multinomial Logistic Regression Model - Categorical Data | STAT 544 and more Study notes Statistics in PDF only on Docsity!

Multinomial Logistic

Regression Models

Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r > 2 categories. (Note: The word polychotomous is sometimes used, but this word does not exist!) When analyzing a polytomous response, it’s important to note whether the response is ordinal (consisting of ordered categories) or nominal (consisting of unordered categories). Some types of models are appropriate only for ordinal responses; other models may be used whether the response is ordinal or nominal. If the response is ordinal, we do not necessarily have to take the ordering into account, but it often helps if we do. Using the natural ordering can

lead to a simpler, more parsimonious model and
increase power to detect relationships with other variables.

If the response variable is polytomous and all the potential predictors are discrete as well, we could describe the multiway contingency table by a loglinear model. But fitting a loglinear model has two disadvantages:

It has many more parameters, and many of them are not of interest. The loglinear model describes the joint distribution of all the variables, whereas the logistic model describes only the conditional distribution of the response given the predictors.
The loglinear model is more complicated to interpret. In the loglinear model, the effect of a predictor X on the response Y is described by the XY association. In a logit model, however, the effect of X on Y is a main effect. If you are analyzing a set of categorical variables, and one of them is clearly a “response” while the others are predictors, I recommend that you use logistic rather than loglinear models.

In ungrouped form, the response occupies a single column of the dataset, but in grouped form the response occupies r columns. Most computer programs for polytomous logistic regression can handle grouped or ungrouped data. Whether the data are grouped or ungrouped, we will imagine the response to be multinomial. That is, the “response” for row i, yi = (yi 1 , yi 2 ,... , yir )T^ , is assumed to have a multinomial distribution with index ni =

Pr j=1 yij^ and parameter πi = (πi 1 , πi 2 ,... , πir )T^. If the data are grouped, then ni is the total number of “trials” in the ith row of the dataset, and yij is the number of trials in which outcome j occurred. If the data are ungrouped, then yi has a 1 in the position corresponding to the outcome that occurred and 0’s elsewhere, and ni = 1. Note, however, that if the data are ungrouped, we do not have to actually create a dataset with columns of 0’s and 1’s; a single column containing the response level 1, 2 ,... , r is sufficient.

Describing polytomous responses by a sequence of binary models. In some cases, it makes sense to “factor” the response into a sequence of binary choices and model them with a sequence of ordinary logistic models. For example, consider the study of the effects of radiation exposure on mortality. The four-level response can be modeled in three stages:

Population

Alive Dead

Non-cancer Cancer

Other cancer Leukemia

Stage 1

Stage 2

Stage 3

Baseline-category logit model. Suppose that yi = (yi 1 , yi 2 ,... , yir )T

has a multinomial distribution with index ni =

Pr j=1 yij^ and parameter πi = (πi 1 , πi 2 ,... , πir )T^.

When the response categories 1, 2 ,... , r are unordered, the most popular way to relate πi to covariates is through a set of r − 1 baseline-category logits. Taking j∗^ as the baseline category, the model is

log

πij πij∗

= xTi βj , j = j∗.

If xi has length p, then this model has (r − 1) × p free parameters, which we can arrange as a matrix or a vector. For example, if the last category is the baseline (j∗^ = r), the coefficients are

β = [β 1 , β 2 ,... , βr− 1 ]

vec(β) =

β 1 β 2 .. . βr− 1

Comments on this model

The kth element of βj can be interpreted as: the increase in log-odds of falling into category j versus category j∗^ resulting from a one-unit increase in the kth covariate, holding the other covariates constant.
Removing the kth covariate from the model is equivalent to simultaneously setting j − 1 coefficients to zero.
Any of the categories can be chosen to be the baseline. The model will fit equally well, achieving the same likelihood and producing the same fitted values. Only the values and interpretation of the coefficients will change.

length of xi, so the degrees of freedom are df = (N − p)(r − 1). The corresponding Pearson statistic is

X^2 =

X^ N

X^ r j=

r ij^2 ,

where rij = yij^ p^ −^ μˆij μˆij is the Pearson residual. If the model is true, both are approximately distributed as χ^2 df provided that

no more than 20% of the μij ’s are below 5.0, and
none are below 1.0. In practice this is often not satisfied, so there may be no way to assess the overall fit of the model. However, we may still apply a χ^2 approximation to ΔG^2 and ΔX^2 to compare nested models, provided that (N − p)(r − 1) is large relative to Δdf.

Overdispersion Overdispersion means that the actual covariance matrix of yi exceeds that specified by the multinomial

model,

V (yi) = ni

h Diag(πi) − πiπTi

i .

It is reasonable to think that overdispersion is present if

the data are grouped (ni’s are greater than 1),
xi already contains all covariates worth considering, and
the overall X^2 is substantially larger than its degrees of freedom (N − p)(r − 1). In this situation, it may be worthwhile to introduce a scale parameter σ^2 , so that V (yi) = niσ^2

h Diag(πi) − πiπTi

i .

The usual estimate for σ^2 is

σˆ^2 = X

2 (N − p)(r − 1) , which is approximately unbiased if (N − p)(r − 1) is large. Introducing a scale parameter does not alter the estimate of β (which then becomes a quasilikelihood estimate), but it does alter our

Primary Food Choice Lake Sex Size Fish Inv. Rept. Bird Other Hancock M small 7 1 0 0 5 large 4 0 0 1 2 F small 16 3 2 2 3 large 3 0 1 2 3 Oklawaha M small 2 2 0 0 1 large 13 7 6 0 0 F small 3 9 1 0 2 large 0 1 0 1 0 Trafford M small 3 7 1 0 1 large 8 6 6 3 5 F small 2 4 1 1 4 large 0 1 0 0 0 George M small 13 10 0 2 2 large 9 0 0 1 2 F small 3 9 1 0 1 large 8 1 0 0 1

Because the usual primary food choice of alligators appears to be fish, we’ll use fish as the baseline category; the four logit equations will then describe the log-odds that alligators select other primary food types instead of fish.

Entering the data. When the data are grouped, as

they are in this example, SAS expects the response categories 1, 2 ,... , r to appear in a single column of the dataset, with another column containing the frequency or count. That is, the data should look like this: Hancock male small fish 7 Hancock male small invert 1 Hancock male small reptile 0 Hancock male small bird 0 Hancock male small other 5 Hancock male large fish 4 Hancock male large invert 0 Hancock male large reptile 0 Hancock male large bird 1 Hancock male large other 2 --lines omitted-- George female large bird 0 George female large other 1 The lines that have a frequency of zero are not actually used in the modeling, because they contribute nothing to the loglikelihood. You can include them if you want to, but it’s not necessary.

Response Profile Ordered Total Value food Frequency 1 bird 13 2 fish 94 3 invert 61 4 other 32 5 reptile 19 Logits modeled use food=’fish’ as the reference category. NOTE: 24 observations having zero frequencies or weights were excluded since they do not contribute to the analysis.

Deviance and Pearson Goodness-of-Fit Statistics Criterion DF Value Value/DF Pr > ChiSq Deviance 40 50.2637 1.2566 0. Pearson 40 52.5643 1.3141 0. Number of unique profiles: 16 There are N = 16 profiles (unique combinations of lake, sex and size) in this dataset. The saturated model, which fits a separate multinomial distribution to each profile, has 16 × 4 = 64 free parameters. The current model has an intercept, three lake coefficients, one sex coefficient and one size coefficient for each of the four logit equations, for a total of 24 parameters. Therefore, the overall fit statistics have 64 − 24 = 40 degrees of freedom.

Output pertaining to the significance of covariates: Testing Global Null Hypothesis: BETA= Test Chi-Square DF Pr > ChiSq Likelihood Ratio 66.4974 20 <. Score 59.4616 20 <. Wald 51.2336 20 0.

Type III Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq lake 12 36.2293 0. size 4 15.8873 0. sex 4 2.1850 0. The first section (global null hypothesis) tests the fit of the current model against a null or intercept-only model. The null model has four parameters (one for each logit equation). Therefore the comparison has 24 − 4 = 20 degrees of freedom. This test is highly significant, indicating that at least one of the covariates has an effect on food choice. The next section (Type III analysis of effects) shows the change in fit resulting from discarding any one of the covariates—lake, sex or size—while keeping the others in the model. For example, consider the test for lake. Discarding lake is equivalent to setting three coefficicients to zero in each of the four logit

size large other 1 -0.2906 0.4599 0.3992 0. size large reptile 1 0.5570 0.6466 0.7421 0. sex female bird 1 0.6064 0.6888 0.7750 0. sex female invert 1 0.4630 0.3955 1.3701 0. sex female other 1 0.2526 0.4663 0.2933 0. sex female reptile 1 0.6275 0.6852 0.8387 0. Odds Ratio Estimates Point 95% Wald Effect food Estimate Confidence Limits lake Oklawaha vs Hancock bird 0.324 0.031 3. lake Oklawaha vs Hancock invert 14.786 3.983 54. lake Oklawaha vs Hancock other 0.477 0.111 2. lake Oklawaha vs Hancock reptile 4.058 0.829 19. lake Trafford vs Hancock bird 1.938 0.369 10. lake Trafford vs Hancock invert 18.846 4.899 72. lake Trafford vs Hancock other 2.206 0.697 6. lake Trafford vs Hancock reptile 6.900 1.369 34. lake George vs Hancock bird 0.563 0.118 2. lake George vs Hancock invert 5.933 1.749 20. lake George vs Hancock other 0.465 0.152 1. lake George vs Hancock reptile 0.323 0.031 3. size large vs small bird 2.076 0.578 7. size large vs small invert 0.263 0.117 0. size large vs small other 0.748 0.304 1. size large vs small reptile 1.745 0.492 6. sex female vs male bird 1.834 0.475 7. sex female vs male invert 1.589 0.732 3. sex female vs male other 1.287 0.516 3. sex female vs male reptile 1.873 0.489 7. How do we interpret them? Recall that there are four logit equations to predict the log-odds of

birds versus fish,
invertebrates versus fish,
other versus fish, and

reptiles versus fish. The intercepts give the estimated log-odds for the reference group lake=Hancock, size=small, sex=male. For example, the estimated log-odds of birds versus fish in this group is − 2 .4633; the estimated log-odds of invertebrates versus fish is − 2 .0744; and so on. The lake effect is characterized by three dummy coefficients in each of the four logit equations. The estimated coefficient for the Lake Oklawaha dummy in the bird-versus-fish equation is − 1 .1256. This means that alligators in Lake Oklawaha are less likely to choose birds over fish than their colleagues in Lake Hancock are. In other words, fish appear to be less common in Lake Oklawaha than in Lake Hancock. The estimated odds ratio of exp(− 1 .1256) = 0.32 is the same for alligators of all sex and sizes, because this is a model with main effects but no interactions.

Multinomial Logistic Regression Model - Categorical Data | STAT 544, Study notes of Statistics

Related documents

Partial preview of the text

Download Multinomial Logistic Regression Model - Categorical Data | STAT 544 and more Study notes Statistics in PDF only on Docsity!

Multinomial Logistic

Regression Models

X^ N