Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Discrete Response Models: Multinomial, Conditional and Nested Logit, Study notes of Introduction to Econometrics

University of Southern California (USC)Introduction to Econometrics

Discrete response models for more than two outcomes, focusing on multinomial, conditional and nested logit models. The models are used to analyze the distribution of non-negative integer valued choices, such as travel modes or employment status, in terms of covariates. How to develop a model for the conditional probability of choice j given the covariates and provides a link with utility maximization.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-8li 🇺🇸

10 documents

1 / 8

This page cannot be seen from the preview

Don't miss anything!

Econ 513, USC, Fall 2005

Lecture 15. Discrete Response Models:

Multinomial, Conditional and Nested Logit Models

Here we focus again on models for discrete choice with more than two outcomes. We

assume that the outcome of interest, the choice ytakes on non-negative integer values

between zero and J;y∈ {0,1, . . . , J}. Unlike the ordered case there is no particular

meaning to the ordering. Examples are travel modes (bus/train/car), employment status

(employed/unemployed/out-of-the-laborforce), marital status (single/married/divorced/widowed)

and many others.

We wish to model the distribution of yin terms of covariates. In some cases we will

distinguish between covariates xithat vary by units (individuals or firms), and covariates that

vary by choice (and possibly individual), xij . Examples of the first type include individual

characteristics such as age, or education. An example of the second type is the cost associated

with the choice, for example the cost of commuting by bus/train/car. This distinction

only arises from the economics (or general scientific) substance of the problem. McFadden

developed the interpretation of these models through utility maximizing choice behavior. In

that case we may be willing to put restrictions on the way covariates affect choices: costs of

a particular choice affect the utility of that choice, but not the utilities of other choices.

The strategy is to develop a model for the conditional probability of choice jgiven the

covariates. Suppose the model is Pr(y=j|x) = Pj(x;θ). Then the log likelihood function is

L(θ) =

N

X

i=1

J

X

j=0

1{yi=j} · ln Pj(xi;θ).

I. Multinomial Logit

Suppose we only have individual specific covariates. Then we can model the response

probability as

Pr(y=j|x) = exp(x0βj)

1 + PJ

l=1 exp(x0βl),

for choices j= 1, . . . , J and

Pr(y= 0|x) = 1

1 + PJ

l=1 exp(x0βl),

for the first choice. This is a direct extension of the binary response logit model. It leads to

a very well-behaved likelihood function and is easy to estimate. More interestingly it can be

viewed as a special case of the following conditional logit.

1

Discover Study notes of Introduction to Econometrics University of Southern California (USC)

Partial preview of the text

Download Discrete Response Models: Multinomial, Conditional and Nested Logit and more Study notes Introduction to Econometrics in PDF only on Docsity!

Econ 513, USC, Fall 2005

Lecture 15. Discrete Response Models: Multinomial, Conditional and Nested Logit Models

Here we focus again on models for discrete choice with more than two outcomes. We assume that the outcome of interest, the choice y takes on non-negative integer values between zero and J; y ∈ { 0 , 1 ,... , J}. Unlike the ordered case there is no particular meaning to the ordering. Examples are travel modes (bus/train/car), employment status (employed/unemployed/out-of-the-laborforce), marital status (single/married/divorced/widowed) and many others.

We wish to model the distribution of y in terms of covariates. In some cases we will distinguish between covariates xi that vary by units (individuals or firms), and covariates that vary by choice (and possibly individual), xij. Examples of the first type include individual characteristics such as age, or education. An example of the second type is the cost associated with the choice, for example the cost of commuting by bus/train/car. This distinction only arises from the economics (or general scientific) substance of the problem. McFadden developed the interpretation of these models through utility maximizing choice behavior. In that case we may be willing to put restrictions on the way covariates affect choices: costs of a particular choice affect the utility of that choice, but not the utilities of other choices.

The strategy is to develop a model for the conditional probability of choice j given the covariates. Suppose the model is Pr(y = j|x) = Pj (x; θ). Then the log likelihood function is

L(θ) =

∑^ N

i=

∑^ J

j=

1 {yi = j} · ln Pj (xi; θ).

I. Multinomial Logit

Suppose we only have individual specific covariates. Then we can model the response probability as

Pr(y = j|x) =

exp(x′βj ) 1 +

∑J

l=1 exp(x ′βl)

for choices j = 1,... , J and

Pr(y = 0|x) =

∑J

l=1 exp(x ′βl),

for the first choice. This is a direct extension of the binary response logit model. It leads to a very well-behaved likelihood function and is easy to estimate. More interestingly it can be viewed as a special case of the following conditional logit.

II. Conditional Logit

Suppose all covariates vary by choice (and possibly also by individual, but that is not essential here). Then McFadden proposed the conditional logit model:

Pr(yi = j|xi 0 ,... , xiJ ) =

exp(x′ ij β) ∑J l=0 exp(x

′ ilβ)

for j = 0,... , J.

The multinomial logit model can be viewed as a special case of this. Suppose we have a vector of individual characteristics xi with dimension K. Then define for each choice j the vector of covariates xij as the vector of dimension K × (J + 1), with all zeros other than the elements K × j + 1 to K × (j + 1) which are equal to xi:

xi 0 =

x 1 0 .. . .. . 0

,... xij =

xi .. . 0

,... xiJ =

xi

III. Link with Utility Maximization

McFadden motivates this model by extending the latent index model to multiple choices. Suppose that the utility for individual i associated with choice j is

Uij = x′ ij β + εij. (1)

Furthermore, let individual i choose option j (that is yi = j) if that provides the highest level of utility, or

yi = j if Uij ≥ Uil for all l = 0,... , J,

(ties have probability zero because of the continuity of the distribution for ε).

Now suppose that the εij are independent accross choices and individuals and have type I extreme value distributions. Then the choice yi follows the conditional logit model. The type I extreme value distribution has cumulative distribution function

F () = exp(− exp(−)),

and probability density function

f () = exp(−) · exp(− exp(−)).

= exp(c) ·

−∞

exp(−η) · exp(− exp(−η))dη = exp(c),

by change of variables, which we apply with

c = − ln (1 + exp(x′ i 1 β − x′ i 0 β) +... + exp(x′ iJ β − x′ i 0 β)).

IV. Independence of Irrelevant Alternatives

The main problem with the conditional logit is the property of independence of irrelevant alternative (IIA). Consider the conditional probability of choosing j given that you choose either j or l, Pr(y = j|y ∈ {j, l}):

Pr(yi = j|yi ∈ {j, l}) =

exp(x′ ij β) exp(x′ ij β) + exp(x′ ilβ)

This probability does not depend on the characteristics of alternatives other than j and l. This is sometimes unattractive. McFadden’s famous blue bus/red bus example illustrates this. Suppose there are three choices: commuting by car, by red bus or by blue bus. A sensible model would be to think that people have a preference over cars versus buses, but are indifferent between red versus blue buses. That would imply that the conditional probability of commuting by car given that one commutes by car or red bus would probably differ from the same conditional probability if there is no blue bus. Presumably taking away the blue bus choice would lead all the current blue bus users to shift to the red bus, and not to cars.

The solution is to allow in some fashion for correlation between the errors in the latent utility representation (1). With choice set that contains multiple versions of essentially the same option, we should allow the latent utilities for these choices to be identical, and so the error terms would have to be perfectly correlated. This can be done in a number of ways. We analyze the first one in the following discussion.

III. Nested Logit

One way to induce correlation between the choices is through nesting them. Suppose the set of choices { 0 , 1 ,... , J} can be partitioned into S sets B 1 ,... , BS , so that

{ 0 , 1 ,... , J} = ∪Ss=1Bs.

Let Zs be set specific variables. (It may be that the set of set specific variables is just a vector of indicators, with Zs an S-vector of zeros with a one for the sth element.) Now let the conditional probability of choice j given that yi ∈ Bs be equal to

Pr(yi = j|xi, yi ∈ Bs) =

exp(σ s− 1 x′ ij β) ∑ l∈Bs exp(σ − 1 s x ′ ilβ)

In addition suppose the probability of set Bs is

Pr(yi ∈ Bs|xi) =

exp(Z s′α)

l∈Bs exp(σ

− 1 s x ′ ilβ)

)σs ∑S t=1 exp(Z

′ tα)^

l∈Bt exp(σ

− 1 t x ′ ilβ)

)σs.

If we fix σs = 1 for all s, then

Pr(yi = j|xi) =

exp(x′ ij β + Z′ sα) ∑S t=

l∈Bt exp(x

′ ilβ^ +^ Ztα)

and we are back in the conditional logit model. The extra coefficient σs implicitly allows for correlation of the errors in (1). The joint distribution function of the εij is

F (εi 0 ,... , εiJ ) = exp

∑^ S

s=

exp(Z s′α)

j∈Bs

exp

−σ s− 1 εij

)σs ) .

Within the sets the ’correlation coefficient’ for the εij is equal to 1 − σ. Between the sets the εij are independent.

How do you estimate these models? One approach is to construct the log likelihood and directly maximize it. That is complicated, especially since the log likelihood function is not concave, but it is not impossible. An easier alternative is to directly use the nesting structure. Within a nest we have a conditional logit model with coefficients β/σs. Hence we can directly estimate β/σs using the concavity of the conditional logit model. Denote

these estimates of β/σs by β/σ̂ s. Then the probability of a particular set Bs can be used to estimate σs and α through

Pr(yi ∈ Bs|xi) =

exp(Z s′α)

l∈Bs exp(x

′ il

β/σs)

)σs

∑S

t=1 exp(Z ′ tα)

l∈Bt exp(x ′ il

β/σt)

)σs =

exp(Z s′α + σs Wˆs) ∑S t=1 exp(Z

′ tα^ +^ σt^ Wˆt)

where

Wˆs = ln

l∈Bs

exp(x′ il β/σ̂ s)

known as the “inclusive values”. Hence we have another conditional logit model back that is easily estimable. These two-step estimators are not efficient. The variance/covariance matrix is provided in McFadden (1981).

 - − 4 − 3 − 2 −

Discrete Response Models: Multinomial, Conditional and Nested Logit, Study notes of Introduction to Econometrics

Related documents

Partial preview of the text

Download Discrete Response Models: Multinomial, Conditional and Nested Logit and more Study notes Introduction to Econometrics in PDF only on Docsity!

∑^ N

∑^ J

∑J

∑J

∑^ S

∑S