Multinomial Logit Regression Analysis on 1991 General Social Survey Data, Study notes of Sociology

A sociology exercise on multinomial logit regression analysis using the 1991 general social survey data. The exercise involves testing hypotheses on the relationship between various independent variables, including years of schooling, age, sex, rural upbringing, and region dummies, and a categorical dependent variable representing occupations. The analysis includes fitting the null model, testing individual coefficients, and testing the effect of subsets of coefficients.

Typology: Study notes

2011/2012

Uploaded on 11/20/2012

shubnam
shubnam 🇮🇳

4.5

(6)

127 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sociology
multinomial logit
testing hypotheses
The data for this exercise again comes from the 1991 General Social Survey. The
categorical dependent variable occ is coded as follows:
occ=0 if a workers occupation is laborer, operative or craft;
occ=1 if occupation is clerical, sales, or service;
occ=2 if occupation is managerial, technical, or professional.
The independent variables are: educ is years of schooling; age is age in years; sexx
is coded 1 male, 0 female; rural is coded 1 if grew up in rural area, 0 otherwise; mid
and wst are dummy variables for region, with other parts of the country omitted.
Let’s fit what we’ll treat for most of this exercise as the null model.
3. mlogit occ educ age sexx rural mid wst,base(0)
Multinomial regression Number of obs = 633
LR chi2(12) = 353.13
Prob > chi2 = 0.0000
Log likelihood = -511.92941 Pseudo R2 = 0.2564
------------------------------------------------------------------------------
occ | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
1 |
educ | .2490034 .056606 4.399 0.000 .1380577 .3599492
age | .0156041 .0099216 1.573 0.116 -.0038418 .0350501
sexx | -2.028054 .2392113 -8.478 0.000 -2.4969 -1.559209
rural | -.7635868 .2619814 -2.915 0.004 -1.277061 -.2501126
mid | .4081406 .2761675 1.478 0.139 -.1331378 .9494189
wst | .4151271 .3078639 1.348 0.178 -.188275 1.018529
_cons | -2.253103 .853224 -2.641 0.008 -3.925391 -.5808147
---------+--------------------------------------------------------------------
2 |
educ | .7840261 .0684775 11.449 0.000 .6498126 .9182395
age | .01764 .011552 1.527 0.127 -.0050015 .0402816
sexx | -1.680553 .2778157 -6.049 0.000 -2.225062 -1.136044
rural | -.128399 .2965349 -0.433 0.665 -.7095968 .4527988
mid | .144635 .3137103 0.461 0.645 -.4702258 .7594958
wst | .3873871 .3445527 1.124 0.261 -.2879237 1.062698
_cons | -10.27188 1.063177 -9.661 0.000 -12.35567 -8.188092
------------------------------------------------------------------------------
Tests of individual coefficients
You can use the z-scores to test for individual coefficients in separate equations.
To do a test of ALL the coefficients of a given variable, say educ, in all the
equations, you need to impose the constraint of the null hypothesis,
and then estimate the restricted model:
docsity.com
pf3
pf4

Partial preview of the text

Download Multinomial Logit Regression Analysis on 1991 General Social Survey Data and more Study notes Sociology in PDF only on Docsity!

Sociology

multinomial logit

testing hypotheses

The data for this exercise again comes from the 1991 General Social Survey. The categorical dependent variable occ is coded as follows:

occ=0 if a workers occupation is laborer, operative or craft; occ=1 if occupation is clerical, sales, or service; occ=2 if occupation is managerial, technical, or professional.

The independent variables are: educ is years of schooling; age is age in years; sexx is coded 1 male, 0 female; rural is coded 1 if grew up in rural area, 0 otherwise; mid and wst are dummy variables for region, with other parts of the country omitted.

Let’s fit what we’ll treat for most of this exercise as the null model.

3. mlogit occ educ age sexx rural mid wst,base(0)

Multinomial regression Number of obs = 633 LR chi2(12) = 353. Prob > chi2 = 0. Log likelihood = -511.92941 Pseudo R2 = 0.


occ | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- 1 | educ | .2490034 .056606 4.399 0.000 .1380577. age | .0156041 .0099216 1.573 0.116 -.0038418. sexx | -2.028054 .2392113 -8.478 0.000 -2.4969 -1. rural | -.7635868 .2619814 -2.915 0.004 -1.277061 -. mid | .4081406 .2761675 1.478 0.139 -.1331378. wst | .4151271 .3078639 1.348 0.178 -.188275 1. _cons | -2.253103 .853224 -2.641 0.008 -3.925391 -. ---------+-------------------------------------------------------------------- 2 | educ | .7840261 .0684775 11.449 0.000 .6498126. age | .01764 .011552 1.527 0.127 -.0050015. sexx | -1.680553 .2778157 -6.049 0.000 -2.225062 -1. rural | -.128399 .2965349 -0.433 0.665 -.7095968. mid | .144635 .3137103 0.461 0.645 -.4702258. wst | .3873871 .3445527 1.124 0.261 -.2879237 1. _cons | -10.27188 1.063177 -9.661 0.000 -12.35567 -8.


Tests of individual coefficients

You can use the z-scores to test for individual coefficients in separate equations. To do a test of ALL the coefficients of a given variable, say educ, in all the equations, you need to impose the constraint of the null hypothesis,

and then estimate the restricted model:

4. mlogit occ age sexx rural mid wst,base(0)

Multinomial regression Number of obs = 633 LR chi2(10) = 112. Prob > chi2 = 0. Log likelihood = -632.35198 Pseudo R2 = 0.


occ | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- 1 | age | .0139588 .0095435 1.463 0.144 -.0047461. sexx | -1.941273 .2293351 -8.465 0.000 -2.390762 -1. rural | -.9540048 .2512577 -3.797 0.000 -1.446461 -. mid | .5754085 .2641193 2.179 0.029 .0577443 1. wst | .476824 .2938117 1.623 0.105 -.0990362 1. _cons | .8674283 .418339 2.074 0.038 .0474989 1. ---------+-------------------------------------------------------------------- 2 | age | .0103217 .0094321 1.094 0.274 -.0081648. sexx | -1.213865 .2267656 -5.353 0.000 -1.658317 -. rural | -.7604203 .2414063 -3.150 0.002 -1.233568 -. mid | .4151938 .2612787 1.589 0.112 -.096903. wst | .4373206 .2893423 1.511 0.131 -.1297799 1. _cons | .6000712 .4171453 1.439 0.150 -.2175186 1.


(Outcome occ==0 is the comparison group) The likelihood ratio test statistic is then

This is distributed as a chi-square with 2 degrees of freedom. Since the mean of a chisquare with df=2 is 2, 242 is way into any reasonable critical region.

Another way to do this same test (i.e., that all the educ coefficients are zero) is with a Wald statistic produced by Stata’s test command. After fitting the full alternative model with command 3 above, issue the following command:

5. test educ

( 1) [1]educ = 0. ( 2) [2]educ = 0.

chi2( 2) = 146. Prob > chi2 = 0.

Multinomial regression Number of obs = 633 LR chi2(6) = 223. Prob > chi2 = 0. Log likelihood = -576.59741 Pseudo R2 = 0.


occ | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- 1 | educ | (dropped) age | (dropped) sexx | (dropped) rural | (dropped) mid | (dropped) wst | (dropped) _cons | .3659343 .0992281 3.688 0.000 .1714508. ---------+-------------------------------------------------------------------- 2 | educ | .6130706 .0520059 11.788 0.000 .5111409. age | .0076156 .0090492 0.842 0.400 -.0101205. sexx | -.2976082 .210549 -1.413 0.158 -.7102768. rural | .331173 .2502102 1.324 0.186 -.1592299. mid | -.1160987 .2463791 -0.471 0.637 -.5989928. wst | .1039835 .2646088 0.393 0.694 -.4146401. _cons | -8.609597 .8531524 -10.092 0.000 -10.28174 -6.


(Outcome occ==0 is the comparison group)

The LR statistic is 2(576.5974-511.92941)= 129.

Here’s another way to do this test for all possible pairs of equations. Fit the full model, and then issue this command.

. mlogtest , lrcomb

**** LR tests for combining outcome categories

Ho: All coefficients except intercepts associated with given pair of outcomes are 0 (i.e., categories can be collapsed).

Categories tested | chi2 df P>chi ------------------+------------------------ 1- 2 | 153.790 6 0. 1- 0 | 129.336 6 0. 2- 0 | 259.746 6 0.