

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of regression analysis in sociology, specifically in relation to the probability of a student dropping out of high school. The unbiasedness and efficiency of least-squares estimators for slope parameters, the suitability of linear models for binary dependent variables, and the interpretation of coefficients in the context of logistic regression. Students will gain a deeper understanding of regression analysis and its applications in sociology.
Typology: Exercises
1 / 2
This page cannot be seen from the preview
Don't miss anything!


sociology 362 mock exam 4
Yi = α + βxXi + βz Zi + ui (1)
where α is the intercept, βx and βz are slope parameters, u is a disturbance with E(u) = 0, X is years of schooling of the students father and Z is a dummy variable coded 1 if student worked while in school, zero otherwise.
a. Are the least-squares estimators of the slope parameters βx and βz unbiased? Explain.
b. Are the least-squares estimators of βx and βz efficient? Explain.
c. What makes the linear model an unsuitable response function for describing the relation between a binary 0/1 dependent variable and one or more quantitative regressors?
πˆi =. 15 −. 012 X +. 16 Z (2)
a. Interpret the coefficients of X and Z.
b. What is the expected probability that a student who works and whose father has 16 years of schooling will drop out of high school?
c. What is the expected variance of Yi (=1 if drop out; 0 otherwise) among students who work and whose father has 10 years of schooling?
Aj nj rj 10 80 4 14 150 8 28 140 12 46 30 5
where nj is the size of the jth patient group and rj is the number diagnosed as having diabetes.
a. Suppose we wished to fit by least-squares the linear probability model to these data. Give the value of the dependent variable for the oldest age group.
b. Suppose we wished to fit by least-squares the logit model to these data. For the 28 year-old group, give the value of the dependent variable.
a. According to the model, what is the difference in the logit for diabetes of someone age 30 compared to someone age 38.
b. For someone in the 46 year-old age group, compare the observed odds of diabetes (see Table for problem 4) to the odds of diabetes predicted by the model.
c. What is the estimated effect of a five-year increase in age on the odds of diabetes?
d. For someone in the 28 year-old age group, compare the observed proportion with diabetes (see data above) to the probability of diabetes as predicted by the model.
e. What does the model give as the point estimate of the odds ratio, and what does it mean?
f. Evaluate the overall “average” effect of a one-year increase in age on the probability of diabetes.
g. Construct the 95% confidence interval for the effect of a one year increase in age on the log-odds of diabetes.
πi = f (S, A, X, W ) (3)
where f is the logistic function, S is occupational status, A is age, X is a dummy coded 1 for females, W is weight. The results of fitting various models, all of which include an intercept, are:
M1: π = f(S, A, X, W) D = 60
M2: π = f(S, X) D = 210
M3: π = f(A, W) D = 80
M4: π = α D = 280
M5: π = f(S, X, W) D = 150
M6: π = f(A, X, W) D = 62
where D = -2log(L). Letting the level of significance be .05:
a. Test the null hypothesis that the coefficients of status, weight, age, and sex are simultaneously zero against the alternative that not all are zero.
b. Test the hypothesis that there are no status differences in heart disease once sex, weight and age are controlled.
c. Test the hypothesis that the coefficients of sex and status are both zero once age and weight are controlled (alternative hypothesis is that not both are zero).