Regression Analysis of Urban Mortality: Identifying Key Predictors, Assignments of Statistics

An analysis of urban mortality data using nonlinear and multiple regression methods. The study identifies the correlation between housing conditions (shs), population density (pop/house), and percentage of white-collar workers (%whitecollar) with mortality rates. The document also includes statistical tests to determine the significance of each predictor variable and the overall model fit.

Typology: Assignments

Pre 2010

Uploaded on 03/28/2010

koofers-user-rm1
koofers-user-rm1 🇺🇸

10 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Homework Solution (#6)
Additional: Analysis of Urban Mortality data
Chapter 12. Nonlinear and Multiple Regression
1.
a).
ρ
(Mortality, Shs) = - 46358
228389 = - .4505
ρ
(Shs, Resid. Mort//SH) = 0
b). To create best possible multiple regression prediction equation with Shs and
one more predictor variable, we will choose %WhiteCollar since it has the biggest
absolute correlation with Residual.
2.
ρ
(Shs, %WhiteCollar) = .1659
-- It is making sense since as the sound housing in an area goes up, so should the
price of the house and the proportion of the white-collar;
ρ
(Shs, p<3) = -.4386
-- It is making sense since as sound housing goes up, the proportion of less
income people will go down;
ρ
(Shs, pop/house) = -.4228
-- It is making sense since as sound housing goes up, the income of family will go
up and less people have to live in each room.
3.
a). Rsquare = 55118/228398 = .241
b). Root Mean Square Error = 3040 = 55.14
c). Mean of Response = 940.3 (from JMP summary on page 8)
d). Shs -- t Ratio = -4.445/1.576 = - 2.82
e). Shs – Prob>|t| = P(| 57
t|> 2.82) = .006
f). pop/house -- Std Error = 74.12/1.697 = 43.68
g). pop/house – t Ratio = 2.88 = 1.697
h). pop/house – Prob>|t| = P(| 57
t|> 1.697) = .0951
i). Shs – F Ratio = 24194/3040 = 7.96
j). Shs – Prob > F = P( 1,57
F>7.96) = .006
k). pop/house – Sum of Squares = 55118 – 46358 = 8760
l). pop/house – F Ratio = 8760/3040 = 2.88
m). pop/house – Prob>F = P( 1,57
F>2.88) = .0951
n). Model – DF = 2
o). Model – Sum of Squares = 228398 – 173280 = 55118
p). Model – Mean Square = 55118/2 = 27559
pf3

Partial preview of the text

Download Regression Analysis of Urban Mortality: Identifying Key Predictors and more Assignments Statistics in PDF only on Docsity!

Homework Solution (#6)

Additional: Analysis of Urban Mortality data

Chapter 12. Nonlinear and Multiple Regression

a). ρ (Mortality, Shs) = -

ρ (Shs, Resid. Mort//SH) = 0 b). To create best possible multiple regression prediction equation with Shs and one more predictor variable, we will choose %WhiteCollar since it has the biggest absolute correlation with Residual.

ρ (Shs, %WhiteCollar) =. -- It is making sense since as the sound housing in an area goes up, so should the price of the house and the proportion of the white-collar; ρ (Shs, p<3) = -. -- It is making sense since as sound housing goes up, the proportion of less income people will go down; ρ (Shs, pop/house) = -. -- It is making sense since as sound housing goes up, the income of family will go up and less people have to live in each room.

a). Rsquare = 55118/228398 =. b). Root Mean Square Error = 3040 = 55. c). Mean of Response = 940.3 (from JMP summary on page 8) d). Shs -- t Ratio = -4.445/1.576 = - 2. e). Shs – Prob>|t| = P(| t 57 |> 2.82) =. f). pop/house -- Std Error = 74.12/1.697 = 43. g). pop/house – t Ratio = 2.88 = 1. h). pop/house – Prob>|t| = P(| t 57 |> 1.697) =. i). Shs – F Ratio = 24194/3040 = 7. j). Shs – Prob > F = P( F 1,57 >7.96) =. k). pop/house – Sum of Squares = 55118 – 46358 = 8760 l). pop/house – F Ratio = 8760/3040 = 2. m). pop/house – Prob>F = P( F 1,57 >2.88) =. n). Model – DF = 2 o). Model – Sum of Squares = 228398 – 173280 = 55118 p). Model – Mean Square = 55118/2 = 27559

q). Model – F Ratio = 27559/3040 = 9. r). Error – DF = 57 s). Error – Mean Square = 173280/57 = 3040 t). C total – DF = 59 u). C total – Sum of Squares = 228398 (from Summary 1)

Model: y = β 0 + β 1 Shs + β 2 Pop/House + ε

e). H (^) 0 : β 1 =0 v.s. H (^) a : β 1 ≠ 0 p-value = .006< .05, so reject H 0. h). H (^) 0 : β 2 =0 v.s. H (^) a : β 2 ≠ 0 p-value = .0951>.05, so don’t reject H (^) 0. j). H (^) 0 : β 1 =0 v.s. H (^) a : β 1 ≠ 0 p-value = .006< .05, so reject H 0. m). H (^) 0 : β 2 =0 v.s. H (^) a : β 2 ≠ 0 p-value = .0951>.05, so don’t reject H (^) 0. q). H (^) 0 : β 1 = β 2 =0 v.s. H (^) a : β 1 ≠ 0 or β 2 ≠ 0 p-value = .0004<.05, so reject H 0.

i). Mortality = 1058 – 4.44578 + 74.123.1 = 941. ii). Residual = 1000 – 941.1 = 58.

iii). Suppose ε ~N(0, σ 2 ), then σˆ 2 =55.14, so P( ε i >58.9) =.

iv). 90% prediction interval = (941.1- t .05,57 *56.34, 941.1+- t .05,57 *56.34) = (846.9, 1035.3) (Since SD ( y ˆ ) = 56.34.)

99% confidence interval for β 1 = -4.445 ± t .005,57 *1.576 = (-8.67, -.224)

β 2 = 74.12, so it is correct for the claim of “If the value of SHs in a community

is held constant then decreasing the pop/house by 0.5 people will reduce the mortality rate by roughly 35 per 100,000. Consequently, reducing the pop/house while holding Shs constant increases life expectancy.”

From Residual plot, point B has small residual while point A and C have large residuals. From leverage plots, point A and C should be labeled as high leverage points. High leverage point means that if the point is excluded, the estimated coefficient will change a lot.