Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Multiple Regression: Model and Estimation - Prof. Emiliano Valdez, Study notes of Mathematics

University of Connecticut (UConn) - Avery Point Mathematics

Prof. Emiliano Valdez

This document, used in a university of connecticut - storrs math 3621 applied actuarial statistics course in the fall 2009 semester, provides an in-depth analysis of multiple regression, including the model, estimation, least squares, hat matrix, gauss-markov theorem, and goodness of fit measures. It also includes an example of catastrophic bonds and an additional case study on demand for term life insurance.

Typology: Study notes

Pre 2010

Uploaded on 02/25/2010

koofers-user-ny6 🇺🇸

10 documents

1 / 28

This page cannot be seen from the preview

Don't miss anything!

Multiple Regression:

Model and

Estimation

EA Valdez

Introduction

The regression model

Least squares estimates

The hat (or projection)

matrix

Properties

Gauss-Markov Theorem

Some goodness of fit

measures

An example -

catastrophic bonds

Morton Lane’s study

Initial data analysis

Preliminary visual analysis

R source codes forfitting

the linear models

R source codes forfitting

the linear models

Interpreting the regression

coefficients

Added variable plots

What are they?

How to do added variable

plots?

Additional case study

Demand forter m life

insurance

page 1

Multiple Regression: Model and

Estimation

Math 3621 Applied Actuarial Statistics

Fall 2009 semester

EA Valdez

University of Connecticut - Storrs

Lecture Weeks 6-7

Discover Study notes of Mathematics University of Connecticut (UConn) - Avery Point

Partial preview of the text

Download Multiple Regression: Model and Estimation - Prof. Emiliano Valdez and more Study notes Mathematics in PDF only on Docsity!

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Multiple Regression: Model and

Estimation

Math 3621 Applied Actuarial Statistics

Fall 2009 semester

EA Valdez

University of Connecticut - Storrs

Lecture Weeks 6-

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

The observable data

Assume our observed data set consists of

(Xi 0 , Xi 1 ,... , Xik , Yi ) for i = 1 , 2 ,... , n.

n is total number of observations;

Xi 0 is associated with the “intercept” term and is usually 1;

and

k is the number of explanatory variables.

Define the vector of responses, Y , and matrix of

explanatory variables, X , as

Y =

Y 1

Y 2

Yn

and X =

X 10 X 11 X 12 · · · X 1 k

X 20 X 21 X 22 · · · X 2 k

Xn 0 Xn 1 Xn 2 · · · Xnk

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Specific individual observation

For a specific observation i, define the row vector of

observed explanatory variables by

X

i

= [Xi 0 , Xi 1 ,... , Xik ]

Thus, we see that the regression model for this specific

observation can be written as

Yi = X

i

β + εi.

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Least squares estimates

The least squares estimates of β, denoted b , minimizes

the sum of squares

SS(β) = ε

ε = ( Y − X β)

( Y − X β).

Note that there are (k + 1 ) parameters to estimate,

including the intercept.

Differentiating and then setting to zero, we have the

normal equations:

X

Xb = X

Y ,

where b is the least squares vector.

Provided X

X is invertible, we have

b = ( X

X )

X

Y.

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Properties of the parameter estimates

Unbiased estimates:

E( b ) = β.

Variance-covariance matrix:

Var( b ) = σ

( X

X )

Estimate for σ

s

= Error MS =

Error SS

n − (k + 1 )

Standard error for a particular component of b :

se(bi− 1 ) = s

( X

X )

ii

, for i = 1 , 2 ,... , k + 1.

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Gauss-Markov Theorem

There are some reasons why the least squares estimates

b are good estimates for β:

Geometrically, it does makes sense because it results from

an orthogonal projection onto the linear space.

These least squares estimates are equivalent to maximum

likelihood estimates in the case where the errors are i.i.d.

normally distributed.

According to the Gauss-Markov theorem, the least squares

estimates are Best Linear Unbiased Estimates (BLUE).

Details of proof of the Gauss-Markov Theorem will be

provided in lectures.

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Some goodness of fit measures

The proportion of variability (still, just like the simple linear

regression model) explained by the regression model is

R

Regression SS

Total SS

n

i= 1

Yi − Y )

n

i= 1

(Yi − Y )

This is also called the coefficient of determination.

When an explanatory variable is added to the regression

model, unfortunately, this R

never decreases.

The adjusted R

defined by

R

a =^1 −^

Error SS/(n − (k + 1 ))

Total SS/(n − 1 )

s

Y

provides for the proportion of the variation explained by

the regression, but adjusted for the number of predictor

variables (or degrees of freedom).

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Morton Lane’s study of catastrophic bonds

Published in ASTIN BUlletin (Vol. 30, Year 2000, pp

Lane fitted regression models to help explain the pricing of

risk transfer in the catastrophic bond market.

CAT bonds refer to securities that provide for coupon

payments and principal based on the aggregate losses of

a portfolio of insurance contracts.

CAT bonds are meant to provide insurance companies a

way to manage catastrophic insurance risks, and at the

same time, investors who wish to have the opportunity to

profit from the transfer of insurance risks.

Lane consider 16 catastrophic bond issues made in 1999.

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Preliminary investigation of the data

read the data file

> cat.bond <- read.csv("C:/Documents and Settings/.../Math238-Fall2007/Data/CATBond-data.csv") attach(cat.bond)

> cat.bond Transaction EER PFL CEL 1 Mosaic 2A 0.0364 0.0115 0. 2 Mosaic 2B 0.0552 0.0525 0. 3 Halyard Re 0.0393 0.0084 0. 4 Domestic Re 0.0324 0.0058 0. 5 Concentric Re 0.0272 0.0064 0. 6 Juno Re 0.0381 0.0060 0. 7 Residential Re 0.0327 0.0076 0. 8 Kelvin 1st Event 0.0652 0.1210 0. 9 Kelvin 2nd Event 0.0452 0.0156 0. 10 Gold Eagle A 0.0282 0.0017 1. 11 Gold Eagle B 0.0485 0.0078 0. 12 Namazu Re 0.0381 0.0100 0. 13 Atlas Re A 0.0263 0.0019 0. 14 Atlas Re B 0.0352 0.0029 0. 15 Atlas Re C 0.1095 0.0547 0. 16 Seismic Ltd 0.0383 0.0113 0.

> summary(cat.bond) Transaction EER PFL CEL Atlas Re A : 1 Min. :0.02630 Min. :0.00170 Min. :0. Atlas Re B : 1 1st Qu.:0.03263 1st Qu.:0.00595 1st Qu.:0. Atlas Re C : 1 Median :0.03810 Median :0.00810 Median :0. Concentric Re: 1 Mean :0.04349 Mean :0.02032 Mean :0. Domestic Re : 1 3rd Qu.:0.04603 3rd Qu.:0.01253 3rd Qu.:0. Gold Eagle A : 1 Max. :0.10950 Max. :0.12100 Max. :1. (Other) :

you can also do mean, sd, quantiles, etc.; not done here.

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Histogram of variables

histograms and sorted plots

> par(mfrow=c(3,2)) > hist(EER,br=10) > plot(sort(EER),pch=3) > hist(PFL,br=10) > plot(sort(PFL),pch=3) > hist(CEL,br=10) > plot(sort(CEL),pch=3)

Histogram of EER

EER

Frequency

0.02 0.04 0.06 0.08 0.

0

2

4

6

8

5 10 15

Index

sort(EER)

Histogram of PFL

PFL

Frequency

0.00 0.02 0.04 0.06 0.08 0.10 0.

0

2

4

6

8

10

5 10 15

Index

sort(PFL)

Histogram of CEL

CEL

Frequency

0.2 0.4 0.6 0.8 1.

0

1

2

3

4

5 10 15

Index

sort(CEL)

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Scatter plot matrix

scatter plot matrix

> pairs(data.frame(EER,PFL,CEL),cex=1.5,pch=19)

EER

0.00 0.04 0.08 0.

l

l l l

l l

l

l l l

l

l 0.

l

l l l

l l

l

l l l

l

l llll

l

l l

ll l l

l

PFL

l

ll ll (^) l

l

l l

ll l l

l

0.04 0.06 0.08 0.

l

l l

l

l l

l

0.2 0.4 0.6 0.8 1.

CEL

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

Scatter plot matrix

scatter plot matrix

> pairs(data.frame(log(EER),log(PFL),log(CEL)),cex=1.5,pch=19, labels=c("log(EER)","log(PFL)","log(CEL)"))

log(EER)

−6 −5 −4 −3 −

l

l l l

l l

l

−3.

−2.

l

l l l

l l

l

−

l

l l (^) ll l

l

log(PFL)

l

l l (^) ll l

l

−3.6 −3.2 −2.8 −2.

l

l l

l

l l l

l

l l

l

l l

l

l l l

l

l l

−1.5 −1.0 −0.5 0.

−1.

−0.

log(CEL)

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

R source codes for fitting the linear models

fitting the linear models with EER as response and PFL and CEL as predictors

> lm1 <- lm(EER~ PFL + CEL) > summary(lm1)

Call: lm(formula = EER ~ PFL + CEL)

Residuals: Min 1Q Median 3Q Max -0.0217089 -0.0061226 -0.0016851 0.0005938 0.

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.032502 0.017062 1.905 0.. PFL 0.439915 0.153191 2.872 0.0131 * CEL 0.003201 0.023279 0.138 0.

Signif. codes: 0 ’’ 0.001 ’’ 0.01 ’’ 0.05 ’.’ 0.1 ’ ’ 1

Residual standard error: 0.01646 on 13 degrees of freedom Multiple R-Squared: 0.4361, Adjusted R-squared: 0. F-statistic: 5.027 on 2 and 13 DF, p-value: 0.

ANOVA table

> anova(lm1) Analysis of Variance Table

Response: EER Df Sum Sq Mean Sq F value Pr(>F) PFL 1 0.0027204 0.0027204 10.0357 0.007412 (^) ** CEL 1 0.0000051 0.0000051 0.0189 0. Residuals 13 0.0035239 0.

Signif. codes: 0 ’’ 0.001 ’’ 0.01 ’’ 0.05 ’.’ 0.1 ’ ’ 1

Model and Estimation

EA Valdez

Introduction The regression model Least squares estimates The hat (or projection) matrix Properties Gauss-Markov Theorem Some goodness of fit measures

An example - catastrophic bonds Morton Lane’s study Initial data analysis Preliminary visual analysis R source codes for fitting the linear models R source codes for fitting the linear models Interpreting the regression coefficients

Added variable plots What are they? How to do added variable plots?

Additional case study Demand for term life insurance

R source codes for fitting the linear models

fitting the linear models with log(EER) as response and log(PFL) and log(CEL) as predictors

> summary(lm2)

Call: lm(formula = log(EER) ~ log(PFL) + log(CEL))

Residuals: Min 1Q Median 3Q Max -0.28900 -0.12959 -0.04742 0.08484 0.

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.80250 0.29559 -6.098 3.79e-05 *** log(PFL) 0.28668 0.05283 5.427 0.000116 *** log(CEL) 0.15409 0.15057 1.023 0.

Signif. codes: 0 ’’ 0.001 ’’ 0.01 ’’ 0.05 ’.’ 0.1 ’ ’ 1

Residual standard error: 0.206 on 13 degrees of freedom Multiple R-Squared: 0.72, Adjusted R-squared: 0. F-statistic: 16.71 on 2 and 13 DF, p-value: 0.

> anova(lm2) Analysis of Variance Table

Response: log(EER) Df Sum Sq Mean Sq F value Pr(>F) log(PFL) 1 1.37427 1.37427 32.3816 7.41e-05 *** log(CEL) 1 0.04445 0.04445 1.0474 0. Residuals 13 0.55172 0.

Signif. codes: 0 ’’ 0.001 ’’ 0.01 ’’ 0.05 ’.’ 0.1 ’ ’ 1

Multiple Regression: Model and Estimation - Prof. Emiliano Valdez, Study notes of Mathematics

Related documents

Partial preview of the text

Download Multiple Regression: Model and Estimation - Prof. Emiliano Valdez and more Study notes Mathematics in PDF only on Docsity!

Multiple Regression: Model and

Estimation

Math 3621 Applied Actuarial Statistics

Fall 2009 semester

EA Valdez

University of Connecticut - Storrs

Lecture Weeks 6-

The observable data

Assume our observed data set consists of

(Xi 0 , Xi 1 ,... , Xik , Yi ) for i = 1 , 2 ,... , n.

n is total number of observations;

Xi 0 is associated with the “intercept” term and is usually 1;

and

k is the number of explanatory variables.

Define the vector of responses, Y , and matrix of

explanatory variables, X , as

Y =

Y 1

Y 2

Yn

and X =

X 10 X 11 X 12 · · · X 1 k

X 20 X 21 X 22 · · · X 2 k

Xn 0 Xn 1 Xn 2 · · · Xnk

Specific individual observation

For a specific observation i, define the row vector of

observed explanatory variables by

X

i

= [Xi 0 , Xi 1 ,... , Xik ]

Thus, we see that the regression model for this specific

observation can be written as

Yi = X

i

β + εi.

Least squares estimates

The least squares estimates of β, denoted b , minimizes

the sum of squares

SS(β) = ε

ε = ( Y − X β)

( Y − X β).

Note that there are (k + 1 ) parameters to estimate,

including the intercept.

Differentiating and then setting to zero, we have the

normal equations:

X

Xb = X

Y ,

where b is the least squares vector.

Provided X

X is invertible, we have

b = ( X

X )

X

Y.

Properties of the parameter estimates

Unbiased estimates:

E( b ) = β.

Variance-covariance matrix:

Var( b ) = σ

( X

X )

Estimate for σ

s

= Error MS =

Error SS

n − (k + 1 )

Standard error for a particular component of b :

se(bi− 1 ) = s

( X

X )

ii

, for i = 1 , 2 ,... , k + 1.

Gauss-Markov Theorem

There are some reasons why the least squares estimates

b are good estimates for β: