Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Solutions to Exercise 1 in Applied Actuarial Statistics, Fall 2008 - Prof. Emiliano Valdez, Assignments of Mathematics

University of Connecticut (UConn) - Avery Point Mathematics

Prof. Emiliano Valdez

The solutions to exercise 1 in the applied actuarial statistics course offered at the university of x in the fall of 2008. The solutions cover various statistical concepts such as expanding a square, finding percentiles, and investigating the normality of data. The document also includes r code for generating histograms and q-q plots.

Typology: Assignments

Pre 2010

Uploaded on 02/25/2010

koofers-user-348 🇺🇸

4

(1)

10 documents

1 / 10

This page cannot be seen from the preview

Don't miss anything!

MATH 3621, APPLIE D ACTUARIAL STATIST IC S, FA LL 2008 SO LU TIONS TO EXE RC IS E 1

Problem 1:

(a) By expanding the square, we get

n

∑

i=1

(Yi−Y)2=

n

∑

i=1

(Y2

i−2YiY+Y2)

=

n

∑

i=1

Y2

i−2

n

∑

i=1

YiY+

n

∑

i=1

Y2.

(b) Because Yis a constant with respect to the summation and nY =∑n

i=1Yi, we have

n

∑

i=1

YiY=

n

∑

i=1

YiY=Y

n

∑

i=1

Yi=Y(nY) = nY2.

Similarly, we have

n

∑

i=1

Y2=nY2.

(c) Putting together the results of parts (a) and (b), we have

n

∑

i=1

(Yi−Y)2=

n

∑

i=1

Y2

i−2nY2+nY2

= n

∑

i=1

Y2

i!−nY2,

for which the desired result follows.

Problem 2:

(a) With α=75 and n=30 observation, the 75th percentile is the approximate value of the (n−

1)α/100) + 1 -th observation, or equivalently the (30 −1)∗75/100 +1=22.75 -th observation.

This then requires the interpolation of 22nd and 23rd observations. We have

Y(22.75)=0.25Y(22)+0.75Y(23)=0.25(319.9) + 0.75(324.5) = 323.35.

This value says that there are approximately 75% observations that are below 323.35.

(b) (359.9−Y)

sY=(359.9−278.6)

53.8656 =1.51 standard deviations above the mean is 359.9.

(c) The required probability is given by

P(Y>359.9) = PY−278.6

53.8656 >359.9 −278.6

53.8656 

=P(Z>1.51) = 1−0.9345 =.0655.

1

Discover Assignments of Mathematics University of Connecticut (UConn) - Avery Point

Partial preview of the text

Download Solutions to Exercise 1 in Applied Actuarial Statistics, Fall 2008 - Prof. Emiliano Valdez and more Assignments Mathematics in PDF only on Docsity!

Problem 1:

(a) By expanding the square, we get

n

i= 1

(Yi − Y)^2 =

n

i= 1

(Y i^2 − 2 YiY + Y

2 )

n

i= 1

Y i^2 − 2

n

i= 1

YiY +

n

i= 1

Y

2 .

(b) Because Y is a constant with respect to the summation and nY = (^) ∑ni= 1 Yi, we have

n

i= 1

YiY =

n

i= 1

YiY = Y

n

i= 1

Yi = Y(nY) = nY 2 .

Similarly, we have n

i= 1

Y

2 = nY 2 .

(c) Putting together the results of parts (a) and (b), we have n

i= 1

(Yi − Y)^2 =

n

i= 1

Y^2 i − 2 nY 2

nY 2

n

i= 1

Y i^2

− nY 2 ,

for which the desired result follows.

Problem 2:

(a) With α = 75 and n = 30 observation, the 75th percentile is the approximate value of the (n − 1 )α/ 100 ) + 1 -th observation, or equivalently the ( 30 − 1 ) ∗ 75 / 100 + 1 = 22.75 -th observation. This then requires the interpolation of 22nd and 23rd observations. We have

Y(22.75) = 0.25Y( 22 ) + 0.75Y( 23 ) = 0.25(319.9) + 0.75(324.5) = 323.35.

This value says that there are approximately 75% observations that are below 323.35.

(b) (359.9 sY− Y)= (359.9 53.8656−278.6 )= 1.51 standard deviations above the mean is 359.9.

(c) The required probability is given by

P(Y > 359.9) = P

Y − 278.

= P(Z > 1.51) = 1 − 0.9345 = .0655.

(d) First, note that (^) ∑^30 i= 1 Yi = nY = 30 (278.6) = 8358. Furthermore, we have (^) ∑^30 i= 1 Y i^2 − nY

2

(n − 1 )s^2 Y = ( 30 − 1 )(53.8656)^2 = 84143.58. Thus, we have (^) ∑^30 i= 1 Y i^2 = nY 2

(n − 1 )s^2 Y = ( 30 )(278.6)^2 + 84143.58 = 2412682.

(e) Omitting the last 2 largest observations, we have (^) ∑^28 i= 1 Yi = (^) ∑^30 i= 1 Yi − 2 × 359.9 = 7638.2 and ∑^28 i= 1 Y^2 i =^ ∑^30 i= 1 Y i^2 −^2 ×^ 359.9^2 =^ 2153626.

(f) Thus, Ynew = 7638.2/ 28 = 272.7929 and s^2 Y,new =

( 2153626 − 28 (272.7929)^2 )/( 28 − 1 ) =

50.91022. The percentage change in the mean is (278.6-272.7929)/278.6 = 2.1% and in the stan- dard deviation is (53.8656 - 50.91022)/53.8656 = 5.5%. This says that the two largest observations decreased the mean by 2.1% and the standard deviation by 5.5%.

Problem 3:

(a) For the players’ salaries, the summary statistics are provided below:

> NFLSAL.1990 <- read.csv("C:/.../Math238-Fall2007/Exercises- /R-data-analysis/NFLSAL-1990.csv") > attach(NFLSAL.1990) > names(NFLSAL.1990) [1] "nflsal90"

> summary(nflsal90) Min. 1st Qu. Median Mean 3rd Qu. Max. 75000 165500 280000 353800 447500 1500000

> sd(nflsal90) [1] 265297.

> quantile(nflsal90,c(0.25,0.75)) 25% 75% 165500 447500

(b) In investigating whether the players’ salaries follow a Normal distribution, we can do three things: (1) draw a histogram with superimposed Normal density, (2) draw a Normal QQ-plot, and (3) approximate the percentages within 1, 2 and 3 std deviations away from the mean. The histogram and QQ-plot are shown below (after the corresponding R commands).

draw the histogram

> hist(nflsal90,br=25,freq=FALSE,xlab="",ylab="", main="Histogram of 1990 NFL Players’ Salaries",cex.main=1.5) > mean.sal <- mean(nflsal90) > sd.sal <- sd(nflsal90)

draw superimposed Normal density

> curve(dnorm(x,mean=mean.sal,sd=sd.sal),from=0,to=1500000,add=TRUE,col="blue")

Note that any one of these do not provide a strong support that players’ salaries follow a Normal distribution.

(c) Now consider the natural logarithm of the players’ salaries. The summary statistics are below:

> log.nflsal90 <- log(nflsal90)

> summary(log.nflsal90) Min. 1st Qu. Median Mean 3rd Qu. Max. 11.23 12.02 12.54 12.54 13.01 14.

> sd(log.nflsal90) [1] 0.

> quantile(log.nflsal90,c(0.25,0.75)) 25% 75% 12.01671 13.

(d) In investigating whether a Normal distribution is suitable for the logged salaries, we follow the same procedure as in part (a).

draw the histogram

> hist(log.nflsal90,br=25,freq=FALSE,xlab="",ylab="", main="Histogram of the log of 1990 NFL Players’ Salaries",cex.main=1.3) > mean.logsal <- mean(log.nflsal90) > sd.logsal <- sd(log.nflsal90)

draw superimposed Normal density

> curve(dnorm(x,mean=mean.logsal,sd=sd.logsal),from=11,to=14.5,add=TRUE,col="blue")

draw the Normal qq plot

> qqnorm(log.nflsal90,cex.lab=1.4,cex.axis=1.5,cex.main=1.5,cex=0.75) > qqline(log.nflsal90,col="blue")

Histogram of the log of 1990 NFL Players' Salaries

11.5 12.0 12.5 13.0 13.5 14.

l l l l

l

l l

l

l l l

l

ll

l

l l

l

l l

l

l l

l

llll

l

ll

l

l l

l

ll l

l

l l

l

ll

l

l l

l

ll

l

l l

l

l l

l

ll

l

l l

l

−3 −2 −1 0 1 2 3

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

computing percentages within one std dev

> count.log1sd <-sum(log.nflsal90<(mean.logsal+sd.logsal)) -sum(log.nflsal90<(mean.logsal-sd.logsal)) > pct.log1sd <- count.log1sd/length(log.nflsal90) > print("within 1 sd") [1] "within 1 sd" > print(c(count.log1sd, pct.log1sd)) [1] 130 0.

computing percentages within two std dev

> count.log2sd <- sum(log.nflsal90<(mean.logsal+2sd.logsal)) -sum(log.nflsal90<(mean.logsal-2sd.logsal)) > pct.log2sd <- count.log2sd/length(log.nflsal90) > print("within 2 sd") [1] "within 2 sd" > print(c(count.log2sd, pct.log2sd)) [1] 191 0.

computing percentages within three std dev

> count.log3sd <- sum(log.nflsal90<(mean.logsal+3sd.logsal)) -sum(log.nflsal90<(mean.logsal-3sd.logsal)) > pct.log3sd <- count.log3sd/length(log.nflsal90) > print("within 3 sd") [1] "within 3 sd" > print(c(count.log3sd, pct.log3sd)) [1] 198 1

These results provide some evidence that the log of the players’ salaries follow a Normal distri-

(c) A 95% prediction interval for an additional observation is given by

Y ± 1.96sY

1 + ( 1 / 1470 ) = 3523 ± 1.96 × 4765.448 ×

But because claims cannot be negative, your 95% confidence interval should be reduced to (0,12866.45).

Problem 5:

(a) The summary statistics are printed below:

> hur.loss <- read.csv("C:/.../HurricaneLosses.csv") > hur.loss Year Loss 1 1977 2000 2 1971 1380 3 1971 2000 4 1964 2000 5 1968 2580 6 1971 4730 7 1956 3700 8 1961 4250 9 1966 5400 10 1955 4500 11 1958 5000 12 1974 14720 13 1959 7900 14 1971 13500 15 1976 22697 16 1964 12000 17 1949 8300 18 1959 13000 19 1950 10450 20 1954 12500 21 1973 32300 22 1980 57911 23 1964 23000 24 1955 25200 25 1967 34800 26 1957 32200 27 1979 122070 28 1975 119189 29 1972 97853 30 1964 67200 31 1960 91000 32 1961 100000 33 1969 165300 34 1954 122050

35 1954 129700 36 1970 309950 37 1979 752510 38 1965 500000 > attach(hur.loss) > names(hur.loss) [1] "Year" "Loss" > summary(Loss) Min. 1st Qu. Median Mean 3rd Qu. Max. 1380 5100 18710 77230 96140 752500 > sd(Loss) [1] 148485. > quantile(Loss,.95) 95%

(b) The histogram below, with a Normal density superimposed, together with the Normal q-q plot, clearly provide an indication that the Loss variable is too highly skewed that a Normal distribu- tion does not fit well.

> hist(Loss,br=25,cex=1.4,freq=FALSE) > curve(dnorm(x,mean=mean(Loss),sd=sd(Loss)),from=0,to=800000,add=TRUE,col="blue")

> qqnorm(Loss,cex.lab=1.4,cex.axis=1.5,cex.main=1.5,cex=0.75) > qqline(Loss,col="blue")

Histogram of Loss

Loss

Density

0e+00 2e+05 4e+05 6e+05 8e+

0.0e+

4.0e−

8.0e−

1.2e−

l l l l l l l^ l^ ll^ ll^ l^ ll^ llll^ l l l ll^ ll

ll l ll^ l

l l l

l

0e+

2e+

4e+

6e+

Normal Q−Q Plot

Theoretical Quantiles

Sample Quantiles

(a) The summary statistics of the logarithm of the Loss are printed below:

possible inflationary values. The second may be fixed by looking at the time trend of the data, which at this point, we do not have the tools to analyze.

Solutions to Exercise 1 in Applied Actuarial Statistics, Fall 2008 - Prof. Emiliano Valdez, Assignments of Mathematics

Related documents

Partial preview of the text

Download Solutions to Exercise 1 in Applied Actuarial Statistics, Fall 2008 - Prof. Emiliano Valdez and more Assignments Mathematics in PDF only on Docsity!

Y

Y

P(Y > 359.9) = P

Y − 278.

= P(Z > 1.51) = 1 − 0.9345 = .0655.

2

( 2153626 − 28 (272.7929)^2 )/( 28 − 1 ) =

draw the histogram

draw superimposed Normal density

draw the histogram

draw superimposed Normal density

draw the Normal qq plot

computing percentages within one std dev

computing percentages within two std dev

computing percentages within three std dev

1 + ( 1 / 1470 ) = 3523 ± 1.96 × 4765.448 ×