Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Probability Distributions: Multivariate, Risk and Insurance, Lecture notes of Law

Probability and StatisticsRisk Management and InsuranceMathematical StatisticsActuarial Science

Multivariate probability distributions, including the bivariate normal, joint probability functions, joint probability density functions, joint cumulative distribution functions, central limit theorem, conditional and marginal probability distributions, moments for joint, conditional, and marginal probability distributions, joint moment generating functions, covariance and correlation coefficient, transformations and order statistics, and probabilities and moments for linear combinations of independent random variables. It also discusses risk and insurance, including definitions, useful facts, and solutions to problems related to exponential utility functions and compound Poisson processes.

What you will learn

  • What is the role of covariance and correlation coefficient in multivariate probability distributions?
  • What is the definition of a multivariate probability distribution?
  • What is the central limit theorem, and how does it apply to multivariate probability distributions?
  • What is the difference between joint and marginal probability distributions?
  • How do you calculate the moments of a multivariate probability distribution?

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

kimball
kimball 🇬🇧

5

(3)

220 documents

1 / 37

Toggle sidebar

Related documents


Partial preview of the text

Download Probability Distributions: Multivariate, Risk and Insurance and more Lecture notes Law in PDF only on Docsity!

STAT 484 Actuarial Science: Models

INTRODUCTION

Who is an Actuary?

An actuary is a person who analyzes financial risks for different types of in- surance and pension programs. The word “actuary” comes from the Latin for account keeper, deriving from actus “public business.”

If you are seriously considering becoming an actuary, then visit

www.beanactuary.org

In the United States, actuaries have two professional organizations: the So- ciety of Actuaries (SOA) and the Casualty Actuarial Society (CAS).

What is the Society of Actuaries (SOA)?

SOA members work in life and health insurances, and retirement programs. Check their website at www.soa.org

What is the Casualty Actuarial Society (CAS)?

CAS members work in property (automobile, homeowner) and casualty in- surance (workers’ compensation). Check their website at www.casact.org

Historical Background

The first actuary to practice in North America was Jacob Shoemaker of Philadelphia, a key organizer in 1809 of the Pennsylvania Company for In- surances on Lives and Granting Annuities. The Actuarial Society of America came into being in New York in 1889. In 1909, actuaries of life companies in the midwestern and southern United States organized the American In- stitute of Actuaries with headquarters in Chicago. In 1914 the actuaries of property and liability companies formed the Casualty Actuarial and Statis- tical Society of America which in 1921 was renamed the Casualty Actuarial Society. In 1949, the Society of Actuaries was created as the result of the merger between the Actuarial Society of America and the American Institute of Actuaries.

1889, Actuarial Society of America

1909, American Institute of Actuaries

1914, Casualty Actuarial and Statistical Society of America

1921, renamed Casualty Actuarial Society

1949, Society of Actuaries

The information below is taken from The Society of Actuaries Basic Educa- tion Catalog, Spring 2006.

Anyone pursuing actuarial career may apply to the SOA. He would be a member of this organization as long as he pays the dues. Anyone who has passed certain number of actuarial exams and met some additional require- ments, becomes an Associate of the Society of Actuaries (ASA), and then, after passing more exams and meeting some more additional requirements one becomes a Fellow of the Society of Actuaries (FSA).

What are Actuarial Exams?

Historical Note: In 1896, after some hesitation, an examination system was adopted. The first Fellow by examination qualified in 1900.

SOA currently offers nine actuarial exams.

Exam 1 Probability (same as SOA Exam P) Exam 2 Financial Mathematics (same as SOA Exam FM) Exam 3 Actuarial Models: Segment 3F, Financial Economics (same as SOA Exam MFE) and Segment 3L, Life Contingencies and Statistics Exam 4 Construction and Evaluation of Actuarial Models (same as SOA Exam C) Exam 5 Introduction to Property and Casualty Insurance and Ratemaking Exam 6 Reserving, Insurance Accounting Principles, Reinsurance, and Enterprise Risk Management Exam 7 Law, Regulation, Government and Industry Insurance Programs, and Financial Reporting and Taxation Exam 8 Investments and Financial Analysis Exam 9 Advanced Ratemaking, Rate of Return, and Individual Risk Rating Plans In this course we review the material for Exam 1/P and learn roughly 25% of what is needed to pass Exam 3 Segment L. A disclaimer is in order here. Taking this course doesn’t guarantee that you pass these exams. The course prepares you to start preparing for the exams.

Exam 1/P Probability

The examination for this material consists of three hours of multiple-choice questions offered through computer-based testing. The purpose of this exam is to check the knowledge of the fundamental probability tools for quantitatively assessed risk. The application of these tools to problems encountered in actuarial science is emphasized. A thor- ough command of probability topics and the supporting calculus is assumed. Additionally, a very basic knowledge of insurance and risk management is assumed. A table of values for the normal distribution is included with the examination.

The exam covers the following probability topics in a risk management con- text:

  1. General Probability
    • Set functions including set notation and basic elements of probability
    • Mutually exclusive events
    • Additive and multiplicative laws
    • Independence of events
    • Conditional probability
    • The Bayes Theorem
  2. Univariate probability distributions (including binomial, negative bino- mial, geometric, hypergeometric, Poisson, uniform, exponential, chi-square, beta, Pareto, lognormal, gamma, Weibull, and normal)
    • Probability functions and probability density functions
    • Cumulative distribution functions
    • Mean, median, mode, percentiles, and moments
    • Variance and measures of dispersion
    • Moment generating functions
    • Transformations
    • Independence of random variables
  3. Multivariate probability distributions (including the bivariate normal)
    • Joint probability functions and joint probability density functions
    • Joint cumulative distribution functions
    • Central Limit Theorem
    • Conditional and marginal probability distributions
    • Moments for joint, conditional, and marginal probability distributions
    • Joint moment generating functions
    • Covariance and correlation coefficient
    • Transformations and order statistics
    • Probabilities and moments for linear combinations of independent ran- dom variables

Risk and Insurance

Reference: Risk and Insurance, Study Notes, SOA web site, code P-21-05.

Definitions. People need economic security (food, clothing, shelter, med- ical care, etc.). The possibility to lose the economic security is called the economic risk or simply risk. This risk causes many people to buy insur- ance. Insurance is an agreement where, for a stipulated payment called the premium, one party called the insurer agrees to pay the other called the policyholder (or insured) or his designated beneficiary a defined amount called claim payment or benefit upon the occurrence of a specific loss. This defined claim payment can be a fixed amount or can reimburse all or a part of the loss that occurred. The insurance contract is called the policy. Only small percentage of policyholders suffer loss. Their losses are paid out of the premiums collected from the pool of policyholders. Thus, the entire pool compensates the unfortunate few. Each policyholder exchanges an unknown loss for the payment of a know premium. The insurer considers the losses expected for the insurance pool and the potential of variation in order to charge premiums that, in total, will be suf- ficient to cover all of the projected claim payments for the insurance pool. The insurer may restrict the particular kinds of losses covered. A peril is a potential cause of a loss. Perils may include fires, hurricanes, theft, or heart attacks. The insurance policy may define specific perils that are covered, or it may cover all perils with certain exclusions such as, for example, property loss as a result of a war or loss of life due to suicide. Losses depend on two random variables. The first is the number of losses that will occur in a specified period. This random variable is called frequency of loss and its probability distribution is called frequency distribution. The second random variable is the amount of the loss, given that a loss has oc- curred. This random variable is called the severity and its distribution is called the severity distribution. By combining the frequency and the sever- ity distributions, one can determine the overall loss distribution. Example. Suppose a car owner will have no accidents in a year with prob- ability 0.8 and will have one accident with probability 0.2. This is the fre- quency distribution. Suppose also that with probability 0.5 the car will need repairs costing $500, with probability 0.4 the repairs will cost $5,000, and with probability 0.1 the car will need to be replaced at the cost $25,000. This is the severity distribution. Combining the two distributions, we have that the distribution of X, the total loss due to an accident is

f (x) =









  1. 8 , if x = 0 (0.2)(0.5) = 0. 1 , if x = 500 (0.2)(0.4) = 0. 08 , if x = 5, 000 (0.2)(0.1) = 0. 02 , if x = 25, 000.

Definitions (continued...) The expected amount of claim payments is called the net premium or benefit premium. The gross premium is the total of the net premium and the amount to cover the insurer’s expenses for selling and servicing the policy, including some profit. Policyholders are willing to pay a gross premium for an insurance policy, which exceeds the expected amount of their losses, to substitute the fixed premium payment for a potentially enormous payment if they are not insured.

What kind of risk is insurable?

An insurable risk must satisfy the following criteria:

  1. The potential loss must be significant so that substituting the premium payment for an unknown economic outcome (given no insurance) is desirable.
  2. The loss and its economic value must be well-defined and out of the pol- icyholder’s control. For example, the policyholder should not be allowed to cause a loss or to lie about its value.
  3. Covered losses should be reasonably independent. For example, an insurer should not insure all the stores in one area against fire.

Examples of Insurance.

  1. the auto liability insurance – will pay benefits to the other party if a policyholder causes a car accident.
  2. the auto collision insurance – will pay benefits to a policyholder in case of a car accident.
  3. the auto insurance against damages other than accident – will pay bene- fits to a policyholder in case the car is damaged from hailstones, tornado, vandalism, flood, earthquake, etc.
  4. the homeowners insurance – will pay benefits to a policyholder towards repairing or replacing the house in case of damage from a covered peril, such as flood, earthquake, landslide, tornado, etc. The contents of the house may also be covered in case of damage or theft.
  5. the health insurance (medical, dental, vision, etc. insurances) – will cover some or all health expenses of a policyholder.
  6. the life insurance – will pay benefits to a designated beneficiary in case of a policyholder’s death.
  7. the disability income insurance – will replace all or portion of a policy- holder’s income in case of disability.
  8. the life annuity – will make regular payments to a policyholder after the retirement until death.

Limits on Policy Benefits.

Definition. If an insurance does not reimburse the entire loss, the poli- cyholder must cover part of the loss. This type of limit on policy benefits is

called coinsurance. We study two types of coinsurances.

  1. deductible – insurance will cover losses in excess of a stated amount. For example, a $500 deductible on a car insurance means that all repairs that cost $500 or less must be covered by the policyholder. If the cost ex- ceeds $500, the policyholder must pay $500, and the insurance will pay the rest.
  2. benefit limit – insurance will not cover losses beyond a stated upper bound.

Definition. Inflation is a persistent increase in the amount of money and credit in relation to the supply of goods and services which results in increase of prices and decline of purchasing power.

What is the role of an actuary?

  1. determine the net and gross premiums of a policy.
  2. determine the amount of assets the insurer should have on hand to assure that benefits can be paid as they arise.
  3. project potential profit or loss of a new kind of policy.
  4. assess potential difficulties of a new policy before they become significant.

Textbook: Mathematical Statistics with Applications by Wackerly, D.D., Mendenhall, W., and Sheaffer, R.L., Duxbury, 2008, 7th edition.

2.3 – 2.10, not 2.6 General Probability, Set Notation.

Definition. The sample space S is the set of all possible outcomes of a random experiment. Definition. An event is a subset of S. Notation. Events are denoted by A, B, C, F, G, A 1 , etc. To list the elements in a set A write A = {a 1 , a 2 ,... , an}. Definition. The union A ∪ B of two events A and B contains all outcomes that are either in A or in B or in both. Definition. The intersection A ∩ B of two events contains all outcomes that are in both, A and B. Definition. Two events A and B are mutually exclusive or disjoint if their intersection is an empty set: A ∩ B = ∅. Definition. The complement Ac^ (or A¯ or A′^ or ∼ A or AC^ ) of an event A contains all outcomes that are in S but not in A. Definition. Draw Venn Diagram. Show union, intersection, mutually ex- clusive events, complement. Definition. The probability of an event A, denoted P(A), is a number with the following two properties: (1) P(A) ≥ 0 , ∀A. (2) P(

∪∞

i=1 Ai) =^

∑∞

i=1 P(Ai) for any disjoint events^ Ai. Useful Formulas. Draw Venn diagrams.

  1. The additive law for two events: P(A ∪ B) = P(A) + P(B) − P(A ∩ B).
  2. The additive law for three events: P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C) − P(B ∩ C) + P(A ∩ B ∩ C).
  3. The complement rule: P(Ac) = 1 − P(A).
  4. If Ai, i = 1,... , n are disjoint, then P(∪ni=1 Ai) =

∑n i=1 P(Ai).

  1. DeMorgan’s laws: P ((A ∩ B)c) = P(Ac^ ∪ Bc), P ((A ∪ B)c) = P(Ac^ ∩ Bc). Definition. Two events A and B are independent iff P(A∩B) = P(A) P(B). Definition. If events A 1 , A 2 ,... are independent then P(∩ Ai) =

P(Ai). Definition. The conditional probability of A given B is

P(A| B) =

P(A ∩ B)

P(B)

.

Useful Formula P(A ∩ B) = P(A| B)P(B). This is the multiplicative law. Definition. A set {A 1 , A 2 ,... , An} is a partition of the sample space S if (1) Ai are mutually exclusive, and (2) ∪Ai = S. Draw Venn Diagram. The Bayes Theorem. Let {A 1 , A 2 ,... , An} be a partition of S. Given that an event B has occurred, the updated probability of Ai is

P(Ai| B) =

P(B| Ai) P(Ai) ∑n j=1 P(B|^ Aj^ )^ P(Aj^ )

.

Definition. A combination of objects is an unordered arrangement of the objects. Definition. The number of combinations of k objects chosen from n dis- tinct objects is given by the binomial coefficient

(n k

)

= (^) k!(nn−!k)!. Definition. The number of ways to separate n distinct objects into k groups of sizes n 1 ,... , nk where n 1 + · · · + nk = n is given by the formula

( n n 1

) (

n − n 1 n 2

)

...

(

nk nk

)

=

n! n 1!... nk!

.

3.1 – 3.9 Discrete Probability Distributions.

Definition. A random variable is a function that assigns a real number to every outcome in the sample space. Definition. A discrete random variable assumes a finite or a countably in- finity number of values. Definition. The probability distribution of a discrete random variable X is the list of the values with the respective probabilities P(X = x) = p(x). The function p(x) is called the probability function. It has the properties: (i) p(x) ≥ 0 ∀x, and (ii)

x p(x) = 1. Definition. The mean (or expected value or expectation or average) of a discrete random variable X is E(X) =

x x p(x). Useful Formulas.

  1. E(aX + b) = a E(X) + b.
  1. E(g(X)) =

x g(x)^ p(x).

  1. E(f (X 1 ) + g(X 2 )) = E(f (X 1 )) + E(g(X 2 )). Definition. The 100α’s percentile of a distribution of a random variable is the value x such that P(X ≤ x) = α. Definition. The first quantile Q 1 of a distribution of X satisfies P(X < Q 1 ) = .25. Definition. The median M of a distribution of X satisfies P(X < M ) = 0 .5 = P(X > M ). Definition. The third quantile Q 3 of a distribution of X satisfies P(X < Q 3 ) = .75. Definition. The mode of a distribution is a local maximum. Definition. The kth moment of a random variable X is E(Xk). Definition. The variance of a random variable X is Var(X) = E (X − E(X))^2. Useful Formulas.
  2. Var(X) = E(X)^2 − (E(X))^2.
  3. Var(aX + b) = a^2 Var(X).
  4. If X 1 and X 2 are independent, then E(f (X 1 )g(X 2 )) = E(f (X 1 ) E(g(X 2 )).
  5. If X 1 and X 2 are independent, then Var(f (X 1 ) + g(X 2 )) = Var(f (X 1 )) + Var(g(X 2 )). Definition. The standard deviation of a random variable X is σ =

Var(X). Definition. The interquantile range of a distribution is Q 3 − Q 1 , the differ- ence between the third and the first quantiles. Definition. The moment generating function (m.g.f.) of a random variable

X is m(t) = mX (t) = E(etX^ ). Useful Formulas.

  1. E(X) = m′(0), E(X^2 ) = m′′(0). In general, E(Xn) = m(n)(0).
  2. If X 1 , · · · , Xn are independent and X =

∑n i=1 Xi, then^ mX^ (t) =^

∏n i=1 mXi^ (t).

Certain Discrete Distributions.

Name Notation P(X = x) E(X) Var(X) m(t)

Binomial X ∼ Bi(n, p)

(n x

)

px(1 − p)n−x, np np(1 − p) (pet^ + 1 − p)n x = 0,... , n Geometric Geom(p) p(1 − p)x−^1 , (^1) p^1 p− 2 p^ pe

t 1 −(1−p)et x = 1,...

Negative N B(p, r)

(x− 1 r− 1

)

pr(1 − p)x−r^ rp^ (1−p 2 p)r

(

pet 1 −(1−p)et

)r

Binomial x = r,... Hypergeo- HG(N, n, k) (kx)(Nn^ −−xk ) (Nn ) , n (^) Nk n (^) Nk (1 − (^) Nk )NN^ − −n 1 no closed metric x = 0,... , n, form Poisson P oi(λ) λ

x x! e

−λ (^) λ λ eλ(et−1) x = 0, 1 ,...

4.2 – 4.6, 4.9 Continuous Probability Distributions.

Definition. A continuous random variable assumes values in an interval. Definition. The probability density function (p.d.f.) of a continuous ran- dom variable Y is a function f (y) with the properties: (i) f (y) ≥ 0 ∀y,

(ii)

∫ ∞

−∞ f^ (y)^ dy^ = 1, and (iii)^ P(a^ ≤^ Y^ ≤^ b) =^

∫ (^) b a f^ (y)^ dy. Definition. The cumulative distribution function (c.d.f.) of a continuous ran-

dom variable Y is F (y) = P(Y ≤ y) =

−∞y^ f (u) du. Useful Formulas.

  1. f (y) = F ′(y).
  2. P(a ≤ Y ≤ b) = F (b) − F (a). Definition. The mean (or expected value or expectation or average) of a continuous random variable Y is E(Y ) =

∫ ∞

−∞ y f^ (y)^ dy. Useful Formulas.

  1. E(aY + b) = a E(Y ) + b.
  2. E (g(Y )) =

y g(y)^ f^ (y)^ dy.

  1. E(Y 1 + Y 2 ) = E(Y 1 ) + E(Y 2 ).
  2. Var(Y ) = E(Y )^2 − (E(Y ))^2.
  3. Var(aY + b) = a^2 Var(Y ).
  4. If Y 1 and Y 2 are independent, then Var(Y 1 + Y 2 ) = Var(Y 1 ) + Var(Y 2 ).

Certain Continuous Distributions.

Uniform: U ∼ U nif (a, b), f (u) =

{

1 b−a ,^ if a^ ≤^ u^ ≤^ b 0 , ow

, F (u) =





0 , if u < a u−a b−a ,^ if a^ ≤^ u^ ≤^ b, 1 , if u ≥ b

E(U ) = a+ 2 b, Var(U ) = (b−a)

2 12 ,^ m(t) =^

ebt−eat (b−a)t.

Exponential: X ∼ Exp(β), f (x) = (^) β^1 e−x/β^ , x > 0 , β > 0,

F (x) =

{

0 , if x ≤ 0 1 − e−x/β^ , if x > 0

E(X) = β, Var(X) = β^2 , m(t) = (^1) −^1 β t.

Useful Facts:

  1. If occurrences have P oisson(λ) distribution, then the interarrival times are Exp(1/λ).
  2. Memoryless property of exponential distribution: P(X > t + s| X > t) = P(X > s).

Gamma: X ∼ Gamma(α, β), f (x) = x

α− 1 Γ(α) βα^ e

−x/β (^) , x > 0 , α > 0 , β > 0,

Γ(α) =

∫ ∞

0 x

α (^) e−x (^) dx, E(X) = α β, Var(X) = α β (^2) , m(t) =

(

1 1 −β t

)α .

Useful Facts:

  1. Γ(n) = (n − 1)! where n is an integer, n ≥ 1.
  2. Gamma r.v. is the sum of α i.i.d. exponentials.

Normal: X ∼ N (μ, σ^2 ), f (x) = √ 2 π σ^12 e−^

(x−μ)^2 2 σ^2 , −∞ < x < ∞, E(X) = μ,

Var(X) = σ^2 , m(t) = Exp

(

μ t + σ

(^2) t 2 2

)

.

Lognormal: If log X ∼ N (μ, σ^2 ), then X has a lognormal distribution. The density of lognormal distribution is

f (x) =

2 πσ^2

x

e−(log^ x−μ)

(^2) /(2σ (^2) ) , x > 0 , E(Xn) = enμ+n

(^2) σ (^2) / 2 , m(t) ̸∃.

Chi-squared: X ∼ χ^2 (p), f (x) = (^) Γ(p/2) 2^1 p/ 2 x(p/2)−^1 e−x/^2 , x > 0 , E(X) =

p, Var(X) = 2p, m(t) =

( 1

1 − 2 t

)p/ 2 , t < 1 / 2. Useful Facts:

  1. If Zii.i.d. ∼ N (0, 1), i = 1,... , n, and X =

∑n i=1 Z

2 i ∼^ χ (^2) (n) has the chi-

squared distribution with n degrees of freedom.

  1. χ^2 (n) is Gamma(n/ 2 , 1 /2).

Beta: X ∼ Beta(α, β), f (x) = (^) Γ(Γ(αα)Γ(+ββ)) xα−^1 (1 − x)β−^1 , 0 < x < 1,

E(X) = (^) αα+β , Var(X) = (^) (α+β) 2 αβ(α+β+1) , m(t) doesn’t look good.

Pareto: X ∼ P areto(α, β), f (x) = βα

β xβ+1^ ,^ x > α,^ α >^0 ,^ β^ >^ 0, E(X) = (^) ββα− 1 , Var(X) = βα

2 (β−1)^2 (β−2) , β >^2 ,^ m(t)^ ̸∃.

Weibull: X ∼ W eibull(γ, β), f (x) = γβ xγ−^1 e−x γ (^) /β , x > 0 , γ > 0 , β > 0,

E(Xn) = βn/γ^ Γ(1 + n/γ), m(t) doesn’t look good. Useful Fact: If X ∼ Exp(β), then X^1 /γ^ ∼ W eibull(γ, β).

6.4 Transformations. Definition. Let X be a continuous random variable with p.d.f. fX and c.d.f. FX , and let g be a function. Define a transformation Y = g(X). Then the c.d.f of Y is FY (y) = P(Y ≤ y) = P(g(X) ≤ y) =

x: g(x)≤y fX^ (x)^ dx, and the p.d.f. of Y is fY (y) = F (^) Y′ (y). In the special case when g is strictly increasing, FY (y) = P (X ≤ g−^1 (y)) =

FX (g−^1 (y)), and fY (y) = F (^) X′ (g−^1 (y)) dg

− (^1) (y) dy. When^ g^ is strictly decreasing, FY (y) = P (X ≥ g−^1 (y)) = 1−FX (g−^1 (y)) and fY (y) = −F (^) X′ (g−^1 (y)) dg

− (^1) (y) dy.

In general, if g is strictly increasing or decreasing, fY (y) = F (^) X′ (g−^1 (y)) dg

− (^1) (y) dy.

Order Statistics.

Definition. Let X 1 ,... , Xn be i.i.d. random variables with c.d.f. F and p.d.f. f. Suppose n realizations of these variables are observed. The ob- servations in increasing order X(1),... , X(n) are called order statistics. The p.d.f. of the ith order statistic is

fX(i) (x) =

n! (i − 1)!(n − i)!

[F (x)]i−^1 f (x) [1 − F (x)]n−i^.

Interesting special cases are

  1. FX(n) (x) = P (Xmax ≤ x) = P(X 1 ≤ x,... , Xn ≤ x) = [F (x)]n^ ,

fX(n) (x) = n [F (x)]n−^1 f (x).

  1. 1 − FX(1) (x) = P (Xmin ≥ x) = P(X 1 ≥ x,... , Xn ≥ x) = [1 − F (x)]n^ ,

fX(1) (x) = n [1 − F (x)]n−^1 f (x).

5.2 – 5.8 Multivariate Probability Distributions.

Definition. Let X 1 and X 2 be two discrete random variables. The joint probability function of X 1 and X 2 is p (x 1 , x 2 ) = P(X 1 = x 1 , X 2 = x 2 ). Definition. Let Y 1 and Y 2 be two continuous random variables. The joint c.d.f is F (y 1 , y 2 ) = P(Y 1 ≤ y 1 , Y 2 ≤ y 2 ). The joint density is

f (y 1 , y 2 ) =

∂^2 F (y 1 , y 2 ) ∂y 1 ∂y 2

,

or, equivalently, F (y 1 , y 2 ) =

∫ (^) y 1 −∞

∫ (^) y 2 −∞ f^ (u, v)^ dv du. Definition. The marginal probability distribution of X 1 is p 1 (x 1 ) =

x 2 p^ (x^1 , x^2 ), of X 2 is p 1 (x 1 ) =

x 1 p^ (x^1 , x^2 ). Definition. The marginal density of Y 1 is f 1 (y 1 ) =

∫ ∞

−∞ f^ (y^1 , y^2 )^ dy^2 , of^ Y^2 is f 2 (y 2 ) =

∫ ∞

−∞ f^ (y^1 , y^2 )^ dy^1. Definition. The conditional probability function of X 1 given that X 2 = x 2 is

p(x 1 | x 2 ) =

p (x 1 , x 2 ) p 2 (x 2 )

.

Definition. The conditional density of Y 1 given that Y 2 = y 2 is

f (y 1 | y 2 ) =

f (y 1 , y 2 ) f 2 (y 2 )

.

Useful Formula.

P(Y 1 ≤ y 1 | Y 2 = y 2 ) =

∫ (^) y 1 ∫^ −∞ f^ (u, y^2 )^ du ∞ −∞ f^ (u, y^2 )^ du

.

Definition. The joint m.g.f. of X and Y is m(t, s) = et X+s Y^.

Definition. The covariance between two random variables X and Y is

Cov(X, Y ) = E [(X − E(X)) (Y − E(Y ))] = E(XY ) − E(X)E(Y ).

Properties.

  1. Cov(X Y ) = Cov(Y, X).
  2. Cov(aX + b, cY + d) = acCov(X, Y ).
  1. Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z).
  2. If X and Y are independent, then Cov(X, Y ) = 0, that is, X and Y are uncorrelated. The converse is not true.
  3. Var(aX + bY + c) = a^2 Var(X) + 2abCov(X, Y ) + b^2 Var(Y ).

Definition.; The correlation coefficient between X and Y is

ρXY =

Cov(X, Y ) √ Var(X)

Var(Y )

.

Properties.

  1. − 1 ≤ ρ ≤ 1.
  2. ρ measures the direction and strength of linear relationship between X and Y. If X and Y are independent, ρ = 0. If Y = aX + b, then ρ = sign(a) = ± 1.

7.3 The Central Limit Theorem (CLT).

Definition. The sample mean of random variables W 1 ,... , Wn is the arith- metic average

W¯ = W^1 +^ · · ·^ +^ Wn n

.

Theorem (The Central Limit Theorem). Let W 1 ,... , Wn be i.i.d. ran- dom variables with mean μ and standard deviation σ. Then for large n, W¯ is approximately normally distributed with mean μ and standard deviation σ/

n. Equivalently, W 1 + · · · + Wn is approximately normally distributed with mean nμ and standard deviation

n σ.

Textbook: Actuarial Mathematics by Bowers, N.L., Gerber, H.U., Hickman, J.C., Jones, D.A., and Nesbitt, C.J., Society of Actuaries, 1997, 2nd edition.

1.2 Utility Theory. Lets play the following game. We flip a fair coin three times. If we see exactly one head, I pay you $10, otherwise you pay me $7. Do you want to play the game? Solution: The sample space of the random experiment is

S = {HHH, HHT, HT H, T HH, HT T, T HT, T T H, T T T }.

Denote by W your gain in the game. Then

P(W = 10) = P({HT T, T HT, T T H}) = 3/ 8 ,

and P(W = −7) = 1 − 3 /8 = 5/8. Your expected gain in the game is E(W ) = (10)(3/8) + (−7)(5/8) = − 5 /8 = −.625, that is, you expect to lose 62.5 cents. So, most likely you don’t want to play the game. Which game would you be willing to play? Only if E(W ) ≥ 0. But if E(W ) > 0, I wouldn’t want to play with you. So, the only way we can agree

to play the game is if E(W ) = 0. That is, we both expect to win nothing (a fair game). For example, if I pay you $10 with probability 3/8 and you pay me $6 with probability 5/8.

Definition. When making a decision in a situation that involves random- ness, one approach is to replace the distribution of possible outcomes by the expected value of the outcomes. This approach is called the expected value principle. Definition. In economics, the expected value of random prospects with monetary payments is called the fair value (or actuarial value) of the prospect.

We might expect a similar principle to be applicable in insurance business. It turns out that it is not always so. Consider the following situation. Sup- pose you own w = $100 which you might lose with probability p = 0.01. If you lose your wealth, an insurer offers to reimburse you in full (called a complete coverage or a complete insurance). How much would you be willing to pay for the prospect? Solution: Notice that if not insured you expect to lose wp = (100)(.01) = $1. Suppose you are willing to pay $x for the insurance. The insurer’s gain then equals x with probability 1−p = .99 and x−100 = x−w with probabil- ity p = .01. Thus, the insurer’s expected gain is x(1 − p) + (x − w)p = x − wp. If it were to be a fair game, x − wp = 0 or x = wp, that is, you should be willing to pay the amount of your expected loss wp = $1. Would you be will- ing to pay more than $1? Most likely not, if your wealth is just $100. But suppose your wealth is one million dollars. Then you should be willing to pay the insurer the amount of the expected loss of (1, 000 , 000)(.01) = $10, 000. Would you be willing to pay more than that? Most likely yes, because if not insured, there is a chance of a catastrophic loss. But how much more are you willing to pay? It depends on a person.

Definition. The value (or utility) that a particular decision-maker attaches to wealth of amount w, can be specified in the form of a function u(w), called a utility function.

How to determine values of an individual’s utility function?

Example. Suppose a decision-maker has wealth w = $20, 000. So, his utility function is defined on interval [0, 20 , 000]. We chose arbitrarily the endpoints of his utility function, for example, u(0) = −1 and u(20, 000) = 0. To de- termine the values of the utility function in intermediate points, proceed as follows. Ask the decision-maker the following question: Suppose you can lose your $20,000 with probability 0.5. How much would you be willing to pay an insurer for a complete insurance against the loss? That is, define the max- imum amount G such that u(20, 000 − G) = (0.5) u(20, 000) + (0.5) u(0) = (0.5)(0) + (0.5)(−1) = − 0. 5. Here the left-hand side represents the utility of the insured certain amount $20, 000 − G, while the right-hand side is

the expected utility of the uninsured wealth. Suppose the decision-maker defines G = $12, 000. Therefore, u(8, 000) = − 0 .5. Notice that the deci- sion maker is willing to pay for the insurance more than the expected loss (0.5)(20, 000) + (0.5)(0) = 10, 000. To determine the other values of the utility function, ask the question: What is the maximum amount you would pay for a complete insurance against a situation that could leave you with wealth w 2 with probability p, or at re- duced wealth w 1 with probability 1 − p? That is, we ask the decision maker to specify G such that u(w 2 − G) = p u(w 2 ) + (1 − p) u(w 1 ). For example, G = 7, 500 in the situation u(20, 000 −G) = (0.5) u(20, 000)+(0.5) u(8, 000) = (0.5)(0) + (0.5)(− 0 .5) = −.25. This defines u(12, 500) = − 0 .25. Notice that G again exceeds the expected loss of (0.5)(0) + (0.5)(12, 000) = 6, 000. Pic- ture.

The main theorem of the utility theory states that a decision-maker prefers the distribution of X to the distribution of Y , if E(u(X)) > E(u(Y )) and is indifferent between the two distributions, if E(u(X)) = E(u(Y )).

1.3. Insurance and Utility.

We apply the utility theory to the decision problems faced by a property owner. Suppose the random loss X to his property has a known distribution. The owner will be indifferent between paying a fixed amount G to an insurer, or assuming the risk himself. That is, u(w − G) = E(u(w − X)). Remember that to make a profit, an insurer must charge a premium that exceeds the expected loss, that is, it should be true that G > E(X) = μ. Which utility functions satisfy this property?

Proposition. If u′(w) > 0 and u′′(w) < 0, then G > μ. Proof: We make use of Jensen’s inequality which states that if u′′(w) < 0 , then E(u(X)) ≤ u(E(X)). The exact equality holds iff X = μ. By this inequality, u(w−G) = E(u(w−X)) ≤ u(E(w−X)) = u(w−μ). Now, since u′(w) > 0, u is an increasing function, and therefore, w − G ≤ w − μ or μ ≤ G with μ < G unless X is a constant. 2 Definition. A decision-maker with utility function u(w) is risk averse if u′′(w) < 0, and a risk lover if u′′(w) > 0. Remark. According to the proposition, for risk averse people G ≥ μ and, so, they are able to get a complete insurance. It can be shown that for risk lovers G ≤ μ and, so, they won’t be insured.

Three functions are commonly used to model utility functions. Model 1. An exponential utility function is of the form u(w) = −e−α w, where w > 0 and α > 0 is a constant. It has the following properties: (1) u′(w) = α e−α w^ > 0 , (2) u′′(w) = −α^2 e−α w^ < 0 , (3) G doesn’t de- pend on w. To see that, write −e−α^ (w−G)^ = E

[

− e−α^ (w−X)

]

= − e−α w^ MX (α) where MX is the m.g.f. of X. From here, G = ln MX (α)/α.

Example 1.3.1. A decision-maker has an exponential utility function u(w) = − e−^5 w. Suppose he faces two economic prospects with outcomes X ∼ N (5, 2) and Y ∼ N (6, 2 .5), respectively. Which prospect should be preferred? Solution: E(u(X)) = −MX (−5) = −e(5)(−5)+(2)(−5) (^2) / 2 = − 1 , and E(u(Y )) = −e(6)(−5)+(2.5)(−5) (^2) / 2 = −e^1.^25 = − 3. 49 < −1 = E(u(X)), so the distribution of X should be preferred to the distribution of Y.

Model 2. A fractional power utility function is of the form u(w) = wγ^ , where w > 0 and 0 < γ < 1 is a constant. Check that u′(w) > 0 and u′′(w) < 0.

Example 1.3.2. Suppose u(w) =

w, w = 10 and X ∼ U nif (0, 10). Find G. Solution:

10 − G = E(

10 − X) =

∫ 10

0

10 − X 101 dx = (^23)

10, so 10 − G = 40/9 and G = 10 − 40 /9 = 5.56.

Model 3. A quadratic utility function is of the form u(w) = w − α w^2 , where w < 1 /(2α), and α > 0 is a constant. Check that u′(w) > 0 and u′′(w) < 0.

Example 1.3.3. Suppose u(w) = w − 0. 01 w^2 , w < 50. Also suppose that with probability p = 0.5 the decision maker will retain wealth of amount w = 20 and with probability 1 − p = 0.5 will suffer a financial loss of amount c = 10. Find G. Solution: Since u(w − G) = pu(w) + (1 − p)u(w − c), we have 20 − G − 0 .01(20 − G)^2 = (0.5)(20 − 0 .01(20)^2 ) + (0.5)(10 − 0 .01(10)^2 ). Check that G solves the quadratic equation 0. 01 G^2 + 0. 6 G − 3 .5 = 0. Thus, G = 5.36.

Example 1.3.4 A property will not be damaged with probability 0.75. A positive loss has Exp(100) distribution. The owner of the property has a utility function u(w) = −e−^0.^005 w. Find E(X) and G.

Solution: Denote the loss by X. Then X has a mixed distribution. X has a mass 0.75 at zero and is Exp(100) with probability 0.25. The expected loss E(X) = (0.75)(0) + (0.25)(100) = 25. To find G, write

u(w −G) = E(u(w −X)) = (0.75)u(w)+(0.25)

∫ ∞

0

u(w −x)(0.01) e−^0.^01 x^ dx,

−e−^0 .005(w−G)^ = −(0.75) e−^0.^005 w^ − (0.25)

∫ ∞

0

e−^0 .005(w−x)(0.01) e−^0.^01 x^ dx,

e^0.^005 G^ = 0.75 + (0.25)(2) = 1. 25 , G = 200 ln (1.25) = 44. 63.

The property owner is willing to pay up to G − E(X) = 44. 63 − 25 = 19. 63 in excess of the expected loss to purchase the complete insurance.

1.5. Optimal Insurance.

Theorem 1.5.1. Suppose a decision maker (1) has wealth w, (2) has a utility function u(w) such that u′(w) > 0 and u′′(w) < 0 (a risk averse), (3) faces a random loss X, and (4) is willing to spend amount P purchasing an insurance. Suppose also that the insurance market offers for a payment P all feasible insurance contracts of the form I(x) where 0 ≤ I(x) ≤ x (avoiding an incentive to incur the loss) with expected payoff E(I(X)) = β. Then to maximize the expected utility, the decision maker should choose an insurance policy

Id∗ (x) =

{

0 if x < d∗, x − d∗^ if x ≥ d∗.

where d∗^ is the unique solution of E(Id(X)) =

∫ ∞

d (x^ −^ d)f^ (x)^ dx^ =^ β. Definition. A feasible insurance contract of the form

Id(x) =

{

0 if x < d x − d if x ≥ d

pays losses above the deductible amount d. This type of contract is called stop-loss or excess-of-loss insurance.

Example. Assume w = 100 and X ∼ U nif (0, 100). Then by the theo- rem d∗^ solves

β =

∫ 100

d

(x − d)

dx =

d^2 − d + 50.

That is, d∗^ solves the quadratic equation d^2 − 200 d + 10, 000 − 200 β = 0, or d∗^ = 100 −

200 β. For example,

Expected payoff β Deductible d∗ 0 100 10 55. 20 36. 30 22. 40 10. 45 5. 50 0

Note that there is no need to specify u(w) and P.

2.2. Models for Individual Claim Random Variables.

We will consider three individual risk models for short time periods. These models do not take into account the inflation of money. Model 1. In a one-year term life insurance the insurer agrees to pay an amount b if the insured dies within a year of policy issue and to pay nothing if the insured survives the year. The probability of a claim during the year is denoted by q. The claim random variable X has distribution

P(X = x) =





1 − q x = 0 q x = b 0 otherwise.

Notice that X = bI where I ∼ Bernoulli(q) indicates whether a death has occurred and, therefore, is called an indicator. Thus, the expected claim E(X) = bq and Var(X) = b^2 q(1 − q).

Model 2. Consider the above model X = IB where the claim amount B varies. Suppose if death is accidental, the benefit amount is B = $50, 000, otherwise, B = $25, 000. Suppose also that the probability of an accidental death within the year is .0005, and the probability of a nonaccidental death is .002. That is,

P(I = 1, B = 50, 000) =. 0005 , P(I = 1, B = 25, 000) =. 002.

Hence, P(I = 1) = .0005 + .002 = .0025, and P(I = 0) = .9975. Therefore, the distribution of X is P(X = 0) =. 9975 , P(X = 25, 000) =. 002 , P(X = 50 , 000) = .0005. The expectation E(X) = $75. Also, the conditional distri- bution of B, given I = 1, is

P(B = 25, 000 | I = 1) =

P(B = 25, 000 , I = 1)

P(I = 1)

=

=. 8 ,

and

P(B = 50, 000 | I = 1) =

P(B = 50, 000 , I = 1)

P(I = 1)

=

=. 2.

This means that 20% of all payoffs are for accidental deaths, and 80% are for nonaccidental. The expected payoff is (25,000)(.8)+(50,000)(.2)=$30,000.

Model 3. Consider an automobile collision coverage above a $250 deductible up to a maximum claim of $2,000. Assume that for an individual the prob- ability of one claim in a period is .15, and the probability of more than one claim is zero, that is, P(I = 1) =. 15 , and P(I = 0) = .85. Assume also that

P(B ≤ x| I = 1) =





0 if x ≤ 0 (.9)

[

1 − (1 − x/ 2 , 000)^2

]

if 0 < x < 2 , 000 1 if x ≥ 2 , 000.

Notice that B has a mixed distribution with a mass at 2,000. The distribution of the claim random variable X is

FX (x) = P(X ≤ x) = P(BI ≤ x| I = 0)P(I = 0) + P(BI ≤ x| I = 1)P(I = 1)

=





0 if x ≤ 0 (1)(.85) + (.15)(.9)

[

1 − (1 − x/ 2 , 000)^2

]

if 0 < x < 2 , 000 1 if x ≥ 2 , 000.

The density of X is fX (x) = F (^) X′ (x) = .000135 (1 − x/ 2 , 000) if 0 < x < 2 , 000, P(X = 0) =. 85 , and P(X = 2, 000) = .015. The kth moment of X

is E(Xk) = (2, 000)k(.015) +

∫ 2 , 000

0 x

kfX (x) dx. Check that E(X) = 120 and

Var(X) = 135, 600.

2.3. Sums of Independent Random Variables.

Proposition. (1) Suppose X and Y are two independent discrete random variables, and let S = X + Y. Then the distribution of S is FS (s) = P(S ≤ s) = P(X + Y ≤ s) =

∑s y=0 P(X^ +^ Y^ ≤^ s|^ Y^ =^ y)(Y^ =^ y) =^

∑s y=0 P(X^ ≤ s − y)P(Y = y). Also, P(S = s) =

∑s y=0 P(X^ =^ s^ −^ y)P(Y^ =^ y). (2) Suppose X and Y are two independent continuous random variables. Then FS (s) =

∫ (^) s 0 FX^ (s^ −^ y)fY^ (y)^ dy^ and^ fS^ (s) =^

∫ (^) s 0 fX^ (y^ −^ s)^ fY^ (y)^ dy.

Definition. The convolution of FX (x) and FY (y) is

FX ∗ FY =

∫ (^) s

0

FX (s − y) fY (y) dy.

Example 2.3.2. Let X ∼ U nif (0, 2) be independent of Y ∼ U nif (0, 3). Find the c.d.f. of S = X + Y. Solution:

FS (s) =













0 if s < 0 ∫ (^) s 0

s−y 2

1 3 dy^ =^

s^2 ∫ 12 if 0^ ≤^ s <^2 s− 2 0 1

1 3 dy^ +^

∫ (^) s s− 2

s−y 2

1 3 dy^ =^

s− 1 ∫ 3 if 2^ ≤^ s <^3 s− 2 0 1

1 3 dy^ +^

∫ 3

s− 2

s−y 2

1 3 dy^ = 1^ −^

(5−s)^2 12 if 3^ ≤^ s <^5 1 if s ≥ 5.

Sometimes one can use the method of moment generating functions to find the distribution of S. Proposition. Suppose Xi, i = 1,... , n are independent random variables with m.g.f.’s∏ MXi (t). Then the m.g.f. of S = X 1 + · · · + Xn is MS (t) = n i=1 MXi (t). Proof: MS (t) = E

(

etS^

)

= E

(

et(X^1 +···+Xn)

)

= {independence} =

∏n i=1 E^

(

etXi

)

∏ =

n i=1 MXi (t).^2

Example. (1) Xi i.i.d. ∼ Bernoulli(p), i = 1,... , n. Then MX (t) = pet+1−p,

and MS (t) = (pet^ + 1 − p)n, and therefore S ∼ Bi(n, p). (2) Show that the sum of r independent Geom(p) random variables is N B(r, p). (3) Show that the sum of α independent Exp(β) random variables is Gamma(α, β). (4) Show that the sum of n independent P oi(λ) random variables is P oi(nλ). (5) Show that the sum of n independent N (μ, σ^2 ) random variables is N (nμ, nσ^2 ).

2.5. Applications of the Central Limit Theorem to Insurance.

Example 2.5.1. The table below gives the number of insured nk, the benefit amount bk, and the probability of claim qk where k = 1,... , 4.

k nk bk qk bkqk b^2 kqk(1 − qk) 1 500 1 .02 .02. 2 500 2 .02 .04. 3 300 1 .10 .10. 4 500 2 .10 .20.

The life insurance company wants to collect the amount equal to 95th per- centile of the distribution of total claims. The share of the jth insured is (1 + θ)E(Xj ). The amount θE(Xj ) is called the security loading and θ is called the relative security loading. This way the company protects itself from the loss of funds due to excess claims. Find θ. Solution: We want to find θ such that P(S ≤ (1 + θ)E(S)) = .95. We use the CLT to obtain

P

(

Z ≤

θE(S) √ Var(S)

)

=. 95 ,

θE(S) √ Var(S)

= 1. 645.

E(S) =

k, nk^ bk^ qk^ = 160,^ Var(S) =^

k nk^ b

2 k qk^ (1^ −^ qk) = 256.^ Therefore, θ = .1645.

Example 2.5.3. A life insurance benefits are q 1 = · · · = q 5 = .02 and

Benefit amount Number insured 10,000 8, 20,000 3, 30,000 2, 50,000 1, 100,000 500

The retention limit is the amount below which this company (the ceding com- pany) will retain the insurance and above which it will purchase reinsurance coverage from another (the reinsuring) company. Suppose the insurance com- pany sets the retention limit at 20,000. Suppose also that reinsurance is available at a cost of .025 per unit of coverage. It is assumed that the model

is a closed model, that is, the number of insured units is known and doesn’t change during the covered period. Otherwise, the model allows migration in and out of the insurance system and is called an open model. Find the prob- ability that the company’s retained claims plus cost of reinsurance exceeds 8,250,000. Solution: Lets work in units of $10,000. Let S be the amount of retained claims paid. The portfolio of retained business is

k bk nk 1 1 8, 2 2 8,

E(S) = 480, Var(S) = 784. The total coverage in the plan is (8, 000)(1) + (3, 500)(2)+· · ·+(500)(10) = 35, 000, and the retained coverage is (8, 000)(1)+ (8, 000)(2) = 24, 000. The difference 35, 000 − 24 , 000 = 11, 000 is the rein- sured amount. The reinsurance cost is (11, 000)(.025) = 275. Thus, the retained claims plus the reinsurance cost is S + 275. We need to compute

P(S + 275 > 825) = P(S > 550) = P

(

Z > 550 √− 784480

)

= P(Z > 2 .5) = .0062.

12.1. Collective Risk Models for a Single Period. Introduction.

Definition. The collective risk model is the model of the aggregate claim amount generated by a portfolio of policies. Denote by N the number of claims generated by a portfolio of policies in a given time period. Let Xi be the amount of the ith claim (severity of ith claim). Then S = X 1 + · · · + XN is the aggregate claim amount. The variables N, X 1 ,... , XN are random variables such that (1) Xi are identically distributed and (2) N, X 1 ,... , Xn are independent.

12.2. The Distribution of Aggregate Claims.

Notation. Denote by pk = E(Xk) the kth moment of the i.i.d. Xi’s. Let MX (t) = E

[

etX^

]

be the m.g.f. of Xi. Also, let MN (t) and MS (t) denote the m.g.f.’s of N and S, respectively. Proposition. (1) E(S) = E[E(S| N )] = E[p 1 N ] = p 1 E(N ). (2) Var(S) = Var[E(S| N )] + E[Var(S| N )] = Var[p 1 N ] + E[(p 2 − p^21 ) N ] = p^21 Var(N ) + (p 2 − p^21 ) E(N ). (3) MS (t) = E[E(etS^ | N )] = E[MX (t)N^ ] = E

[

eN^ ln^ MX^ (t)

]

= MN (ln MX (t)).

Examples 12.2.1 and 12.2.3. Let N ∼ Geom(p) and Xi ∼ Exp(1). Find the distribution of S. Solution:

MN (t) =

p 1 − (1 − p)et^

and MX (t) =

(1 − t)

.

Thus,

MS (t) =

p 1 − (1 − p)/(1 − t)

= p + (1 − p)

p p − t

,

which is a weighted average of two m.g.f.’s (0 and Exp(p)) with weights p and 1 − p, respectively. Therefore, S has a mixed distribution with mass p at zero and Exp(p) distribution for x > 0. That is,

FS (x) =

{

0 if x < 0 (p)(1) + (1 − p)(1 − e−px) = 1 − (1 − p)e−px^ if x ≥ 0.

Picture.

12.3.1. The Distribution of N.

Three distributions are commonly used.

  1. N ∼ P oi(λ). The distribution of S is called a compound Poisson distri- bution.

E(S) = p 1 λ, Var(S) = p^21 λ + (p 2 − p^21 ) λ = p 2 λ, and MS (t) = eλ[MX^ (t)−1].

Recall that if N ∼ P oi(λ), then E(N ) = Var(N ) = λ. If the variance of N is larger than the mean, the Poisson distribution is not appropriate. Then a negative binomial distribution is used.

  1. N ∼ N B(r, p). The distribution of S is called a compound negative binomial distribution.

E(S) = p 1

r(1 − p) p

, Var(S) = p^21

r(1 − p)^2 p^2

  • p 2

r(1 − p) p

, and

MS (t) =

[

p 1 − (1 − p) MX (t)

]r .

Notice that indeed if N ∼ N B, then Var(N ) > E(N ).

  1. N has a conditional (or mixed) Poisson distribution: N | Λ ∼ P oi(Λ) where Λ is a random variable with p.d.f. u(λ). We have

P(N = n) =

∫ ∞

0

P(N = n| Λ = λ) u(λ) dλ =

∫ ∞

0

λn n!

e−λ^ u(λ) dλ,

E(N ) = E[E(N | Λ)] = E(Λ), Var(N ) = E(Λ)+Var(Λ), MN (t) = MΛ(et−1).

Notice that E(N ) < Var(N ), as in the case of the negative binomial distribu- tion. In fact, the negative binomial distribution can be derived in this fashion.

Example 12.3.1. Let Λ ∼ Gamma(α, β). Show that

N | Λ ∼ N B

(

r = α, p =

β 1 + β

)

.

Solution: MΛ(t) =

(

β β−t

. Thus,

MN (t) = MΛ(et^ − 1) =

(

β β − (et^ − 1)

[

β/(1 + β) 1 − [1 − β/(1 + β)] et

. 2

12.5. Approximations to the Distribution of Aggregate Claims.

The following versions of the central limit theorem hold for compound Pois- son and compound negative binomial distributions.

Theorem 12.5.1. (1) If S has a compound Poisson distribution, then for large λ,

Z =

S − λ p 1 √ λ p 2

has approximately N (0, 1) distribution.

(2) If S has a compound negative binomial distribution, then for large r,

Z =

S − rp 1 1 −pp √ rp 2 1 −p p+ rp^21 (1−p)

2 p^2

has approximately N (0, 1) distribution.

The normal distribution doesn’t give a good approximation, since the it is a symmetric distribution, but the distribution of aggregate claims is skewed to the right (has a long right tail). A translated gamma distribution gives a better approximation. Definition. Let G(x, α, β) denote the c.d.f of a Gamma(α, β) distribution. Then H(x, α, β, x 0 ) = G(x − x 0 , α, β) is the c.d.f. of a translated gamma distribution. Picture. The parameters α, β, x 0 are found by equating the first moments

E(S) = x 0 +

α β

,

and the 2nd and 3rd central moments

E(S − E(S))^2 = Var(S) =

α β^2

, E(S − E(S))^3 =

2 α β^3

.

From here,

α = 4

[Var(S)]^3 [E(S − E(S))^3 ]^2

,

β = 2

Var(S) E(S − E(S))^3

,

x 0 = E(S) − 2

[Var(S)]^2 E(S − E(S))^3

.

For compound Poisson distribution,

E(S) = p 1 λ, Var(S) = p 2 λ, and E(S − E(S))^3 = λ p 3.

Therefore,

α = 4λ

p^32 p^23

, β = 2

p 2 p 3

, and x 0 = λ p 1 − 2 λ

p^22 p 3

.

For compound negative binomial,

E(S) = p 1

r(1 − p) p

, Var(S) = p^21

r(1 − p)^2 p^2

  • p 2

r(1 − p) p

,

and

E(S − E(S))^3 = rp 3

1 − p p

  • 3rp 1 p 2

(1 − p)^2 p^2

  • 2rp^31

(1 − p)^3 p^3

.

Example 12.5.1. Assume N ∼ P oi(λ = 16) and Xi = 1. Approximate P(S < 25) (a) using the normal approximation, (b) using the translated gamma approximation. Solution: (a)

P(S < 25) = P

(

Z <

25 − 16 +. 5

)

= P(Z < 2 .38) =. 9912.

(b) λ = 16, α = 64, β = 2, x 0 = −16,

P(S < 25) = G(25 + 16 +. 5 , 64 , 2) = G(41. 5 , 64 , 2) =. 9866.

Example. Assume N ∼ N B(r = 10, p = .01) and Xi ∼ U nif (0, 1). Ap- proximate P(S > 600).

13.1. Collective Risk Models over an Extended Period. Introduc- tion.

Definition. A stochastic process {X(t), t ≥ 0 } is a collection of random variables indexed by t called time. Definition. Denote by S(t) the aggregate claim process paid in the inter- val [0, t]. Let c(t) be the premiums collected in the interval [0, t], and u be the company’s surplus at time 0. Then the surplus process at time t is U (t) = u + c(t) − S(t), t ≥ 0. We will assume c(t) = ct where c > 0 is a constant premium rate. Let Ti denote the time when the ith claim occurred. Picture of a typical path of U (t), linear growth, jumps at Ti. Definitions. A ruin occurs when the surplus U (t) becomes negative. De- note the time of ruin T = min{t : t ≥ 0 and U (t) < 0 }. Assume T = ∞ if U (t) ≥ 0 for all t. Let ψ(u) = P(T < ∞) = P(U (t) < 0 for some t) denote the probability of ruin considered as a function of initial surplus.

13.3. A Continuous Time Model. A Compound Poisson Process.

Definition. Consider a portfolio of insurance. Let {N (t), t ≥ 0 } denote the claim number process and {S(t), t ≥ 0 } denote the aggregate claim process. Let Xi be the amount of the ith claim. Then S(t) = X 1 + X 2 + · · · + XN (t). Picture N (t) and S(t).

Definition. A stochastic process {N (t), t ≥ 0 } is called a counting process

if it represents the total number of events that have occurred up to time t. A counting process is such that (1) N (t) ≥ 0, (2) N (t) is integer valued, (3) If s < t, then N (s) ≤ N (t), and (4) For s < t, N (t) − N (s) is the number of events that have occurred in the interval (s, t). Definition. A counting process is said to have independent increments if the number of events occurring in disjoint time intervals are independent. Definition. A counting process {N (t), t ≥ 0 } is called a Poisson process if (1) N (0) = 0, (2) it has independent increments, and (3) the number of events in an interval of length t is a Poisson random variable with rate λ t, that is,

P(N (t + s) − N (s) = n) =

(λ t)n n!

e−λ t.

Proposition. Let Vk = Tk − Tk− 1 , k > 1 be the time between two con- secutive claims (called the waiting or interarrival or interevent time). If

{N (t), t ≥ 0 } is a Poisson process, then Vk i.i.d. ∼ Exp(β = 1/λ) and Tk ∼ Gamma(α = k, β = 1/λ).

Definition. If claim amounts Xi are i.i.d. random variables independent of the claim number process {N (t), t ≥ 0 } which is assumed to be a Poisson process, then {S(t), t ≥ 0 } is a compound Poisson process. The mean and variance of S(t) is E(S(t)) = λ t p 1 and Var(S(t)) = λ t p 2.

13.4. Ruin Probability and the Claim Amount Distribution.

We make the following assumptions: (1) S(t) is a compound Poisson process. (2) c > λ p 1 , that is, the premium collection rate exceeds the expected claim payments per unit time. (3) Define the relative security loading θ by c = (1 + θ) λ p 1 where θ > 0. If θ ≤ 0, then the ruin is certain and ψ(u) = 1 for any initial surplus u.

Definition. The adjustment coefficient R is the positive solution of the equation λ (MX (r) − 1) = c r,

or, equivalently, after substitution c = (1 + θ) λ p 1 , the positive solution of 1 + (1 + θ) p 1 r = MX (r). Picture.

Example 13.4.1. Find the adjustment coefficient if X is exponential with mean 1/β. Solution:

MX (r) =

β β − r

, p 1 = 1/β, =⇒ 1 + (1 + θ)

r β

=

β β − r

,

r solves the quadratic equation (1 + θ)r^2 − θβ r = 0. One solution is r = 0 and the other is

R =

θβ 1 + θ

.