Lung Cancer - Introductory Statistics - Lab Solutions, Study notes of Mathematical Statistics

These are the important key points of lab solutions of Introductory Statistics are: Lung Cancer, Probability of Death, Available Information, Total Probability, Conditioning, Experiment Consisting, Random Selection, Particular Sale, Expected Repair Cost, Repair Cost

Typology: Study notes

2012/2013

Uploaded on 01/11/2013

bigna
bigna 🇮🇳

5

(1)

32 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 366 Lab 1 Solutions (September 14, 2006) page 1
TA: Yury Petrachenko, CAB 484, [email protected], http://www.ualberta.ca/yuryp/
Review Questions, Chapters 2, 3, 4, 7, 8
2.105 A study of the residents of region showed that 20% were smokers. The probability of death
due to lung cancer, given that a person smoked, was ten times the probability of death due to
lung cancer, given the person did not smoke. If the probability of death due to lung cancer in
the region is .006, what is the probability of death due to lung cancer given that the person
is a smoker?
Solution. Consider an experiment consisting of a random selection of a late resident in the
area. Some of those who died were smokers. Let’s denote Sthe event that the selected person
smoked. Now, some of the deaths were due to lung cancer. Denote Dthe event that the
person died because of lung cancer.
It is stated in the problem that P(S) = 0.2 and P(D) = 0.006. We are also given that
P(D|S) = 10 ·P(D|S), or P(D|S) = 0.1·P(D|S), where Sis the event that the person didn’t
smoke.
Let’s apply the law of total probability to P(D) conditioning on Sand S:
P(D) = P(D|S)P(S) + P(D|S)P(S).
Substitute all available information into this equation:
0.006 = P(D|S)·0.2+0.1·P(D|S)·(1 0.2).
Now, P(D|S) = 0.0021. ¤
3.42 A particular sale involves four items randomly selected from a large lot that is known to
contain 10% of defectives. Let Ydenote the number of defectives among the four sold. The
purchaser of the items will return the defectives for repair, and the repair cost is given by
C= 3Y2+Y+ 2. Find the expected repair cost.
Solution. A binomial model is depicted here. There are n= 4 trials of selecting an item from
a large lot. There are two outcomes each time: an item is either defective or not. Assuming
the lot is large enough, the trials are independent. Since Yis the number of defectives, it
makes sense to consider finding a defective success. Then, p= 0.1 and YBinomial(n, p).
To find the expected repair cost C, let’s use the linearity of expected values:
E[C] = E[3Y2+Y+ 2] = 3E[Y2] + E[Y] + E[2].
pf3
pf4

Partial preview of the text

Download Lung Cancer - Introductory Statistics - Lab Solutions and more Study notes Mathematical Statistics in PDF only on Docsity!

TA: Yury Petrachenko, CAB 484, [email protected], http://www.ualberta.ca/∼yuryp/

Review Questions, Chapters 2, 3, 4, 7, 8

2.105 A study of the residents of region showed that 20% were smokers. The probability of death

due to lung cancer, given that a person smoked, was ten times the probability of death due to

lung cancer, given the person did not smoke. If the probability of death due to lung cancer in

the region is .006, what is the probability of death due to lung cancer given that the person

is a smoker?

Solution. Consider an experiment consisting of a random selection of a late resident in the

area. Some of those who died were smokers. Let’s denote S the event that the selected person

smoked. Now, some of the deaths were due to lung cancer. Denote D the event that the

person died because of lung cancer.

It is stated in the problem that P (S) = 0.2 and P (D) = 0.006. We are also given that

P (D|S) = 10 · P (D|S), or P (D|S) = 0. 1 · P (D|S), where S is the event that the person didn’t

smoke.

Let’s apply the law of total probability to P (D) conditioning on S and S:

P (D) = P (D|S)P (S) + P (D|S)P (S).

Substitute all available information into this equation:

0 .006 = P (D|S) · 0 .2 + 0. 1 · P (D|S) · (1 − 0 .2).

Now, P (D|S) = 0.0021. §

3.42 A particular sale involves four items randomly selected from a large lot that is known to

contain 10% of defectives. Let Y denote the number of defectives among the four sold. The

purchaser of the items will return the defectives for repair, and the repair cost is given by

C = 3Y

2

  • Y + 2. Find the expected repair cost.

Solution. A binomial model is depicted here. There are n = 4 trials of selecting an item from

a large lot. There are two outcomes each time: an item is either defective or not. Assuming

the lot is large enough, the trials are independent. Since Y is the number of defectives, it

makes sense to consider finding a defective success. Then, p = 0.1 and Y ∼ Binomial(n, p).

To find the expected repair cost C, let’s use the linearity of expected values:

E[C] = E[3Y

2

  • Y + 2] = 3E[Y

2

] + E[Y ] + E[2].

The expectation of 2 is 2. The expectation of Y is np (since we know the distribution of this

random variable), so E[Y ] = 4 · 0 .1 = 0.4. To find E[Y

2 ] recall the formula

V [Y ] = σ

2

= E[Y

2

] − μ

2

= E[Y

2

] −

E[Y ]

2

We have E[Y

2 ] = V [Y ] +

E[Y ]

2

= npq + (0.4)

2 = 0.36 + 0.16 = 0.52. Let’s finally plug

everything in: E[C] = 3 · 0 .52 + 0.4 + 2 = 3.96. §

3.107 A salesperson has found that the probability of a sale on a single contract is approximately .03.

If the salesperson contacts 100 prospects, what is the approximate probability of making at

least one sale?

Solution. If we ignore the word “approximate”, we can approach this problem with a binomial

distribution. There are n = 100 prospects, seemingly independent, with the salesperson

making a single sale with the probability p = 0.03 in each of the 100 cases. The question can be

reformulated in terms of probabilities as follows. Find P (Y ≥ 1) if Y is distributed binomially

with parameters n and p. To answer this question let’s use the fact that

P (Y = y) = 1,

and Y takes values from 0, 1, 2,... , 100. So,

P (Y ≥ 1) =

100 ∑

y=

P (Y = y) = 1 − P (Y = 0) = 1 −

0

(0.97)

100

≈ 0. 9524.

To find a truly approximate value, we can use the Poisson distribution. In this case, since p

is close to zero, the variance and expected value doesn’t differ much. So, let λ = np = 3 and

assume that Y follows the Poisson distribution:

P (Y = y) =

λ

y

y!

e

−λ

, y = 0, 1 , 2 ,...

With this assumption,

P (Y ≥ 1) = 1 − P (Y = 0) = 1 −

0

e

−λ

= 1 − e

− 3

≈ 0. 9502.

The second method is faster and implements a model with one parameter only. §

We want this function minimized. Denote

f (a) = a

2

σ

2

1

  • (1 − a)

2

σ

2

2

, then

f

(a) = 2aσ

2

1

− 2(1 − a)σ

2

2

and solving for a:

a =

σ

2

2

σ

2

1

  • σ

2

2

This value of a minimizes the variance of the third estimator. §

8.11 Let Y 1

, Y

2

,... , Y

n

denote a random sample of size n from a population whose density is given

by

f (y) =

3 β

3

y

− 4

, β ≤ y,

0 , elsewhere

where β is unknown. Consider the estimator

β = min(Y 1

, Y

2

,... , Y

n

(a) Derive the bias of the estimator

β.

(b) Derive MSE(

β).

Solution. See Prof. Prasad’s notes.