Lecture 3: Cumulative distribution functions, Schemes and Mind Maps of Mathematical Statistics

The cumulative distribution function (CDF) for the Bernoulli. B(p) distribution. 2. Discrete with finite support. Let Y be a discrete random ...

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 03/01/2023

nicoline
nicoline 🇺🇸

4.6

(12)

271 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 3: Cumulative distribution functions 1of 8
Course: Mathematical Statistics
Term: Fall 2018
Instructor: Gordan Žitkovi´c
Lecture 3
Cumulative distribution functions and derived
quantities
When we talk about the distribution of a discrete random variable, we
write down its pmf (or a distribution table), and when the variable is contin-
uous, we give its pdf. There are other ways of expressing the same informa-
tion; depending on the context, these other ways can be much more useful
or effective.
3.1Cumulative distribution functions (cdf)
Definition 3.1.1.For a random variable Y, discrete or continuous, we
define its cumulative distribution function (cdf) FY:R[0, 1]by
FY(y) = P[Yy],yR.
The first, obvious, advantage of the cdf is that it can be used for both dis-
crete and continuous random variables. Since it is defined as a probability of
an event, FY(y)can be computed (at least in principle) from the distribution
table in the discrete case
FY(y) =
u∈SY,uy
pY(u),
or from the pdf (in the continuous case):
FY(y) = Zy
fY(u)du. (3.1.1)
As we shall see in the examples, going the other way in the discrete case is
possible, but the formula is a bit clumsy. The continuous case is nicer because
one could use the fundamental theorem of calculus to conclude that
fY(y) = d
dy FY(y)for yR,
at least for those ywhere fYis a continuous function.
We know that the pdf fYof any random variable Ymust be nonnega-
tive and integrate to 1. In a similar way, any cdf will have the following
properties:
Last Updated: September 25,2019
pf3
pf4
pf5
pf8

Partial preview of the text

Download Lecture 3: Cumulative distribution functions and more Schemes and Mind Maps Mathematical Statistics in PDF only on Docsity!

Course: Mathematical Statistics Term: Fall 2018 Instructor: Gordan Žitkovi´c

Lecture 3

Cumulative distribution functions and derived

quantities

When we talk about the distribution of a discrete random variable, we write down its pmf (or a distribution table), and when the variable is contin- uous, we give its pdf. There are other ways of expressing the same informa- tion; depending on the context, these other ways can be much more useful or effective.

3. 1 Cumulative distribution functions (cdf)

Definition 3. 1. 1. For a random variable Y, discrete or continuous, we define its cumulative distribution function (cdf) FY : R → [0, 1] by

FY (y) = P [Y ≤ y], y ∈ R.

The first, obvious, advantage of the cdf is that it can be used for both dis- crete and continuous random variables. Since it is defined as a probability of an event, FY (y) can be computed (at least in principle) from the distribution table in the discrete case

FY (y) = ∑

u∈SY ,u≤y

pY (u),

or from the pdf (in the continuous case):

FY (y) =

∫ (^) y

−∞

fY (u) du. ( 3. 1. 1 )

As we shall see in the examples, going the other way in the discrete case is possible, but the formula is a bit clumsy. The continuous case is nicer because one could use the fundamental theorem of calculus to conclude that

fY (y) = (^) dyd FY (y) for y ∈ R ,

at least for those y where fY is a continuous function.

We know that the pdf fY of any random variable Y must be nonnega- tive and integrate to 1. In a similar way, any cdf will have the following properties:

  1. 0 ≤ FY (u) ≤ 1,

  2. FY is nondecreasing, and

  3. limu→∞ FY (u) = 1 and limu→−∞ FY (u) = 0.

Example 3. 1. 2.

  1. Bernoulli. Let Y be a Bernoulli random variable B(p). To find an expression for FY, we first note that

FY (y) = 0 for y < 0.

This follows directly from the defintion - Y takes values 0 or 1, so P [Y ≤ y] = 0, as soon as y < 0. Similarly,

FY (y) = 1 for y ≥ 1.

What happens in the middle? For any y ∈ [0, 1), the only way for Y ≤ y to be true is if Y = 0. Therefore,

FY (y) = P [Y ≤ y] = P [Y = 0 ] = q for y ∈ [0, 1).

A picture makes it even easier to grasp:

1 y

q

1

Figure 1. The cumulative distribution function (CDF) for the Bernoulli B(p) distribution.

  1. Discrete with finite support. Let Y be a discrete random variable with a finite support SY = {y 1 ,... , yn} and let its distribution table be given by y 1 y 2... yn p 1 p 2... pn

l r

11

Figure 3. The cumulative distribution of a uniform U(l, r) distribution.

  1. Normal Distribution. The CDF of the normal distribution N( μ , σ )

FY (y) =

∫ (^) u

−∞

√^1 2 πσ^2 e − (u− μ )

2 2 σ^2 du

does not have an explicit expression in terms of elementary func- tions (not even for μ = 0 and σ = 1). That is why you had to use tables (or software) to compute various probabilities associate to the normal in your probability class. Using mathematical software, one can evaluate this integral numerically, and the resulting picture is given below:

μ-σ μ μ+σ μ+ 2 σ y

Figure 4. The cumulative distribution of a normal N( μ , σ ) distribution.

  1. Exponential distribution. The integration in the computation of the cdf FY of an exponentially-distributed random variable Y ∼ E( τ ) can be performed quite easily and completely explicitly. First

of all, for y < 0, we clearly have FY (y) = 0. For y > 0, we compute

FY (y) =

∫ (^) y

−∞

1 τ e

−u/ τ (^) 1 [0,∞)(u)^ du^ =

∫ (^) y

0

1 τ e

−u/ τ (^) du = 1 − e−y/ τ (^) , y > 0.

0 y

11

Figure 5. CDF of the exponential distribution E( τ ).

3. 2 Quantiles

The notion of a quantile is familiar to almost everyone, even if you have not learned it formally in a class. You don know what “top 1 %” means, right? The formal definition is easy once we have the notion of a cdf at our disposal:

Definition 3. 2. 1. For α ∈ (0, 1), we define the α -quantile of the distri- bution of the random Y as the number qY ( α ) ∈ R with the property that FY (qY ( α )) = α , i.e., P [Y ≤ qY ( α )] = α.

Caveat: The way we defined above, the quantile qY ( α ) may not need to exist for all α. This can be remedied by adopting a more careful definition, but, since we will not have to deal with this problem in these notes - and whenever we need quantiles, they will happily exist

  • we simply ignore it. If you want to think about this a bit more, try to figure out which quantiles of the Bernoulli distribution actually exist, i.e., for which α can we find a number q such that P [Y ≤ q] = α , when Y is Bernoulli. Is such a q uniquely determined?

tures some real data about humans where the hazard rate is far from constant.

20 40 60 80 100 120 y

20 40 60 80 100 y

Figure 6. The survival (left) and the hazard (right) functions of the empirical distribution of the ages of death of all female individuals born in the US in

3. 4 Problems

Problem 3. 4. 1. Two (unbiased, independent) coins are tossed, and the total number of heads is denoted by Y. Write an expression for the CDF of Y and sketch its graph.

Problem 3. 4. 2. Which of the following pairs of functions could be the pdf and the cdf (respectively) of some probability distribution:

(a) f (x) = x^2 , F(x) = 13 x^3

(b) f (x) = cos(x), F(x) = sin(x).

(c) f (x) = 2 e−^2 x^ (^1) {x> 0 }, F(x) = ( 1 − e−^2 x^ ) (^1) {x> 0 }.

(d) f (x) = √^1 2 π e−x (^2) / , F(x) = 1 − e−x 2 .

(e) f (x) = (^1) {x> 0 }, F(x) = x (^1) {x> 0 }.

Problem 3. 4. 3. Let Y be a random variable with CDF FY, and let qY : (0, 1) → R be its quantile function (we assume it exists for each α ∈ (0, 1)). What is the relationship between the graphs of FY and qY, i.e., how do you get one from the other?

Problem 3. 4. 4. Let Y be a continuous random variable with the density fY given by fY (y) = cy^2 ( 1 − y) (^1) 0,1,

for an appropriate constant c.

  1. Sketch the graph of f and find the value of the constant c.

  2. Compute the cumulative distribution function (cdf) FY and the survival function SY, of Y.

  3. What is the domain of the hazard function? Compute the hazard function hY itself.

  4. Find the mode of Y

  5. Compute the 165 -th quantile of Y. (Note: Guess and verify.)

Problem 3. 4. 5. Let Y be a random variable with the pdf

fY (y) = 2 y (^1) { 0 ≤y≤ 1 }.

Compute the hazard function hY of Y.

Problem 3. 4. 6. Let Y be a uniform random variable on the interval [0, 100]. The hazard function hY of the distribution of Y is given by

(a) (^1) y (^1) {y> 0 } for y ∈ (−∞, 100)

(b) (^1001) −y (^1) {y> 0 } for y ∈ (−∞, 100)

(c) (^1) {y< 0 } + 100100 − y (^1) { 0 ≤y≤ 100 } for y ∈ (−∞, 100]

(d) ( 100 − y) (^1) {y∈[0,100)} for y ∈ [0, ∞)

(e) none of the above

Problem 3. 4. 7. The expected lifetime of a bulb is h (in hours). Assuming that the bulb lifetimes are exponentially distributed, compute

  1. the probability that the bulb is still functional at time h

  2. the half-life of the bulb, i.e., a number t∗^ such that the probability that the bulb is still functional after t∗^ hours is exactly 1/2.

Problem 3. 4. 8. Compute the α -quantile qY ( α ) for α = 0.75 where Y is the uniform distribution U(4, 8) on [4, 8].