






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Definition 4.1.1: Continuous Random Variables ... Every continuous random variable has a probability density function (PDF), instead of a probability mass.
Typology: Schemes and Mind Maps
1 / 11
This page cannot be seen from the preview
Don't miss anything!







Slides (Google Drive) Alex Tsun Video (YouTube)
Up to this point, we have only been talking about discrete random variables - ones that only take values in a countable (finite or countably infinite) set like the integers or a subset. What if we wanted to model quantities that were continuous - that could take on uncountably infinitely many values? If you haven’t studied or seen cardinality (or types of infinities) before, you can think of this as being intervals of the real line, which take decimal values. Our tools from the previous chapter were not suitable to modelling these situations, and so we need a new type of random variable.
Definition 4.1.1: Continuous Random Variables
A continuous random variable is a random variable that takes values from an uncountably infinite set, such as the set of real numbers or an interval. For e.g., height (5.6312435 feet, 6.1123 feet, etc.), weight (121.33567 lbs, 153.4642 lbs, etc.) and time (2.5644 seconds, 9321.23403 seconds, etc.) are continuous random variables that take on values in a continuum.
Why do we need continuous random variables?
Suppose we want a random number in the interval [0, 10], with each possibility being “equally likely”.
Suppose we want to study the set of possible heights (in feet) a person can have, supposing that the range of possible heights is the interval [1, 8].
Notice, that since these values can have infinite precision, the probability that a variable has a specific value is 0, in contrast to discrete random variables.
2 Probability & Statistics with Applications to Computing 4.
Every continuous random variable has a probability density function (PDF), instead of a probability mass function (PMF), that defines the relative likelihood that a random variable X has a particular value. Why do we need this new construct? We already said that P (X = a) = 0 for any value of a, and so a “PMF” for a continuous random variable would equal 0 for any input and be useless. It wouldn’t satisfy the constraint that the sum of the probabilities is 1 (assuming we could even sum over uncountably many values; we can’t). Instead, we have the idea of a probability density function where the x-axis has values in the random variable’s range (usually an interval), and the y-axis has the probability density (not mass), which is explained below.
A PDF may look something like this:
The probability density function fX has some characteristic properties (denoted with fX to distinguish from PMFs pX ). Notice again I will use different dummy variables inside the function like fX (z) or fX (t) to ensure you get the idea that the density is fX (subscript indicates for rv X) and the dummy variable can be anything.
−∞ fX^ (t)dt^ = 1; i.e., the area under the entire curve is equal to 1, just like the sum of all the proba- bilities of a discrete random variable equals 1.
∫ (^) b a fX^ (w)dw; i.e., the probability that^ X^ lies in the interval^ a^ to^ b^ is the area under the curve from a to b. This is key - integrating fX gives us probabilities.
4 Probability & Statistics with Applications to Computing 4.
we see their ratios. Since the density is twice as high at u as it is at v, it means we are twice as likely to get a point “near” u as we are to get one “near” v.
Definition 4.1.2: Probability Density Function (PDF)
Let X be a continuous random variable (one whose range is typically an interval or union of intervals). The probability density function (PDF) of X is the function fX : R → R, such that the following properties hold:
−∞ fX^ (t)^ dt^ = 1
∫ (^) b a fX^ (w)^ dw
P (X ≈ q) = P
q −
ε 2 ≤ X ≤ q +
ε 2
≈ εfX (q)
P (X ≈ u) P (X ≈ v)
εfX (u) εfX (v)
fX (u) fX (v)
Here is the density function of a “uniform” random variable on the interval [0, 1]:
4.1 Probability & Statistics with Applications to Computing 5
We know this is valid, because the area under the curve is the area of a square with side lengths 1, which is 1 · 1 = 1.
We define the cumulative distribution function (CDF) of X to be FX (w) = P (X ≤ w). That is, the all the area to the left of w in the density function. Note we also have CDFs for discrete random variables, they are defined exactly the same way (the probability of being less than or equal to a certain value)! They just don’t usually have a nice closed form like they do for continuous RVs. Note for continuous random variables, the CDF at w is just the cumulative area to the left of w, which can be found by an integral (the dummy variable of integration should be different than the input variable w)
FX (w) = P (X ≤ w) =
∫ (^) w
−∞
fX (y)dy
Let’s try to compute the CDF of this uniform random variable on [0, 1]. There are three cases to consider here.
4.1 Probability & Statistics with Applications to Computing 7
Definition 4.1.3: Cumulative Distribution Function (CDF)
Let X be a continuous random variable (one whose range is typically an interval or union of intervals). The cumulative distribution function (CDF) of X is the function FX : R → R such that:
∫ (^) t −∞ fX^ (w)^ dw^ for all^ t^ ∈^ R
Example(s)
Suppose the number of hours that a package gets delivered past noon is modelled by the following PDF:
fX (x) =
x/ 10 0 ≤ x ≤ 2 c 2 < x ≤ 6 0 otherwise Here is a graph of the PDF as described above:
8 Probability & Statistics with Applications to Computing 4.
Solution
−∞
fX (x)dx = 1
But, the density function is split into three parts, we can split our integral into three. However, anywhere the density is zero, we will get an integral of zero, so we’ll only set up the two integrals that are nontrivial: (^) ∫ 2
0
x/ 10 dx +
2
cdx = 1
Solving this equation for c would definitely work. But let’s try to use geometry instead, as we do know how to compute the area of a triangle and rectangle. So the left integral is the area of the triangle with base from 0 to 2 and height c, so that area is 2c/2 = c (the area of a triangle is b · h/2). The area of the rectangle with base from 2 to 6 is 4c. We need the total area of c + 4c = 1, so c = 1/5.
(a) The outer cases are usually the easiest ones: if x < 0, then FX (x) = P (X ≤ x) = 0 since X cannot be less than zero. (b) If x > 6, then FX (x) = P (X ≤ x) = 1 since X is guaranteed to be at most 6. (c) For 0 ≤ x ≤ 2, we need the cumulative area to the left of x, which happens to be a triangle with base x and height x/10, so the area is x^2 /20. Alternatively, evaluate the integral
FX (x) =
∫ (^) x
−∞
fX (t)dt =
∫ (^) x
0
t/ 10 dt = t^2 / 20
(d) For 2 < x ≤ 6, we have the entire triangle of area 2 · 1 / 5 · 0 .5 = 1/5, but also a rectangle of base x − 2 and height 1/5, for a total area of 1/5 + 1/5(x − 2) = x/ 5 − 1 /5. Alternatively, the integral would be FX (x) =
∫ (^) x
−∞
fX (t)dt =
0
t/ 10 dt +
∫ (^) x
2
1 / 5 dt = x/ 5 − 1 / 5
Again, I skipped all the integral evaluation steps as they are purely computational, but feel free to verify!
Finally, putting this together gives
FX (x) =
0 x < 0 x^2 / 20 0 ≤ x ≤ 2 x/ 5 − 1 / 5 2 < x ≤ 6 1 x > 6
10 Probability & Statistics with Applications to Computing 4.
Solution: We need the total area under the curve to be 1, so
−∞
fX (y)dy =
0
cy^2 dy = c
y^3
0
= c
= 243c
Hence, c = 2431. The expected value is the weighted average of each point weighted by its density, so
−∞
zfX (z)dz =
0
z
z^2 dz =
0
z^3 dz
Similarly, by LOTUS,
−∞
z^2 fX (z)dz =
0
z^2
z^2 dz =
0
z^4 dz
Finally, we can set Var (X) = E
For the CDF, we know that
FX (t) = P (X ≤ t) =
∫ (^) t
−∞
fX (y)dy
We actually have three cases, similar to the example earlier. If t < 0, FX (t) = 0 since there’s no way to get a negative number (the range is ΩX = [0, 9]). If t > 9, FX (t) = 1 since we are guaranteed to get a number less than t. And for 0 ≤ t ≤ 9, we just do a normal integral to get that
FX (t) = P (X ≤ t) =
∫ (^) t
−∞
fX (s)ds =
−∞
fX (s)ds +
∫ (^) t
0
fX (s)ds = 0 +
∫ (^) t
0
cs^2 ds = c 3
t^3
Putting this together gives:
FX (t) =
0 t < 0 c 3 t
(^3 0) ≤ t ≤ 9 1 t > 9
fX (x) =
c x^2 1 ≤^ x^ ≤ ∞ 0 otherwise
Write an expression for the value of c that makes X a valid pdf, and set up expressions (integrals) for its mean and variance. Also, find the cdf of X, FX.
Solution: We need the total area under the curve to be 1, so
−∞
fX (y)dy =
1
c y^2
dy = −c
y
1
= −c(0 − 1) = c
Hence, c = 1. The expected value is the weighted average of each point weighted by its density, so
−∞
zfX (z)dz =
1
z ·
z^2
dz =
1
z
dz = [ln(z)]∞ 1 = ∞
4.1 Probability & Statistics with Applications to Computing 11
Actually, the mean and variance are undefined (since they are infinite)! If the integral for E [X] did not converge, then the integral for E
had no chance either (try it)! For the cdf, we know that
FX (t) = P (X ≤ t) =
∫ (^) t
−∞
fX (y)dy
We actually have two cases. If t < 1, FX (t) = 0 since there’s no way to get a number less than 1 (the range is ΩX = [1, ∞)). For t > 1, we just do a normal integral to get that
FX (t) = P (X ≤ t) =
∫ (^) t
−∞
fX (s)ds =
−∞
fX (s)ds+
∫ (^) t
1
fX (s)ds =
∫ (^) t
1
s^2 ds = −
s
]t
1
t
t
Putting this together gives:
FX (t) =
0 t < 1 1 − (^1) t t ≥ 1