Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Lecture Notes on Random Variable - Basic Statistical Method | ISYE 2028, Study notes of Data Analysis & Statistical Methods

Georgia Institute of Technology - Main Campus Data Analysis & Statistical Methods

Prof. Kobi Abayomi

Material Type: Notes; Professor: Abayomi; Class: Basic Statistical Meth; Subject: Industrial & Systems Engr; University: Georgia Institute of Technology-Main Campus; Term: Spring 2009;

Typology: Study notes

Pre 2010

Uploaded on 08/04/2009

koofers-user-pn4 🇺🇸

10 documents

1 / 13

This page cannot be seen from the preview

Don't miss anything!

ISYE 2028 A and B

Lecture 4

Dr. Kobi Abayomi

January 20, 2009

1 Introduction - Continuous Random Variables

We call a random variable continuous if it has an uncountable number of values; if it can

take all values in an interval of values.

Examples of continuous random variables: Survival time of drinkers of Smoke-Colar; Time

to recidivism for parolee of Savings and Loan Scandal; Amount of weight lost. Etc. Etc.

That the definition of continuous closely matches the version we use in single variable calculus

is natural and should make us feel good.

We can extend what we’ve said already about discrete random variables, using P’s, to say

analogous things about continuous random variables, using R’s. Remember that the integral,

R, is just the limit of P, as the discrete index goes to be an infinitesimal.1.

2 Probability Distribution of a Random Variable

Let’s extend the definition of the probability distribution to the continuous case by first

restating that the distribution is the complete specification of values of the random variable

with assigned probabilities. In the discrete case we could use this heuristic to write down

a function or a table. In the continuous case, the distribution of the random variable is

explicitly functional.

1In Leibniz’s view of the calculus. Now would be a good time to break out your Calc I textbook, if you

need to

1

Discover Study notes of Data Analysis & Statistical Methods Georgia Institute of Technology - Main Campus

Partial preview of the text

Download Lecture Notes on Random Variable - Basic Statistical Method | ISYE 2028 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

ISYE 2028 A and B

Lecture 4

Dr. Kobi Abayomi

January 20, 2009

1 Introduction - Continuous Random Variables

We call a random variable continuous if it has an uncountable number of values; if it can take all values in an interval of values.

Examples of continuous random variables: Survival time of drinkers of Smoke-Colar; Time to recidivism for parolee of Savings and Loan Scandal; Amount of weight lost. Etc. Etc.

That the definition of continuous closely matches the version we use in single variable calculus is natural and should make us feel good.

We can extend what we’ve said already about discrete random variables, using

’s, to say analogous things about continuous random variables, using

∫ ’s. Remember that the integral, , is just the limit of

, as the discrete index goes to be an infinitesimal.^1.

2 Probability Distribution of a Random Variable

Let’s extend the definition of the probability distribution to the continuous case by first restating that the distribution is the complete specification of values of the random variable with assigned probabilities. In the discrete case we could use this heuristic to write down a function or a table. In the continuous case, the distribution of the random variable is explicitly functional.

(^1) In Leibniz’s view of the calculus. Now would be a good time to break out your Calc I textbook, if you

need to

Here is an explicit definition of a continuous probability distribution or probability density function (pdf):

For X a continuous random variable, the pdf of X is the function f (X) such that:

P(a ≤ X ≤ b) =

∫ (^) b

a

f (x)dx (1)

we call, f (x), the density curve for X. We can also restate some of our probability rules using this new definition.

For X a continuous random variable, on the real line, with density function f (x)

∫ (^) x −∞ f^ (u)du^ =^ F^ (x).^ F^ (x) is called the distribution function for^ X.^ F^ (x) =^ P(X^ ≤^ x), or the probability that and random variable X is less than or equal to x.

−∞ f^ (x)dx^ =^ F^ (+∞)^ −^ F^ (−∞) = 1^ −^ 0 = 1.^ Pay attention to nuance here: The distribution function of X is 1 at infinity. Every value of X is less than or equal to infinity. There is an analogous argument for F (−∞) = 0. And I point out that the area under the density curve must equal 1.

For all X ∈ ] − ∞, +∞[, 0 ≤ f (x) ≤ 1

2.1 Example

Say we have an interval A = {x : 0 ≤ x ≤ 2 } where we observe a real valued random variable, X. Say we believe the distribution function is of some form F (x) = cx^2 , with c a constant. We can immediately determine c: since F (2) = 1 = 4c → c = 1/4.

As well, F (x) = x

2 4 =^

0 f^ (u)du^ →^ f^ (x) =^

x

Then the probabilities for any interval, for example P(^14 ≤ X ≤ 12 ) =

1 / 4

u 2 du^ =^ F^ (1/2)^ − F (1/4) = 3/64.

2.2 Features

The property of the complement yields:

F (x) ≡ P(X > x) = 1 − P(X ≤ x) = 1 − F (x) (2)

E(X) =

0

x^2 2

dx = 8/ 6

In general:

E(h(x)) =

R

h(x)f (x)dx (7)

for any function, h.

Example:

Say X ∼ f (x) = x 2

Then

E(2X) =

0

x^2 2

dx = 2 · 8 / 6

Additionally, this equation — known as the layered representation — holds for non-negative random variables.

E(X) =

R+

(1 − F (x))dx (8)

Example:

Say X ∼ f (x) = x 2

Then

E(X) =

0

(1 − F (x))dx =

0

x^2 4

3.2 Population Variance

σ^2 = E[(X − μ)^2 ] =

(x − μ)^2 f (x)dx = V ar(X) (9)

Again, this can be reduced to:

σ^2 = V ar(X) = E(X^2 ) − μ^2 (10)

Example:

Say X ∼ f (x) = x 2

Then

V ar(X) =

0

(x − 8 /6)^2 ·

x 2

dx =

= 8/ 9

4 General Joint Distributions

Two given random variables X and Y have a general, joint distribution that is an extension of the single variable definition. In the discrete case

P((X, Y ) = (x, y)) = p(x, y) (11)

the joint probability mass function. In the continuous case

P((X, Y ) = (x ± , y ± )) = f (x, y) (12)

We generate the marginal distributions for X and Y alone just as we did for contingency tables by summing over all values of the other variable.

px(x) =

y

p(x, y); py(y) =

x

p(x, y) (13)

fx(x) =

R

f (x, v)dv; fy(y) =

R

f (u, y)dx (14)

Two random variables are independent if

pX,Y (x, y) = pX (x)pY (y) (15)

Again same as the discrete case.

6 Expectation and Covariance

The main idea is that expectation is a linear operator and that the expectations of a function is the expectation taken over the values of the function.

In a natural extension to two dimensions:

E(g(X, Y )) =

y

∫ ∫ x^ g(x, y)p(x, y),^ x, y discrete g(x, y)f (x, y)dxdy, x, y cont.

6.1 Example

We get the moments we use in calculation of mean and variance, etc. by choosing the function we take an expectation of.

g 1 (x) = x −→ E(g 1 (X)) = μX g 2 (x, y) = xy −→ E(g 2 (X, Y )) = E(XY ) g 3 (y) = (y − μy)^2 −→ E(g 3 (Y )) = E((Y − μy)^2 ) = V ar(Y )

7 Covariance

Let g(X, Y ) = [X − μX ][Y − μy]. Then:

E(g(X, Y )) = E([X − μX ][Y − μy]) = E(XY − μY X − μX Y + μX μY ) = E(XY ) − μY E(x) − μX E(Y ) + μX μY E(XY ) − μX μY

This expectation has a special name, the covariance of X, Y. So the covariance of X, Y is

Cov(X, Y ) = E([X − μX ][Y − μY ]) = E(XY ) − μX μY (22)

7.1 Properties of Covariance

7.1.1 Covariance can be negative

Cov(X, Y ) ∈ R

Note that V ar(X) ≥ 0.

7.1.2 Independence implies zero Covariance

If X ⊥ Y then

E(XY ) − μX μy = μX μY − μX μY = 0

but!

7.1.3 Zero Covariance does not imply independence

The fact here is Cov(X, Y ) ; X ⊥ Y.

For an example, take X, Y with this distribution :

P(X = 0) = P(X = 1) = P(X = −1) =

Y =

0 , X 6 = 0

1 , X = 0

Thus E(X) = 0 and E(XY ) = 0 but Y is obviously a function of X

In general, for many Y = g(X), where g is symmetric (about zero, for instance), Cov(X, Y ) = 0 but X is — of course — not independent of Y = g(X).

7.1.4 Covariance is symmetric

Cov(X, Y ) = Cov(Y, X)

8 Correlation Coefficient

The number ρ, which we introduced as a parameter to the multivariate normal distribution, is called the correlation coefficient

ρ =

Cov(X, Y ) √ σ^2 X σ Y^2

E([X − μx][Y − μY ]) √ E([X − μX ]^2 )E([Y − μY ]^2 )

Fact (a version of the Cauchy-Schwarz inequality):

|E([X − μx][Y − μY ])| ≤

E([X − μX ]^2 )E([Y − μY ]^2 )

so ρ ∈ [− 1 , 1]

Notice:

E(XY ) = μX μY + ρσX σY

since Cov(X, Y ) = ρσX σY

8.1 Properties of ρ

Let X, Y ∼ fX,Y with the conditional distribution of Y |X = x: fY |X = fX,Y fX.

Then

E(Y |X = x) =

yfY |X dy

yfx,ydy fx

Remember that the expected value of Y given X = x is a random variable depending upon the observed value of X, x. Say, this expected value is a linear function: set E(Y |X = x) = a + bx Call this equation (++) ≡ E(Y |X = x) = a + bx.

Let’s derive a general result for the conditional expectation when it is constrained to be a linear function, i.e. let’s solve for constants a and b

If we integrate both sides of this equation with respect to dx, we get:

μY = a + bμX

Now, integrate x · (++) (both sides) with respect to dx, this yields:

E(XY ) = aμX + bE(X^2 )

Realizing that E(X^2 ) = σ X^2 + μ^2 x, with the two above equations (for the two unknowns, a and b), the result is:

E(Y |X = x) = μY + ρ

σY σX

(X − μX ) (24)

N.B.: This is the same as the conditional expectation for the bivariate normal distribution. This suggests a role for the normal distribution in linear conditional expectation. Notice that in equation (24) the expectation is simply μY if ρ = 0. For the bivariate normal distribution, this is equivalent to X ⊥ Y.

Moreover, if Y = aX + b then Cov(X, Y ) ⇒= aσ^2 X and

ρ =

aσ^2 X √ σ^2 X · a^2 σ^2 X

aσ X^2 |a|σ^2 X

= 1 · sgn(a)

8.2 Variance, Again!

This is important for a general equation for variance of linear transforms: aX + bY

V ar(aX + bY ) = E(aX + bY )^2 − [E(aX + bY )]^2 = E(a^2 X^2 + abXY + b^2 Y 2 ) − [aμX + bμY ]^2 = E(a^2 X^2 ) + E(b^2 Y 2 ) + 2abE(XY ) − a^2 μX 2 − 2 abμX μY − b^2 μY 2 = a^2 E(X^2 ) − a^2 μX + b^2 E(Y 2 ) − b^2 μY + 2ab(E(XY ) − μX μY ) →

V ar(aX + bY ) = a^2 V ar(X) + b^2 V ar(Y ) + 2abCov(XY ) (25)

That is, for any interval [to, tf ] in [a, b] the probability P(to ≤ X ≤ tf ) = c(b − a). c is our (usual) constant of proportionality. If we take the entire interval we can solve for c: 1 = P(a ≤ X ≤ b) = c(b − a) implies c = (^) b−^1 a , which yields the uniform pdf.

The cumulative distribution function is generated in the usual way

FX (x) =

∫ (^) x

a

b − a

dt =

x − a b − a

and in notation we call X ∼ U [a, b]. In words ”X is uniformly distributed between a and b”. The pdf is:

I leave it to you to convince yourself of E(X) and V ar(X) for the uniform distribution. That’s just figuring out these integrals...

E(X) =

∫ (^) b

a

x b − a

b + a 2

V ar(X) =

∫ (^) b

a

x − E(X) b − a

dx =

(b − a)^2 12

It is not too hard.

11 Exercises

Convince yourself of the expectation and variance for a uniformly distributed random variable on interval [a, b].
Verify Cov(ax + b, Y ) = aCov(X, Y )
Verify Cov(

i Xi,^

j Yj^ ) =^

i

j Cov(Xi, Yj^ )

Let X be a random variable having finite expectation μ and variance σ^2. Let g(·) be a twice differentiable function. Show that E[g(X)] ≈ g(μ) + g

′′(μ) 2 σ

(^2). Hint: Expand g(·) in a Taylor series about μ. Now use this to suggest an approach for V ar(g(X)).

Lecture Notes on Random Variable - Basic Statistical Method | ISYE 2028, Study notes of Data Analysis & Statistical Methods

Related documents

Partial preview of the text

Download Lecture Notes on Random Variable - Basic Statistical Method | ISYE 2028 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

ISYE 2028 A and B

Lecture 4

Dr. Kobi Abayomi

January 20, 2009

1 Introduction - Continuous Random Variables

2 Probability Distribution of a Random Variable

2.1 Example

2.2 Features

E(X) =

E(2X) =

E(X) =

E(X) =

3.2 Population Variance

6.1 Example

7.1 Properties of Covariance

P(X = 0) = P(X = 1) = P(X = −1) =

Y =

0 , X 6 = 0

1 , X = 0

8.1 Properties of ρ

8.2 Variance, Again!

E(X) =