ECE 313 Lecture 14: Conditional Probability & Chain Rule in Engineering Applications - Pro, Study notes of Statistics

A set of lecture slides from the university of illinois at urbana-champaign's electrical and computer engineering department for ece 313: probability with engineering applications, fall 2000. The slides cover topics such as conditional probability, the chain rule, the theorem of total probability, and conditional pmf. The slides also include examples and formulas.

Typology: Study notes

Pre 2010

Uploaded on 03/10/2009

koofers-user-hp0
koofers-user-hp0 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECE 313 — Probability with Engineering Applications Fall 2000
Department of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign 14.1
ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 1 of 38
Conditional probability
lGiven that event A of probability P(A) > 0
occurred, the conditional probability of B
given A is denoted by P(B|A) and defined
as P(B|A) = P(AB)
P(A)
lP(B|A) can be larger than, smaller than, or
the same as P(B)
lConditional probabilities satisfy the axioms
of probability theory
ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 2 of 38
The chain rule or product rule
lP(B|A) = P(AB)/P(A)
lP(AB) = P(B|A)P(A)
P(ABCD…)
= P(A)P(B|A)P(C|AB)P(D|ABC)…
lThe chain rule also applies to conditional
probabilities given an event H (say)
P(ABCD… |H)
= P(A|H)P(B|AH)P(C|ABH)P(D|ABCH)…
ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 3 of 38
The theorem of total probability I
lThe theorem of total probability allows us
to compute unconditional probabilities
from conditional probabilities
P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)
P(B) = P(B|A)P(A) + P(B|Ac)P(Ac)
lThe theorem also applies to conditional
probabilities
P(A|C) = P(A|BC)P(B|C)+P(A|Bc C)P(Bc|C)
ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 4 of 38
The theorem of total probability II
Given a countable partition A1, A2 , … An, …
of the sample space,
P(B) = P(B|A1)P(A1) + P(B|A2)P(A2)
+ … + P(B|An)P(An) + …
P(B|C)
= P(B|A1C)P(A1|C) + P(B|A2C)P(A2|C)
+ … + P(B|AnC)P(An|C) + …
ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 5 of 38
Checking the answers
lmin P(B|Ai) P(B) max P(B|Aj)
i j
lIn particular,
P(B) min{P(B|A), P(B|Ac)}
P(B) max{P(B|A), P(B|Ac)}
lEquality holds if and only if P(A) = 0 or
P(A) = 1 or P(B|A) = P(B|Ac)
lIf P(A) = 1/2, P(B) = [P(B|A) + P(B|Ac)]/2
ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 6 of 38
Why is all this stuff important?
lThe chain rule or product rule allows us to
compute a joint probability (probability of
an intersection) as the product of various
conditional probabilities
lThe theorem of total probability allows us
to find an unconditional probability from
conditional probabilities
lResults are very important and very useful
tools in probabilistic analyses
pf3
pf4
pf5

Partial preview of the text

Download ECE 313 Lecture 14: Conditional Probability & Chain Rule in Engineering Applications - Pro and more Study notes Statistics in PDF only on Docsity!

Department of Electrical and Computer Engineering

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 1 of 38

Conditional probability

l Given that event A of probability P(A) > 0 occurred, the conditional probability of B

given A is denoted by P(B|A) and defined

as

P(B|A) =

P(AB)

P(A)

l P(B|A) can be larger than, smaller than, or

the same as P(B) l Conditional probabilities satisfy the axioms of probability theory ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 2 of 38

The chain rule or product rule

l P(B|A) = P(AB)/P(A)

l P(AB) = P(B|A)P(A)

P(ABCD…)

= P(A)P(B|A)P(C|AB)P(D|ABC)…

l The chain rule also applies to conditional probabilities given an event H (say)

P(ABCD… |H)

= P(A|H)P(B|AH)P(C|ABH)P(D|ABCH)…

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 3 of 38

The theorem of total probability I

l The theorem of total probability allows us to compute unconditional probabilities from conditional probabilities

P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

P(B) = P(B|A)P(A) + P(B|Ac)P(Ac)

l The theorem also applies to conditional probabilities

P(A|C) = P(A|BC)P(B|C)+P(A|Bc^ C)P(Bc|C)

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 4 of 38

The theorem of total probability II

Given a countable partition A 1 , A 2 , … An, … of the sample space,

P(B) = P(B|A 1 )P(A 1 ) + P(B|A 2 )P(A 2 )

+ … + P(B|An)P(An) + …

P(B|C)

= P(B|A 1 C)P(A 1 |C) + P(B|A 2 C)P(A 2 |C)

+ … + P(B|AnC)P(An|C) + …

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 5 of 38

Checking the answers

l min P(B|Ai) ≤ P(B) ≤ max P(B|Aj)

i j l In particular,

P(B) ≥ min{P(B|A), P(B|Ac)}

P(B) ≤ max{P(B|A), P(B|Ac)}

l Equality holds if and only if P(A) = 0 or

P(A) = 1 or P(B|A) = P(B|Ac)

l If P(A) = 1/2, P(B) = [P(B|A) + P(B|Ac)]/

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 6 of 38

Why is all this stuff important?

l The chain rule or product rule allows us to compute a joint probability (probability of an intersection) as the product of various conditional probabilities l The theorem of total probability allows us to find an unconditional probability from conditional probabilities l Results are very important and very useful tools in probabilistic analyses

Department of Electrical and Computer Engineering

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 7 of 38

Conditional pmf of X

l The pmf of a discrete random variable X tells us the probabilities with which we observe X taking on various values l When partial knowledge is available about the outcome, we should update the pmf probabilities from their original values to the corresponding conditional probabilities l This updated or modified pmf is called the conditional pmf of X

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 8 of 38

Definition of conditional pmf of X

l The pmf of a discrete random variable X taking on values u 1 , u 2 , … is given by p X (ui) = P{ X = ui}, i = 1, 2, … l Given that event A (with P(A) > 0) has occurred, the conditional pmf of X is

p X |A (ui|A) = P{ X = ui|A}, i = 1, 2, …

where the right side now has conditional probabilities given that A occurred

l P{ X = ui|A} = P[{ X = ui} ∩ A]/P(A)

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 9 of 38

Some thoughts on conditional pmfs

l In defining the pmf of X , it was assumed that values u 1 , u 2 , … occur with nonzero probabilities, i.e. p X (ui) > 0 for each ui l Given that the event A occurred, it might be that certain values ui of X can never occur under these conditions l { X = ui} ∩ A = ∅

l In the conditional pmf of X , p X |A(ui|A) = 0

for such values ui

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 10 of 38

Example: conditional pmf

l X has value 1, 2, 3, and 4 respectively with probabilities 1/2, 1/4, 1/8, and 1/ l Let A = { X is an even number}; P(A) = 3/ l The conditional pmf of X is given by

n p X |A (1|A) = 0 n p X |A (3|A) = 0

n p X |A (2|A) = 2/3 n p X |A (4|A) = 1/

l Note that the conditional probabilities have the same ratios (1/4):(1/8) as (2/3):(1/3)

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 11 of 38

Graphical representation

pmf of X

conditional pmf of X given that X is an even number

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 12 of 38

Conditional mean and variance

l Just as conditional probabilities are a probability measure, conditional pmfs are valid pmfs l The conditional expectation of X is the expectation of X computed using the conditional pmf of X instead of the pmf

l E[ X |A] = ∑ ui•p X |A(ui|A)

l Similarly, the conditional variance uses the conditional pmf instead of the pmf

Department of Electrical and Computer Engineering

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 19 of 38

Example

l Given A, X is binomial; parameters (n, p) l Given Ac, X is binomial; parameters (n, q) l P(A) = 1/2 l q ≠ 1 – p

l P{ X = k} = P{ X =k|A}P(A) + P{ X =k|Ac}P(Ac)

= average of two binomial probabilities l The unconditional pmf of X is a mess

l E[ X ] =? E[ X |A] = np and E[ X |Ac] = nq

l E[ X ] = (np + nq)/ ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 20 of 38

Example (continued)

l Given A, X is binomial; parameters (n, p) l Given Ac, X is binomial; parameters (n, q) l P(A) = 1/2 l q ≠ 1 – p

l var( X |A) = np(1–p) l var( X |Ac) = nq(1–q)

l The conditional mean is np or nq with equal probability l var(conditional mean) = [n(p–q)/2] 2 l var( X ) = [np(1–p)+nq(1–q)]/2 + [n(p–q)/2] 2

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 21 of 38

Comment

l The example is a very simple illustration of the use of conditional expectations l In many important instances, conditional expectations are much easier to calculate than the unconditional expectations l The theorem of total probability is the tool that we use to obtain unconditional expectations from conditional expectations

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 22 of 38

Bayes’ Formula … at last!

l Given that event A of probability P(A) > 0 occurred, the conditional probability of B

given A is denoted by P(B|A) and defined

as

P(B|A) = P(AB)

P(A)

l What is P(A|B)?

P(A|B) =

P(AB)

P(B)

P(B|A)P(A)

P(B)

l Simplest version of Bayes’ formula ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 23 of 38

Many names; some wrong ’uns too

l Bayes’ formula

P(A|B) = P(B|A)P(A)/P(B)

is also called Bayes’ theorem, or Bayes’ lemma, or often (mistakenly) Bayes’ rule l Bayes’ rule (discussed later) refers to a methodology for decision-making that is an extremely controversial topic among statisticians l No controversy about Bayes’ formula ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 24 of 38

Is that all there is to it?

l Bayes’ formula

P(A|B) = P(B|A)P(A)/P(B)

is a very simple consequence of the definition of conditional probability l In applications, P(B) is usually replaced by its equivalent expression as given by the theorem of total probability

l P(A|B) =

P(B|A)P(A)

P(B|A)P(A) + P(B|Ac)P(Ac)

Department of Electrical and Computer Engineering

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 25 of 38

Look closely at the formula …

l P(A|B) =

P(B|A)P(A)

P(B|A)P(A) + P(B|Ac)P(Ac)

l The numerator is always one of the terms in the denominator

l When is P(B|A)P(A) ≠ P(B|A)P(A)?

l When it is calculated twice by an ECE 313 student in a hurry on homework or exam! l Save and re-use the appropriate term!

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 26 of 38

General version of Bayes’ formula

l When P(B) is obtained from P(B |Ak)’s via

the more general version of the theorem of total probability, the more general total probability appears in the denominator l The numerator is still one of the terms in the denominator

l P(Ak|B) =

P(B|Ak)P(Ak)

∑P(B|Ai)P(Ai)

i ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 27 of 38

Example of use of Bayes’ formula

l Example: Box I has 3 green and 2 red balls, while Box II has 2 green and 2 red balls. A ball is drawn at random from Box I and transferred to Box II. Then, a ball is drawn at random from Box II. l G = event ball drawn from Box II is green l A = event ball transferred is red

l P(G|A) = 2/5 l P(G|Ac) = 3/5 l P(A) = 2/

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 28 of 38

Example (continued)

l G = event ball drawn from Box II is green l A = event ball transferred is red

l P(G|A) = 2/5 l P(G|Ac) = 3/5 l P(A) = 2/

l P(G) = P(G|A)P(A) + P(G|Ac)P(Ac)

l P(A|G) = P(G|A)P(A)/P(G) = 4/

l P(Ac|G) = P(G|Ac)P(Ac)/P(G) = 9/

l Check: P(Ac|G) = 1 – P(A|G)

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 29 of 38

Does this make any sense?

l P(A|G) = 4/13, P(Ac|G) = 9/13 > P(A|G)

l If a green ball is drawn from Box II, it is reasonable to assume (more likely) that a green ball was transferred over!

Box I

Box II

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 30 of 38

Another example

l Whenever there is a fire, a fire alarm rings with probability 1 – 10–8. If there is no fire, the fire alarm does occasionally ring. The probability of such false alarms is 10 – l Both these probabilities can be measured experimentally by the manufacturer (and touted in its advertising!) l What the user is more concerned about is: When the fire alarm is ringing, is there actually a fire? or is it just a false alarm?

Department of Electrical and Computer Engineering

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 37 of 38

What comes next?

l In the next two lectures, we shall discuss the general notion of decision making in the face of uncertainty n How to make decisions? n What are the probabilities that our decisions are incorrect? n What is the decision rule that minimizes the error probability? n What decision rule minimizes the costs?

ECE 313 - Lecture 14 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 38 of 38

Summary

l We discussed the notion of the conditional pmf of a random variable and related it to the pmf via the theorem of total probability l We discussed conditional means and variances and their relation to the mean and variance l We discussed Bayes’ formula and some simple applications