Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

For each uploaded document

Answer questions

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Understanding Conditional Probability: Definition, Rules, and Applications - Prof. Dilip S, Study notes of Statistics

University of Illinois - Urbana-Champaign Statistics

Prof. Dilip Sarwate

An in-depth exploration of conditional probability, its definition, consistency with various models, axioms, rules, and applications. The chain rule or product rule, the theorem of total probability, and examples such as the birthday surprise problem and the theorem of total probability. It also discusses the importance of conditional probabilities in probabilistic analyses.

Typology: Study notes

Pre 2010

Uploaded on 02/24/2010

koofers-user-6et 🇺🇸

9 documents

1 / 7

This page cannot be seen from the preview

Don't miss anything!

bg1

ECE 313 — Probability with Engineering Applications Fall 2000

Department of Electrical and Computer Engineering

University of Illinois at Urbana-Champaign 13.1

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 1 of 39

Introduction

lThe conditional probability of an a event B

given that event A occurred is our revised

estimate of the chances that B occurred in

light of partial knowledge of the outcome

of the experiment, viz. knowing that A

occurred

lTo avoid trivialities, we assume that A,

sometimes called the conditioning event,

has nonzero probability

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 2 of 39

Definition of conditional probability

lThe conditional probability of B given A is

denoted by P(B|A)

lRead this as “the probability of B given A”

or “the probability of B conditioned on A”

lDefinition: If P(A) > 0, P(B|A) is defined as

P(B|A) = P(AB)

P(A)

lP(B|A) can be larger than, smaller than, or

the same as P(B)

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 3 of 39

Consistent with various models

lThe definition of conditional probability is

consistent with

nclassical approach to probability

nrelative frequency approach

lConditional probabilities can also be

discussed for events defined in terms of

random variables

lP{X = k | X > n}? or P{X ≤ k | a < X < b}?

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 4 of 39

Geometric RVs are memoryless

lLet X denote a geometric random variable

with parameter p

lFor k > 0, P{X = k+r | X > r} = P{X = k}

lGiven that the event {X > r} has occurred,

that is, the first r trials ended in a “failure”,

the probability that we need to wait for an

additional k trials to observe the first

success is the same as P{X = k}

lIt’s as if the first r trials are forgotten!

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 5 of 39

Binomial random variables

lLet X denote a binomial random variable

with parameters (n, p)

lGIven the event {X = k} has occurred, the

conditional probability that the j-th trial

resulted in a success is k/n, independent

of the value of p

lThe conditional probability of successes

on the i-th and j-th trials is k(k–1)/[n(n–1)]

land so on

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 6 of 39

Axioms are satisfied

lConditional probabilities are a probability

measure, that is, they satisfy the axioms of

probability theory

lAll the consequences of the axioms (rules

of probability) also apply to conditional

probabilities

lCaveat: Everything must be conditioned

on the same event. No mixing and

matching allowed

pf3

pf4

pf5

Discover Study notes of Statistics University of Illinois - Urbana-Champaign

Related documents

ECE 313 Final Examination: Probability Theory Questions - Prof. Dilip Sarwate

Conditional Probability: Lecture Notes from ECE 313 at UIUC, Fall 2000 - Prof. Dilip Sarwa

Solutions to ECE 313 Final Exam, Fall 2007 - Prof. Dilip Sarwate

Probability Lecture Notes for ECE 313, UIUC, Fall '97 - Prof. Dilip Sarwate

Probability Analysis: Expected Value & Variance of Geometric Random Variable - Prof. Dilip

Probability with Engineering Applications: Problem Set 5 - Decision Theory - Prof. Dilip S

Solutions to ECE 313 Final Exam, University of Illinois, Spring 2002 - Prof. Dilip Sarwate

ECE 313 Lecture 14: Conditional Probability & Chain Rule in Engineering Applications - Pro

Problem Set #14 for ECE 313 at University of Illinois, Fall 1997 - Prof. Dilip Sarwate

Solutions to Problem Set #8 in ECE 313, University of Illinois, Spring 2003 - Prof. Dilip

Solutions to Problem Set #3 in ECE 413: Probability Theory - Prof. Dilip Sarwate

Continuous Random Variables: Joint PDFs and Marginal Distributions - Prof. Dilip Sarwate

Partial preview of the text

Download Understanding Conditional Probability: Definition, Rules, and Applications - Prof. Dilip S and more Study notes Statistics in PDF only on Docsity!

Department of Electrical and Computer Engineering

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 1 of 39

Introduction

l The conditional probability of an a event B given that event A occurred is our revised estimate of the chances that B occurred in light of partial knowledge of the outcome of the experiment, viz. knowing that A occurred l To avoid trivialities, we assume that A, sometimes called the conditioning event, has nonzero probability

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 2 of 39

Definition of conditional probability

l The conditional probability of B given A is

denoted by P(B|A)

l Read this as “the probability of B given A” or “the probability of B conditioned on A”

l Definition: If P(A) > 0, P(B|A) is defined as

P(B|A) =

P(AB)

P(A)

l P(B|A) can be larger than, smaller than, or

the same as P(B) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 3 of 39

Consistent with various models

l The definition of conditional probability is consistent with n classical approach to probability n relative frequency approach l Conditional probabilities can also be discussed for events defined in terms of random variables

l P{ X = k | X > n}? or P{ X ≤ k | a < X < b}?

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 4 of 39

Geometric RVs are memoryless

l Let X denote a geometric random variable with parameter p

l For k > 0, P{ X = k+r | X > r} = P{ X = k}

l Given that the event { X > r} has occurred, that is, the first r trials ended in a “failure”, the probability that we need to wait for an additional k trials to observe the first success is the same as P{ X = k} l It’s as if the first r trials are forgotten!

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 5 of 39

Binomial random variables

l Let X denote a binomial random variable with parameters (n, p) l GIven the event { X = k} has occurred, the conditional probability that the j-th trial resulted in a success is k/n, independent of the value of p l The conditional probability of successes on the i-th and j-th trials is k(k–1)/[n(n–1)] l and so on

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 6 of 39

Axioms are satisfied

l Conditional probabilities are a probability measure, that is, they satisfy the axioms of probability theory l All the consequences of the axioms (rules of probability) also apply to conditional probabilities l Caveat: Everything must be conditioned on the same event. No mixing and matching allowed

Department of Electrical and Computer Engineering

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 7 of 39

Rules? What rules?

l P(Ω|A) = 1 l P(∅|A) = 0

l P(Bc|A) = 1 – P(B|A)

l If B ⊂ C, then P(B|A) ≤ P(C|A)

l If BC = ∅, then

P((B ∪ C)|A) = P(B|A) + P(C|A)

l More generally,

P((B ∪ C)|A) = P(B|A) + P(C|A) – P(BC|A)

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 8 of 39

Left side versus right side

l An expression such as P((B ∪ C)|(A ∪ D))

is commonly written as P(B ∪ C|A ∪ D)

l Everything to the right of the vertical bar is the conditioning event; it is a single set l Everything to the left of the vertical bar is the conditioned event; it is a single set l Even if A, B, C, and D are disjoint,

P(B ∪ C|A ∪ D) ≠ P(B) + P(C|A) +P(D)

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 9 of 39

Is that all there is to it?

l OK, so you can update your probabilities to conditional probabilities if you know that event A occurred n Is that all there is to it? n Is the notion of conditional probability just a one-trick pony? n Surely life holds more than that? l Actually, conditional probabilities are fundamental tools in probabilistic analyses

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 10 of 39

The chain rule or product rule

l P(B|A) = P(AB)/P(A)

l P(AB) = P(B|A)P(A)

l Note that P(AB) can also be expressed as

P(A|B)P(B)

l The conditional probability P(B|A) can be

used to compute the joint probability P(AB)

l Conditional probability P(B|A) times P(A),

the probability of the conditioning event ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 11 of 39

Generalization of the chain rule

l More generally,

P(ABCD…)=P(A)P(B|A)P(C|AB)P(D|ABC)…

l Product of first two terms is P(AB)

l P(C|AB)P(AB) = P(ABC), so that the

product of the first three terms is P(ABC), and so on … l For ABCD… to occur, A must occur, and if A has occurred, so must B (with probability

P(B|A)); if both A and B, then C must …

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 12 of 39

Applications of the chain rule

l Example: A random sample of size k is drawn without replacement from the set {1, 2, … , n}. What is the probability that the sample is exactly {1, 2, 3, … , k–1, n}?

l Simple answer: There are equally likely subsets that could have been drawn, and so the desired probability is just

n k

n k

Department of Electrical and Computer Engineering

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 19 of 39

Further generalization of the chain rule

l P(ABCD…)

= P(A)P(B|A)P(C|AB)P(D|ABC)…

l Every probability result also applies to conditional probabilities l The chain rule applies to computation of conditional probabilities by conditioning everything on the given event H (say)

l P(ABCD… |H)

=P(A|H)P(B|AH)P(C|ABH)P(D|ABCH)…

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 20 of 39

It’s not just for breakfast any more!

l P(AB) + P(ABc) = P(A)

l P(AB)= P(A|B)P(B)

l P(ABc) = P(A|Bc)P(Bc)

l Hence, P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

and P(B) = P(B|A)P(A) + P(B|Ac)P(Ac)

l These formulas are totally unlike the ones seen previously

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 21 of 39

It’s not ‘the same thing, only different…’

P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

P(B) = P(B|A)P(A) + P(B|Ac)P(Ac)

l These formulas are totally unlike the ones seen previously l On the right side, we have probabilities conditioned on different events l Previously, we were conditioning on the same event throughout

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 22 of 39

…it’s something much much more!

P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

l B and Bc^ cannot occur simultaneously on the same trial l To find P(A), first imagine that B occurred

l From P(A|B), we can determine P(AB)

l Next imagine that Bc^ occurred

l From P(A|Bc), we can determine P(ABc)

l The sum of these two numbers is P(A)! ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 23 of 39

Oatmeal or haute cuisine?

P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

l We knew how to obtain conditional probabilities from “regular” probabilities

l P(A|B) = P(AB)/P(B)

l New result allows us to find unconditional probabilities from conditional probabilities l It is a fundamentally important result l It is also very simple (uses horse sense) ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 24 of 39

The theorem of total probability

P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

l This fundamental result is called the theorem of total probability l The probability of the event A is the weighted average of the probabilities of A conditioned on B and on Bc l In the Ross textbook, this result is Eq.(3.1) on page 72

Department of Electrical and Computer Engineering

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 25 of 39

Applications

l Example: Box I has 3 green and 2 red balls, while Box II has 2 green and 2 red balls. A ball is drawn at random from Box I and transferred to Box II. Then, a ball is drawn at random from Box II. What is the probability that the ball drawn from Box II is green? l Note that the color of the ball transferred from Box I to Box II is not known ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 26 of 39

Example (continued)

l The color of the ball transferred is not known, but it’s either green or red for sure!

Box I

Box II

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 27 of 39

Example (continued)

l Box I has 3g, 2r; Box II has 2g, 2r l After the transfer, Box II has 5 balls in it l G = event ball drawn from Box II is green l A = event ball transferred is red

l P(G|A) = 2/5 l P(G|Ac) = 3/

l P(A) = 2/

l P(G) = P(G|A)P(A) + P(G|Ac)P(Ac)

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 28 of 39

A built-in test for checking answers

l The probability of event A is the weighted

average of P(A|B) and P(A|Bc)

l P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

= P(A|B)P(B) + P(A|Bc)[1 – P(B)]

l The linear function y = a•x + b•(1 – x) has value b at x = 0 and a at x = 1 l For 0 < x < 1, y is between a and b

l P(A) is between P(A|B) and P(A|Bc)

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 29 of 39

Example (checking our work)

l P(G|A) = 2/5 l P(G|Ac) = 3/

l P(G) = P(G|A)P(A) + P(G|Ac)P(Ac)

P(G|A) = 2/5 ≤ P(G) = 13/25 ≤ P(G|Ac) = 3/

l If the check is satisfied, it does not imply that your work is right; there may be other mistakes, e.g. you computed P(G) = 12/ l But, if the check is not satisfied, … ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 30 of 39

Generalizations of the theorem I

l P(A) = P(A|B)P(B) + P(A|Bc)P(Bc)

l Since conditional probabilities form a probability measure, a similar result also holds for conditional probabilities

l P(A|C) = P(A|BC)P(B|C)+P(A|Bc^ C)P(Bc|C)

l All probabilities in the first equation are now conditioned on C (in addition to any previously existing conditioning)

Department of Electrical and Computer Engineering

ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 37 of 39

Another Example

l You and a friend (also taking ECE 313) are at a party with N–1 other people when suddenly a conga line forms. Assume that all (N+1)! orderings are possible l What is the probability that your friend is ahead of you in the conga line? l Answer: 1/2 (by symmetry) l If there was a different (correct) answer, you would be ahead with same prob ≠ 1/ ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 38 of 39

Do it by the theorem…

l Both you and your friend are equally likely to be anywhere in the conga line l P(you are in j-th position) = 1/(N + 1)

l P(friend ahead|you in j-th) = (j – 1)/N

l Why j–1? Why N and not N+1? l P(friend ahead) = sum of [(j–1)/N]•[1/(N+1)] = [0 + 1 + … + N]/[N•(N + 1)] = 1/ l 1 + 2 + … + N = N•(N + 1)/2 !!!! ECE 313 - Lecture 13 © 2000 Dilip V. Sarwate, University of Illinois at Urbana-Champaign, All Rights Reserved Slide 39 of 39

Summary

l The chain rule or product rule allows us to compute a joint probability (i.e. probability of an intersection) as the product of various conditional probabilities l The theorem of total probability allows us to find an unconditional probability from conditional probabilities l We discussed some examples of the applications of these rules