Understanding Probability: Mutually Exclusive, Conditional, Independence, Slides of Russian

The fundamental rules of probability, including the certainty rule, additivity rule for mutually exclusive events, and the definition of conditional probability. It also discusses statistical independence and the use of Venn diagrams to illustrate these concepts.

Typology: Slides

2021/2022

Uploaded on 09/27/2022

thimothy
thimothy 🇬🇧

4

(12)

217 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
6The Basic Rules
of
Probability
This
chapter summarizes the
rules
you
have
been
using
for
adding and
multiplying probabilities, and
for
using
conditional
probability.
It
also
gives
a
pictorial
way
to
understand the
rules.
The Basic Rules
of
Probability 59
(2)
Pr(certain proposition) =1
Pr(sure event) =1
Often the Greek letter
fi
is used to represent certainty: Pr(fi) =
1.
ADDITIVITY
If
two events or propositions Aand Bare mutually exclusive (disjoint, incompat-
ible), the probability that one
or
the other happens (or is true) is the
sum
of their
probabilities.
(3)
If
A
and
Bare mutually exclusive, then
Pr(AvB) =Pr(A) +
Pr(B).
OVERLAP
When A
and
Bare not mutually exclusive,
we
have to subtract the probability of
their overlap.
In
amoment we will
deduce
this from rules (1)-(3).
(4)
Pr(AvB) =Pr(A) +Pr(B) -
Pr(A&B)
(6)
If
Pr(B) >
0,
Pr(A&B)
=Pr(A/B)Pr(B).
CONDmONAL
PKOBABILl'IY
The only basic rules are (1)-(3). Now comes a
definition.
MtJLTIPUCAll0N
The definition of conditional probability implies that:
The rules that follow are informal versions of standard axioms for elementary
probability theory.
ASSUMPTIONS
The rules stated here take some
things
for granted:
The rules are for finite groups of propositions (or events).
If
Aand Bare propositions (or events), then so are
AvB,
A&B,
and-A.
Elementary deductive logic (or elementary set theory) is taken for granted.
If
Aand Bare
logically
equivalent,
then Pr(A)
Pr(B).
[Or, in set theory,
if
Aand Bare events which are provably the same sets of events, Pr(A) =
Pr(B).]
(5)
If
Pr(B) >
0,
then Pr(A/B)
Pr(A&B)
Pr(B)
NORMALl'IY
The probability of any proposition or event Alies between 0
and
1.
(1) 0
~
Pr(A)
~
1
Why the name "normality"? Ameasure is said to
be
normalized
if
it
is
put
on
a
scale between0and
1.
CERTAINTY
An
event
that is sure to happen has probability
1.
Aproposition that is certainly
true
has
probability
1.
TOTAL PR.OBABILl'IY
Another consequence of the definition of conditional probability:
(7)
If
0<Pr(B) <
1,
Pr(A) =Pr(B)Pr(A/B) +
Pr(-B)Pr(A/-B).
In
practice this is avery useful rule. What is the probability that
you
will get a
grade of Ain this course? Maybe there are just two possibilities:
you
study hard,
or you
do
not
study
hard. Then:
Pr(A) Pr(study hard)Pr(A/study hard) +Pr(don't study)Pr(AIdon't study).
pf3
pf4
pf5

Partial preview of the text

Download Understanding Probability: Mutually Exclusive, Conditional, Independence and more Slides Russian in PDF only on Docsity!

6 The Basic Rules of Probability

This chapter summarizes the rules you have been using for adding and

multiplying probabilities, and for using conditional probability. It also gives a

pictorial way to understand the rules.

The Basic Rules of Probability 59

(2) Pr(certain proposition) = 1

Pr(sure event) = 1

Often the Greek letter fi is used to represent certainty: Pr(fi) = 1.

ADDITIVITY

If two events or propositions A and B are mutually exclusive (disjoint, incompat-

ible), the probability that one or the other happens (or is true) is the sum of their

probabilities.

(3) If A and B are mutually exclusive, then

Pr(AvB) = Pr(A) + Pr(B).

OVERLAP

When A and B are not mutually exclusive, we have to subtract the probability of

their overlap. In a moment we will deduce this from rules (1)-(3).

(4) Pr(AvB) = Pr(A) + Pr(B) - Pr(A&B)

(6) If Pr(B) > 0, Pr(A&B) = Pr(A/B)Pr(B).

CONDmONAL PKOBABILl'IY

The only basic rules are (1)-(3). Now comes a definition.

MtJLTIPUCAll0N

The definition of conditional probability implies that:

The rules that follow are informal versions of standard axioms for elementary

probability theory.

ASSUMPTIONS

The rules stated here take some things for granted:

  • The rules are for finite groups of propositions (or events).
  • If A and B are propositions (or events), then so are AvB, A&B, and-A.
  • Elementary deductive logic (or elementary set theory) is taken for granted.
  • If A and B are logically equivalent, then Pr(A) Pr(B). [Or, in set theory, if

A and B are events which are provably the same sets of events, Pr(A) =

Pr(B).]

(5) If Pr(B) > 0, then Pr(A/B)

Pr(A&B)

Pr(B)

NORMALl'IY

The probability of any proposition or event A lies between 0 and 1.

(1) 0 ~ Pr(A) ~ 1

Why the name "normality"? A measure is said to be normalized if it is put on a

scale between 0 and 1.

CERTAINTY

An event that is sure to happen has probability 1. A proposition that is certainly

true has probability 1.

TOTAL PR.OBABILl'IY

Another consequence of the definition of conditional probability:

(7) If 0 < Pr(B) < 1, Pr(A) = Pr(B)Pr(A/B) + Pr(-B)Pr(A/-B).

In practice this is a very useful rule. What is the probability that you will get a

grade of A in this course? Maybe there are just two possibilities: you study hard,

or you do not study hard. Then:

Pr(A) Pr(study hard)Pr(A/study hard) + Pr(don't study)Pr(AIdon't study).

60 An Introduction to Probability and Inductive Logic

Try putting in some numbers that describe yourself.

LOGICAL CONSEQUENCE

When B lOgically entails A, then

Pr(B) :5 Pr(A).

This is because, when B entails A, B is logically equivalent to A&B. Since

Pr(A) Pr(A&B) + Pr(A&-B) = Pr(B) + Pr(A&-B),

Pr(A) will be bigger than Pr(B) except when Pr(A&-B) = O.

STATISTICAL INDEPENDENCE

Thus far we have been very informal when talking about independence. Now we state a definition of one concept, often called statistical independence.

(8) If 0 < Pr(A) and 0 < Pr(B), then,

A and B are statistically independent if and only if.

Pr(A/B) = Pr(A).

The Basic Rules of Probability 61

Pr(AvB) Pr(A&B) + Pr(A&-B) + Pr(-A&B) + Pr(A&B) - Pr(A&B)

Hence,

Pr(AvB) = Pr(A) + Pr(B) Pr(A&B).

CONDmONALIZING mE RULES

It is easy to check that the basic rules (lH3), and (5), the definition of conditional

probability, all hold in conditional form. That is, the rules hold if we replace Pr(A), Pr(B), Pr(A/B), and so on, by Pr(A/E), Pr(B/E), P(A/B&E), and so on.

Normality

(IC) 0:5 Pr(A/E) :5 1

Certainty

We need to check that for E, such that Pr(E) > 0,

(2C) Pr([sure event]lE) = 1.

Now E is logically equivalent to the occurrence of E with something that is sure to happen. Hence,

Additirity

Let Pr(E) > O. If A and B are mutually exclusive, then

Pr[(AvB)/E] Pr[(AvB)&E]lPr(E) = Pr(A&E)/Pr(E) + Pr(B&E)/Pr(E).

(3C) Pr[(AvB)/E] Pr(A/E) + Pr(B/E).

PROOF OF mE RULE FOR OVERLAP (4) Pr(AvB) = Pr(A) + Pr(B) - Pr(A&B).

This rule follows from rules (lH3), and the logical assumption on page 58, that

logically equivalent propositions have the same probability.

AvB is logically equivalent to: (A&B) v (A&-B) v (-A&B) (.)

Pr([sure event] & E) Pr([sure event/E])

Pr(E). [Pr(E)] / [Pr(E)] 1.

Why? Those familiar with "truth tables" can check it out. But you can see it directly. A is logically equivalent to (A&B) v (A&-B). B is logically equivalent to (A&B) v (-A&B). Now the three components (A&B), (A&-B), and (-A&B) are mutually exclu- sive. (Why?) Hence we can add their probabilities, using (').

Pr(AvB) = Pr(A&B) + Pr(A&-B) + Pr(-A&B) (..) A is logically equivalent to [(A&B)v(A&-B)], and B is logically equivalent to [(A&B)v(-A&B)].

So,

Pr(A) = Pr(A&B) + Pr(A&-B).

Pr(B) - Pr(A&B) + Pr(- A&B).

Since it makes no difference to add and then subtract something in (..):

Conditional probability

This is the only case you should examine carefully. The conditionalized form of (5) is:

(SC) If Pr(E) > 0 and Pr(B/E) > 0, then Pr[A/(B&E)] = Pr[(A&B)/E] Pr(B/E).

We prove this starting from (5),

P [A/(B&E)] = Pr(A&B&E) r Pr(B&E).

The numerator (on top of the fraction) is Pr(A&B&E) = Prf(A&B)/E] X Pr(E). The denominator (bottom of the fraction) is Pr(B&E) = Pr(B/E) x Pr(E). Dividing the numerator by the denominator, we get (SC).

64 An Introduction to Probability and Inductive Logic The^ Basic^ Rules^ of^ Probability^65

nonmusical people, resulting in a group of 20 people. Imagine we were interested in these two events:

appears only in B. The area only in B is the areas in B, less the area of overlap with A.

(4) Overlap: To calculate the probability of AvB, determine how much of the rectangle is covered by circles A and B. This will be all the area in A, plus the area that

Event A = a singer is selected at random from the whole group. Event B = a whistler is selected at random from the whole group.

Here is a Venn diagram of the situation, where the entire box represents the room full of twenty people.

Notice the major change from the previous diagram: Figure 6.2 now has its circles enclosed in a rectangle. By convention, the area of the rectangle is set to 1. The areas of each of the circles correspond to the probability of occurrence of an event

of the type that it represents: the area of circle A is 5/20, or 0.25, since there are

5 singers among 20 people. Likewise, the area of circle B is 4/20, or 0.2. The area of the region of overlap between A & B is 1/20, or 0.05. These drawings can be used to illustrate the basic rules of probability.

(1) Normality: 0 s; Pr(A) 1. This corresponds to the rectangle having an area of 1 unit: since all circles must lie within the rectangle, no circle, and hence no event can have a probability of greater than 1. (2) Certainty: Pr(sure event) = 1. Pr(certain proposition) = 1. With Venn diagrams, an event that is sure to happen, or a proposition that is certain, corresponds to a "circle" that fills the entire rectangle, which by convention has unit area l. (3) Additivity: U A and B are mutually exclusive, then:

Pr(AvB) Pr(A) + Pr(B).

U two groups are mutually exclusive they do not overlap, and the area covering members of either group is just the sum of the areas of each.

But just look at the logical consequence rule on page 60. Since, for example, (f) logically entails (a) and (b), (a) and (b) must be more probable than (f).

In lact, they rank the possibilities something like this, from most probable to least

probable:

(f), (e), (d), (a), (c), (b).

ODD QUESTION 2

Recall the Odd Question about Pia:

2. Pia is thirty-one years old, single, outspoken, and smart. She was a philoso-

phy major. When a student, she was an ardent supporter of Native American rights, and she picketed a department store that had no facilities for nursing mothers. Rank the following statements in order of probability from 1 (most probable) to 6 (least probable). (TIes are allowed.) ___(a) Pia is an active feminist. ___(b) Pia is a bank teller. ___(c) Pia works in a small bookstore. ___(d) Pia is a bank teller and an active feminist. ___(e) Pia is a bank teller and an active feminist who takes yoga classes. ___(f) Pia works in a small bookstore and is an active feminist who takes yoga classes.

This is a famous example, first studied empirically by the psychologists Amos

Tversky and Daniel Kahneman. They found that very many people think that,

given the whole story:

The most probable description is (f) Pia works in a small bookstore and is an active feminist who takes yoga classes.

Pr(AvB) = Pr(A) + Pr(B) Pr(A&B) (5) Conditional: Given that event B has happened, what is the probability that event A will

also happen? Look at Figure 6.2. If B has happened, you know that the person

selected is a whistler. So we want the proportion of the area of B, that includes A. That is, the area of A&B divided by the area of B. Pr(A/B) Pr(A&B)';- Pr(B), so long as Pr(B) > O. So, in our numerical example, Pr(A/B) = 1/4. Conversely, Pr(BIA) Pr(A & B)/Pr(A) = 115 = 0.2.

Singers only (4) Whistlers only (3)

Non-musicians (12) Total (20)

FlGUltll 6.

66 An Introdudion to Probability and Inductive Logic

In general: Pr(A&B) :5 Pr(B),

It follows that the probability rankings given by many people, with (f) most

probable, are completely wrong, There are many ways of ranking (aHf), but any ranking should obey these inequalities:

Pr(a) ;:= Pr(d) 2: Pr(e), Pr(b) 2: Pr(d) 2: Pr(e), Pr(a) ;:= Pr(f). Pr(c) ;:= Pr (f),

ARE PEOPLE STUPID?

Some readers of Tversky and Kahneman conclude that we human beings are irrational, because so many of us come up with the wrong probability orderings.

But perhaps people are merely careless!

Perhaps most of us do not attend closely to the exact wording of the question,

"Which of statements (aHf) are more probable, that is have the highest proba-

bility." Instead we think, "Which is the most useful, instructive, and likely to be true thing to say about Pia?" When we are asked a question, most of us want to be informative, useful, or interesting, We don't necessarily want simply to say what is most probable, in

the strict sense of having the highest probability.

For example, suppose I ask you whether you think the rate of inflation next

year will be (a) less than 3%, (b) between 3% and 4%, or (c) greater than 4%. You could reply, (a)-or-(b)-or-(c). You would certainly be right! That would be the answer with the highest probability. But it would be totally uninformative. You could reply, (b)-or-(c). That is more probable than simply (b), or simply

(c), assuming that both are possible (thanks to additivity). But that is a less

interesting and less useful answer than (c), or (b), by itself. Perhaps what many people do, when they look at Odd Question 2, is to form a character analysis of Pia, and then make an interesting guess about what she is doing nowadays.

If that is what is happening, then people who said it was most probable that

Pia works in a small bookstore and is an active feminist who takes yoga classes, are not irrational. They are just answering the wrong question-but maybe answering a more useful question than the one that was asked,

AXIOMS: BUYGENS

Probability can be axioIDatized h"1 ID,my ways. The first axionls, or basic rules,

were published in 1657 by the Dutch physicist Christiaan Huygens (1629-1695), famous for his wave theory of light. Strictly speaking, Huygens did not use the

The Basic Rules of Probability 67

idea of probability at all. Instead, he used the idea of the fair price of something like a lottery ticket, or what we today would call the expected value of an event or proposition. We can still do that today. In fact, almost all approaches take probability as the idea to be axiomatized. But a few authors still take expected value as the primitive idea, in terms of which they define probability.

AXIOMS: KOLMOGOROV

The definitive axioms for probability theory were published in 1933 by the immensely influential Russian mathematician A, N, Kolmogorov (1903-1987). This theory is much more developed than our basic rules, for it applies to infinite sets and employs the full differential and integral calculus, as part of what is called measure theory.

EXERCISES

1 Venn Diagrams.

Let L: A person contracts a lung disease,

Let S: That person smokes.

Write each of the following probabilities using the Pr notation, and then explain it using a Venn diagram. (a) The probability that a person either smokes or contracts lung disease (or both). (b) The probability that a person contracts lung disease, given that he or she smokes. (c) The probability that a person smokes, given that she or he contracts lung disease. :1 Toml probability. Prove from the basic rules that Pr(A) + Pre- A) = 1.

3 Multiplying. Prove from the definition of statistical independence that if 0 <

Pr(A), and 0 < Pr(B), and A and B are statistically independent, Pr(A&B) = Pr(A)Pr(B).

4 Conventions. In Chapter 4, page 40, we said that the rules for normality and

certainty are just conventions. Can you think of any other plausible conventions for representing probability by numbers?

S Terrorists. This is a story about a philosopher, the late Max Black.

One of Black's students was to go overseas to do some research on Kant. She was afraid that a terrorist would put a bomb on the plane. Black could not convince her that the risk was negligible. So he argued as follows:

BLACK: Well, at least you agree that it is almost impossible that two people

should take bombs on your plane?

STUDENT: Sure. llLAClC: Then you should take a bomb on H!.e p!a.T\e. The risk th<1t thpT'I' would be another bomb on your plane is negligible. What's the joke?