






















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A Short Course on Graphical Models, Using Probability Theory to reason under uncertainty
Typology: Lecture notes
1 / 30
This page cannot be seen from the preview
Don't miss anything!























A Short Course on Graphical Models
Mark Paskin
1
Reasoning under uncertainty
when we have imperfect or incomplete information.In many settings, we must try to understand what is going on in a system
laziness
(modeling every detail of a complex system is costly)
ignorance
(we may not completely understand the system)
Our model will reflect both laziness and ignorance:Example: deploy a network of smoke sensors to detect fires in a building.
We are too
lazy
to model what, besides fire, can trigger the sensors;
We are too
ignorant
to model how fire creates smoke, what density of
smoke is required to trigger the sensors, etc.
2
The only prerequisite: Set Theory
A
B
A
∩
B
A
B
A
B
A
∪
B
A
B
requires Measure Theory.countably infinite sets is not difficult. The extension to uncountably infinite sets For simplicity, we will work (mostly) with finite sets. The extension to
4
Probability spaces
probability space
represents our uncertainty regarding an
experiment
sample space
Ω, which is a set of
outcomes
; and
probability measure
Ω^ , which is a real function of the subsets of Ω.
P
ℜ
A
P ( A )
A set of outcomes
Ω is called an
event
) represents how likely it is
that the experiment’s
actual
outcome will be a member of
5
0 for all events
) for disjoint events
and
A
P ( A ) +
P ( B ) =
P ( A ∪ B )
0
1
B
7
Some simple consequences of the axioms
If
then
8
Conditional probability
Conditional probability allows us to reason with
partial information
When
0, the
conditional probability of
given
is defined as
This is the probability that
occurs, given we have
observed
, i.e., that
we know the experiment’s actual outcome will be in
. It is the fraction of
probability mass in
that also belongs to
) is called the
a priori (or prior) probability
of
and
) is called
the
a posteriori probability
of
given
Ω
ℜ
P ( A ∩ B ) (^) / P ( B ) =
P ( A | B
)
A
B
10
Example of conditional probability
If
is defined by
fire
no fire
smoke
no smoke
then
fire
(^) smoke
fire
(^) smoke
no fire
smoke
fire
(^) smoke
fire
smoke
no fire
(^) smoke
fire
smoke
no fire
(^) smoke
fire
(^) smoke
fire
(^) smoke
no fire
smoke
11
The chain rule
Apply the product rule repeatedly:
i )
=
1 ) P
(^) ( A 2 | A 1 ) P
3
| (^) A
1
∩
2 ) (^) · · ·
k
| ∩
k −
1
i )
independence in Bayesian networks. The chain rule will become important later when we discuss conditional
13
Bayes’ rule
Use the product rule both ways with
) and divide by
For example, if Bayes’ rule translates causal knowledge into diagnostic knowledge.
is the event that a patient has a disease, and
is the event
that she displays a symptom, then
) describes a causal relationship, and
) describes a diagnostic one (that is usually hard to assess). If
) and
) can be assessed easily, then we get
) for free.
14
Examples of random variables
Let’s say our experiment is to draw a card from a deck:
random variable
example event
ω
) =
true
if
ω
is a
false
otherwise
true
ω
) =
n
if
ω
is the number
n
otherwise
ω
) =
if
ω
is a face card
otherwise
16
Densities
Let
Ξ be a finite random variable. The function
p X
is the
density of
if for all
x
p X
(^) ( x ) =
{ ω : X ( ω
x } )
When Ξ is infinite,
p X
is the
density of
if for all
ξ
⊆
{ ω : X ( ω ) ∈ ξ }
ξ p X
(^) ( x ) d
x
Note that
Ξ
p X
(^) ( x ) d
x
= 1 for a valid density.
Ω
Ξ
ω
X
X ( ω ) =
x
p
X
ℜ
p X (^) ( x )
17
Random variables and densities
are a layer of abstraction
probability space is implicit. We usually work with a set of random variables and a joint density; the
5
0
5
5
0
50
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.
Ω ω
X
Y
x
y
p XY
(^) ( x , y )
19
Marginal densities
Given the joint density
p XY
x, y
) for
Ξ and
Υ, we can
compute the
marginal density
of
by
p X
(^) ( x ) =
y ∑ ∈ Υ
p XY
x, y
when Υ is finite, or by
p X
(^) ( x ) =
Υ
p XY
x, y
) d
y
when Υ is infinite.
marginalizationThis process of summing over the unwanted variables is called
20