##### Document information

**Introduction to Graphical Models
**

**Part 2 of 2
**

**Lecture 31 of 41
**

Docsity.com

**Graphical Models Overview [1]:
**

**Bayesian Networks
**

*P*(*20s*, *Female*, *Low,**Non-Smoker*, *No-Cancer,* *Negative,* *Negative*)

** = P(T) · P(F)**

**·**

*P*(*L*|*T*) ·*P*(*N*|*T*,*F*) ·*P*(*N*|*L, N*) ·*P*(*N*|*N*) ·*P*(*N*|*N*)• **Conditional Independence
**

– ** X is conditionally independent (CI) from Y given Z (sometimes written X **

*Y*|*Z*) iff*P*(*X* | *Y*, *Z*) = *P*(*X* | *Z*) for all values of *X,* *Y*, and *Z*

– **Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning) **

****

*T*

*R*|*L*• **Bayesian (Belief) Network
**

– **Acyclic directed graph model B = (V, E, **

**) representing**

*CI assertions*over

– **Vertices (nodes) V: denote events (each a random variable)
**

– **Edges (arcs, links) E: denote conditional dependencies
**

• **Markov Condition for BBNs (Chain Rule):
**

• **Example BBN
**

*n
*

*i
*

*iin21 Xparents |XPX , ,X,XP
*1

*X1 X3
*

*X4
*

*X5
*

**Age**

**Exposure-To-Toxins**

**Smoking**

**Cancer**
*X6
*

**Serum Calcium**

*X2 *Gender*X7
*

**Lung Tumor**
*sDescendantNon
*

**
**

*Parents
*

** **

*sDescendant**
*

Docsity.com

• **Fusion
**

– *Methods for combining multiple beliefs
*

– **Theory more precise than for fuzzy, ANN inference
**

– **Data and sensor fusion
**

– **Resolving conflict (vote-taking, winner-take-all, mixture estimation)
**

– **Paraconsistent reasoning
**

• **Propagation
**

– *Modeling process of evidential reasoning by updating beliefs
*

– **Source of parallelism
**

– **Natural object-oriented (message-passing) model
**

– **Communication: asynchronous –dynamic workpool management problem
**

– **Concurrency: known Petri net dualities
**

• **Structuring
**

– **Learning graphical dependencies from scores, constraints
**

– **Two parameter estimation problems: structure learning, belief revision
**

**Fusion, Propagation, and Structuring
**

Docsity.com

**Bayesian Learning
**

• **Framework: Interpretations of Probability [Cheeseman, 1985]
**

– **Bayesian subjectivist view
**

• **A measure of an agent’s belief in a proposition
**

• **Proposition denoted by random variable (sample space: range)
**

• **e.g., Pr(Outlook = Sunny) = 0.8
**

– **Frequentist view: probability is the frequency of observations of an event
**

– **Logicist view: probability is inferential evidence in favor of a proposition
**

• **Typical Applications
**

– **HCI: learning natural language; intelligent displays; decision support
**

– **Approaches: prediction; sensor and data fusion (e.g., bioinformatics)
**

• **Prediction: Examples
**

– **Measure relevant parameters: temperature, barometric pressure, wind speed
**

– **Make statement of the form Pr(Tomorrow’s-Weather = Rain) = 0.5
**

– **College admissions: Pr(Acceptance) **

*p*• **Plain beliefs: unconditional acceptance ( p = 1) or categorical rejection (p = 0)
**

• **Conditional beliefs: depends on reviewer (use probabilistic model)
**

Docsity.com

**Choosing Hypotheses
**

** xfmaxarg
Ωx**

• **Bayes’s Theorem
**

**
**

• **MAP Hypothesis
**

– **Generally want most probable hypothesis given the training data
**

– **Define: **** the value of x in the sample space **

**with the highest**

*f*(*x*)– **Maximum a posteriori hypothesis, hMAP
**

**
**

**
**

**
**

**
**

• **ML Hypothesis
**

– **Assume that p(hi) = p(hj) for all pairs i, j (uniform priors, i.e., PH ~ Uniform)
**

– **Can further simplify and choose the maximum likelihood hypothesis, hML
**

*hPh|DPmaxarg
DP
*

*hPh|DP
maxarg
*

*D|hPmaxargh
*

*Hh
*

*Hh
*

*Hh
MAP
*

*DP
*

*DhP
*

*DP
*

*hPh|DP
D|hP
*

*i
Hh
*

** ML h|DPmaxargh
i**

Docsity.com

**Propagation Algorithm in Singly-Connected
**

**Bayesian Networks – Pearl (1983)
**

**C1
**

**C2
**

**C3
**

**C4 C5
**

**C6
**

**Upward (child-to-
**

**parent) **** messages
**

**’ ( Ci
’) modified during **

**message-passing phase
**

**Downward **** messages
**

**P’ ( Ci
’) is computed during **

**message-passing phase
**

**Multiply-connected case: exact, approximate inference are #**P**-complete
**

**(counting problem is #**P**-complete iff decision problem is **NP**-complete)
**

Docsity.com

**Inference by Clustering [1]: Graph Operations
**

**(Moralization, Triangulation, Maximal Cliques)
**

**A
**

**D
**

**B E G
**

**C
**

**H
**

**F
**

**Bayesian Network
**

**(Acyclic Digraph)
**

**A
**

**D
**

**B E G
**

**C
**

**H
**

**F
**

**Moralize
**

**A
**

**1
**

**D
**

**8
**

**B
**

**2
**

**E
**

**3
**

**G
**

**5
**

**C
**

**4
**

**H
**

**7
**

**F
**

**6
**

**Triangulate
**

*Clq6
*

*D8*

*C4*

*G5*

*H7*

*C4*

*Clq5
*

*G5*

*F6*

*E3*

*Clq4
*

*G5 E3
*

*C4**Clq3
*

*A1*

*B2*

*Clq1
*

*E3*

*C4*

*B2*

*Clq2
*

**Find Maximal Cliques
**

Docsity.com

**Inference by Clustering [2]:
**

**Junction Tree – Lauritzen & Spiegelhalter (1988)
**

**Input**: list of cliques of **triangulated, moralized graph** *Gu**
*

**Output:
**

** Tree of cliques
**

** Separators nodes Si**,

Residual nodes **Ri** and **potential probability ****(Clqi)** for all cliques**
**

**Algorithm**:

1. **Si** = Clq**i** (Clq**1** Clq**2** … Clq**i-1**)

2. **Ri** = Clq**i** - **Si**

3. If *i* >1 then identify a *j* < *i* such that Clq** j** is a parent of Clq

*i* 4. Assign each node *v* to a unique clique Clq** i **that

*v* c(

*v*) Clq

*i* 5. Compute **(Clqi)** = **f(v) Clqi** = P(*v *| *c*(*v*)) {1 if no *v* is assigned to Clq** i**}

6. Store Clq** i** ,

**Ri**,

**Si**, and

**(Clqi)**at each vertex in the tree of cliques

Docsity.com

**Inference by Clustering [3]:
**

**Clique-Tree Operations
**

*Clq6
*

*D8*

*C4*

*G5*

*H7*

*C4*

*Clq5
*

*G5*

*F6*

*E3*

*Clq4
*

*G5 E3
*

*C4**Clq3
*

*A1*

*B2*

*Clq1
*

*E3*

*C4*

*B2*

*Clq2
*

**(Clq5) = P(H|C,G)
**

**(Clq2) = P(D|C)
**

*Clq1
*

**Clq3 = {E,C,G}
**

**R3 = {G}
**

** S3 = { E,C }
**

**Clq1 = {A, B}
**

**R1 = {A, B}
**

**S1 = {}
**

**Clq2 = {B,E,C}
**

**R2 = {C,E}
**

** S2 = { B }
**

**Clq4 = {E, G, F}
**

**R4 = {F}
**

** S4 = { E,G }
**

**Clq5 = {C, G,H}
**

**R5 = {H}
**

** S5 = { C,G }
**

**Clq6 = {C, D}
**

**R5 = {D}
**

** S5 = { C}
**

**(Clq1) = P(B|A)P(A)
**

**(Clq2) = P(C|B,E)
**

**(Clq3) = 1
**

**(Clq4) =
**

**P(E|F)P(G|F)P(F)
**

*AB
*

*BEC
*

*ECG
*

*EGF CGH
*

*CD
*

*B
*

*EC
*

*CG EG
*

*C
*

*Ri*: residual nodes

** Si: separator nodes
**

**(**

*Clqi*): potential probability of Clique*i**Clq2
*

*Clq3
*

*Clq4 Clq5
*

*Clq6
*

Docsity.com

**Inference by Loop Cutset Conditioning
**

Split vertex in

undirected cycle;

condition upon each

of its state values

**Number of network
**

**instantiations:
**

Product of arity of

nodes in minimal loop

cutset

**Posterior**: marginal

conditioned upon

cutset variable values

*X3
*

*X4
*

*X5
*

**Exposure-To-
**

**Toxins**

**Smoking**

**Cancer** *X6
*

**Serum Calcium**

*X2
*

**Gender**

*X7
*

**Lung Tumor**

*X1,1
*

**Age = [0, 10)**

*X1,2
*

**Age = [10, 20)**

*X1,10
*

**Age = [100, ****)**

**
**

**
**

**
**

**
**

**
**

**
**

**
**

**
**

**
**

• **Deciding Optimal Cutset: **NP*-hard*

• **Current Open Problems
**

– **Bounded cutset conditioning: ordering heuristics
**

– **Finding randomized algorithms for loop cutset optimization
**

Docsity.com

**Inference by Variable Elimination [1]:
**

**Intuition
**

**http://aima.cs.berkeley.edu/
**

Docsity.com

**Inference by Variable Elimination [2]:
**

**Factoring Operations
**

**http://aima.cs.berkeley.edu/
**

Docsity.com

**Inference by Variable Elimination [3]:
**

**Example
**

A

B C

F

G

Season

Sprinkler Rain

Wet

Slippery

D

Manual

Watering

P(A|G=1) = ?

d = < A, C, B, F, D, G >

**G
**

**D
**

**F
**

**B
**

**C
**

**A
**

λG(f) = ΣG=1 P(G|F)

P(A), P(B|A), P(C|A), P(D|B,A), P(F|B,C), P(G|F)

P(G|F)

P(D|B,A)

P(F|B,C)

P(B|A)

P(C|A)

P(A)

G=1

Docsity.com