Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Machine Learning 10-601, Lecture notes of Artificial Intelligence

Carnegie Mellon University (CMU)Artificial Intelligence

A set of notes from a lecture on Machine Learning given by Tom M. Mitchell at Carnegie Mellon University. The lecture covers topics such as Bishop chapter 8, graphical models, Bayes Nets, inference, learning, EM Midterm, and belief propagation. The lecture also discusses the use of Monte Carlo methods and variational methods for tractable approximate solutions. an example of generating a sample from a joint distribution and estimating marginals. The lecture also covers the EM algorithm for learning from partly observed data.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

rubytuesday 🇺🇸

4.4

(38)

273 documents

1 / 34

This page cannot be seen from the preview

Don't miss anything!

Machine Learning 10-601

Tom M. Mitchell

Machine Learning Department

Carnegie Mellon University

February 25, 2015

Today:

• Graphical models

• Bayes Nets:

• Inference

• Learning

• EM

Readings:

• Bishop chapter 8

• Mitchell chapter 6

Discover Lecture notes of Artificial Intelligence Carnegie Mellon University (CMU)

Partial preview of the text

Download Machine Learning 10-601 and more Lecture notes Artificial Intelligence in PDF only on Docsity!

Machine Learning 10-

Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 25, 2015 Today:

Graphical models
Bayes Nets:
- Inference
- Learning
- EM Readings:
  - Bishop chapter 8
  - Mitchell chapter 6

Midterm

In class on Monday, March 2
Closed book
You may bring a 8.5x11 “cheat sheet” of notes
Covers all material through today
Be sure to come on time. We’ll start precisely

at 12 noon

What You Should Know

Bayes nets are convenient representation for encoding dependencies / conditional independence
BN = Graph plus parameters of CPD’s
- Defines joint distribution over variables
- Can calculate everything else from that
- Though inference may be intractable
Reading conditional independence relations from the graph - Each node is cond indep of non-descendents, given only its parents - X and Y are conditionally independent given Z if Z D-separates every path connecting X to Y - Marginal independence : special case where Z={}

Inference in Bayes Nets

In general, intractable (NP-complete)
For certain cases, tractable
- Assigning probability to fully observed set of variables
- Or if just one variable unobserved
- Or for singly connected graphs (ie., no undirected loops)
  - Belief propagation
Sometimes use Monte Carlo methods
- Generate many samples according to the Bayes Net distribution, then count up the results
Variational methods for tractable approximate solutions

Prob. of joint assignment: easy

Suppose we are interested in joint assignment <F=f,A=a,S=s,H=h,N=n> What is P(f,a,s,h,n)? let’s use p(a,b) as shorthand for p(A=a, B=b)

Prob. of marginals: not so easy

How do we calculate P(N=n)? let’s use p(a,b) as shorthand for p(A=a, B=b)

Generating a sample from joint distribution: easy How can we generate random samples drawn according to P(F,A,S,H,N)? Hint: random sample of F according to P(F=1) = θ F=

draw a value of r uniformly from [0,1]
if r<θ then output F=1, else F= Solution:
draw a random value f for F, using its CPD
then draw values for A, for S|A,F, for H|S, for N|S

Generating a sample from joint distribution: easy Note we can estimate marginals like P(N=n) by generating many samples from joint distribution, then count the fraction of samples for which N=n Similarly, for anything else we care about P(F=1|H=1, N=0) à weak but general method for estimating any probability term…

Learning of Bayes Nets

Four categories of learning problems
- Graph structure may be known/unknown
- Variable values may be fully observed / partly unobserved
Easy case: learn parameters for graph structure is known , and data is fully observed
Interesting case: graph known , data partly known
Gruesome case: graph structure unknown , data partly unobserved

Learning CPTs from Fully Observed Data Flu (^) Allergy Sinus Headache Nose kth^ training example δ(x) = 1 if x=true, = 0 if x=false

Example: Consider learning the parameter
Max Likelihood Estimate is
Remember why? let’s use p(a,b) as shorthand for p(A=a, B=b)

Estimate from partly observed data

What if FAHN observed, but not S?
Can’t calculate MLE
Let X be all observed variable values (over all examples)
Let Z be all unobserved variable values
Can’t calculate MLE: Flu (^) Allergy Sinus Headache Nose
WHAT TO DO?

Estimate from partly observed data

What if FAHN observed, but not S?
Can’t calculate MLE
Let X be all observed variable values (over all examples)
Let Z be all unobserved variable values
Can’t calculate MLE: Flu (^) Allergy Sinus Headache Nose
EM seeks* to estimate:
- EM guaranteed to find local maximum

EM Algorithm - Informally

EM is a general procedure for learning from partly observed data Given observed variables X, unobserved Z (X={F,A,H,N}, Z={S}) Begin with arbitrary choice for parameters θ Iterate until convergence:

E Step: estimate the values of unobserved Z, using θ
M Step: use observed values plus E-step estimates to derive a better θ Guaranteed to find local maximum. Each iteration increases

EM Algorithm - Precisely

EM is a general procedure for learning from partly observed data Given observed variables X, unobserved Z (X={F,A,H,N}, Z={S}) Define Iterate until convergence:

E Step: Use X and current θ to calculate P(Z|X,θ)
M Step: Replace current θ by Guaranteed to find local maximum. Each iteration increases

Machine Learning 10-601, Lecture notes of Artificial Intelligence

Related documents

Partial preview of the text

Download Machine Learning 10-601 and more Lecture notes Artificial Intelligence in PDF only on Docsity!

Machine Learning 10-

Midterm

at 12 noon

What You Should Know

Learning of Bayes Nets

Estimate from partly observed data

Estimate from partly observed data

EM Algorithm - Informally

EM Algorithm - Precisely