









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Presentation tips for delivering a talk on unsupervised learning, specifically focusing on expectation maximization and clustering. It also covers the structure of a final report, including its contents and evaluation criteria. The document concludes with an introduction to unsupervised probability modeling and hidden variables, leading to the concept of mixture models and the expectation-maximization algorithm.
Typology: Study notes
1 / 17
This page cannot be seen from the preview
Don't miss anything!










Practice!
Work on knowing what you’re going to say at each point.
Know your own presentation
Practice!
Work on timing
You have 15 minutes to talk + 3 minutes for questions
Will be graded on adherence to time!
Timing is hard. Becomes easier as you practice
Due: Dec 17, 5:00 PM (last day of finals week)
Should contain:
Intro: what was your problem; why should we care about it?
Background: what have other people done?
Your work: what did you do? Was it novel or re- implementation? (Algorithms, descriptions, etc.)
Results: Did it work? How do we know? (Experiments, plots & tables, etc.)
Discussion: What did you/we learn from this?
Future work: What would you do next/do over?
Length: Long enough to convey all that
Will be graded on:
Content: Have you accomplished what you set out to? Have you demonstrated your conclusions? Have you described what you did well?
Analysis: have you thought clearly about what you accomplished, drawn appropriate conclusions, formulated appropriate “future work”, etc?
Writing and clarity: Have you conveyed your ideas clearly and concisely? Are all of your conclusions supported by arguments? Are your algorithms/data/etc. described clearly?
General clustering framework:
Set target of k clusters
Choose a cluster optimality criterion
Often function of “between-cluster variation” vs. “within-cluster variation”
Find assignment of points to clusters that minimizes (maximizes) this criterion
Q: Given N data points and k clusters, how many possible clusterings are there?
Define:
Cluster i :
Cluster i mean:
Between-cluster variation:
Within-cluster variation:
i 1
i 2
i n
i
|C (^) i |
j=
i j
between
2 k
i= k
j=
i
j
2
k
i=
i
|C (^) i |
j=
i j
2
http://www.geophysik.ruhr-uni-bochum.de/index.php?id=3&sid= Clustering of seismological data
Sometimes, instead of clusters want a full probability model of data
Can sometimes use prob. model to get clusters
Recall: in supervised learning, we said:
Find a probability model, Pr[ X |C i ] for each class, C i
Now: find a prob. model for data w/o knowing class: Pr[ X ]
Simplest: fit your favorite model via ML
Harder: assume a “hidden cluster ID” variable
This form is called a “mixture model”
“mixture” of k sub-models
Equivalent to the process: Roll a weighted die (weighted by α i); choose the corresponding sub- model; generate a data point from that sub-model
Example: mixture of Gaussians:
k
i=
i
d
1 2
− 1 2 (X− ¯ Xi ) T Σ − 1 i (X− ¯ Xi )
How do you find the params, etc?
Simple answer: use maximum likelihood:
Write down joint likelihood function
Differentiate
Set equal to 0
Solve for params
Unfortunately... It doesn’t work in this case
Good exercise: try it and see why it breaks
Answer: Expectation Maximization
Assume: data generated from 1-d mixture of Gaussians:
Whole data set:
Introduce a “responsibility” variable:
If you know model params, can calculate responsibilities
k
i=
i
− (x−μ i ) 2 2 σ 2 i
j
i
j
k l=
l
i
l
Assume you know the responsibilities, zij
Can use this to find parameters for each Gaussian (think about special case where zij =0 or 1 ):
j
N i=
N i=
j
N i=
2
N i=
j
N i=
k a=
N i=
N i=