Supervised Machine in Image Processing and Analysis - Lecture Slides | CSE 591, Study notes of Computer Science

Material Type: Notes; Professor: Hakenberg; Class: Introduction to Image Processing and Analysis; Subject: Computer Science and Engineering; University: Arizona State University - Tempe; Term: Fall 2008;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-d54
koofers-user-d54 🇺🇸

9 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE 591
Supervised machine
learning
Fall 2008
http://www.public.asu.edu/~jhakenbe/591/
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Supervised Machine in Image Processing and Analysis - Lecture Slides | CSE 591 and more Study notes Computer Science in PDF only on Docsity!

CSE 591

Supervised machine

learning

Fall 2008 http://www.public.asu.edu/~jhakenbe/591/

Supervised learning

For classification tasks:

  • suppose some given data points fall into either of two classes
    • text categorization: email spam or ham
    • NER: name of a person or not
    • POS tagging: noun or not noun
    • weather: rain or no rain
  • and we know the class=label for each of them
  • for a new data point, the goal is to decide which class it belongs to
  • can be extended to multiple classes

Naïve Bayes classification

  • pick the hypothesis that is most probable:^ maximum a posteriori (MAP) decision rule ➠
  • text classification: assign class to document^ d’
  • two-class scenarios ( e.g. ,^ “ham or spam”): if > 0 ➱ ham ( c+ ), spam ( c- ) otherwise
  • centroid for class^ c : for document d , documents with class c : Dc
  • classification:

Rocchio

  • vector space representation of objects
  • for every class, compute the centroid
  • given a new object^ , find the closest centroid ➠ class

Support Vector Machines

  • in Rocchio and^ k NN, neighborhood defines class boundaries ➠ hyperplanes
  • SVM: calculate the hyperplane(s) directly
  • in an^ n -dimensional space, we try to find an ( n-1) -dimensional hyperplane ➠ linear classifier
  • there are many such classifiers ➠ try to achieve maximum separation ➠ maximum margin ➠ maximum margin hyperplane ➠ maximum margin classifier

Formalization of SVMs

  • data points given as ( xi, ci ),^ xi ∈^ ℝ n , ci ∈ {-1,+1}
  • a hyperplane can be written as^ w · x - b^ =^0 ➠ two margins: w·x - b = 1 and w·x - b = -
  • w is a normal vector perpendicular to the hyperplane
  • the distance between the two hyperplanes is (2 / ||w||)
  • maximum distance^ ➱^ minimize ||w||

Support vectors & dual form

  • maximum margin hyperplane is a function of the training data that lie on the margin ➠ support vectors ➠ same for the classification task
  • classification rule in its^ dual form:
  • subject to^ and
  • α^ terms constitute a^ dual representation^ for the weight vector w in terms of the training set:

Non-linear classification

  • SVMs are generalized^ linear classifiers
  • but not all problems are linearly separable (in^ n )
  • kernel trick: replaces every dot product with a (non- linear) kernel function ➠ fit hyperplane in a transformed feature space where a linear solution exists
    • transformation may be non-linear
    • transformed space high-dimensional
  • kernels^ essentially provide the^ mapping^ and compute a similarity between two data points
    • k (x,x’)^ =^ (x,x’) d (polynomial kernel)

Class project phase 2

  • use naïve Bayes,^ k NN, or SVM
    • SVM for Java: libSVM, jSVM; console/C: SVM light
  • find useful features
    • tokens in the sentence
    • entity types (and number) in the sentence
    • distance of associated entities
    • tokens / POS tags between associated entities
  • simplest model (baseline):
    • “co-occurrence in a sentence^ ➱^ association”
  • compare baseline with your approach

References

  • Tom Mitchell: Introduction to Machine Learning
  • Wikipedia
  • Manning et al.: Introduction to Information Retrieval. Online: http://nlp.stanford.edu/IR-book/html/htmledition/irbook.html