Features, Features, Features - Lecture Slides | CSE 591, Study notes of Computer Science

Material Type: Notes; Professor: Hakenberg; Class: Introduction to Image Processing and Analysis; Subject: Computer Science and Engineering; University: Arizona State University - Tempe; Term: Fall 2008;

Typology: Study notes

Pre 2010

Uploaded on 09/02/2009

koofers-user-v5d
koofers-user-v5d 🇺🇸

9 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE 591
Features, features, features
Fall 2008
http://www.public.asu.edu/~jhakenbe/591/
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Features, Features, Features - Lecture Slides | CSE 591 and more Study notes Computer Science in PDF only on Docsity!

CSE 591

Features, features, features

Fall 2008

http://www.public.asu.edu/~jhakenbe/591/

Features, features, features

• What is a feature?

• Software

  • a^ property, gimmick (+/-)
  • or something unexpected/surprising (+/-)^ ➠^ “It’s not a bug, it’s a feature”
  • that^ distinguishes it from (most) others

• Machine learning, NLP

  • a^ property of a class^ of {data, words, documents, entities, …}
  • that^ distinguishes it from (most) others^ (or: helps to ~)
  • a property is^ common^ to members of the class, but^ not mandatory
  • “positive feature”^ ➠^ many members of the class show it
  • “negative feature”^ ➠^ (almost) no members of the class show it

Why & where do features help?

  • features help to^ distinguish^ instances of one class from all others & to group instances together ➱ distance versus similarity
  • a feature known to be^ positive^ wrt. to one class applies to current word ➠ increases the likelihood of the word belonging to that class
  • a^ negative^ feature^ ➠^ decrease
  • a word ends with “-tion”^ ➠^ likely a noun
  • a word ends with “-ly”^ ➠^ unlikely a noun

Feature engineering

  • you can^ define relevant (+/-) features^ yourself
    • write down all +/- suffixes^ ➠^ that includes telling which one is + or - or
  • you can^ define classes of features
    • suffixes (1-suffix, 2-suffix, …), prefixes, n-grams; capitalization, occurrence of symbols or numbers, surface features (“aaa0”, “Aaaa”)
    • part-of-speech, dictionary-lookup, …
    • for current word and/or within window
  • and have a^ machine learner^ decide which^ instances of the feature classes (=single features) make sense to use for which kind of decision (+/-)
    • e.g. , finds “-tion” as an instance of 4-suffix and that is useful (with a certain positive value) to distinguish nouns from other things
    • e.g. , finds “-ly” as an instance of 2-suffix and that is is useful (with a certain negative value) to dinstinguish other things from nouns

Feature vectors

• are used to^ represent instances^ in your data

• positive and negative instances

• contain all single features that were found in the data

  • single features^ that you designed
  • instances of feature classes^ that you designed

• sparse representation^ ➠^ a vector contains only

features with values different from 0

  • (17:0.9 21:3.2 28:1.0 45:0.1 108:1.1 237:1.2 244:2.0) T
  • mk

Features for … NER

• see Bob’s slides for gene/protein NER