Introduction to Machine Learning - Project - Artificial Intelligence | CS 510, Study Guides, Projects, Research of Computer Science

Material Type: Project; Professor: Greenstadt; Class: Introduction to Artificial Intelligence; Subject: Computer Science; University: Drexel University; Term: Fall 2008;

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 08/19/2009

koofers-user-b1g
koofers-user-b1g 🇺🇸

5

(1)

9 documents

1 / 46

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lecture 7 : Intro to
Machine Learning
Rachel Greenstadt
November 11, 2008
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e

Partial preview of the text

Download Introduction to Machine Learning - Project - Artificial Intelligence | CS 510 and more Study Guides, Projects, Research Computer Science in PDF only on Docsity!

Lecture 7 : Intro to

Machine Learning

Rachel Greenstadt

November 11, 2008

Reminders

Machine Learning exercise out today

We’ll go over it

Due before class 11/

Machine Learning

Definition: the study of computer algorithms that improve automatically through experience

Formally:

Improve at task T

with respect to performance measure P

based on experience E

Example: Learning to play Backgammon

T: play backgammon

P: number of games won

E: data about previously played games

Where is ML useful?

Self-customizing software

spam filters, learning user preferences

Data mining

medical records, credit fraud

Software that can’t be written by hand

speech recognition, autonomous driving

Other examples?

Learning element

  • Design of a learning element is affected by
    • Which components of the performance element are to

be learned

  • What feedback is available to learn these components
  • What representation is used for the components
  • Type of feedback:
  • Supervised learning: correct answers for each example
  • Unsupervised learning: correct answers not given
  • Reinforcement learning: occasional rewards

Inductive Learning

Supervised

“Teacher” provides labeled examples

Opposite of unsupervised, e.g. clustering

Inductive Inference

Given: samples of an unknown function f

(x, f(x)) pairs

Goal: find a function h that approximates f

Inductive Learning

Construct/adjust h to agree with f on training set

(the supervised examples)

(h is consistent if it agrees with f on all examples)

E.g. curve-fitting

Inductive Learning

Construct/adjust h to agree with f on training set

(the supervised examples)

(h is consistent if it agrees with f on all examples)

E.g. curve-fitting

Inductive Learning

Construct/adjust h to agree with f on training set

(the supervised examples)

(h is consistent if it agrees with f on all examples)

E.g. curve-fitting

Inductive Learning

Construct/adjust h to agree with f on training set

(the supervised examples)

(h is consistent if it agrees with f on all examples)

E.g. curve-fitting

Occam’s razor: prefer the simplest hypothesis

consistent with the data

Induction Task Example

Decision Trees :

PlayTennis

Expressiveness

  • Decision trees can express any function of the input attributes.
  • E.g., for Boolean functions, truth table row → path to leaf:
  • Trivially, there is a consistent decision tree for any training set with one path to leaf for each example (unless f nondeterministic in x ) but it probably won't generalize to new examples
  • Prefer to find more compact decision trees

Hypothesis spaces

How many distinct decision trees with n Boolean attributes?

= number of Boolean functions

= number of distinct truth tables with 2

n

rows = 2

2 n

  • E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,

trees

How many purely conjunctive hypotheses (e.g., Hungry ∧ ¬ Rain )?

  • Each attribute can be in (positive), in (negative), or out ⇒ 3 n distinct conjunctive hypotheses
  • More expressive hypothesis space
    • increases chance that target function can be expressed
    • increases number of hypotheses consistent with training set ⇒ may get worse predictions