Decision Trees and Classification , Lecture Notes - Computer Science | Study notes Artificial Intelligence

CS181 Lecture 2 — Decision Trees and Classification

Avi Pfeffer; Revised by David Parkes

Jan 23, 2010

In introducing supervised learning we consider the special problem of learning a Boolean concept from

training examples, some of which satisfy the concept and some of which do not. This is a classical machine

learning problem. After discussing some basic ideas for supervised learning, we will turn to a particular

learning algorithm, decision trees.

Optional readings for the next two lectures: Chapter 18 (through 18.4) of Russell & Norvig, Chapters 1

& 3 of Mitchell “Machine Learning”

1 The Task: Supervised Learning

Whenever we talk about learning, there is a hierarchy of tasks to consider. We must first talk about the

ultimate task to be performed, the thing we are trying to learn to do. Then we can talk about the task

of learning how to perform the ultimate task. Finally, we can consider the task of designing a learning

algorithm.

For the next few classes, we will focus on classification as the ultimate task to be performed. Classification

means determining what category an object falls into, based on its features (or attributes). For example, we

might try to classify a plant as nutritious or poisonous, based on biological features such as color, leaf shape

and so on. Or we might try to classify a pixel image as being a particular digit.

Classification An object is described by a set of features X1,...,Xm. There is a set Y={1,...,c}of

possible classes. Given the features x= (x1,...,xm)∈Xof a particular object, a classifier needs to

determine it true class f(x) = y∈Y.1Thus a classifier is a function h:X→Y.

The performance of a classifier hon a new instance (x, y) is measured by an error function, for example

∆(y, y′) = 0 if y=y′

1 otherwise, (1)

where y′=h(x). In some domains a more complex error function can be appropriate. For example, in

a medical domain, the cost of false negatives (missing a disease diagnosis) is likely higher than that of

false positives (incorrectly diagnosing disease when there is none).

Classification is a special case of the general problem of supervised learning.

Supervised Learning The goal of supervised learning is to learn a classifier from a set of labeled data D.

Each instance (x, y)∈ D defines feature values x= (x1,...,xm) and a target value y∈Y. Together,

we have D={(x1,y 1),. . . , (xn, yn)}, and nlabeled examples. A supervised learning algorithm takes

this data and outputs a function h:X→Y.

Thus a supervised learning algorithm can itself be considered to define a function Lfrom labeled

training data to classifiers. For a particular training set D,L(D) is a classifier.

1We will generally use boldf ace to denote vectors or matrices and capital letters to denote sets, with small letters to denote

particular elements of these sets.

Decision Trees and Classification , Lecture Notes - Computer Science, Study notes of Artificial Intelligence

Related documents

Partial preview of the text

Download Decision Trees and Classification , Lecture Notes - Computer Science and more Study notes Artificial Intelligence in PDF only on Docsity!

CS181 Lecture 2 — Decision Trees and Classification

Avi Pfeffer; Revised by David Parkes

Jan 23, 2010

1 The Task: Supervised Learning

2 Is Learning Possible?

3 Inductive Bias

4 Learning Conjunctive Concepts with Decision Trees

P N

N

P N

T F T F

6 Information Gain: Deciding How to Branch