Computational Learning Theory , Lecture Notes - Computer Science | Study notes Artificial Intelligence

CS181 Lecture 20 — Computational Learning Theory

Avi Pfeffer; Revised by David Parkes

April 17, 2011

We turn now to the fundamental question of how and when it is possible to learn and the topic of

computational learning theory. We will briefly survey some of the main ideas in this area. Les Valiant,

who founded the field, teaches an in-depth course on the topic (CS228). The key concepts covered are

PAC-learnability, sample complexity and the VC dimension of a hypothesis space.1 2

1 Computational Learning Theory

Learning works. There’s lots of evidence of that, both from humans and also from the success of machine

learning algorithms. Today we’ll consider the question “Why does learning work?” There are several reasons

why one might want to answer this question, including:

•Simply for the sake of understanding. To quote Russell and Norvig, “Unless we find some answers,

machine learning will, at best, be puzzled by its own success.”

•To understand when and under what circumstances learning works.

•To be able to give guarantees about the performance of the hypothesis produced by an algorithm on

some training set.

•To be able to determine how many training samples are needed in order to produce a good hypothesis.

Simply stated, we’d like to understand how much data is required to provide good generalization. The

basic insight provided by the probably approximately correct (PAC) model of learning theory is that we

can provide formal guarantees that a learned hypothesis will generalize well by assuming that the future

distribution on examples will be the same as the distribution used for training.

Suppose you are trying to learn a classifier for animals and you never saw a pink elephant: this is all

right as long as you don’t expect to see pink elephants in the future!

We know that for learning to be possible we need some form of inductive bias. Computational learning

theory focuses on restriction bias, in which there is a restricted set of hypotheses that can be represented by

a learner. The basic question that is addressed is to understand whether it is possible to learn efficiently the

true hypothesis, under the assumption that it meets the restriction. Learning efficiently requires both that

the number of examples required is small and that the computational time required is small.

Computational learning theory also provides another way to think about Occam’s razor: it explains that

we should prefer simple hypotheses because they can be learned with less data. Computational learning

theory explains how to reason about the amount of training data required for a hypothesis space of a given

(representation) complexity.

1Additional material on computational learning theory can be found in the MIT Press book by Kearns and Vazirani, “An

Introduction to Computational Learning Theory.”

2These notes are based in part on Russell and Norvig, notes by Sally Goldman, class notes by Avrim Blum, and lecture

notes by Raymond Mooney.

Computational Learning Theory , Lecture Notes - Computer Science, Study notes of Artificial Intelligence

Related documents

Partial preview of the text

Download Computational Learning Theory , Lecture Notes - Computer Science and more Study notes Artificial Intelligence in PDF only on Docsity!

CS181 Lecture 20 — Computational Learning Theory

Avi Pfeffer; Revised by David Parkes

April 17, 2011

1 Computational Learning Theory

2 Basic Definitions

2.1 Probably Approximately Correct

3 Applications

3.1 Sample Complexity of Conjunctive Formulas

3.2 Sample Complexity of Finite Hypothesis Spaces

3.3 PAC-Learnability of Conjunctive Formulas

3.6 PAC-Learnability of Decision Lists

3.7 PAC-Learnability of General DNF formulas.

3.8 PAC-Learnability of Decision Trees

4 Variations to the PAC Model

5.1 VC dimension and Sample Complexity

O

]

+ VC (H)]

5.2 Application: Neural Networks