CS545: Linear Models for Classification - Chuck Anderson - Prof. Charles Anderson, Study notes of Computer Science

A portion of a lecture note from colorado state university's cs545: linear models for classification course, focusing on linear least squares for classification, indicator variables, masking problem, generative models for classification using qda and lda, and their applications in data fitting and overfitting. The document also includes examples and code snippets.

Typology: Study notes

Pre 2010

Uploaded on 11/08/2009

koofers-user-ecb-1
koofers-user-ecb-1 🇺🇸

10 documents

1 / 92

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS545: Linear
Models for
Classification
Chuck Anderson
Linear Least
Squares for
Classification
Indicator Variables
Masking Problem
Example
Generative Models
for Classification
QDA
Fitting the Generative
Distributions to Data
Overfitting
LDA
Example
CS545: Linear Models for Classification
Chuck Anderson
Department of Computer Science
Colorado State University
Fall, 2009
1 / 92
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c

Partial preview of the text

Download CS545: Linear Models for Classification - Chuck Anderson - Prof. Charles Anderson and more Study notes Computer Science in PDF only on Docsity!

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

CS545: Linear Models for Classification

Chuck Anderson

Department of Computer Science

Colorado State University

Fall, 2009

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Outline

Linear Least Squares for Classification

Indicator Variables

Masking Problem

Example

Generative Models for Classification

QDA

Fitting the Generative Distributions to Data

Overfitting

LDA

Example

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

To classify a sample as being a member of 1 of 3

different classes, we could use integers 1, 2, and 3 as

target outputs.

Class

1

2

3

x

Linear Model

Linear function of x seems to match data fairly well.

Why is this not a good idea?

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

We must convert the continuous y-axis value to discrete

integers 1, 2, or 3. Without adding more parameters,

we are forced to use the general solution of splitting at

1.5 and 2.5.

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

We must convert the continuous y-axis value to discrete

integers 1, 2, or 3. Without adding more parameters,

we are forced to use the general solution of splitting at

1.5 and 2.5.

Class

1

2

3

x

Class 1 Class 2 Class 3

Rats! Boundaries are not where we want them.

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Indicator Variables

To allow flexibility, we need to decouple the modeling of

the boundaries. Problem is due to using one value to

represent all classes.

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Indicator Variables

To allow flexibility, we need to decouple the modeling of

the boundaries. Problem is due to using one value to

represent all classes.

Instead, let’s use three values, one for each class.

Binary-valued variables are adequate. Class 1 =

(1, 0 , 0), Class 2 = (0, 1 , 0) and Class 3 = (0, 0 , 1). Our

linear model has three outputs now. How do we

interpret the output for a new sample?

Let the output be y = (y 1 , y 2 , y 3 ). Convert these values

to a class by picking the maximum value.

class = argmax

i

yi

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Indicator Variables

To allow flexibility, we need to decouple the modeling of

the boundaries. Problem is due to using one value to

represent all classes.

Instead, let’s use three values, one for each class.

Binary-valued variables are adequate. Class 1 =

(1, 0 , 0), Class 2 = (0, 1 , 0) and Class 3 = (0, 0 , 1). Our

linear model has three outputs now. How do we

interpret the output for a new sample?

Let the output be y = (y 1 , y 2 , y 3 ). Convert these values

to a class by picking the maximum value.

class = argmax

i

yi

Targets

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Can plot the three output components on three separate

graphs. What linear functions will each one learn?

Indicator variable 1

1

0

x

Indicator variable 2

1

0

x

Indicator variable 3

1

0

x

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Can plot the three output components on three separate

graphs. What linear functions will each one learn?

Indicator variable 1

1

0

x

Indicator variable 2

1

0

x

Indicator variable 3

1

0

x

Indicator variable 1

1

0

x

Indicator variable 2

1

0

x

Indicator variable 3

1

0

x

Overlay them to see which one is the maximum for each

x value.

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Can plot the three output components on three separate

graphs. What linear functions will each one learn?

Indicator variable 1

1

0

x

Indicator variable 2

1

0

x

Indicator variable 3

1

0

x

Indicator variable 1

1

0

x

Indicator variable 2

1

0

x

Indicator variable 3

1

0

x

Overlay them to see which one is the maximum for each

x value.

Indicator

variables

x

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

See any potential problems?

Indicator

variables

(1,0,0)

x

(0,1,0)

(0,0,1)

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

See any potential problems?

Indicator

variables

(1,0,0)

x

(0,1,0)

(0,0,1)

What if the green line is too low?

Indicator

variables

(1,0,0)

x

(0,1,0)

(0,0,1)

What could cause this?

Models for

Classification

Chuck Anderson

Linear Least

Squares for

Classification

Indicator Variables Masking Problem Example

Generative Models

for Classification

QDA

Fitting the Generative Distributions to Data Overfitting LDA Example

Too few samples from Class 2.

Indicator

variable

1

0

x

Indicator

variable

1

0

x

Indicator

variable

1

0

x