Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

More Methodology: Nearest-Neighbor Classifiers | CS 591, Exams of Programming Languages

University of New Mexico (UNM) - Gallup Programming Languages

Material Type: Exam; Class: ST: Prog Analy &Mechanization; Subject: Computer Science; University: University of New Mexico; Term: Unknown 1989;

Typology: Exams

Pre 2010

Uploaded on 07/22/2009

koofers-user-kxl 🇺🇸

4.5

(2)

10 documents

1 / 24

This page cannot be seen from the preview

Don't miss anything!

More Methodology;

Nearest-Neighbor

Classifiers

Sec 4.7

Discover Exams of Programming Languages University of New Mexico (UNM) - Gallup

Partial preview of the text

Download More Methodology: Nearest-Neighbor Classifiers | CS 591 and more Exams Programming Languages in PDF only on Docsity!

More Methodology;

Nearest-Neighbor

Classifiers

Sec 4.

Review: Properties of DTs

Axis orthagonal, hyperrectangular, piecewise- constant models

Categorical labels

Non-metric

Holdout data

Usual to “hold out” a separate set of data for testing; not used to train classifier

A.k.a., test set, holdout set, evaluation set, etc.

E.g.,

is training set accuracy

is test set (or generalization ) accuracy

X = [X 1 , X 2 ,... , XN ]

⇒ X train = [X 1 , X 2 , ..., Xi ]

acc(X train )

acc(X test )

X test = [X i+1 , X i+2 ,... , XN ]

Gotchas...

What if you’re unlucky when you split data into train/test?

E.g., all train data are class A and all test are class B?

No “red” things show up in training data

Best answer: stratification

Try to make sure class (+feature) ratios are same in train/test sets (and same as original data)

Why does this work?

Almost as good: randomization

Shuffle data randomly before split

Why does this work?

CV in pix

[ X ; Y ]

Original data [ X ’; Y ’] Random shuffle k -way partition [ X1 ’ Y1 ’] [ X2 ’ Y2 ’] [ Xk ’ Yk ’] ... k train/ test sets k accuracies 53.7% 85.1% 73.2%

But is it really learning?

Now we know how well our models are performing

But are they really learning?

Maybe any classifier would do as well

E.g., a default classifier (pick the most likely class) or a random classifier

How can we tell if the model is learning anything?

Go back to first definitions

What does it mean to learn something?

Measuring variance

Cross validation helps you get better estimate of accuracy for small data

Randomization (shuffling the data) helps guard against poor splits/ordering of the data

Learning curves help assess learning rate/asymptotic accuracy

Still one big missing component: variance

Definition: Variance of a classifier is the fraction of error due to the specific data set it’s trained on

Measuring variance

Variance tells you how much you expect your classifier/performance to change when you train it on a new (but similar) data set

E.g., take 5 samplings of a data source; train/test 5 classifiers

Accuracies: 74.2, 90.3, 58.1, 80.6, 90.

Mean accuracy: 78.7%

Std dev of acc: 13.4%

Variance is usually a function of both classifier and data source

High variance classifiers are very susceptible to small changes in data

Putting it all together

10 20 30 40 50 60 70 80 90 40 50 60 70 80 90 100 % data size accuracy “hepatitis” data

5 minutes of math...

Decision trees are non-metric

Don’t know anything about relations between instances, except sets induced by feature splits

Often, we have well-defined distances between points

Idea of distance encapsulated by a metric

5 minutes of math...

Examples:

Euclidean distance

d(X

, X

(x

a 1

− x

b 1

+ · · · + (x

a d

− x

b d

= ((X

− X

· (X

− X

1 2

(x

a i

− x

b i

Note: omitting the square root still yields a metric and usually won’t change our results

5 minutes of math...

Examples:

Manhattan (taxicab) distance

Distance travelled along a grid between two points

No diagonals allowed

d(X

, X

) = |x

a 1

− x

b 1

| + · · · + |x

a d

− x

d b

|x

a i

− x

b i

5 minutes of math...

Examples:

What if some attribute is categorical?

Typical answer is 0/1 distance :

For each attribute, add 1 if the instances differ in that attribute, else 0

d

0 / 1

δ(x

a i

= x

b i

Distances in classification

Nearest neighbor : find the nearest instance to the query point in feature space, return the class of that instance

Simplest possible distance-based classifier

With more notation:

f (X) = Class( arg min

X ′ ∈Xtrain

d(X, X

′

More Methodology: Nearest-Neighbor Classifiers | CS 591, Exams of Programming Languages

Related documents

Partial preview of the text

Download More Methodology: Nearest-Neighbor Classifiers | CS 591 and more Exams Programming Languages in PDF only on Docsity!

More Methodology;

Nearest-Neighbor

Classifiers

Review: Properties of DTs

Holdout data

X = [X 1 , X 2 ,... , XN ]

⇒ X train = [X 1 , X 2 , ..., Xi ]

acc(X train )

acc(X test )

X test = [X i+1 , X i+2 ,... , XN ]

Gotchas...

CV in pix

[ X ; Y ]

But is it really learning?

Measuring variance

Measuring variance

Putting it all together

5 minutes of math...

5 minutes of math...

d(X

, X

(x

− x

+ · · · + (x

− x

= ((X

− X

· (X

− X

(x

− x

5 minutes of math...

d(X

, X

) = |x

− x

| + · · · + |x

− x

|x

− x

5 minutes of math...

d

δ(x

= x

Distances in classification

f (X) = Class( arg min

d(X, X