2 Questions for Assignment 4 - Statistical Pattern Recognition | CS 591Q, Assignments of Computer Science

Material Type: Assignment; Professor: Ross; Class: ADTP:Statistcl Pattrn Recogntn; Subject: Computer Science; University: West Virginia University; Term: Spring 2008;

Typology: Assignments

Pre 2010

Uploaded on 07/30/2009

koofers-user-0ng
koofers-user-0ng 🇺🇸

10 documents

1 / 1

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Homework 4
CS 591Q/791V - Pattern Recognition
Instructor: Dr. Arun Ross
Due Date: April 24, 2008
Note: You are permitted to discuss the following questions with others in the class.
However, you must write up your own solutions to these questions. Any indication to the
contrary will be considered an act of academic dishonesty. Code developed as part of this
assignment should be placed in a zip file and sent to arun.ross at mail.wvu.edu with the
subject line “CS 591Q/791V : Homework 4”. Also, include a hard-copy of your code when
you submit the homework.
1. [20 points] Consider two class-conditional densities N(10,5) and N(25,5). Generate 100
random training samples from each of the two distributions. Write a program that employs
the Parzen window technique with a Gaussian kernel to estimate the density using al l 200
samples. Plot the estimated density function for window widths of 0.01, 0.5, and 10.0.
Repeat the above after generating 1000 training samples from each of the two distributions.
2. [20 points] The iris (flower) dataset consists of 150 4-dimensional patterns belonging to three
classes (setosa, versicolor, and virginica). There are 50 patterns per class. The 4 features
correspond to (a) sepal length in cm, (b) sepal width in cm, (c) petal length in cm, and (d)
petal width in cm. The data can be accessed here. Note that the class labels are indicated
at the end of every pattern.
Design a K-NN classifier for this dataset. Randomly choose half the data (from each class)
for training the classifier (i.e., these are the prototypes) and the remaining half for testing
the classifier. Perform this data-partitioning exercise ten times and report the empirical
error rate each time. Also, compute the average and the variance of the error rates. In order
to study the effect of Kon the performance of the classifier, report error rate statistics for
K=1,3,5,7 and 9.

Partial preview of the text

Download 2 Questions for Assignment 4 - Statistical Pattern Recognition | CS 591Q and more Assignments Computer Science in PDF only on Docsity!

Homework 4

CS 591Q/791V - Pattern Recognition Instructor: Dr. Arun Ross Due Date: April 24, 2008

Note: You are permitted to discuss the following questions with others in the class. However, you must write up your own solutions to these questions. Any indication to the contrary will be considered an act of academic dishonesty. Code developed as part of this assignment should be placed in a zip file and sent to arun.ross at mail.wvu.edu with the subject line “CS 591Q/791V : Homework 4”. Also, include a hard-copy of your code when you submit the homework.

  1. [20 points] Consider two class-conditional densities N(10,5) and N(25,5). Generate 100 random training samples from each of the two distributions. Write a program that employs the Parzen window technique with a Gaussian kernel to estimate the density using all 200 samples. Plot the estimated density function for window widths of 0.01, 0.5, and 10.0. Repeat the above after generating 1000 training samples from each of the two distributions.
  2. [20 points] The iris (flower) dataset consists of 150 4-dimensional patterns belonging to three classes (setosa, versicolor, and virginica). There are 50 patterns per class. The 4 features correspond to (a) sepal length in cm, (b) sepal width in cm, (c) petal length in cm, and (d) petal width in cm. The data can be accessed here. Note that the class labels are indicated at the end of every pattern. Design a K-NN classifier for this dataset. Randomly choose half the data (from each class) for training the classifier (i.e., these are the prototypes) and the remaining half for testing the classifier. Perform this data-partitioning exercise ten times and report the empirical error rate each time. Also, compute the average and the variance of the error rates. In order to study the effect of K on the performance of the classifier, report error rate statistics for K=1,3,5,7 and 9.