



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: Machine Learning; Subject: (Computer Science); University: University of Houston; Term: Spring 2009;
Typology: Exams
1 / 6
This page cannot be seen from the preview
Don't miss anything!




a) Compare decision trees with kNN to solve classification problems. What are the main differences between these two approaches? [5]
kNN Decision Tree
b) We would like to predict the gender of a person based on two binary attributes: leg- cover (pants or skirts) and beard (beard or bare-faced). We assume we have a data set of 20000 individuals, 16000 of which are male and 4000 of which are female. 80% of the 16000 males are barefaced. Skirts are present on 50% of the females. All females are bare-faced and no male wears a skirt. i) Compute the information gain of using the attribute leg-cover for predicting gender! Just giving the formula that computes the information gain is fine; you do not need to compute the exact value of the formula! Use H as the entropy function in your formula (e.g. H(1/3,2/3) is the entropy that 1/3 of the examples belong to class1 and 2/3 of the examples belong to class 2). [2] ii) Computer the information gain of using the attribute beard to predict gender! [2]
i) Gain(D, leg-cover) = H(1/5, 4/5) – (1/10)H(1, 0) – (9/10)H(1/9, 8/9) ii) Gain(D, beard) = H(1/5, 4/5) – 4/25 H(0,1) – 21/25 H(16/21, 5/21)**
_() This question doesn’t require you to compute the exact value but you have to write the formulas in above forms to get credit._*
c) Why do decision tree learning algorithms grow the entire tree and then apply pruning techniques to trim the tree to obtain a tree of smaller size? [3]
What role do non-parametric density functions play for the DENCLUE clustering algorithm? Give a description how the DENCLUE algorithm clusters a data set. Limit your answer to the second question to at most 6 sentences.
b) What is the main difference between the Gaussian Kernel Density function approach as described on page 157 of the textbook and the k-nearest Neighbor Density Estimator that has been described in Section 8.2.3. [3]
c) What advantages you see in using non-parametric density estimation approach compared to parametric density approaches, such as multivariate Gaussians? [3] Non-parametric density estimation approach:
a) Both editing and condensing a are popular in conjunction with kNN classifiers. What is the goal of dataset editing? What is the goal of dataset condensing? [3]
b) Give a sketch of an algorithm that uses Voronoi diagrams (or their dual Delaunay graphs) for condensing a classification dataset![4]
a) What is the goal of Principal Component Analysis (PCA)? Limit your answer to at most 5 sentences. [4]
b) The eigenvectors chosen to form the transformation wT^ that reduces dataset dimensionality in PCA have to orthonormal. What does this mean? Why is it desirable that the selected eigenvectors are orthonormal? [4]
a) What the characteristics of objects that are classified as outliers by DBSCAN? [2]
b) How does DBSCAN form clusters? Limit your answer to at most 5 sentences [3]
c) DBSCAN does not well to cluster datasets that have clusters of varying densities. What is the explanation for that [2]?