Pattern Recognition Problem Set 3: Estimating Probabilities and Bayesian Classification - , Assignments of Computer Science

Problem set 3 for cs 4803a/8803a: pattern recognition course. It includes five problems related to estimating probabilities using the multinomial distribution, maximum likelihood estimation, bayesian classification, and independence of random variables. Problem 5 involves a fishing scenario where two fish with given weights are classified as bass, but the optimal classification is to be determined.

Typology: Assignments

Pre 2010

Uploaded on 08/05/2009

koofers-user-3t4
koofers-user-3t4 🇺🇸

9 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 4803A/8803A: Pattern Recognition
Problem Set 3
Date: Feb 20, 2003
Due: Feb 28th, 2003 - which is Friday before Spring Break.
But will be accepted until Monday 10th.
1. It is suspected that a certain four-sided die is unfair. By rolling the die many times, we wish
to estimate the probability of each of the four outcomes. Let {Pi}, i = 1, . . . , 4denote the
unknown probabilities.
Suppose the die is rolled Ntimes. If the probabilities to be estimated were known, then the
probability of observing exactly y11’s, y22’s, . . ., and y44’s in the Nrolls would be:
P(y1, . . . , y4;P1, . . . , P4) = N!
y1!y2!· · · y4!Py1
1Py2
2· · · Py4
4.
This distribution is called the multinomial distribution.
Suppose exactly yioccurrences of iare observed in the Nrolls, where i= 1,...,4. Find the
maximum likelihood estimates {ˆ
Pi}for the probabilities {Pi}, i = 1, . . . , 4in terms of the
yi’s.
2. DHS Problem 3.2
3. DHS Problem 3.15
4. DHS Problem 3.11 (Symbolic math - but cool to see that the obvious Gaussian model is in fact
the best Gaussian model for minimizing KL- divergence.)
5. This one can be tricky. We’ll do it in class but first do it yourself ...
We return to the fishing scenario in Problem Set 2, Problem 7. By now you’ve constructed
the optimal Bayesian classifier to decide if a fish is carp or bass, based on the uncertain class
information you received from the two experts. You draw two fish from the pond which weigh
1.5 pounds and 5 pounds, respectively. If you applied your classifier independently to each
fish, they would both be labeled bass. However, this may not seem right, and indeed it isn’t
(remember: only one expert is right about the pond). In this problem we will figure out what
the optimal classification of the two fish should be.
(a) The weight of a fish in the pond can be considered a random variable. If I draw two fish,
the weights correspond to two random variables. Are these two variables independent,
given your limited knowledge of the pond? Motivate your answer mathematically. (Re-
member: independence is not a property of the fish but of your knowledge about them.)
1
pf2

Partial preview of the text

Download Pattern Recognition Problem Set 3: Estimating Probabilities and Bayesian Classification - and more Assignments Computer Science in PDF only on Docsity!

CS 4803A/8803A: Pattern Recognition

Problem Set 3

Date: Feb 20, 2003 Due: Feb 28th, 2003 - which is Friday before Spring Break. But will be accepted until Monday 10th.

  1. It is suspected that a certain four-sided die is unfair. By rolling the die many times, we wish to estimate the probability of each of the four outcomes. Let {Pi}, i = 1,... , 4 denote the unknown probabilities. Suppose the die is rolled N times. If the probabilities to be estimated were known, then the probability of observing exactly y 1 1’s, y 2 2’s,.. ., and y 4 4’s in the N rolls would be:

P (y 1 ,... , y 4 ; P 1 ,... , P 4 ) =

N!

y 1 !y 2! · · · y 4!

P 1 y 1 P 2 y 2 · · · P 4 y 4.

This distribution is called the multinomial distribution. Suppose exactly yi occurrences of i are observed in the N rolls, where i = 1,... , 4. Find the maximum likelihood estimates { Pˆi} for the probabilities {Pi}, i = 1,... , 4 in terms of the yi’s.

  1. DHS Problem 3.
  2. DHS Problem 3.
  3. DHS Problem 3.11 (Symbolic math - but cool to see that the obvious Gaussian model is in fact the best Gaussian model for minimizing KL- divergence.)
  4. This one can be tricky. We’ll do it in class but first do it yourself... We return to the fishing scenario in Problem Set 2, Problem 7. By now you’ve constructed the optimal Bayesian classifier to decide if a fish is carp or bass, based on the uncertain class information you received from the two experts. You draw two fish from the pond which weigh 1.5 pounds and 5 pounds, respectively. If you applied your classifier independently to each fish, they would both be labeled bass. However, this may not seem right, and indeed it isn’t (remember: only one expert is right about the pond). In this problem we will figure out what the optimal classification of the two fish should be.

(a) The weight of a fish in the pond can be considered a random variable. If I draw two fish, the weights correspond to two random variables. Are these two variables independent, given your limited knowledge of the pond? Motivate your answer mathematically. (Re- member: independence is not a property of the fish but of your knowledge about them.)

(b) There are four possible classifications of the two fish: c 2 = carp c 2 = bass c 1 = carp c 1 = bass where c 1 is the class of fish 1 (which weighs w 1 = 1. 5 pounds) and c 2 is the class of fish 2 (which weighs w 2 = 5 pounds). Make a table like this and fill in the probability of each possible classification given the weights you observed. What is the most probable classification of the fish? What is the least probable classification of the fish? (c) What does this result imply in general about performing pattern recognition when you are uncertain about the class distributions (e.g. because you estimated them from training data)?