


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A step-by-step solution to construct a decision tree using the given training examples in cs4600 - introduction to intelligent systems homework 8. The document also explains how the resulting decision tree classifies a new example. Formulas for calculating information gain and entropy, as well as the process of selecting the best feature to split at each node.
Typology: Assignments
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Assume that you have the following training examples available:
Use all of the training examples to construct a decision tree. In case of ties between features, break ties in favor of features with smaller numbers (for example, favor F1 over F2, F2 over F3, and so on).
How does the resulting decision tree classify the following example:
F1 F2 F3 F4 F5 Class
Example 1 t t f f f p
Example 2 f f t t f p
Example 3 t f f t f p
Example 4 t f t f t p
Example 5 f t f f f n
Example 6 t t f t t n
Example 7 f t t t t n
F1 F2 F3 F4 F5 Class
Example 8 f f f t t?
Some formula:
Pre-compute some I(x,y)
I(0,x) = I(x,0) = -1 log 2 1 - 0 log 2 0 = 0 I(x,x) = - 1/2 log 2 1/2 - 1/2 log 2 1/2 = 1 I(1,2) = I(2,1) = - 2/3 log 2 2/3 - 1/3 log 2 1/3 = 0. I(1,3) = I(3,1) = - 3/4 log 2 3/4 - 1/4 log 2 1/4 = 0. I(3,4) = I(4,3) = - 4/7 log 2 4/7 - 3/7 log 2 3/7 = 0.
First, choose from {F1, F2, F3, F4, F5} to become the root.
Gain(F1) = I(4,3) - E(F1) = 0. Gain(F2) = I(4,3) - E(F2) = 0. Gain(F3) = I(4,3) - E(F3) = 0. Gain(F4) = I(4,3) - E(F4) = 0. Gain(F5) = I(4,3) - E(F5) = 0.
Since Gain(F2) is the highest, F2 becomes the root.
Then, choose from {F1, F3, F4, F5} to be F2’s f-child.
F1 F2 F3 F4 F5 Class Example 1 t t f f f p Example 2 f f t t f p Example 3 t f f t f p Example 4 t f t f t p Example 5 f t f f f n Example 6 t t f t t n Example 7 f t t t t n
p+n
p (^) t+n (^) t _______ (^) I(pt +nt ) p+n
pf +n (^) f E(A) = + _______^ I(pf +n (^) f)
Gain(A) = I(p,n) - E(A)
___p p+n
___p p+n
I(p,n) = - log 2 ___n p+n
___n p+n
Gain(F3) = I(1,3) - E(F3) = 0 Gain(F4) = I(1,3) - E(F4) = 1 Gain(F5) = I(1,3) - E(F5) = 1
F4 and F5 have the highest Gain(). F4 are favored by the tie-breaking scheme and, thus, becomes F1’s t-child.
Next, choose either F3 or F5 to be F4’s f-child.
Since the only example has class p, “p” becomes F4’s f-child.
Then, choose either F3 or F5 to be F4’s t-child.
For similar reason as before, class n becomes F4’s t-child.
The final tree below will classify example8 (f,f,f,t,t) as belonging to class p.
F1 F2 F3 F4 F5 Class Example 1 t t f f f p Example 6 t t f t t n
F1 F2 F3 F4 F5 Class Example 1 t t f f f p
F1 F2 F3 F4 F5 Class Example 6 t t f t t n
t
t
t
f
f
f
p
n
n (^) p