Decision Trees - Artificial Intelligence - Solved Exam, Exams of Artificial Intelligence

Main points of this past exam are: Decision Trees, Information Gain, Linear Svm, Naïve Bayes, Markov Models, Probability, Boolean Random Variables, Negative Rate

Typology: Exams

2012/2013

Uploaded on 04/08/2013

gajjrup
gajjrup 🇮🇳

4.4

(16)

52 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Final Examination
CS 540-2: Introduction to Artificial Intelligence
December 20, 2010
LAST NAME: SOLUTION
FIRST NAME:
Problem Score Max Score
1 ___________ 6
2 ___________ 8
3 ___________ 6
4 ___________ 12
5 ___________ 15
6 ___________ 6
7 ___________ 9
8 ___________ 8
9 ___________ 15
10 ___________ 15
Total ___________ 100
pf3
pf4
pf5

Partial preview of the text

Download Decision Trees - Artificial Intelligence - Solved Exam and more Exams Artificial Intelligence in PDF only on Docsity!

Final Examination

CS 540-2: Introduction to Artificial Intelligence

December 20, 2010

LAST NAME: SOLUTION

FIRST NAME:

Problem Score Max Score

1 ___________ 6

2 ___________ 8

3 ___________ 6

4 ___________ 12

5 ___________ 15

6 ___________ 6

7 ___________ 9

8 ___________ 8

9 ___________ 15

10 ___________ 15

Total ___________ 100

1. [6] Entropy Running from You-Know-Who, Harry enters the CS building on the 1 st^ floor. He flips a fair coin; if it is heads, he hides in room 1325; otherwise he climbs to the 2 nd^ floor. In that case he flips the coin again; if it is heads, he hides in the CSL; otherwise he climbs to the 3 rd^ floor. In that case he flips the coin yet again; if it is heads, he hides in 3331; otherwise he hides in the Men’s room. What is the entropy of Harry’s location?

L=(1/2, 1/4, 1/8, 1/8)

H(L) = -[1/2log 2 1/2 + 1/4log 2 1/4 + 1/8log 2 1/8 + 1/8log 2 1/8]

= -[1/2 * -1 + 1/4 * -2 + 1/8 * -3 + 1/8 * -3]

= 1.75 bits

2. [8] Decision Trees There are 100 parrots. They have either a red beak or a black beak. They can either talk or not. Complete the two cells in the following table so that the mutual information (i.e., information gain ) between “Beak” and “Talk” is 0. Show your work that justifies your answer.

Number of parrots Beak Talk

10 Red Yes

30 or (^15) Red No

15 or (^30) Black Yes

45 Black No

5. [15] Bayesian Networks Consider the Bayesian network A BC with three Boolean random variables and their CPTs defined by: P ( A ) = 0. P ( B | A , C ) = 0. P ( B | A , Ÿ C ) = 0. P ( B | Ÿ A , C ) = 0. P ( B | Ÿ A , Ÿ C ) = 0. P ( C ) = 0.

a. Compute PA , B , C )

P(ŸA, B, C) = P(B | ŸA, C)P(ŸA)P(C) = (.3)(.8)(.55) = 0.

b. Compute PA | Ÿ C )

PA | Ÿ C ) = P(ŸA) = 0.8 since A and C are independent

c. Compute P ( A | B , Ÿ C )

P ( A | B , Ÿ C ) = P ( A , B , Ÿ C ) / P ( B , Ÿ C )

P ( A , B , Ÿ C ) = P (B | A, ŸC) P (A) P (ŸC) = (.5)(.2)(.45) = 0.

P ( B , Ÿ C ) = P ( A , B , Ÿ C ) + P (Ÿ A , B , Ÿ C )

So, P ( A | B , Ÿ C ) = .045 / .333 = 0.

6. [6] Naïve Bayes Which one or more of the following are true statements about the conditional independence properties that are guaranteed true in a Bayesian network that is used to represent a Naïve Bayes classifier in which there are three evidence variables, W, X, and Y, and one classification variable, C.

a. P (C | W, X, Y) = P (C | W) * P (C | X) * P (C | Y) b. P (C | W, X, Y) = ( P (W, X, Y | C) * P (C)) / P (W, X, Y) c. P (W, X, Y | C) = P (W | C) * P (X | C) * P (Y | C) d. P (W, X, Y) = P (W) * P (X) * P (Y) e. P (C | W, X, Y) = P (C | W) * P (W | X) * P (X | Y) f. P (W, X, Y | C) = P (W | C) * P (X | W) * P (Y | X)

(c) is the only one that relates to conditional independence in a Bayesian network for Naïve Bayes

7. [9] Speech Recognition Traditional speech recognition can be posed as a probabilistic inference problem: given acoustic signal A , the task is to find a sentence (i.e., sequence of words) W such that W * = argmax W P ( W | A ) = argmax W P ( A | W ) P ( W ) (1)

where P ( A | W ) is the acoustic model and P ( W ) is the language model. In light of the McGurk effect, video signal V of the speaker’s face is also helpful in speech recognition. In one line, write down how you would modify equation (1) to incorporate both the acoustic and video signals for speech recognition. In another line, briefly explain the components in English.

W* = argmax (^) W P ( W | A , V ) = argmax (^) W P ( A , V | W ) P ( W ) = argmax (^) W P ( A | W ) P ( V | W ) P ( W ) assuming A and V are conditionally independent given W. This means we have the same acoustic model and language model as before, but now we add P ( V | W ) as a “video model”

8. [8] Neural Networks Fill in the two missing weights below so that the following 2-layer neural network computes A XOR B. Both A and B take values 0 or 1, and the units are Linear Threshold Units (LTUs).

w < -0.

5 ≤ w < 15

w = ‐ 10

w = 1

w = 1

w = 1

(^1) w = ‐0.

(^1) w = ‐ 5

(^1) w = ‐0.

A

B

A XOR B

10. [15] Hidden Markov Models You sometimes get colds (C), which make you sneeze (S). You also get allergies (A), which make you sneeze. Sometimes you are well (W), which doesn’t make you sneeze (Q). You decide to model this as an HMM with hidden states C, A, W, and observable states S, Q as follows:

.

S Q

W A C

Start

1 0

0 .

.

.1.^ .8 (^) .2.

.7 (^). .6. .2.

.

a. What is the probability of the sequence W, C, C, W on days 1 to 4?

P ( q 1 =W, q 2 =C, q 3 =C, q 4 =W) = P( q 4 =W| q 3 =C)P( q 3 =C| q 2 =C)P( q 2 =C| q 1 =W)P( q 1 =W| Start ) = (.2)(.6)(.1) (1) = 0.

b. What is the probability that on day 1 you observe Q and on day 2 you observe S?

P ( o 1 =Q, o 2 =S) = P ( o 1 =Q, o 2 =S | q 1 =W, q 2 =W) P ( q 1 =W, q 2 =W)

  • P ( o 1 =Q, o 2 =S | q 1 =W, q 2 =A) P ( q 1 =W, q 2 =A)

  • P ( o 1 =Q, o 2 =S | q 1 =W, q 2 =C) P ( q 1 =W, q 2 =C)

= P ( o 1 =Q | q 1 =W) P ( o 2 =S | q 2 =W) P ( q 2 =W | q 1 =W) P ( q 1 =W) + ...

= (.9)(.1)(.7)(1) + (.9)(.8)(.2)(1) + (.9)(.7)(.1)(1)

= 0.

c. What is the probability that on day 2 you are Well? P ( q 2 =W) = P ( q 2 =W| q 1 =W) P ( q 1 =W) = (.7)(1) = 0.