Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Midterm Exam Question with Solution - Machine Learning | ECS 271, Exams of Computer Science

University of California - Davis Computer Science

Material Type: Exam; Class: Machine Learning; Subject: Engineering Computer Science; University: University of California - Davis; Term: Spring 2004;

Typology: Exams

Pre 2010

Uploaded on 07/31/2009

koofers-user-763 🇺🇸

10 documents

1 / 6

This page cannot be seen from the preview

Don't miss anything!

Your name: __________________

Your ID:_____________________

UC-Davis

ECS 271 Midterm Examination

Closed book

Spring 2004

Show all work clearly and legibly. Remember, you are being tested. So even if an

answer is obvious to you, pl. show all the justification by clearly showing the

calculations, or explaining why a calculation is skipped.

1. True or False? ( 9 points each)

(a) In PAC learning model, the learner makes no assumptions aboutthe

class from which the target concept is drawn. (False)

(b) In PAC learning, the learner outputs the hypothesis from H that has

theleast error (possibly zero) over the training data (False)

(c) The numberof training examples required for successful learning is

strongly influenced by the complexity of the hypothesis space

considered by the learner. (True)

2. (15 points) Illustrate your understanding of the back propagation method by

explicitly showing all steps of the calculations with respect to a single-neuron

with a sigmoidal nonlinearity. Assume that you are at the output stage of the

network. The objective is for the unit to learn a single input pattern, namely

1

2

1

4

i

   

 

   

 

The desired output is o = 1. Initially assume

1 2 0w w 

. Use a learning rate

1.0





. Show all the calculations for two iterations. Show the weight values at

the end of the first and second iterations. In what direction is the weight

vector moving from iteration to iteration?

Discover Exams of Computer Science University of California - Davis

Partial preview of the text

Download Midterm Exam Question with Solution - Machine Learning | ECS 271 and more Exams Computer Science in PDF only on Docsity!

Your name: __________________

Your ID:_____________________

UC-Davis

ECS 271 Midterm Examination

Closed book

Spring 2004

Show all work clearly and legibly. Remember, you are being tested. So even if an

answer is obvious to you, pl. show all the justification by clearly showing the

calculations, or explaining why a calculation is skipped.

1. True or False? ( 9 points each)

(a) In PAC learning model, the learner makes no assumptions aboutthe

class from which the target concept is drawn. (False)

(b) In PAC learning, the learner outputs the hypothesis from H that has

theleast error (possibly zero) over the training data (False)

(c) The numberof training examples required for successful learning is

strongly influenced by the complexity of the hypothesis space

considered by the learner. (True)

2. (15 points) Illustrate your understanding of the back propagation method by

explicitly showing all steps of the calculations with respect to a single-neuron

with a sigmoidal nonlinearity. Assume that you are at the output stage of the

network. The objective is for the unit to learn a single input pattern, namely

1

2

i

The desired output is o = 1. Initially assume 1 2

w  w  (^0). Use a learning rate

  1.0 (^). Show all the calculations for two iterations****. Show the weight values at

the end of the first and second iterations. In what direction is the weight

vector moving from iteration to iteration?

Solution:

1st iteration: netinput = 0. output = 1/2, error = (1-0.5)**2 = 0.

delta-w1 = etadeloutput = 1.0*(0.5)(0.5)(1-0.5) i1 = 0.125 i1 = 0.

delta-w2 = etadeloutput = 1.0*(0.5)(0.5)(1-0.5) i1 = 0.125 i2 = 0.

new weights are 0.125 and 0.

2nd iteration

2nd iteration: netinput = 2.125. output = 1 + exp (-2.125) = 0.893, error =

delta-w1 = etadeloutput = 1.0*(1-0.893)(0.893)(1-0.893) i1 = 0.0853 i1 =

delta-w2 = etadeloutput = 0.0853 i2 =0.

new weights are 0.210 and 0.

The weight vector is moving toward the input vector.

3. (8 points) Suppose H is a set of possible hypotheses and D is a set of training

data. We would like our program to output the most probable hypothesis h

from H, given the data D. Under what conditions does the following hold?

arg max P H ( | D ) arg max P D ( | H );

h  H h  H

Solution: First, there is a typo. The H should be h under arg max. But this is a

minor thing and it did not bother any of you. So let us proceed.

The starting formula is

( | ) ( )

arg max ( | ) arg max

( )

P D h P h

P h D

P D



h  H h  H

P(D) can be dropped because it does not depend on h

P(h) can be treated as a constant if all the hypothesesin the hypothesis space are equally

likely.

Under these conditons, both sides are equal as stated int he question.

5. (a) (12 points) Build a decision tree to classify the following patterns. Show

all the calculations systematically or explain why certain calculations are

skipped.

Pattern

(x1,x2,x3)

Class

(b) (2 points) What Boolean function is the above tree implementing?

Solution:

A plot of the 8 points along x1, x2 and x3 gives an idea on how to solve this.

The initail uncertainity of all 8 points is

-(6/8) log2 (6/8) – (2/8) log2 (2/8) = 0.

Suppose we divide the points by drawing a plane along the x1- axis (i. e., parallel to the

x2-x3 plane. Then the left-branch has 4 points all belonging to the same class and the

right hand branch has two of each class. So the uncertainity of the left branch is

-(4/4) log2 (4/4) – (0/4) log2 (0/4) = 0

The uncertainity of the right branch is

-(2/4) log2 (2/4) – (2/4) log2 (2/4) = 1

Average uncertainity after the first test (on x1) is

Uncertainity reduction achieved is 0.81 – 0.5 = 0.

Do a similar thing along x2 and x3 and find out that test along x3 gives exactly the same

uncertainity and a test along x2 gives no improvement at all. So first choose either x1 or

x2.

The decision tree really implements f = x1x3.

(c) ( 5 points) Consider a decision tree built from an arbitrary set of data. If the

output is discreet-valued and can take on k different possible values, what is the

maximum training set error (expressed as a fraction) that any data set could

possibly have?

Suggested Solution: The answer is (k-1)/k. Consider data sets with identical inputs but

the outputs are evenly distributed among k classes. Then we will always get one correct

classification and (k-1) erroneous classifications.

6. (12 points) Imagine that you are given the following set of training examples.

All the features are Boolean-valued.

F1 F2 F3 Result

T T F +

F T T +

T F T -

F T F -

F F T -

How would a Naive Bayes approach classify the following test example?

Be sure to show your work.

F1 = T F2 = F F3 = F

Solution: There are only two possible answers + and -. So it is possible that you can toss a coin

and guess the answer and be on the correct side 50% of the time. Therefore, it becomes imporatnt

that you show all calculations and they be correct too, to justify your answer.

Furthermore, one of the probability terms is zero. This makes it doubly dangerous because you

can getthe correct classification despite a horde of calculation errors.

From the historical data given to you, P(+) = 2/5 = 0.4 and P(-) = 3/5 = 0.

You simply have to calculate arg max P(vj) P(F1=T|vj) P(F2=F|vj) P(F3=F|vj); Naive Bayes

assumption. Note vj can assume only two values + and -.

P(vj = +)* P(F1| +) P(F2| +) P(F3 | +)

P(vj = -)* P(F1| -) P(F2| -) P(F3 | -)

Midterm Exam Question with Solution - Machine Learning | ECS 271, Exams of Computer Science

Related documents

Partial preview of the text

Download Midterm Exam Question with Solution - Machine Learning | ECS 271 and more Exams Computer Science in PDF only on Docsity!

Your name: __________________

Your ID:_____________________

UC-Davis

ECS 271 Midterm Examination

P(F1|+) = P(F1=T|+) = 1/2 = 0.

P(F2|+) = P(F2=F|+) = 0/2 = 0.

P(F3|+) = P(F3=F|+) = 1/2 = 0.