Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Classes Part 1-Artificial Intelligence-Quiz, Exercises of Artificial Intelligence

Central University of Jammu and Kashmir Artificial Intelligence

Madam Amrita Ahuja took this quiz in class of Artificial Intelligence at Central University of Jammu and Kashmir. This quiz involves: Learning, Hypothesis, Classes, RealValued, Inputs, Separators, Trees, Linear, Kernel, Gaussian, Svm

Typology: Exercises

2011/2012

Uploaded on 07/31/2012

shaina_44kin 🇮🇳

3.9

(9)

64 documents

1 / 33

This page cannot be seen from the preview

Don't miss anything!

5 Learning hypothesis classes (16 points)

Consider a classiﬁcation problem with two real-valued inputs. For each of the following

algorithms, specify all of the separators below that it could have generated and explain why.

If it could not have generated any of the separators, explain why not.

1. 1-nearest neighbor

2. decision trees on real-valued inputs

docsity.com

Discover Exercises of Artificial Intelligence Central University of Jammu and Kashmir

Partial preview of the text

Download Classes Part 1-Artificial Intelligence-Quiz and more Exercises Artificial Intelligence in PDF only on Docsity!

D E F

B C

5 Learning hypothesis classes (16 points)

Consider a classification problem with two realvalued inputs. For each of the following algorithms, specify all of the separators below that it could have generated and explain why. If it could not have generated any of the separators, explain why not.

1nearest neighbor
decision trees on realvalued inputs

standard perceptron algorithm
SVM with linear kernel
SVM with Gaussian kernel (σ = 0.25)
SVM with Gaussian kernel (σ = 1)
neural network with no hidden units and one sigmoidal output unit, run until conver gence of training error
neural network with 4 hidden units and one sigmoidal output unit, run until conver gence of training error

7 SVMs (12 points)

Assume that we are using an SVM with a polynomial kernel of degree 2. You are given the following support vectors:

x 1 x 2 y 1 2 + 1 2 1

The α values for each of these support vectors are equal to 0.05.

What is the value of b? Explain your approach to getting the answer.
What value does this SVM compute for the input point (1, 3)

8 Neural networks (18 points)

A physician wants to use a neural network to predict whether patients have a disease, based on the results of a battery of tests. He has assigned a cost of c 01 to false positives (generating an output of 1 when it ought to have been 0), and a cost of c 10 to generating an output of 0 when it ought to have been 1. The cost of a correct answer is 0. The neural network is just a single sigmoid unit, which computes the following function:

g(¯x ) = s(w¯ ·x¯)

with s(z) being the usual sigmoid function.

Give an error function for the whole training set, E(w ¯) that implements this error metric, for example, for a training set of 20 cases, if the network predicts 1 for 5 cases that should have been 0, predicts 0 for 3 cases that should have been 1 and predicts another 12 correctly, the value of the error function should be: 5c 01 + 3c 10.
Would this be an appropriate error criterion to use for a neural network? Why or why not?

4 Machine Learning — Continuous Features (20 points)

In all the parts of this problem we will be dealing with onedimensional data, that is, a set of points (xi) with only one feature (called simply x). The points are in two classes given by the value of yi^. We will show you the points on the x axis, labeled by their class values; we also give you a table of values.

4.1 Nearest Neighbors

i xi^ yi 1 1 0 2 2 1 3 3 1 4 4 0 5 6 1 6 7 1 7 10 0 8 11 1

In the figure below, draw the output of a 1NearestNeighbor classifier over the range indicated in the figure.

In the figure below, draw the output of a 5NearestNeighbor classifier over the range indicated in the figure.

4.3 Neural Nets

Assume that each of the units of a neural net uses one of the the following output functions of the total activation (instead of the usual sigmoid s(z))

Linear: This just outputs the total activation:

l(z) = z

NonLinear: This looks like a linearized form of the usual sigmoid funtion:

f (z) = 0 if z < − 1 f (z) = 1 if z > 1 f (z) = 0 .5(z + 1) otherwise

Consider the following output from a neural net made up of units of the types described above.

Can this output be produced using only linear units? Explain.
Construct the simplest neural net out of these two type of units that would have the output shown above. When possible, use weights that have magnitude of 1. Label each unit as either Linear or NonLinear.

� �

5 Machine Learning (20 points)

Grady Ent decides to train a single sigmoid unit using the following error function:

1 E(w) = (y(xi^ , w) − y i∗)^2 +

β wj^2 (^2) i 2 j

where y(xi^ , w) = s(x^ i w) with s(z) = 1+

1 · e−z being our usual sigmoid function.

Write an expression for ∂E ∂wj.^ Your^ answer^ should^ not^ involve^ derivatives.
What update should be made to weight wj given a single training example < x, y∗^ >. Your answer should not involve derivatives.

�

Here are two graphs of the output of the sigmoid unit as a function of a single feature x. The unit has a weight for x and an offset. The two graphs are made using different values of the magnitude of the weight vector (�w�^2 = (^) j wj^2 ).

Which of the graphs is produced by the larger �w�^2? Explain.

Why might penalizing large �w�^2 , as we could do above by choosing a positive β, be desirable?
How might Grady select a good value for β for a particular classification problem?

The score is the percentage correct of the tree, computed on the training set, minus a constant C times the number of nodes in the tree. C is chosen in advance by running this algorithm (grow a large tree then prune in order to maximize percent correct minus C times number of nodes) for many different values of C, and choosing the value of C that minimizes trainingset error.
The score is the percentage correct of the tree, computed on the training set, minus a constant C times the number of nodes in the tree. C is chosen in advance by running crossvalidation trials of this algorithm (grow a large tree then prune in order to maximize percent correct minus C times number of nodes) for many different values of C, and choosing the value of C that minimizes crossvalidation error.

Problem 4: Learning (25 points)

Part A: (5 Points)

Since the cost of using a nearest neighbor classifier grows with the size of the training set, sometimes one tries to eliminate redundant points from the training set. These are points whose removal does not affect the behavior of the classifier for any possible new point.

In the figure below, sketch the decision boundary for a 1-nearest-neighbor rule and circle the redundant points.

What is the general condition(s) required for a point to be declared redundant for a 1- nearest-neighor rule? Assume we have only two classes (+, -). Restating the definition of redundant ("removing it does not change anything") is not an acceptable answer. Hint

think about the neighborhood of redundant points.

Part C: (10 Points)

X Y

In this network, all the units are sigmoid except unit 5 which is linear (its output is simply the weighted sum of its inputs). All the bias weights are zero. The dashed connections have weights of -1, all the other connections (solid lines) have weights of 1.

Given X=0 and Y=0, what are the output values of each of the units? Unit 1 = Unit 2 = Unit 3 = Unit 4 = Unit 5 =
What are the δ values for each unit (as computed by backpropagation defined for squared error) assume that the desired output for the network is 4. Unit 1 = Unit 2 = Unit 3 = Unit 4 = Unit 5 =
What would be the new value of the weight connecting units 2 and 3 assuming that the learning rate for backpropagation is set to 1?

Part D: (10 Points)

Consider the simple one-dimensional classification problem shown below. Imagine attacking this problem with an SVM using a radial-basis function kernel. Assume that we want the classifier to return a positive output for the + points and a negative output for the – points.

Draw a plausible classifier output curve for a trained SVM, indicating the classifier output for every feature value in the range shown. Do this twice, once assuming that the standard deviation (σ) is very small relative to the distance between adjacent training points and again assuming that the standard deviation (σ) is about double the distance between adjacent training points.

Small standard deviation (σ):

SVM output

Feature value

Large standard deviation (σ):

SVM output

Feature value

Classes Part 1-Artificial Intelligence-Quiz, Exercises of Artificial Intelligence

Related documents

Partial preview of the text

Download Classes Part 1-Artificial Intelligence-Quiz and more Exercises Artificial Intelligence in PDF only on Docsity!

5 Learning hypothesis classes (16 points)

7 SVMs (12 points)

8 Neural networks (18 points)

4 Machine Learning — Continuous Features (20 points)

4.1 Nearest Neighbors

4.3 Neural Nets

5 Machine Learning (20 points)

Problem 4: Learning (25 points)

Part A: (5 Points)

X Y

Part D: (10 Points)