Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

EXAMPLE Machine Learning (C395) Exam Questions, Lecture notes of Machine Learning

University College Bahrain (UCB)Machine Learning

EXAMPLE Machine Learning (C395) Exam Questions. (1) Question: Explain the principle of the gradient descent algorithm. Accompany.

Typology: Lecture notes

2021/2022

Uploaded on 08/01/2022

fabh_99 🇧🇭

4.4

(53)

543 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

EXAMPLE Machine Learning (C395) Exam Questions

(1) Question: Explain the principle of the gradient descent algorithm. Accompany

your explanation with a diagram. Explain the use of all the terms and constants

that you introduce and comment on the range of values that they can take.

Solution: Training can be posed as an optimization problem, in which the goal

is to optimize a function (usually to minimize a cost function E) with respect to

a number of free variables, usually weights wi. The gradient decent algorithm

begins from an initialization of the weights (e.g. a random initialization) and in

an iterative procedure updates the weights wi by a quantity Δwi, where Δwi = –α

(∂E / ∂wi) and (∂E / ∂wi) is the gradient of the cost function with respect to the

weights, while α is a constant which takes small values in order to keep the

updates low and avoid oscillations.

(2) Question: Derive the gradient descent training rule assuming that the target

function representation is:

od = w0 + w1x1 + … + wnxn.

Define explicitly the cost/error function E, assuming that a set of training

examples D is provided, where each training example d ∈ D is associated with

the target output td.

Solution: The error function: E = ∑d∈D (td – od)2

The gradient decent algorithm: Δwi = –α (∂E / ∂wi)

First represent (∂E / ∂wi) in terms of the unit inputs xid, outputs od, and target

values td:

(∂E / ∂wi) = (∂∑d∈D (td – od)2) / ∂wi = ∑d∈D 2(td – od) (∂(td – od) / ∂wi) =

∑d∈D 2(td – od) (–∂od / ∂wi) = –∑d∈D 2(td – od) (∂(w0 + … + wixid + … + wnxnd) /

∂wi) = –∑d∈D 2(td – od) (xid)

=> Δwi = α ∑d∈D 2(td – od) xid

Discover Lecture notes of Machine Learning University College Bahrain (UCB)

Partial preview of the text

Download EXAMPLE Machine Learning (C395) Exam Questions and more Lecture notes Machine Learning in PDF only on Docsity!

EXAMPLE Machine Learning (C395) Exam Questions

(1) Question: Explain the principle of the gradient descent algorithm. Accompany your explanation with a diagram. Explain the use of all the terms and constants that you introduce and comment on the range of values that they can take. Solution: Training can be posed as an optimization problem, in which the goal is to optimize a function (usually to minimize a cost function E) with respect to a number of free variables, usually weights wi. The gradient decent algorithm begins from an initialization of the weights (e.g. a random initialization) and in an iterative procedure updates the weights wi by a quantity Δwi, where Δwi = – α (∂E / ∂wi) and (∂E / ∂wi) is the gradient of the cost function with respect to the weights, while α is a constant which takes small values in order to keep the updates low and avoid oscillations. (2) Question: Derive the gradient descent training rule assuming that the target function representation is: od = w 0 + w 1 x 1 + … + wnxn. Define explicitly the cost/error function E, assuming that a set of training examples D is provided, where each training example d ∈ D is associated with the target output td. Solution: The error function: E = ∑d∈D (td – od)^2 The gradient decent algorithm: Δwi = – α (∂E / ∂wi) First represent (∂E / ∂wi) in terms of the unit inputs xid, outputs od, and target values td: (∂E / ∂wi) = (∂∑d∈D (td – od)^2 ) / ∂wi = ∑d∈D 2(td – od) (∂(td – od) / ∂wi) = ∑d∈D 2(td – od) (–∂od / ∂wi) = – ∑d∈D 2(td – od) (∂(w 0 + … + wixid + … + wnxnd) / ∂wi) = – ∑d∈D 2(td – od) (xid) => Δwi = α ∑d∈D 2(td – od) xid

(3) Question: Prove that the LMS training rule performs a gradient descent to minimize the cost/error function E defined in (2). Solution: Given the target function representation od = w 0 + w 1 x 1 + … + wnxn, LMS training rule is a learning algorithm for choosing the set of weights wi to best fit the set of training examples {< d, td >}, i.e., to minimize the squared error E ≡ ∑d∈D (td – od) 2 . LMS training rule works as follows: (∀ < d, td >) use the current weights wi to calculate od (∀wi) wi ← wi + η(td – od)xid () From (2) à (∂E / ∂wi) = – ∑d∈D 2(td – od)xid à – (1/2xid)(∂E / ∂wi) = (td – od) Substitute this in () à (∀wi) wi ← wi + (η/2)(–∂E / ∂wi) This shows that LMS alters weights in the very same proportion as does the gradient descent algorithm (i.e., – ∂E / ∂wi), proving that LMS performs gradient descent. (4) Question: Consider the following set of training examples: What is the information gain of a2 relative to these training examples? Provide the equation for calculating the information gain as well as the intermediate results. Solution: Entropy E(S) = E([3+, 3-]) = - (3/6) log 2 (3/6) - (3/6) log 2 (3/6) = 1. Gain (S, a2) = E(S) – (4/6)E(T) – (2/6)E(F) = 1 – 4/6 – 2/6 ≈ 0. E(T) = E([2+, 2-]) = 1. E(F) = E([1+, 1-]) = 1. (5) Question: Suppose that we want to build a neural network that classifies two dimensional data (i.e., X = [x1, x2]) into two classes: diamonds and crosses. We have a set of training data that is plotted as follows: Instance Classification a1 a 1 + T T 2 + T T 3 - T F 4 + F F 5 - F T 6 - F T

What is the appropriate chromosome design for the given problem? Which Genetic Algorithm parameters need to be defined? What would be the suitable values of those parameters for the given problem? Provide a short explanation for each. What is the result of applying a single round of the prototypical Genetic Algorithm? Explain your answer in a clear and compact manner by providing the pseudo code of the algorithm. Solution : size = {large, mid, small} → 100, 010, 001, 011, …, 111, 000 brand = {Volvo, BMW, SUV} → 100, 010, 001, 011, …, 111, 000 sport = {yes, no} → 10, 01, 11, 00 engine = {F12, V12, V8} → 100, 010, 001, 011, …, 111, 000 GoodCar = {yes, no} → 10, 01, 11, 00 → chromosome design: size brand sport engine GoodCar 100 100 11 111 01 Fitness function for the given problem can be defined as a Sigmoid function f(x) = 1 / (1+ e-x), where x is the percentage of all training examples correctly classified by a specific solution (chromosome). Selection method – e.g., rank selection method can be used; Crossover technique – 2 - point crossover can be used for the given problem with a crossover mask 1111110000011; the reason is that either size + brand or sport + engine define the solution Crossover rate – usually k = 60% Mutation rate – usually 1% Termination condition – e.g., all training examples are correctly classified GA pseudo code: Step 1: Choose initial population. Step 2: Evaluate the fitness of individuals in the population. Step 3: Select k individuals to reproduce; breed new generation through crossover and mutation; evaluate the individual fitness of offspring; replace k worse ranked part of population with offspring. Step 4: Repeat step 3 until the termination condition is reached. s1/2: 1010101001010 , 1011111110101 , 0100011111101, 0011111001010, 1011011110101 s3: 1010101110110 (fit 1), 0011111001001 (fit 0), 1010101111110 (fit 2), 1011011001001 (fit 1), 0011111110101 (fit 2), 1011011110101 (fit 3) result: 1011111110101 , 1011011110101 , 1011011110101 , 0011111110101 , 1010101111110

EXAMPLE Machine Learning (C395) Exam Questions, Lecture notes of Machine Learning

Related documents

Partial preview of the text

Download EXAMPLE Machine Learning (C395) Exam Questions and more Lecture notes Machine Learning in PDF only on Docsity!

EXAMPLE Machine Learning (C395) Exam Questions