Understanding Intelligence: Linear Models & Artificial Neural Networks, Slides of Advanced Algorithms

A part of the pr nptel course on linear models and artificial neural networks. It discusses the efficiency of learning linear models, the concept of artificial neural networks, the structure of the human brain, and the processing power of neurons. The document also touches upon artificial intelligence and its approaches, including symbolic ai and artificial neural networks.

Typology: Slides

2012/2013

Uploaded on 04/20/2013

padmaghira
padmaghira 🇮🇳

3.2

(5)

53 documents

1 / 123

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
We have been discussing some simple ideas from
statistical learning theory.
PR NPTEL course p.1/138
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Understanding Intelligence: Linear Models & Artificial Neural Networks and more Slides Advanced Algorithms in PDF only on Docsity!

  • We have been discussing some simple ideas fromstatistical learning theory. PR NPTEL course – p.1/
  • We have been discussing some simple ideas fromstatistical learning theory. - The risk minimization framework that we discussedgives us a better perspective on understanding theunifying theme in different learning algorithms. PR NPTEL course – p.2/
  • We have been discussing some simple ideas fromstatistical learning theory. - The risk minimization framework that we discussedgives us a better perspective on understanding theunifying theme in different learning algorithms. - We will now go back to studying pattern classificationalgorithms. - We will first briefly review algorithms for learninglinear classifiers and then start looking at methods tolearn nonlinear classifiers. PR NPTEL course – p.4/

Linear Models

  • In the two class case, the linear classifier is given by h

X

sign

W

T

X

w 0

PR NPTEL course – p.5/

  • We discussed many algorithms for learning

W

. PR NPTEL course – p.7/

  • We discussed many algorithms for learning

W

.

  • The Perceptron algorithm is a simple error-correctingmethod that is guarenteed to find a separatinghyperplane if one exists. PR NPTEL course – p.8/
  • We discussed many algorithms for learning

W

.

  • The Perceptron algorithm is a simple error-correctingmethod that is guarenteed to find a separatinghyperplane if one exists. - The perceptron convergence theorem shows thatgiven any training set of linearly separable patterns,the algorithm will find a separating hyperplane. - Our discussion on statistical learning theory gives usan idea of how many iid examples we should have before we can be confident that the hyperplane thatseparates the examples will also do well on test data. PR NPTEL course – p.10/
  • We have also seen the least-squares method wherewe find

W

to minimize

J

W

(^1) n

i

W

T

X

i

y i

2 where, for simplicity of notation, we have assumedaugumented feature vectors. PR NPTEL course – p.11/

  • We have seen how to obtain the least-squaressolution:

W

A

T

A

− 1

A

T

Y

where rows of matrix

A

are feature vectors and components of

Y

are y i . PR NPTEL course – p.13/

  • We have seen how to obtain the least-squaressolution:

W

A

T

A

− 1

A

T

Y

where rows of matrix

A

are feature vectors and components of

Y

are y i .

  • The least-squares method can also be used to learnlinear regression models. PR NPTEL course – p.14/
  • We have seen that we can also minimize the empiricalrisk

J

W

using gradient descent. PR NPTEL course – p.16/

  • We have seen that we can also minimize the empiricalrisk

J

W

using gradient descent.

  • We can also run this gradient descent in anincremental fashion by considering one example at atime. PR NPTEL course – p.17/
  • We have also seen that we can use the least squaresidea to learn a model g

W

T

X

by redefining

J

as J

W

(^1) n

i

g

W

T

X

i

y i

2 PR NPTEL course – p.19/

  • We have also seen that we can use the least squaresidea to learn a model g

W

T

X

by redefining

J

as J

W

(^1) n

i

g

W

T

X

i

y i

2

  • An important example is the logistic regression wherewe take g as the sigmoid function. PR NPTEL course – p.20/