Neural Networks 2 , Lecture Notes - Computer Science, Study notes of Artificial Intelligence

Prof. David C Parkes, Computer Science, Neural-Networks, Multilayer Networks, Minsky and Papert’s Argument, Harvard, Lecture Notes

Typology: Study notes

2010/2011

Uploaded on 10/25/2011

thecoral
thecoral 🇺🇸

4.5

(30)

395 documents

1 / 21

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS181 Lecture 6:
Neural-Networks II: Multilayer
Networks
Prof. David C. Parkes
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15

Partial preview of the text

Download Neural Networks 2 , Lecture Notes - Computer Science and more Study notes Artificial Intelligence in PDF only on Docsity!

CS181 Lecture 6:

Neural-Networks II: Multilayer

Networks

Prof. David C. Parkes

Minsky and Papert:

“Perceptrons” (1969)

Perceptrons are insufficient to construct

generally intelligent machines

Minsky and Papert‟s Argument

Example: Connectedness

Minsky and Papert‟s Argument

  • Challenge: local features (e.g., edges at

left, edges at right) are insufficient

  • Possible answer: Use high-level features

x 1

x 1000

Á 1 ( x )

perceptron output

  • high level features are predefined
  • encode non-linear functions of the inputs
  • represent complex hypotheses

Á 100 ( x )

  • Challenge: number of high-level features

needed for general intelligence (seems) huge

  • Don‟t want to have hand code features
  • Answer: make high-level features adaptive

Example 1: General Case

Layered Networks

  • Units organized into layers
    • last layer is output units
    • intermediate layers are hidden units
  • Edges „feed forward‟ (only)

x 1

x 2

H 1
H 4
O 5
H 2
H 3

Weights and Activations

  • Each edge has a weight, weights are the

parameters

  • Each unit j has an “activation level” aj

x 1

x 2

H 1
O 3
H 2

w 11

w 21

w 12

w 22

w 31

w 32

[Also add a +1 input and wj0 to every unit j]

Multiple output units

Forward Propagation

  • Given input x =(x 1 ,…,x m ), determine the

outcome by forward propagation :

  • For each node j in increasing order
    • evaluate inj = i 2 parents(j) wji ai
    • set aj=g(inj)
  • Feature value xj is “activation level” on

input

  • Suppose for the moment that activation

function g(.) is the perceptron activation

function

Challenge: Learning Weights

  • Want a differentiable activation function on

hidden units and output units

  • Recall: perceptron and adaline rules

Properties of Sigmoid

  • Continuous and differentiable
  • Converges to 1 as input gets large, 0 as input gets small
  • Approximately linear when input is close to 0
  • Decision boundary at “in=0”

in e

g in  

 1

1 ( )