









































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of the perceptron learning rule and winnow algorithm in on-line machine learning. The perceptron learning rule is a simple algorithm used for binary classification tasks, while winnow is an on-line learning algorithm that can handle multi-class problems. Both algorithms use a weight vector to represent the model and update the weights based on the errors made on new data.
Typology: Study notes
1 / 49
This page cannot be seen from the preview
Don't miss anything!










































On-line Learning
CS446-Spring
n
n
On-line Learning
CS446-Spring
- - On-line Learning
CS446-Spring
4
Perceptron learning rule
We learn f:X
→
{-1,+1}
represented as
f = sgn{w
x)
Where X=
or X=
w
∈
{0,1}
R
R
m m 2 2 1 1 •
Given Labeled examples:
)}
y
,
(x
),...,
y
,
(x
),
y
,
{(x
Initialize w=
∈
Cycle through all examplesa. Predict the label of instance x to be y’ = sgn{w
x)
b. If y’
≠
y, update the weight vector:
w = w + r y x
(r - a constant, learning rate)
Otherwise, if y’=y, leave weights unchanged.
R
On-line Learning
CS446-Spring
5
Footnote About the Threshold
⇔
θ
−
⇔
∀
⇔
, 1
, w
w
x
x
x
0
1
0
1
θ
On-line Learning
CS446-Spring
On-line Learning
CS446-Spring
On-line Learning
CS446-Spring
Perceptron learning rule
Initialize w=
∈
Cycle through all examplesa. Predict the label of instance x to be y’ = sgn{w
x)
b. If y’
≠
y, update the weight vector to
w = w + r y x
(r - a constant, learning rate)
Otherwise, if y’=y, leave weights unchanged.
R
If x is Boolean, only weights of active features are updated.
1/
x)
exp(w
1
1
to
equivalent
is
0
x
w
On-line Learning
CS446-Spring
Perceptron Learnability
represent symmetry, connectivity
“What pattern recognition problems can be transformed soas to become linearly separable?”
On-line Learning
CS446-Spring
Perceptron Convergence
If there exist a set of weights that are consistent with the (I.e., the data is linearly separable) the perceptron learning algorithm will converge
-- How long would it take to converge?
the perceptron learning algorithm will eventually repeat the same set of weights and therefore enter an infinite loop.
-- How to provide robustness, more expressivity?
On-line Learning
CS446-Spring
t
t
i
i
i
∈
∈
R
N
γ
i
≥
γ
γ
MarginComplexityParameter
Perceptron: Mistake Bound Theorem
On-line Learning
CS446-Spring
Perceptron for Boolean Functions
when learning a k-disjunction?
make O(n) mistakes on k-disjunction on n attributes.
On-line Learning
CS446-Spring
17
Winnow Algorithm
instead of demotion
we can use elimination
.
(demotion)
1)
x
(if
/
w
w
,
x
but
w
0
f(x)
If
)
(promotion
1)
x
(if
2w
w
,
x
w
but
1
f(x)
If
nothing
do
:
mistake
no
If
x
w
iff
1
is
Prediction
w
:
Initialize
i
i
i
i
i
i
i
=
←
≥
=
=
←
<
=
≥
=
=
1
n;
θ
θ
θ
θ
On-line Learning
CS446-Spring
19
Winnow - Example
these variables (
w
=(256,256,0,…32,…256,256) )
hypothesis
(final
version)
on
(eliminati
mistake
ok
variable)
good
each
(for
log(n/2)
mistake
mistake
mistake
ok
ok
ok
Initialize
1024
1023
2
1
w
w
x
w
w
x
w
w
w
x
w
w
x
w
w
x
w
w
x
w
w
x
w
w
x
w
w x x x x f
On-line Learning
CS446-Spring
20
Winnow - Mistake Bound
Claim
: Winnow makes O(k log n) mistakes on k-disjunctions
u - # of mistakes on positive examples (promotions)v - # of mistakes on negative examples (demotions)
(demotion)
1)
x
(if
/
w
w
,
x
but
w
0
f(x)
If
)
(promotion
1)
x
(if
2w
w
,
x
w
but
1
f(x)
If
nothing
do
:
mistake
no
If
x
w
iff
1
is
Prediction
w
:
Initialize
i
i
i
i
i
i
i
=
←
≥
=
=
←
<
=
≥
=
=
1
n;
θ
θ
θ
θ