

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
assignment 3 of csce machine learning
Typology: Assignments
1 / 3
This page cannot be seen from the preview
Don't miss anything!


infinite intervals which are unbounded on the right. Determine VC-dim(H).
VC-dim(H) = 1.
First, H can shatter the empty set {ฯ}.
Second, for x
1
= {x}, H can shatter all subsets.
Then, as for two points set x
2
= {x 1
, x 2
}, one point cannot be extracted without affecting the
other one.
, a 2
): a 1
and a 2
are real numbers}. That is, H is the set
of bounded intervals. Determine VC-dim(H).
VC-dim(H) = 2.
First, H can shatter the empty set {ฯ}.
Second, for x
1
= {x}, H can shatter all subsets {{ฯ}, {x}}.
Then, for x
2
= {x 1
, x 2
}, H can shatter all subsets {{ฯ}, {x 1
}, {x 2
}, {x 1
, x 2
However, when the number of points is 3, the subsets which contain {x 1
, x 3
} cannot be
shattered without affecting {x 2
(i) Let X = R
2
, the two-dimensional plane. Let R (a1, a2)
:= {(x 1
, x 2
): x 1
โฅ a 1
, x 2
โฅ a 2
} denote
the two-dimensional semi-infinite rectangle with โsouthwestโ corner at the point (a 1
, a 2
Let H consist of all such semi-infinite rectangles, i.e., H = {R(a1, a2): a 1 and a 2 are real
numbers}. Determine VC-dim(H).
(ii) Generalize the above result to the higher dimensional case where X = R
n
(a1, a2, ... ,
an)
:= {(x 1
, x 2
, ... , x n
): x 1
โฅ a 1
, x 2
โฅ a 2
, ... , x n
โฅ a n
}, and H = {R (a1, a2, ... , an)
: a 1
, a 2
, ... , a n
are real
numbers}.
First, H can easily shatter the empty set {ฯ}.
Second, for x
1
= {x}, H can shatter all subsets {{ฯ}, {x}}.
Then, for x
2
= {x 1
, x 2
}, H can shatter all subsets {{ฯ}, {x 1
}, {x 2
}, {x 1
, x 2
However, when the number of points is 3, the subsets which contain {x 1
, x 2
} cannot be
shattered without affecting {x 3
First, in n dimension space, H can shatter subsets of x
n
, when each point in one dimension.
Therefore, VC-dim(H) โฅ n.
Then, when the number of points increases, the (n + 1)-th point will make not all the
subsets of x
n+
be shattered without affecting others. Thus, VC-dim(H) < n + 1.
Now we know that VC-dim(H) = n, the VC-dimension of semi-intervals equals to the
dimension of hyper-plan.
numbers for a class H of sets if its VC-dim(H) is finite. The Glivenko-Cantelli Theorem
says that empirical distribution functions converge in the L โ
distribution function in probability. Here is what it means.
Let F(x) be the distribution function of a random variable X, i.e., P(X โค x) = F(x). We wish
to estimate this distribution function. For this purpose, we obtain m i.i.d. samples {x 1
x 2
, ... , x m
} where each x i
โผ P. Then we construct the empirical distribution function
!
"
!
!
#$"
. Show that P(sup x
m
(x) โ F(x)|| > ฮต) โ 0 as m โ โ.
Weโve already known that there is uniform convergence in the law of weak numbers for a
class H of sets if its VC-dim(H) is finite.
The empirical distribution function is ๐บ
!
"
!
!
#$"
Let t be integral number in the real line R, ฯต > 0 and t < 1/ฯต.
๐นo๐
%ฬ
q = ๐นo๐
'
q โ ๐(๐
'
'
'
should satisfy ๐น(๐
%ฬ
'
(
'
), j = 1, 2, โฆ , t.
for ๐
')"
'
๐นo๐
%
ฬ
q โ ๐นo๐
'
q โค
"
(
When m โ โ, we have ๐บ
!
o๐
'
q โ ๐นo๐
'
q โ 0 , ๐บ
!
o๐
%ฬ
q โ ๐นo๐
%ฬ
q โ 0 , then
โ = max
'$",+,โฆ,(
x|๐บ
!
o๐
'
q โ ๐นo๐
'
q|, |๐บ
!
o๐
%ฬ
q โ ๐นo๐
%ฬ
q|y โ 0.
For any x and j, ๐นo๐ ')"
q โค ๐ฅ โค ๐นo๐
'
q.
So
Then,