Download Concept Learning in Machine Learning and more Lecture notes Machine Learning in PDF only on Docsity! Concept Learning Mitchell, Chapter 2 CptS 570 Machine Learning School of EECS Washington State University Outline Definition Generaltospecific ordering over hypotheses Version spaces and the candidate elimination algorithm Inductive bias Learning Task: Enjoy Sport Task T Accurately predict enjoyment Performance P Predictive accuracy Experience E Training examples each with attribute values and class value (yes or no) Representing Hypotheses Many possible representations Let hypothesis h be a conjunction of constraints on attributes Hypothesis space H is the set of all possible hypotheses h Each constraint can be Specific value (e.g., Water = Warm) Don’t care (e.g., Water = ?) No value is acceptable (e.g., Water = Ø) For example <Sunny, ?, ?, Strong, ?, Same> I.e., if (Sky=Sunny) and (Wind=Strong) and (Forecast=Same), then EnjoySport=Yes Concept Learning Task Given Instances X: Possible days Each described by the attributes: Sky, AirTemp, Humidity, Wind, Water, Forecast Target function c: EnjoySport {0,1} Hypotheses H: Conjunctions of literals E.g., <?, Cold, High, ?, ?, ?> Training examples D Positive and negative examples of the target function <x1,c(x1)>, …, <xm,c(xm)> Determine A hypothesis h in H such that h(x) = c(x) for all x in D Terminology Inductive learning hypothesis Any hypothesis approximating the target concept well, over a sufficiently large set of training examples, will also approximate the target concept well for unobserved examples Concept Learning as Search Learning viewed as a search through hypothesis space H for a hypothesis consistent with the training examples Generaltospecific ordering of hypotheses Allows more directed search of H GeneraltoSpecific Ordering
of Hypotheses
instances X Hypotheses H
Specific
General
x4= <Sunny, Warm, High, Strong, Cool, Same> hy <Sunny, ?, ?, Strong, ?, ?>
x= <Sunny, Warm, High, Light, Warm, Same> a <Sunny, 2,2, 2, 2, 2>
he <Sunny, ?, ?, 2, Cool, 2>
FindS Algorithm Initialize h to the most specific hypothesis in H For each positive training instance x For each attribute constraint ai in h If the constraint ai in h is satisfied by x Then do nothing Else replace ai in h by the next more general constraint that is satisfied by x Output hypothesis h FindS Example
Instances X Hypotheses H
e x2 e e Specific
3 .
e
e
af xe
e
e
e 7 © General
‘4
hg= <O, ©, ©, ©, ©, O©>
X1 = <Sunny Warm Normal Strong Warm Same>, + hy = <Sumny Warm Normal Strong Warm Same>
X= <Sunny Warm High Strong Warm Same>, + hy = <Sunny Warm ? Strong Warm Same>
X= <Rainy Cold High Strong Warm Change>,  h, = <Sunny Warm ? Strong Warm Same>
X45 <Sunny Warm High Strong Cool Change>, + Ay = <Sunny Warm ? Strong ? ?>
FindS Algorithm Will h ever cover a negative example? No, if c ∈ H and training examples consistent Problems with FindS Cannot tell if converged on target concept Why prefer the most specific hypothesis? Handling inconsistent training examples due to errors or noise What if more than one maximallyspecific consistent hypothesis? Version Space Example Version space resulting from previous four EnjoySport examples. Finding the Version Space ListThenEliminate VS = list of every hypothesis in H For each training example <x,c(x)> ∈ D Remove from VS any h where h(x) ≠ c(x) Return VS Impractical for all but most trivial H’s Candidate Elimination Algorithm Initialize G to the set of maximally general hypotheses in H Initialize S to the set of maximally specific hypotheses in H For each training example d, do If d is a positive example … If d is a negative example … Example
S9:1{K 9, 0,9,2,8, o>}
y
Ss 1: {<Sunny, Warm, Normal, Strong, Warm, Same> }
y
S 2: {<Sunny, Warm, ?, Strong, Warm, Same>}
Gy, > Gs] (<2, 2,2, 2,2, 2>)}
Training examples:
1. <Sunny, Warm, Normal, Strong, Warm, Same>, Enjoy Sport = Yes
2. <Sunny, Warm, High, Strong, Warm, Same>, Enjoy Sport = Yes
Example (cont.)
Sy, S 3:{<Sunny, Warm, ?, Strong, Warm, Same> }
G3: {<Sunny, ?, ?, 2, 2, 2> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same>}
G: {<?, 2,2, 2,2, ?>}
Training Example:
3. <Rainy, Cold, High, Strong, Warm, Change>, EnjoySport=No
Example (cont.)
Ss 3: {<Sunny, Warm, ?, Strong, Warm, Same>}
!
s 4: {<Sunny, Warm, ?, Strong, ?, ?>}
Gq, {<Sunny, 7, 2, 2, 2, 2> <2, Warm, ?, 2, 2, 2>}

G3: {<Sunmny, ?, ?, ?, 2, P> <P, Warm, ?, 2°, ?, Pm <P, P,P? ? Same>}
Training Example:
4.<Sunny, Warm, High, Strong, Cool, Change>, EnjoySport = Yes
Version Spaces and the Candidate Elimination Algorithm Which training example requested next? Learner may query oracle for example’s classification Ideally, choose example eliminating half of VS Need log2VS examples to converge Which Training Example Next? <Sunny, Cold, Normal, Strong, Cool, Change> ? <Sunny, Warm, High, Light, Cool, Change> ? Using VS to Classify New Example <Sunny, Warm, Normal, Strong, Cool, Change> ? <Rainy, Cold, Normal, Light, Warm, Same> ? <Sunny, Warm, Normal, Light, Warm, Same> ? <Sunny, Cold, Normal, Strong, Warm, Same> ? Unbiased Learner H = every teachable concept (power set of X) E.g., EnjoySport  H  = 296 = 1028 (only 973 by previous H, biased!) H’ = arbitrary conjunctions, disjunctions or negations of hypotheses from previous H E.g., [Sky = Sunny or Cloudy] <Sunny,?,?,?,?,?> or <Cloudy,?,?,?,?,?> Unbiased Learner Problems using H’ S = disjunction of positive examples G = negated disjunction of negative examples Thus, no generalization Each unseen instance covered by exactly half of VS Unbiased Learner Biasfree learning is futile Fundamental property of inductive learning Learners that make no a priori assumptions about the target concept have no rational basis for classifying unseen instances Inductive Bias Permits comparison of learners Rote learner Store examples; classify x iff matches previously observed example No bias CE c ∈ H FindS c ∈ H c(x) = 0 for all instances not covered WEKA’s ConjunctiveRule Classifier Learns rule of the form If A1 and A2 and … An, Then class = c A’s are inequality constraints on attributes A’s chosen based on information gain criterion I.e., which constraint, when added, best improves classification Lastly, performs reducederror pruning Remove A’s from rule as long as reduces error on pruning set If instance x not covered by rule, then c(x) = majority class of training examples not covered by rule Inductive bias? Summary Concept learning as search Generaltospecific ordering Version spaces Candidate elimination algorithm S and G boundary sets characterize learner’s uncertainty Learner can generate useful queries Inductive leaps possible only if learner is biased