Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Concept Learning in Machine Learning, Lecture notes of Machine Learning

CptS 570 Machine Learning School of EECS Washington State University

Typology: Lecture notes

2017/2018

Uploaded on 05/17/2018

alfaraqed-school
alfaraqed-school 🇮🇶

5

(1)

1 document

1 / 42

Toggle sidebar

Related documents


Partial preview of the text

Download Concept Learning in Machine Learning and more Lecture notes Machine Learning in PDF only on Docsity! Concept Learning Mitchell, Chapter 2 CptS 570 Machine Learning School of EECS Washington State University Outline Definition General-to-specific ordering over hypotheses Version spaces and the candidate elimination algorithm Inductive bias Learning Task: Enjoy Sport Task T Accurately predict enjoyment Performance P Predictive accuracy Experience E Training examples each with attribute values and class value (yes or no) Representing Hypotheses Many possible representations Let hypothesis h be a conjunction of constraints on attributes Hypothesis space H is the set of all possible hypotheses h Each constraint can be Specific value (e.g., Water = Warm) Don’t care (e.g., Water = ?) No value is acceptable (e.g., Water = Ø) For example <Sunny, ?, ?, Strong, ?, Same> I.e., if (Sky=Sunny) and (Wind=Strong) and (Forecast=Same), then EnjoySport=Yes Concept Learning Task Given Instances X: Possible days Each described by the attributes: Sky, AirTemp, Humidity, Wind, Water, Forecast Target function c: EnjoySport {0,1} Hypotheses H: Conjunctions of literals E.g., <?, Cold, High, ?, ?, ?> Training examples D Positive and negative examples of the target function <x1,c(x1)>, …, <xm,c(xm)> Determine A hypothesis h in H such that h(x) = c(x) for all x in D Terminology Inductive learning hypothesis Any hypothesis approximating the target concept well, over a sufficiently large set of training examples, will also approximate the target concept well for unobserved examples Concept Learning as Search Learning viewed as a search through hypothesis space H for a hypothesis consistent with the training examples General-to-specific ordering of hypotheses Allows more directed search of H General-to-Specific Ordering of Hypotheses instances X Hypotheses H Specific General x4= <Sunny, Warm, High, Strong, Cool, Same> hy <Sunny, ?, ?, Strong, ?, ?> x= <Sunny, Warm, High, Light, Warm, Same> a <Sunny, 2,2, 2, 2, 2> he <Sunny, ?, ?, 2, Cool, 2> Find-S Algorithm Initialize h to the most specific hypothesis in H For each positive training instance x For each attribute constraint ai in h If the constraint ai in h is satisfied by x Then do nothing Else replace ai in h by the next more general constraint that is satisfied by x Output hypothesis h Find-S Example Instances X Hypotheses H e x2 e e Specific 3 . e e af xe e e e 7 © General ‘4 hg= <O, ©, ©, ©, ©, O©> X1 = <Sunny Warm Normal Strong Warm Same>, + hy = <Sumny Warm Normal Strong Warm Same> X= <Sunny Warm High Strong Warm Same>, + hy = <Sunny Warm ? Strong Warm Same> X= <Rainy Cold High Strong Warm Change>, - h, = <Sunny Warm ? Strong Warm Same> X45 <Sunny Warm High Strong Cool Change>, + Ay = <Sunny Warm ? Strong ? ?> Find-S Algorithm Will h ever cover a negative example? No, if c ∈ H and training examples consistent Problems with Find-S Cannot tell if converged on target concept Why prefer the most specific hypothesis? Handling inconsistent training examples due to errors or noise What if more than one maximally-specific consistent hypothesis? Version Space Example Version space resulting from previous four EnjoySport examples. Finding the Version Space List-Then-Eliminate VS = list of every hypothesis in H For each training example <x,c(x)> ∈ D Remove from VS any h where h(x) ≠ c(x) Return VS Impractical for all but most trivial H’s Candidate Elimination Algorithm Initialize G to the set of maximally general hypotheses in H Initialize S to the set of maximally specific hypotheses in H For each training example d, do If d is a positive example … If d is a negative example … Example S9:1{K 9, 0,9,2,8, o>} y Ss 1: {<Sunny, Warm, Normal, Strong, Warm, Same> } y S 2:| {<Sunny, Warm, ?, Strong, Warm, Same>} Gy, > Gs] (<2, 2,2, 2,2, 2>)} Training examples: 1. <Sunny, Warm, Normal, Strong, Warm, Same>, Enjoy Sport = Yes 2. <Sunny, Warm, High, Strong, Warm, Same>, Enjoy Sport = Yes Example (cont.) Sy, S 3:|{<Sunny, Warm, ?, Strong, Warm, Same> } G3: {<Sunny, ?, ?, 2, 2, 2> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same>} G:| {<?, 2,2, 2,2, ?>} Training Example: 3. <Rainy, Cold, High, Strong, Warm, Change>, EnjoySport=No Example (cont.) Ss 3: |{<Sunny, Warm, ?, Strong, Warm, Same>} |! s 4: {<Sunny, Warm, ?, Strong, ?, ?>} Gq, |{<Sunny, 7, 2, 2, 2, 2> <2, Warm, ?, 2, 2, 2>} | G3: {<Sunmny, ?, ?, ?, 2, P> <P, Warm, ?, 2°, ?, Pm <P, P,P? ? Same>} Training Example: 4.<Sunny, Warm, High, Strong, Cool, Change>, EnjoySport = Yes Version Spaces and the Candidate Elimination Algorithm Which training example requested next? Learner may query oracle for example’s classification Ideally, choose example eliminating half of VS Need log2|VS| examples to converge Which Training Example Next? <Sunny, Cold, Normal, Strong, Cool, Change> ? <Sunny, Warm, High, Light, Cool, Change> ? Using VS to Classify New Example <Sunny, Warm, Normal, Strong, Cool, Change> ? <Rainy, Cold, Normal, Light, Warm, Same> ? <Sunny, Warm, Normal, Light, Warm, Same> ? <Sunny, Cold, Normal, Strong, Warm, Same> ? Unbiased Learner H = every teachable concept (power set of X) E.g., EnjoySport | H | = 296 = 1028 (only 973 by previous H, biased!) H’ = arbitrary conjunctions, disjunctions or negations of hypotheses from previous H E.g., [Sky = Sunny or Cloudy] <Sunny,?,?,?,?,?> or <Cloudy,?,?,?,?,?> Unbiased Learner Problems using H’ S = disjunction of positive examples G = negated disjunction of negative examples Thus, no generalization Each unseen instance covered by exactly half of VS Unbiased Learner Bias-free learning is futile Fundamental property of inductive learning Learners that make no a priori assumptions about the target concept have no rational basis for classifying unseen instances Inductive Bias Permits comparison of learners Rote learner Store examples; classify x iff matches previously observed example No bias CE c ∈ H Find-S c ∈ H c(x) = 0 for all instances not covered WEKA’s ConjunctiveRule Classifier Learns rule of the form If A1 and A2 and … An, Then class = c A’s are inequality constraints on attributes A’s chosen based on information gain criterion I.e., which constraint, when added, best improves classification Lastly, performs reduced-error pruning Remove A’s from rule as long as reduces error on pruning set If instance x not covered by rule, then c(x) = majority class of training examples not covered by rule Inductive bias? Summary Concept learning as search General-to-specific ordering Version spaces Candidate elimination algorithm S and G boundary sets characterize learner’s uncertainty Learner can generate useful queries Inductive leaps possible only if learner is biased