




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of tree pruning and classification techniques, including impurity measures for regression and classification trees, class assignment rules, and stop-splitting rules. It also discusses the importance of pruning branches to prevent overfitting and the challenges of finding the optimal tree size.
Typology: Study notes
1 / 122
This page cannot be seen from the preview
Don't miss anything!





























































































S^ : smooth nonparametric functionj^
Weather^ Data:
Play^ or^ not
Play?
Outlook^ Temperature^
Humidity^ Windy^ Play? sunny^ hot^
high^ false^ No sunny^ hot^
high^ true^ No overcast^ hot^
high^ false^ Yes rain^ mild^
high^ false^ Yes rain^ cool^
normal^ false^ Yes rain^ cool^
normal^ true^ No overcast^ cool^
normal^ true^ Yes sunny^ mild^
high^ false^ No sunny^ cool^
normal^ false^ Yes rain^ mild^
normal^ false^ Yes sunny^ mild^
normal^ true^ Yes overcast^ mild^
high^ true^ Yes overcast^ hot^
normal^ false^ Yes rain^ mild^
high^ true^ No
Example^ Tree^ for^ “Play?”^ Outlook^ Yes Humidity^ Windy
- Models•^ C^ — the regression model prediction value corresponding to the region Rm
m
- Two types•^ Regression trees^ •^ Classification trees • Fundamentals Issues in Tree-based Models•^ How to decide the splitting point? (Tree growing)^ •^ How to control the size of the tree? (Tree pruning)
(^52) , (^15) (^42) , (^14)
(^32) , (^13) (^22) , (^12) 5 2 ,^1112 ,^11
Tree-Based Models ) ∈= ∑= =
Regression^
Trees
M^ regions: R^ , R^ , …, R^1
.M M ∈= RxIc ∑ mm (^) = m
…^ x^ ), output is continuous yi2 ,^ ,^ ip^
i,
Classification and Regression Trees^ (Fundamentals)
Accuracy Estimation • Question – How good is a classifier in prediction, i.e.,how accurate is a classifier? • True Misclassification Rate – Given the learning sample
L, (x^ , j), n=1,n^ n^
L,N, x^ ∊^ X , j∊^ C , construct d(x). Let (x, j), xn^ n^
∊^ X , j ∊^ C^ be a new sample from the same populationas^ L. The true misclassification rate of d(x),*^ R^ (d), is defined as^ R
*^ (d) = P(d(x)^ ≠ j).