classification and regression tree, Lecture notes of Pattern Classification and Recognition

Tree-based methods extend regression (and classification) by splitting the dataset into subsets and fitting a separate, simpler model to each subset. The general framework is called CART — Classification And Regression Trees.

Typology: Lecture notes

2025/2026

Uploaded on 03/10/2026

mado-bukuru
mado-bukuru 🇺🇸

3 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ISYE 6501 — Introduction to Analytics
Module 10, Lesson 1: Introduction to CART
Classification and Regression Trees
1. Overview
Tree-based methods extend regression (and classification) by splitting the dataset into subsets
and fitting a separate, simpler model to each subset. The general framework is called CART —
Classification And Regression Trees.
What problem does CART solve?
A single global regression model assumes the same coefficients apply to all observations. But
real-world relationships often differ across subgroups. For example, the effect of a marketing
email on customer spending may vary strongly by age. CART lets the model capture these
interactions naturally by partitioning the data.
2. How Trees Work in Regression
Basic Concept
Start with all training data at the root node. Split into branches based on the value of one
predictor variable at a time. Each branch can be split again, creating a tree structure. The
endpoints of the tree are called leaves (or leaf nodes). A separate model is fit to the data within
each leaf.
Example: Marketing Email Study
Suppose we model the impact of a marketing email on website spending using:
Demographic factors: age, sex, number of children, annual income
Purchasing factor: average monthly spend on the website
Treatment indicator: binary variable for whether a marketing email was received
A tree might branch as follows:
Root split: Age ≤ 25 vs. Age ≥ 26
Second split (within Age ≥ 26): Age 26–50 vs. Age ≥ 51
Third split (within Age 26–50): No children vs. At least one child
pf3
pf4

Partial preview of the text

Download classification and regression tree and more Lecture notes Pattern Classification and Recognition in PDF only on Docsity!

ISYE 6501 — Introduction to Analytics

Module 10, Lesson 1: Introduction to CART

Classification and Regression Trees

1. Overview

Tree-based methods extend regression (and classification) by splitting the dataset into subsets and fitting a separate, simpler model to each subset. The general framework is called CART — Classification And Regression Trees.

What problem does CART solve?

A single global regression model assumes the same coefficients apply to all observations. But real-world relationships often differ across subgroups. For example, the effect of a marketing email on customer spending may vary strongly by age. CART lets the model capture these interactions naturally by partitioning the data.

2. How Trees Work in Regression

Basic Concept

Start with all training data at the root node. Split into branches based on the value of one predictor variable at a time. Each branch can be split again, creating a tree structure. The endpoints of the tree are called leaves (or leaf nodes). A separate model is fit to the data within each leaf.

Example: Marketing Email Study

Suppose we model the impact of a marketing email on website spending using:

  • Demographic factors: age, sex, number of children, annual income
  • Purchasing factor: average monthly spend on the website
  • Treatment indicator: binary variable for whether a marketing email was received A tree might branch as follows:
  • Root split: Age ≤ 25 vs. Age ≥ 26
  • Second split (within Age ≥ 26): Age 26–50 vs. Age ≥ 51
  • Third split (within Age 26–50): No children vs. At least one child

This produces four leaves, each with its own regression model and coefficients. A 37-year-old with two children would follow the path: Age ≥ 26 → 26–50 → At least one child, and be scored using that leaf's model.

★ Key Insight: Coefficients may differ across leaves, and some predictors may be

significant in one leaf but not another. This is the power of tree-based partitioning.

3. What's Actually Done in Practice Simple Leaf Models (Not Full Regression) Theory: fit a full regression at every node. Practice: this is computationally expensive, risks overfitting (fewer data points deeper in the tree), and becomes prohibitive when using many trees (e.g., Random Forests). So instead, each leaf uses the simplest possible model: Task Leaf Model Prediction Output Regression Tree Mean of response values in leaf Predicted numeric value Logistic Regression Tree Fraction of data points where response = true Predicted probability Classification Tree Most common class in leaf Predicted class label 4. Uses of CART Descriptive Analytics Each leaf's coefficients explain behavior within that specific subgroup. This helps identify how and why different segments respond differently. Predictive Analytics The tree routes new observations to the correct leaf, which then generates a tailored prediction. This is more precise than a single global model.

Leaf / Leaf node An endpoint of the tree where a model or decision is applied Node Any point in the tree (root, internal branch, or leaf) Decision Tree A tree used for decision-making rather than purely for prediction

9. Summary CART splits training data into subsets using a tree structure and fits a simple model (typically the mean, fraction, or mode) to each leaf. This allows the model to capture interactions and subgroup differences that a single global model would miss. The same tree structure applies to regression trees, logistic regression trees, classification trees, and decision trees.