Machine Learning Performance: Accuracy, Lift, Precision, Recall, ROC, and ROC Area, Slides of Advanced Algorithms

An overview of various performance measures used to evaluate the effectiveness of machine learning models. Topics include accuracy, weighted accuracy, lift, precision, recall, ROC, and ROC area. the concepts behind each measure, their calculations, and their applications. It also discusses the limitations and assumptions of accuracy as a performance measure and the importance of considering other measures for specific use cases.

Typology: Slides

2020/2021

Uploaded on 05/13/2021

SybyllaA
SybyllaA 🇳🇱

4.4

(8)

78 documents

1 / 32

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Performance Measures
for Machine Learning
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20

Partial preview of the text

Download Machine Learning Performance: Accuracy, Lift, Precision, Recall, ROC, and ROC Area and more Slides Advanced Algorithms in PDF only on Docsity!

Performance Measures

for Machine Learning

Performance Measures

  • Accuracy
  • Weighted (Cost-Sensitive) Accuracy
  • Lift
  • Precision/Recall
    • F
    • Break Even Point
  • ROC
    • ROC Area

Confusion Matrix

Predicted 1 Predicted 0 True 0 True

a b

c d

correct incorrect

accuracy = (a+d) / (a+b+c+d)

threshold

Prediction Threshold

Predicted 1 Predicted 0 True 0 True 1

0 b

0 d

  • threshold > MAX(f(x))
  • all cases predicted 0
  • (b+d) = total
  • accuracy = %False = %0’s Predicted 1 Predicted 0 True 0 True 1

a 0

c 0

  • threshold < MIN(f(x))
  • all cases predicted 1
  • (a+c) = total
  • accuracy = %True = %1’s

threshold demo

Problems with Accuracy

  • Assumes equal cost for both kinds of errors
    • cost(b-type-error) = cost (c-type-error)
  • is 99% accuracy good?
    • can be excellent, good, mediocre, poor, terrible
    • depends on problem
  • is 10% accuracy bad?
    • information retrieval
  • BaseRate = accuracy of predicting predominant class (on most problems obtaining BaseRate accuracy is easy)

Costs (Error Weights)

Predicted 1 Predicted 0 True 0 True

w

a

w

b

w

c

w

d

  • Often W a

= W

d = zero and W b

≠ W

c ≠ zero

Lift

  • not interested in accuracy on entire dataset
  • want accurate predictions for 5%, 10%, or 20% of dataset
  • don’t care about remaining 95%, 90%, 80%, resp.
  • typical application: marketing
  • how much better than random prediction on the fraction of the dataset predicted true (f(x) > threshold) lift ( threshold ) = % positives > threshold % dataset > threshold

Lift

Predicted 1 Predicted 0 True 0 True

a b

c d

threshold lift = a ( a + b ) ( a + c ) ( a + b + c + d )

Lift and Accuracy do not always correlate well Problem 1 Problem 2 (thresholds arbitrarily set at 0.5 for both lift and accuracy)

Precision and Recall

  • typically used in document retrieval
  • Precision:
    • how many of the returned documents are correct
    • precision(threshold)
  • Recall:
    • how many of the positives does the model return
    • recall(threshold)
  • Precision/Recall Curve: sweep thresholds

Summary Stats: F & BreakEvenPt

PRECISION = a /( a + c ) RECALL = a /( a + b ) F =

2 * ( PRECISION ¥ RECALL )

( PRECISION + RECALL )

BreakEvenPo int = PRECISION = RECALL harmonic average of precision and recall