Amazon Data Mining Examples - Database Design - Lecture Slides, Slides of Database Management Systems (DBMS)

These lecture slide are very easy to understand and very helpful to built a concept about the foundation of computers and Database Design.The key points in these slide are:Amazon Data Mining Examples, Decision Tree, Association Rules, Clustering, Data Mining, Automated Extraction, Internal Node, Traverse Tree, Credit Risk Decision Tree, Decision Tree Construction, Antecedent True

Typology: Slides

2012/2013

Uploaded on 04/27/2013

arunima
arunima 🇮🇳

3

(2)

99 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Mining
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Amazon Data Mining Examples - Database Design - Lecture Slides and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Data Mining

Agenda

  • Data Mining As Part of KDD
  • Decision Tree
  • Association Rules
  • Clustering
  • Amazon Data Mining Examples

What is Data Mining?

  • “the automated extraction of hidden predictive information from large databases”
  • Algorithms produce patterns, rules
  • Predict future trends/behavior
  • Used to make business decisions

Classification

  • Items belong to classes
  • Given past items’ classification, predict class of new item
  • Example: Issuing credit cards
    • Use information: income, educational background, age, current debts
    • Credit worthiness: Bad, good, excellent

Credit Risk Decision Tree

Decision Tree Construction

  • Some Definitions
    • Purity: > # instances of each leaf belonging to only 1 class means > purity
    • Best Split: split giving the maximum information gain ratio (info gain/info content) - Choose attribute and condition resulting in maximum purity

Association Rules

  • antecedent  consequent
    • if  then
    • beer  diaper (Walmart)
    • economy bad  higher unemployment
    • Higher unemployment  higher unemployment benefits cost
  • Rules associated with population, support, confidence

Association Rules

  • Population: instances such as grocery store purchases
  • Support
    • % of population satisfying antecedent and consequent
  • Confidence
    • % consequent true when antecedent true

Clustering

  • “The process of dividing a dataset into mutually exclusive groups such that the members of each group are as "close" as possible to one another, and different groups are as "far" as possible from one another, where distance is measured with respect to all available variables.”

Clustering

  • Birch Algorithm
  • points inserted into multidimensional tree
  • items guided to leaf nodes "near" representative internal nodes
  • nearby points clustered into one leaf node

Clustering

    1. cluster people with similar movie preferences
    1. given a new movie goer, find a cluster of similar movie goers
    1. then predict the cluster's new movie preferences

Amazon Examples

Amazon Examples

Amazon Examples