Download Data Mining Lectures for beginners and more Lecture notes Data Mining in PDF only on Docsity!
Data Mining
Slides prepared by Irzam Sarfraz
Book
• DATA MINING
Concepts, Models, Methods, and Algorithms
- (^) Author: Mehmed Kantardzic
- (^) PDF available at: http://sites.google.com/view/dmgmdc
Data Mining
- (^) Discovering and utilizing patterns in large datasets
Objectives of Data Mining
- (^) Predictive Data Mining
- (^) Descriptive Data Mining
Data Mining Process
- (^) State the problem and formulate the hypothesis
- (^) Collect the data
- (^) Preprocess the data
- (^) Scaling, encoding, and selecting features
- (^) Estimate the model
- (^) Interpret the model and draw conclusions
Large Datasets
- (^) Problem?
- (^) Size and dimensionality are too large for manual analysis
- (^) Not even possible with semi-automatic computer methods e.g. a calculator
- (^) But conclusions from large datasets can be very effective and promising
Data in Data Mining
- (^) Structured e.g. business transactions in databases
- (^) Semi-Structured e.g. webpages
- (^) Unstructured e.g. videos
Unobserved Variables
- (^) Variables that influence the system but are not available or recorded
- (^) Maybe due to high cost/complexity or not understanding the need
- (^) Leads to inaccuracies
Data Warehouses
- (^) An organizations repository of data storing millions or billions of records for strategic decision making
- (^) Respond to queries from decision makers
- (^) Problem: May produce misleading results if data is not standardized or integrated
- (^) Solution: Use transformations to accurately structure data
Reading Assignment
- (^) Article 1.
- (^) Why a data mining project fails?
- (^) Due next week first lecture