What is Data Mining-Data Warehouse-Lecture Slides, Slides of Data Warehousing

Topics include in this course are Data Warehousing Concepts, Design and Development, Extraction, Transformation and Loading, OLAP Technology, Data Mining Techniques: Classification, Clustering and Decision Tree, Advanced Topics. This lecture includes: Data, Mining, Process, Interdisciplinary, Field, Statistics, Databases, Pattern, Recognition, Visualization, Instances

Typology: Slides

2011/2012

Uploaded on 08/08/2012

sharib_sweet
sharib_sweet 🇮🇳

4.2

(50)

102 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
2
What is Data Mining?
“…the process of discovering meaningful new
correlations, patterns, and trends by sifting through
large amounts of data…” (Gartner Group)
“…the analysis of observational data sets to find
unsuspected relationships and to summarize data in
novel ways…” (Hand et al.)
“…is an interdisciplinary field bringing together
techniques from machine learning, pattern recognition,
statistics, databases, and visualization…” (Cabana et al.)
docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download What is Data Mining-Data Warehouse-Lecture Slides and more Slides Data Warehousing in PDF only on Docsity!

What is Data Mining?•^ “…the process of discovering meaningful newcorrelations, patterns, and trends by sifting throughlarge amounts of data…”^2

(Gartner Group)

-^ “…the analysis of observational data sets to findunsuspected relationships and to summarize data innovel ways…”^ (Hand et al.)•^ “…is an interdisciplinary field bringing togethertechniques from machine learning, pattern recognition,statistics, databases, and visualization…”

(Cabana et al.)docsity.com

What is Data Mining?•^ The^ process^ of^ employing

one^ or^ more

computer^ learning

techniques^ to

automatically^ analyze

and^ extract

knowledge from data. 3

Supervised Learning•^ Build^ a^ learner

model^ using^ data

instances of known origin.•^ Use^ the^ model

to^ determine^

the

outcome^ of^ new^ instances

of^ unknown

origin.

- Hypothetical Training Data for Disease Diagnosis Table 1.1 (^) Patient Sore^ SwollenID# Throat^ Fever^ Glands Congestion^ Headache^ Diagnosis 1 Yes Yes Yes Yes^ Yes^ Strep throat 2 No No No Yes^ Yes^ Allergy 3 Yes Yes No Yes^ No^ Cold 4 Yes No Yes No^ No^ Strep throat 5 No Yes No Yes^ No^ Cold 6 No No No Yes^ No^ Allergy 7 No No Yes No^ No^ Strep throat 8 Yes No No Yes^ Yes^ Allergy 9 No Yes No Yes^ Yes^ Cold 10 Yes^ Yes^ No^

Yes^ Yes^ Cold

Supervised Learning

- Data Instances with an Unknown Classification Table 1.2 (^) Patient^ Sore^ SwollenID# Throat Fever Glands^ Congestion^ Headache^ Diagnosis 11 No No Yes^ Yes^ Yes^

12 Yes^ Yes^ No^

No^ Yes^? 13 No^ No^ No^

No^ Yes^?

Supervised Learning

Unsupervised ClusteringA data mining method that builds modelsfrom^ data^ without

predefined^ classes.

The Acme Investors Dataset

Supervised Learning1.^ Can I develop a general profile of an online investor?2.^ Can I determine if a new customer is likely to open amargin account?3.^ Can I build a model predict the average number oftrades per month for a new investor?4.^ What characteristics differentiate female and maleinvestors?

The Acme Investors Dataset& Unsupervised Clustering1.^ What^ attribute^

similarities^ group

customers of Acme Investors together?2. What^ differences^

in^ attribute^ values

segment the customer database?

Data Mining Strategies Data MiningStrategies SupervisedMarket BasketUnsupervisedLearningAnalysisClustering PredictionEstimationClassification

Description docsity.com

Data Mining Strategies:Classification•^ Learning is supervised.•^ The dependent variable is categorical.•^ Well-defined classes.•^ Current rather than future behavior.