Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Data Mining - Advanced Database System - Lecture Slides, Slides of Database Management Systems (DBMS)

Damodaram Sanjivayya National Law University Database Management Systems (DBMS)

Some concept of Advanced Database System are Types Supported, Simple Data Model, Concurrency Control Two, Continuously Adaptive, Cost-Based Optimization, Data Access From Disks, Data Warehousing. Main points of this lecture are: Data Mining, Subsidiary Issues, Data Cleansing, Visualization, Warehousing of Data, Megabyte, Bogus Data, Decision Trees, Clusters, Hidden-Markov

Typology: Slides

2012/2013

Uploaded on 04/27/2013

dhanapati 🇮🇳

4.1

(24)

123 documents

1 / 42

This page cannot be seen from the preview

Don't miss anything!

Advanced Database Systems

Data Mining

Docsity.com

Discover Slides of Database Management Systems (DBMS) Damodaram Sanjivayya National Law University

Partial preview of the text

Download Data Mining - Advanced Database System - Lecture Slides and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Advanced Database Systems

Data Mining

What is Data Mining?

Discovery of useful, possibly unexpected, patterns in data.
Subsidiary issues:
- Data cleansing: detection of bogus data.
  - E.g., age = 150.
- Visualization: something better than megabyte files of output.
- Warehousing of data (for retrieval).

Example: Clusters

x x x x x x x x x x x x x x x

x xx x x x x x x x x x x x

x x x x x x x x x x

Example: Frequent Itemsets

A common marketing problem: examine what people buy together to discover patterns.

What pairs of items are unusually often found together at Kroger checkout?

Answer: diapers and beer.

What books are likely to be bought by the same Amazon customer?

Rhine Paradox --- (1)

David Rhine was a parapsychologist in the 1950’s who hypothesized that some people had Extra-Sensory Perception.
He devised an experiment where subjects were asked to guess 10 hidden cards --- red or blue.
He discovered that almost 1 in 1000 had ESP --
- they were able to get all 10 right!

Rhine Paradox --- (2)

He told these people they had ESP and called them in for another test of the same type.
Alas, he discovered that almost all of them had lost their ESP.
What did he conclude?
- Answer on next slide.

“Association Rules”

Market Baskets Frequent Itemsets A-priori Algorithm

The Market-Basket Model

A large set of items , e.g., things sold in a supermarket.
A large set of baskets , each of which is a small set of the items, e.g., the things one customer buys on one day.

Support

Simplest question: find sets of items that appear “frequently” in the baskets.
Support for itemset I = the number of baskets containing all items in I.
Given a support threshold^ s , sets of items that appear in > s baskets are called frequent itemsets.

Example

Items={milk, coke, pepsi, beer, juice}.
Support = 3 baskets. B1 = {m, c, b} B2 = {m, p, j} B3 = {m, b} B4 = {c, j} B5 = {m, p, b} B6 = {m, c, b, j} B7 = {c, b, j} B8 = {b, c}
Frequent itemsets: {m}, {c}, {b}, {j}, {m, b}, {c, b}, {j, c}.

Applications --- (2)

“Baskets” = documents; “items” = words in those documents. - Lets us find words that appear together unusually frequently, i.e., linked concepts.
“Baskets” = sentences, “items” = documents containing those sentences. - Items that appear together too often could represent plagiarism.

Applications --- (3)

“Baskets” = Web pages; “items” = linked pages. - Pairs of pages with many common references may be about the same topic.
“Baskets” = Web pages p ; “items” = pages that link to p. - Pages with many of the same links may be mirrors or about the same topic.

Scale of Problem

WalMart sells 100,000 items and can store billions of baskets.
The Web has over 100,000,000 words and billions of pages.

Association Rules

If-then rules about the contents of baskets.
{ i 1 , i 2 ,…, i (^) k } → j means: “if a basket contains all

of i 1 ,…, i (^) k then it is likely to contain j.

Confidence of this association rule is the probability of j given i 1 ,…, i^ k.

Data Mining - Advanced Database System - Lecture Slides, Slides of Database Management Systems (DBMS)

Related documents

Partial preview of the text

Download Data Mining - Advanced Database System - Lecture Slides and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Advanced Database Systems

Data Mining

What is Data Mining?

Example: Clusters

Example: Frequent Itemsets

Rhine Paradox --- (1)

Rhine Paradox --- (2)

“Association Rules”

The Market-Basket Model

Support

Example

Applications --- (2)

Applications --- (3)

Scale of Problem

Association Rules