Data Analysis - Database Design - Lecture Slides, Slides of Database Management Systems (DBMS)

These lecture slide are very easy to understand and very helpful to built a concept about the foundation of computers and Database Design.The key points in these slide are:Data Analysis, Data Mining, Subsets and Relations, Data Mining Subtypes, Data Dredging, Metadata, Process of Scanning, Data Set for Relations, Multidimensional Dot Product, Component of Data Mining, Clustering

Typology: Slides

2012/2013

Uploaded on 04/27/2013

arunima
arunima 🇮🇳

3

(2)

99 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Mining vs. Data Analysis
In terms of software and the marketing thereof
Data Mining != Data Analysis
Data Mining implies software uses some intelligence
over simple grouping and partitioning of data to infer
new information.
Data Analysis is more in line with standard statistical
software (ie: web stats). These usually present
information about subsets and relations within the
recorded data set (ie: browser/search engine usage,
average visit time, etc. )
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Data Analysis - Database Design - Lecture Slides and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Data Mining vs. Data Analysis

  • In terms of software and the marketing thereof

Data Mining != Data Analysis

  • Data Mining implies software uses some intelligence

over simple grouping and partitioning of data to infer

new information.

  • Data Analysis is more in line with standard statistical

software (ie: web stats). These usually present

information about subsets and relations within the

recorded data set (ie: browser/search engine usage,

average visit time, etc. )

Data Mining Subtypes

  • Data Dredging The process of scanning a data set for relations and then coming up with a hypothesis for existence of those relations.
  • MetaData Data that describes other data. Can describe an individual element, or a collection of elements. Wikipedia example: “In a library, where the data is the content of the titles stocked, metadata about a title would typically include a description of the content, the author, the publication date and the physical location”
  • Applications for Data Dredging in business include Market and Risk Analysis, as well as trading strategies.
  • Applications for Science include disaster prediction.

Key Component of Data Mining

  • Whether Knowledge Discovery or Knowledge

Prediction, data mining takes information that was

once quite difficult to detect and presents it in an

easily understandable format (ie: graphical or

statistical)

  • Data mining Techniques involve sophisticated

algorithms, including Decision Tree Classifications,

Association detection, and Clustering.

  • Since Data mining is not on test, I will keep things

superficial.

Uses of Data Mining

  • AI/Machine Learning

Combinatorial/Game Data Mining Good for analyzing winning strategies to games, and thus developing intelligent AI opponents. (ie: Chess)

  • Business Strategies

Market Basket Analysis Identify customer demographics, preferences, and purchasing patterns.

  • Risk Analysis

Product Defect Analysis Analyze product defect rates for given plants and predict possible complications (read: lawsuits) down the line.

Uses of Data Mining (Continued)

  • Health and Science Protein Folding Predicting protein interactions and functionality within biological cells. Applications of this research include determining causes and possible cures for Alzheimers, Parkinson's, and some cancers (caused by protein "misfolds")

Extra-Terrestrial Intelligence Scanning Satellite receptions for possible transmissions from other planets.

  • For more information see Stanford’s Folding@home and SETI@home projects. Both involve participation in a widely distributed computer application.

Sources of Data for Mining

• Databases (most obvious)

• Text Documents

• Computer Simulations

• Social Networks

Prevalence of Data Mining

  • Your data is already being mined, whether you like it or not.
  • Many web services require that you allow access to your information [for data mining] in order to use the service.
  • Google mines email data in Gmail accounts to present account owners with ads.
  • Facebook requires users to allow access to info from non-Facebook pages. Facebook privacy policy: "We may use information about you that we collect from other sources, including but not limited to newspapers and Internet sources such as blogs, instant messaging services and other users of Facebook, to supplement your profile.
  • This allows access to your blog RSS feed (rather innocuous), as well as information obtained through partner sites (worthy of concern).

Data Mining Controversies

  • Latest one: Facebook's Beacon Advertising program

(Just popped on Slashdot within the last week)

  • What Beacon does:

“when you engage in consumer activity at a

[Facebook] partner website, such as Amazon, eBay,

or the New York Times, not only will Facebook record

that activity, but your Facebook connections will also

be informed of your purchases or actions.” [taken

from

http://trickytrickywhiteboy.blogspot.com/2007/11/b

eware-of-facebooks-beacon.html]

Bottom Line

• Data obtained through Data Mining is

incredibly valuable

• Companies are understandably reluctant to

give up data they have obtained.

• Expect to see prevalence of Data Mining and

(possibly subversive) methods increase in

years to come.