Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Intelligent Data Analysis: Understanding IDA, Tools, and Techniques, Slides of Public Health

Explore intelligent data analysis (ida), an interdisciplinary field focused on effective data analysis for extracting valuable information and knowledge from large datasets. Learn about ida concepts, tools like see5, cubist, illm, and magnum opus, and techniques for discovering rules hidden in data.

Typology: Slides

2012/2013

Uploaded on 11/21/2013

super-malik
super-malik 🇮🇳

4.6

(14)

195 documents

1 / 31

Toggle sidebar

Related documents


Partial preview of the text

Download Intelligent Data Analysis: Understanding IDA, Tools, and Techniques and more Slides Public Health in PDF only on Docsity!

Intelligent Data

Analysis

(IDA)

Interest and Excitement forIntelligent Data Analysis^ 

Decision making is asking forinformation and knowledge Data processing can give them Multidimensionality of problems islooking for methods for adequate anddeep data processing and analysis

Learning Objectives

To understand the concept of the IDA To meet web-sites and literature on IDA To meet some tools for IDA To learn how to use IDA tools and tovalidate the IDA results

Performance Objectives

Recognize problems asking for IDA Preparing data and making analysis Validating and interpreting results of IDA

IDA is…

… an interdisciplinary studyconcerned with the effectiveanalysis of data;… used for extracting usefulinformation from large quantitiesof online data; extracting desirableknowledge or interesting patternsfrom existing databases;

IDA or …

Data mining Knowledge acquisition from data Genetic algorithm-based rule discovery Knowledge discovery Learning classifier system Machine learning etc.

IDA gives knowledge …

Knowledge is …

^ the distillation of information that has beencollected, classified, organized, integrated,abstracted and value-added; ^ at a level of abstraction higher than the data,and information on which it is based and canbe used to deduce new information and newknowledge; ^ usually in the context of human expertiseused in solving problems.

Knowledge acquisition …  The process of eliciting, analyzing,transforming, classifying, organizing andintegrating knowledge and representingthat knowledge in a form that can beused in a computer system.

Knowledge in a domain canbe expressed as a number

of rules

Rule is …

A formal way of specifying arecommendation, directive, orstrategy, expressed as "IF premiseTHEN conclusion" or "IF conditionTHEN action".

How to discover ruleshidden in the data?

Some tools for IDA …

See

  • program for analyzing data and generating classifiers in the form ofdecision trees and/or rule sets.

Some tools for IDA …

Cubist

  • analyzes data and generates rule-based piecewise linear models –collections of rules, each with anassociated linear expression forcomputing a target value..

Some tools for IDA …

 ILLM

- the tool constructs

classification models in the form ofrules which represent knowledgeabout relations hidden in data.

Some tools for IDA …

 Magnum Opus

- finds association

rules providing competitiveadvantage by revealing underlyinginteractions between factors withinthe data.

Evaluation of IDA results Absolute & relative accuracy Sensitivity & specificity False positive & false negative Error rate Reliability of rules Etc.

Example of IDA

Illustration of IDA by using See

See5…application…

application.

names

  • lists the

classes

to

which cases may belong and the attributes

used to describe each case.

Attributes are of two types:

discrete

attributes have a value drawn from a setof possibilities, and

continuous

attributes have numeric values.

See5…application…

application.

data

  • provides information

on the

training

cases from which See

will extract patterns. The entry for each case consists of oneor more lines that give the values for allattributes.

See5…application…

application.

test

  • provides information

on the

test

cases (used for evaluation of

results). The entry for each case consists of oneor more lines that give the values for allattributes.

See5…application…example… Epidemiological study (1970-1990) Sample of examinees died fromcardiovascular diseases during theperiod Question: Did they know they were ill?

1 – they were healthy2 – they were ill (drug treatment, positive clinicaland laboratory findings)

See5…application…example… ^ application.

names

  • example

Goal.gender:M,Factivity:1,2,3age: continuoussmoking: No,Yes… Goal:1,2…

See5…application…example… application.

data

  • example

M,1,59,Yes,0,0,0,0,119,73,103,86,247,87,15979,?,?,?,1,73,2.5M,1,66,Yes,0,0,0,0,132,81,183,239,?,783,14403,27221,19153,23187,1,73,2.6M,1,61,No,0,0,0,0,130,79,148,86,209,115,21719,12324,10593,11458,1,74,2.5…^

See5…application…example… Results – example

Rule 1: (cover 26)

gender = MSBP > 111oil_fat > 2.9 ->^

class 1

[0.929]