data mining libraries | High school final essays Computer science



Abstract—Due to the fast and flawless technological innovation

there is a tremendous amount of data dumping all over the world in

every domain such as Pattern Recognition, Machine Learning, Spatial

Data Mining, Image Analysis, Fraudulent Analysis, World Wide

Web etc., This issue turns to be more essential for developing severa l

tools for data mining functionalities. The major aim of this paper is to

analyze various tools which are used to build a resourceful analytical

or descriptive model for handling large amount of information more

efficiently and user friendly. In this survey the diverse tools are

illustrated with their extensive technical paradigm, outstanding

graphical interface and inbuilt multipath algorithms in which it is

very useful for handling significant amount of data more indeed.

Keywords—Classification, Clustering, Data Mining, Machine

learning, Visualization

I. INTRODUCTION

HE domain of data mining and discovery of knowledge in

various research fields such as Pattern Recognition,

Information Retrieval, Medicine, Image Processing, Spatial

Data Extraction, Business and Education has been

tremendously increased over the certain span of time. Data

Mining highly endeavors to originate, analyze, extract and

implement fundamental induction process that facilitates the

mining of meaningful information and useful patterns from the

huge dumped unstructured data. This Data mining paradigm

mainly uses complex algorithms and mathematical analysis to

derive exact patterns and trends that subsists in data. The main

aspire of data mining technique is to build an effective

predictive and descriptive model of an enormous amount of

data. Several real world data mining problems involve

numerous conflicting measures of performance or intention in

which it is needed to be optimized simultaneously. The most

distinct features of data mining are that it deals with huge and

complex datasets in which its volume varies from gigabytes to

even terabytes. This requires the data mining operations and

algorithms are robust, stable and scalable along with the

ability to cooperate with different research domains. Hence the

various data mining tasks plays a crucial role in each and

every aspect of information extraction and this in turn leads to

the emergence of several data mining tools. From a pragmatic

perspective, the graphical interface used in the tools tends to

be more efficient, user friendly and easier to operate, in which

they are highly preferred by researchers [1].

Mrs.S.Sarumathi, Associate Professor, is w ith the Department of

Information Technology, K. S. Rangasamy College of Technology, Tamil

Nadu, India (phone: 9443321692; e-mail: rishi_saru20@rediffmail.com).

Dr.N.Shanthi, Professor and Dean, is with the Department of Computer

Science Engineering, Nandha Engineering College, Tamil Nadu, India (e-

mail: [email protected]).

Fig. 1 A Data Mining Framework

Revolving into the relationships between the elements of

the framework has several data modeling notations pointing

towards the cardinality 1 or else m of every relationship. For

these minimum familiar with data modeling notations.

• A business problem is studied via more than one classes

of modeling approach is useful for multiple business

problems.

• More than one method is helpful for any classes of model

plus any known methods is used for more than one classes

of models.

• There is normally more than one approach of

implementing any known methods.

• Data mining tools may sustain more than one of the

methods plus every method is supported by means of

more than one vendor's products.

• For every known method a meticulous product supports a

meticulous implementation algorithm [2].

II. DIFFERENT DATA MINING TOOLS

A. DATABIONIC

The Databionic Emergent Self-Organizing Map tool [3] is a

collection of programs to do data mining tasks such as

visualization, clustering and classification. Training data is a

collection of points from a high dimensional space known as

data space. A SOM contain a collection of prototype vectors in

the data space plus a topology between these prototypes.

Commonly used topology is a 2-dimensional grid where every

prototype that is neuron has four direct neighbors and the

locations on the grid from the map space. Additional two

distance functions are necessary for each space. Euclidean

S. Sarumathi, N. Shanthi

Comprehensive Analysis of Data Mining Tools

World Academy of Science, Engineering and Technology

International Journal of Computer and Information Engineering

Vol:9, No:3, 2015

837International Scholarly and Scientific Research & Innovation 9(3) 2015 ISNI:0000000091950263

Open Science Index, Computer and Information Engineering Vol:9, No:3, 2015 publications.waset.org/10002609/pdf

data mining libraries, High school final essays of Computer science

Related documents

Partial preview of the text

Download data mining libraries and more High school final essays Computer science in PDF only on Docsity!

S. Sarumathi, N. Shanthi

Comprehensive Analysis of Data Mining Tools

T

B. ELKI

REFERENCES