




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Cluster analysis, a fundamental method in data mining, groups similar data points into clusters, revealing hidden patterns within the data. It is crucial for exploratory data analysis. Different algorithms, like k-means, hierarchical clustering, and density-based clustering, can be used based on the analysis requirements and data nature.
Typology: Summaries
1 / 8
This page cannot be seen from the preview
Don't miss anything!





1 Aashish R. Dandekar Department of CSE(DS) Cluster Analysis Cluster Analysis: An Overview Cluster analysis, also referred to as clustering, is a fundamental method in data mining aimed at grouping similar data points. The primary objective of cluster analysis is to partition a dataset into clusters, where each cluster contains data points that are more similar to each other than to those in other clusters. This technique is pivotal in exploratory data analysis, enabling the identification of hidden patterns or relationships within the data. Various algorithms can be employed for cluster analysis, including k- means, hierarchical clustering, and density-based clustering. The selection of an appropriate algorithm hinges on the specific requirements of the analysis and the nature of the data. Cluster analysis operates on the principle of unsupervised learning, dealing with unlabeled data. A cluster represents a group of similar data points. For instance, consider a dataset containing information on different types of vehicles, such as cars, buses, and bicycles. Since this is unsupervised learning, the dataset lacks predefined class labels. Cluster analysis can be used to organize this unlabeled data into labeled clusters, such as a cluster for cars, a cluster for buses, and so on. The essence of cluster analysis lies in organizing data points into clusters, each containing similar objects. This process can be particularly useful for converting unlabeled data into labeled data, thereby facilitating further analysis. Properties of Clustering
2 Aashish R. Dandekar
4 Aashish R. Dandekar Types of Data:
5 Aashish R. Dandekar
7 Aashish R. Dandekar
Grid-Based Method for Distance-Based Outlier Detection Grid-Based Outlier Detection involves:
8 Aashish R. Dandekar