K Means Clustering Machine learning algorithm, Lab Reports of Machine Learning

K-Means clustering is a popular unsupervised machine learning algorithm used for partitioning a dataset into groups or clusters based on similarity or distance between data points. The primary goal of K-Means is to separate data into K clusters, where K is a user-defined parameter. It's commonly used for data analysis, image segmentation, recommendation systems, and more. Here's how the K-Means algorithm works:

Typology: Lab Reports

2023/2024

Available from 10/27/2023

faizanshaik-official
faizanshaik-official 🇮🇳

4 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
In [35]:
In [37]:
In [15]:
Out[15]:
Variable_1 Variable_2
012 30
120 36
228 30
318 52
429 54
533 46
624 55
745 59
845 63
952 70
10 51 66
11 52 63
12 55 58
13 53 23
14 55 14
15 61 8
16 64 19
17 69 7
18 72 24
# k means clustering
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.read_excel("C:/Users/kriti/OneDrive/Desktop/machine Learning/experiments/C
data
1
2
3
4
5
1
1
pf3
pf4
pf5

Partial preview of the text

Download K Means Clustering Machine learning algorithm and more Lab Reports Machine Learning in PDF only on Docsity!

In [37]: In [15]: Out[15]: Variable_1 Variable_ 0 12 30 1 20 36 2 28 30 3 18 52 4 29 54 5 33 46 6 24 55 7 45 59 8 45 63 9 52 70 10 51 66 11 52 63 12 55 58 13 53 23 14 55 14 15 61 8 16 64 19 17 69 7 18 72 24 # k means clustering import pandas as pd import numpy as np import matplotlib.pyplot as plt data = pd.read_excel("C:/Users/kriti/OneDrive/Desktop/machine Learning/experiments/C data

In [17]: fig = plt.figure(figsize = ( 5 , 5 )) x = data["Variable_1"] y = data["Variable_2"] n = range( 0 , 19 ) plt.grid() plt.scatter(x, y, marker = 'o', c = 'red' ) plt.xlabel('Variable_1') plt.ylabel('Variable_2') for i, txt in enumerate(n): plt.annotate(txt, (x[i], y[i])) from sklearn.cluster import KMeans individual_clustering_score = [] for i in range( 1 , 4 ): kmeans = KMeans(n_clusters = i) kmeans.fit(data) individual_clustering_score.append(kmeans.inertia_)

In [25]: In [ ]: Out[22]: array([[23.42857143, 43.28571429], [62.33333333, 15.83333333], [50. , 63.16666667]]) # cluster centers centroids fig = plt.figure(figsize = ( 5 , 5 )) # dictionary- map numbers to colors colmap = { 1 :'r', 2 :'b', 3 :'g'} # map will assign colors to labels colors = map( lambda x: colmap[x + 1 ], labels) colors1 = list(colors) plt.scatter(x, y, color = colors1, alpha = 0.5 ) # plotting the centroids wrt color for idx, centroid in enumerate(centroids): plt.scatter( ***** centroid, color = colmap[idx + 1 ]) # labeling the points as 0,1,2,.... for i, txt in enumerate(n): plt.annotate(txt, (x[i], y[i])) plt.grid()

In [ ]: In [ ]: In [ ]: In [ ]: In [ ]: In [ ]: In [ ]: