

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Use the k-means algorithm and calculate the distance for each centroids from a data point and the data point having minimum distance from the centroid of a cluster is assign to that particular cluster center and calculate the mean value for that cluster center.
Typology: Exercises
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Using K-means Algorithm: Data set: A1(2,10), A2(2,5), A3(8,5), B1(5,8), B2(7,5), B3(6,4), C1(1,2), C2(4,9). Centroids: A1(2,10), B1(5,8), C1(1,2). Iteration: 1 We need to Calculate the distance for each centroids from a data point and the data point having minimum distance from the centroid of a cluster is assign to that particular cluster center and calculate the mean value for that cluster center. Two points (x1,y1), (x2,y2) Euclidean distance Formula: =√(x2-x1)^2 + (y2-y1)^2 or = |x2-x1| + |y2-y1| Mean Formula: ((x1+x2) / 2, (y1+y2) / 2). 1 ST^ ROW: Distance calculate between the A2 data point and the Centroids A1, B1, C Distance between A2(2,5) & A1(2,10) = |2-2| + |5-10| = 0+5 = 5 Distance between A2(2,5) & B1(5,8) = |2-5| + |5-8| = 3+3 = 6 Distance between A2(2,5) & C1(1,2) = |2-1| + |5-2| = 1+3 = 4 The A2 nearby Cluster Center is C1. Then we need to calculate the mean value between the C1 and A2. C1 Mean value = (1.5,3.5) Then we need to update the centroid C1 value as (1.5, 3.5). 2 ND^ ROW: Distance calculate between the A3 data point and the Centroids A1, B1, C Distance between A3(8,5) & A1(2,10) = 11 Distance between A3(8,5) & B1(5,8) = 6 Distance between A3(8,5) & C1(1.5,3.5) = 8 The A3 nearby Cluster Center is B1. Then we need to calculate the mean value between the A3 and B1. B1 Mean value = (6.5, 6.5) Then we need to update the centroid B1 value as (6.5, 6.5). 3 RD^ ROW: Distance calculate between the B2 data point and the Centroids A1, B1, C Distance between B2(7,5) & A1(2,10) = 10 Distance between B2(7,5) & B1(6.5,6.5) = 2 Distance between B2(7,5) & C1(1.5, 3.5) = 7
The B2 nearby Cluster Center is B1. Then we need to calculate the mean value between the B2 and B1. B1 Mean value = (6.7, 5.7) Then we need to update the centroid B1 value as (6.7, 5.7). 4 TH^ ROW: Distance calculate between the B3 data point and the Centroids A1, B1, C Distance between B3(6,4) & A1(2,10) = 10 Distance between B3(6,4) & B1(6.7, 5.7) = 2. Distance between B3(6,4) & C1(1.5,3.5) = 5 The B3 nearby Cluster Center is B1. Then we need to calculate the mean value between the B3 and B1. B1 Mean value = (6.3, 4.8) Then we need to update the centroid B1 value as (6.3, 4.8). 5 TH^ ROW: Distance calculate between the C2 data point and the Centroids A1, B1, C1 Distance between C2(4,9) & A1(2, 10) = 3 Distance between C2(4,9) & B1(6.3, 4.8) = 6. Distance between C2(4,9) & C1(1.5, 3.5) = 8 The C2 nearby Cluster Center is A1. Then we need to calculate the mean value between the C2 and A1. A1 Mean value = (3, 9.5) Then we need to update the centroid A1 value as (3, 9.5). The updated Cluster points are : A1(3, 9.5), B1(6.3, 4.8), C1(1.5, 3.5) The above calculations are shown in the form of below table: Now we need to go for the next iteration with the updated cluster points