Data Mining: Clustering and Association Rule Learning | Exams Computer Science

Problem 1:

Iteration 1: M1=5 C1={1, 2, 3, 4, 5, 6, 10} M2=20 C2={20, 30, 40, 50, 60}

Iteration 2: M1=4.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=40 C2={30, 40, 50, 60}

Iteration 3: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60}

Iteration 4: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60}

Iteration 1: M1=2 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=50 C2={30, 40, 50, 60}

Iteration 2: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60}

Iteration 3: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60}

Iteration 1: M1=6 C1={1, 2, 3, 4, 5, 6} M2=10 C2={10, 20, 30, 40, 50, 60}

Iteration 2: M1=3.5 C1={1, 2, 3, 4, 5, 6, 10} M2=35 C2={20, 30, 40, 50, 60}

Iteration 3: M1=4.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=40 C2={30, 40, 50, 60}

Iteration 4: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60}

Iteration 5: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60}

We always break ties in the favor of cluster with the smallest number.

(1) (2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60)

(1, 2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60)

((1, 2), 3) (4) (5) (6) (10) (20) (30) (40) (50) (60)

(((1, 2), 3), 4) (5) (6) (10) (20) (30) (40) (50) (60)

((((1, 2), 3), 4), 5) (6) (10) (20) (30) (40) (50) (60)

(((((1, 2), 3), 4), 5), 6) (10) (20) (30) (40) (50) (60)

((((((1, 2), 3), 4), 5), 6), 10) (20) (30) (40) (50) (60)

(((((((1, 2), 3), 4), 5), 6), 10), 20) (30) (40) (50) (60)

((((((((1, 2), 3), 4), 5), 6), 10), 20), 30) (40) (50) (60)

(((((((((1, 2), 3), 4), 5), 6), 10), 20), 30), 40) (50) (60)

((((((((((1, 2), 3), 4), 5), 6), 10), 20), 30), 40), 50) (60)

(((((((((((1, 2), 3), 4), 5), 6), 10), 20), 30), 40), 50), 60)

We always break ties in the favor of cluster with the smallest number.

(1) (2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60)

(1, 2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60)

(1, 2) (3, 4) (5) (6) (10) (20) (30) (40) (50) (60)

(1, 2) (3, 4) (5, 6) (10) (20) (30) (40) (50) (60)

((1, 2), (3, 4)) (5, 6) (10) (20) (30) (40) (50) (60)

(((1, 2), (3, 4)), (5, 6)) (10) (20) (30) (40) (50) (60)

((((1, 2), (3, 4)), (5, 6)), 10) (20) (30) (40) (50) (60)

((((1, 2), (3, 4)), (5, 6)), 10) (20, 30) (40) (50) (60)

((((1, 2), (3, 4)), (5, 6)), 10) (20, 30) (40, 50) (60)

((((1, 2), (3, 4)), (5, 6)), 10) (20, 30) ((40, 50), 60)

(((((1, 2), (3, 4)), (5, 6)), 10), (20, 30)) ((40, 50), 60)

((((((1, 2), (3, 4)), (5, 6)), 10), (20, 30)), ((40, 50), 60))

Partial preview of the text

Download Data Mining: Clustering and Association Rule Learning and more Exams Computer Science in PDF only on Docsity!

Problem 1: a) Iteration 1: M1=5 C1={1, 2, 3, 4, 5, 6, 10} M2=20 C2={20, 30, 40, 50, 60} Iteration 2: M1=4.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=40 C2={30, 40, 50, 60} Iteration 3: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60} Iteration 4: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60} b) Iteration 1: M1=2 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=50 C2={30, 40, 50, 60} Iteration 2: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60} Iteration 3: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60} c) Iteration 1: M1=6 C1={1, 2, 3, 4, 5, 6} M2=10 C2={10, 20, 30, 40, 50, 60} Iteration 2: M1=3.5 C1={1, 2, 3, 4, 5, 6, 10} M2=35 C2={20, 30, 40, 50, 60} Iteration 3: M1=4.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=40 C2={30, 40, 50, 60} Iteration 4: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60} Iteration 5: M1=6.4 C1={1, 2, 3, 4, 5, 6, 10, 20} M2=45 C2={30, 40, 50, 60} d) We always break ties in the favor of cluster with the smallest number. (1) (2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60) (1, 2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60) ((1, 2), 3) (4) (5) (6) (10) (20) (30) (40) (50) (60) (((1, 2), 3), 4) (5) (6) (10) (20) (30) (40) (50) (60) ((((1, 2), 3), 4), 5) (6) (10) (20) (30) (40) (50) (60) (((((1, 2), 3), 4), 5), 6) (10) (20) (30) (40) (50) (60) ((((((1, 2), 3), 4), 5), 6), 10) (20) (30) (40) (50) (60) (((((((1, 2), 3), 4), 5), 6), 10), 20) (30) (40) (50) (60) ((((((((1, 2), 3), 4), 5), 6), 10), 20), 30) (40) (50) (60) (((((((((1, 2), 3), 4), 5), 6), 10), 20), 30), 40) (50) (60) ((((((((((1, 2), 3), 4), 5), 6), 10), 20), 30), 40), 50) (60) (((((((((((1, 2), 3), 4), 5), 6), 10), 20), 30), 40), 50), 60) e) We always break ties in the favor of cluster with the smallest number. (1) (2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60) (1, 2) (3) (4) (5) (6) (10) (20) (30) (40) (50) (60) (1, 2) (3, 4) (5) (6) (10) (20) (30) (40) (50) (60) (1, 2) (3, 4) (5, 6) (10) (20) (30) (40) (50) (60) ((1, 2), (3, 4)) (5, 6) (10) (20) (30) (40) (50) (60) (((1, 2), (3, 4)), (5, 6)) (10) (20) (30) (40) (50) (60) ((((1, 2), (3, 4)), (5, 6)), 10) (20) (30) (40) (50) (60) ((((1, 2), (3, 4)), (5, 6)), 10) (20, 30) (40) (50) (60) ((((1, 2), (3, 4)), (5, 6)), 10) (20, 30) (40, 50) (60) ((((1, 2), (3, 4)), (5, 6)), 10) (20, 30) ((40, 50), 60) (((((1, 2), (3, 4)), (5, 6)), 10), (20, 30)) ((40, 50), 60) ((((((1, 2), (3, 4)), (5, 6)), 10), (20, 30)), ((40, 50), 60))

Problem 2: a) Entropy Purity Cluster #1 2.0558 22/49 = 0. Cluster #2 2.0396 29/71 = 0. Cluster #3 1.4549 45/66 = 0. Cluster #4 1.4885 8/13 = 0. Overall 1.8137 0. b) Precision(“Compilers”) Cluster #1 2/49 = 0. Cluster #2 3/71 = 0. Cluster #3 1/66 = 0. Cluster #4 8/13 = 0. Overall 0. c) Recall(“Systems”) Cluster #1 7/45 = 0. Cluster #2 23/45 = 0. Cluster #3 12/45 = 0. Cluster #4 3/45=0. Overall 0.3134 (weighted sum) OR 45/45= Problem 3: a) Same as show in Figure 6.32 on page 408 in the text book. Note that the authors have not used the label L10 for any leaf node. b) Leaf node L c) Leaf nodes visited will be L4, L2, L3, L5, L1, L8, L6 and L d) Candidate item sets will be {1, 2, 7} {1, 7, 8} and {2, 7, 8} Problem 4: a) b) Both are false. Consider the following counter example. Let minsup = 0.2 and minconf = 0. Support(A, B, C) = 4/20 Support(A) = 10/20 Confidence(A → B) = 5/10 >= minconf Support(A, B) = 5/20 Confidence(A → BC) = 4/10 < minconf Support(A, C) = 9/20 Confidence(AC → B) = 4/9 < minconf c) False. Consider the following counter example Let minsup = 0.2 and minconf = 0. Support(A, B, C) = 4/20 Support(A) = 10/20 Confidence(AC → B) = 4/5 >= minconf Support(A, B) = 9/20 Support(A, C) = 5/20 Confidence(AB → C) = 4/9 < minconf d) False. Consider the following counter example Let minsup = 0.2 and minconf = 0. Support(A) = 10/20 Support(A, B) = 5/20 Confidence(A → B) = 5/10 >= minconf Support(B) = 12/20 Confidence(B → A) = 5/12 < minconf

Data Mining: Clustering and Association Rule Learning, Exams of Computer Science

Related documents

Partial preview of the text

Download Data Mining: Clustering and Association Rule Learning and more Exams Computer Science in PDF only on Docsity!