4 Problems for Assignment 4 - Computational Bioinformatics | CPSC 485, Assignments of Computer Science

Material Type: Assignment; Professor: Wang; Class: Computational Bioinformatics; Subject: Computer Science; University: California State University - Fullerton; Term: Summer 2009;

Typology: Assignments

Pre 2010

Uploaded on 08/16/2009

koofers-user-l4g-1
koofers-user-l4g-1 🇺🇸

9 documents

1 / 1

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CPSC 485 Computational Bioinformatics
Homework #4, Due 05/06/09
Problem 1-3. Assume the following is the distance matrix of 9 genes. Cluster the 9 genes
using the algorithms we discussed in classes.
g1 g2 g3 g4 g5 g6 g7 g8 g9
g1 0 38 58 13 15 27 10 19 12
g2 38 0 86 49 67 60 25 43 89
g3 58 86 0 83 37 66 78 95 11
g4 13 49 83 0 67 31 45 82 36
g5 15 67 37 67 0 5 94 2 51
g6 27 60 66 31 5 0 96 28 88
g7 10 25 78 45 94 96 0 85 43
g8 19 43 95 82 2 28 85 0 67
g9 12 89 11 36 51 88 43 67 0
Problem 1. Cluster the genes using the HIERARCHICALCLUSTERING algorithm on page
345. The distance between two clusters is calculated as the average distance between
their elements.
Problem 2. Cluster the genes using the CAST algorithm on page 354. The distance
threshold is set to
θ
= 33.
Problem 3. Cluster the genes using the UPGMA algorithm on page 366.
Problem 4. Assume the following is the distance matrix of 6 leaves in a tree. Reconstruct
the tree using the ADDITIVEPHYLOGENY algorithm on page 364.
A B C D E F
A 0 4 7 10 12 11
B 4 0 7 10 12 11
C 7 7 0 9 11 10
D 10 10 9 0 10 9
E 12 12 11 10 0 9
F 11 11 10 9 9 0

Partial preview of the text

Download 4 Problems for Assignment 4 - Computational Bioinformatics | CPSC 485 and more Assignments Computer Science in PDF only on Docsity!

CPSC 485 Computational Bioinformatics

Homework #4, Due 05/06/

Problem 1-3. Assume the following is the distance matrix of 9 genes. Cluster the 9 genes using the algorithms we discussed in classes.

g1 g2 g3 g4 g5 g6 g7 g8 g g1 0 38 58 13 15 27 10 19 12 g2 38 0 86 49 67 60 25 43 89 g3 58 86 0 83 37 66 78 95 11 g4 13 49 83 0 67 31 45 82 36 g5 15 67 37 67 0 5 94 2 51 g6 27 60 66 31 5 0 96 28 88 g7 10 25 78 45 94 96 0 85 43 g8 19 43 95 82 2 28 85 0 67 g9 12 89 11 36 51 88 43 67 0

Problem 1. Cluster the genes using the HIERARCHICAL CLUSTERING algorithm on page

  1. The distance between two clusters is calculated as the average distance between their elements.

Problem 2. Cluster the genes using the CAST algorithm on page 354. The distance

threshold is set to^ θ^ = 33.

Problem 3. Cluster the genes using the UPGMA algorithm on page 366.

Problem 4. Assume the following is the distance matrix of 6 leaves in a tree. Reconstruct the tree using the ADDITIVE PHYLOGENY algorithm on page 364.

A B C D E F A 0 4 7 10 12 11 B 4 0 7 10 12 11 C 7 7 0 9 11 10 D 10 10 9 0 10 9 E 12 12 11 10 0 9 F 11 11 10 9 9 0