

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A biology assignment that involves computing distance matrices for tree inference, analyzing different algorithms for tree reconstruction, and identifying open reading frames and translating them into proteins. Students are expected to submit their answers in hardcopy in class.
Typology: Exercises
1 / 3
This page cannot be seen from the preview
Don't miss anything!


For this assignment, all answers are to be submitted in hardcopy in class.
This problem compares different algorithms for tree inference. Note that even though in this case, you will know the correct tree, in the case of real data you would have only the distance matrix and not the tree available to you.
This problem was inspired by a problem assigned in Shamir’s 2001 Algo- rithms for Molecular Biology class.
Background: You are a biologist studying a rare human disease called Home- work. After years of work, you understand that it is associated with specific malfunctioning cells. You harvest mRNA from such a cell, and get a cDNA, which you sequence. You reverse and complement the result to obtain the coding strand, and get:
(a) It is a coding region (b) It is an exon (c) It is an intron (d) It is a 5’ untranslated region (e) It is a 3’ untranslated region
(a) You have found a new gene (b) You have found a mutated version of a known gene (what are the mutations?) (c) Your sequencing machine made sequencing errors? (what are they?) (d) The gene is not human, but rather contamination of a bacteria or fungus or virus in the testube?