Phylogenetics - Bioinformatics - Lecture Slides, Slides of Bioinformatics

Main points of this lecture are: Phylogenetics, Distance-Based Methods, Sequences of Nucleic Acids, Phylogenetic Trees, Dna Sequences, Sequences of Cytochrome, Homologous Sequence, Multiple Sequence Alignment, Vaccine Development, Types of Data

Typology: Slides

2012/2013

Uploaded on 04/23/2013

asmita
asmita 🇮🇳

4.6

(34)

178 documents

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Phylogenetics - Distance-
Based Methods
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Phylogenetics - Bioinformatics - Lecture Slides and more Slides Bioinformatics in PDF only on Docsity!

Phylogenetics - Distance-

Based Methods

Phylogenetics

  • Attempts to infer the evolutionary history of a

group of organisms or sequences of nucleic acids

or proteins

  • Phylogenetic methods can be used for the study of evolutionary relationships between species of organisms as well as genes
  • Attempt to reconstruct evolutionary ancestors
  • Estimate time of divergence from ancestor

Phylogenetic Trees

Phylogenetic Tree for Close Human Relatives

Humans

Orangutans Chimpanzees Gorillas

Common Ancestor of Gorillas Chimps

Comon Ancestor Gorillas, Chimps, Orangs

Common Ancestor of Humans and Apes

History

  • Taxonomists used anatomy and physiology to

group and classify organisms

  • Morphological features like presence of feathers or number of legs
  • When protein sequencing, and later DNA

sequencing became common, amino acid and

DNA sequences became the common way to

contruct trees

The Big Picture

  • Determine the species or genes to be studied
  • Acquire homologous sequence data
  • Use multiple sequence alignment software like ClustalW to align
  • Clean up data by hand
  • Use phylogenetic analysis software like Phylip based on techniques we will study
  • Verify experimentally

Phylogenetics

  • Can be used to solve a number of interesting

problems

  • Forensics
    • HIV virus mutates rapidly
  • Predicting evolution of influenza viruses
  • Predicting functions of uncharacterized genes - ortholog detection
  • Drug discovery
  • Vaccine development
    • Target inferred common ancestor

Phylogenetic Trees

  • Trees are composed of nodes and branches
    • Terminal or leaf nodes correspond to a gene or organism for which data has been collected
    • Internal nodes usually represent an inferred common ancestor that gave rise to two independent lineages sometime in the past

Rooted and Unrooted Trees

  • Some trees make an inference about a

common ancestor and the direction of

evolution and some don’t

  • First type is called a rooted tree and has a single node designated as root which is the common ancestor
  • Second type is called an unrooted tree
    • Specifies only relationship between nodes and says nothing about direction of evolution

Rooted and Unrooted Trees

  • Roots can usually be assigned to unrooted

trees using an outgroup

  • Species unambiguously separated the earliest from others being studied
  • E.g. baboons in case of humans and gorillas
  • For three species there are 3 possible rooted trees, but only one possible unrooted tree

Rooted and Unrooted Trees

  • In fact the numbers of rooted ( NR ) and unrooted trees ( NU ) for n species is - NR = (2n - 3)!/2n-2(n - 2)! - N (^) U = (2n - 5)!/2n-3(n - 3)!

Data Sets Rooted Trees Unrooted Trees 2 1 1 3 3 1 4 15 3 5 105 15 10 34,459,425 2,027, 15 213,458,046,767,875 7,905,853,580, 20 8,200,794,532,637,891,559,375 221,643,095,476,699,771,

Rooting a Tree

QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.

More Tree Terminology

  • Structure of a phylogenetic tree can be represented in Newick format using nested parentheses - (((B, C), (D, E)), A)
  • If we lack data to tell in which order two or more independent lineages occurred in the past, the tree may be multifurcating (more than two ancestors) otherwise, it is bifurcating (exactly two ancestors per interior node)

Character and Distance Data

  • Distance-based methods must transform the

sequence data into a pairwise similarity matrix

for use during tree inference

Species A B C D B 2 - - - C 4 5 - - D 7 9 5 - E 3 5 7 8

Distance-Based Methods

  • Given such an input matrix we want to find an

edge-weighted tree where the leafs of the

tree correspond to the species and the

distances measured between two leaves

corresponds to the corresponding matrix

value for the leaves