Sequence Alignment: Methods, Significance, and Optimal Alignment Scoring, Study notes of Cancer Cytogenetics

An overview of sequence alignment, including methods such as pairwise alignments, multiple alignments, and dynamic programming algorithms like the smith-waterman method. It also covers the significance of sequence similarity and the calculation of optimal alignment scores.

Typology: Study notes

Pre 2010

Uploaded on 08/09/2009

koofers-user-kpr
koofers-user-kpr 🇺🇸

10 documents

1 / 56

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
8/31/07 C.Bystroff 1
BIOL 6961 Fall 2007
8/31: Sequence Alignment
inferences, methods, significance, editing
9/5: Database searching
methods, e-values, GenBank files
9/7: Fitting data
parametric space, correlations, training
and testing
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38

Partial preview of the text

Download Sequence Alignment: Methods, Significance, and Optimal Alignment Scoring and more Study notes Cancer Cytogenetics in PDF only on Docsity!

8/31/07 C.Bystroff 1

BIOL 6961 Fall 2007

• 8/31: Sequence Alignment

  • inferences, methods, significance, editing

• 9/5: Database searching

  • methods, e-values, GenBank files

• 9/7: Fitting data

  • parametric space, correlations, training

and testing

8/31/07 C.Bystroff 2

Pairwise alignments

8/31/07 C.Bystroff 4

AAAGA GATTCTGC TAGCGGT CGG

AGAGATGCTGCAGCGAGTCGGCC

Plant. Bug. Aligning two sequences An alignment is a one-to-one association, or a set of one- to-one associations. What does it mean? What kinds of associations are allowed?

8/31/07 C.Bystroff 5 Venn Diagram -- two meanings of "aligned", for proteins. (A) have a common ancestor (B) superimpose in space TG CTA (^) TGCAA TG CTA Aligned positions.... some positions have a common ancestor but don’t superimpose in space evolution conserves structure, mostly some positions that superimpose in space didn’t have a common ancestor (convergent evolution).

8/31/07 C.Bystroff 7

AAAGA GATTCTGC TAGCG.G TCGG..

..AGA GATGCTGC .AGCGAG TCGGCC

Plant. Bug. Columns denote association Dots or dashes are spacers, or gaps. The sequence with a gap character has a deletion. The sequence aligned to a gap character has an insertion. Characters in the same column are aligned (or matched ) Matched characters can be identity matches or mutations.

Bird. ..AG. GAAGCTGC AAGCCAG TCGGTC

8/31/07 C.Bystroff 8

The optimal alignment

• "The alignment" can be any set of one-

to-one associations between

characters, but it generally means the

best (or optimal ) alignment.

• The best alignment is the one that

assumes the fewest mutations , or

smallest evolutionary distance.

• To find the optimal alignment, we must

define evoutionary distance.

8/31/07 C.Bystroff 10

The optimal alignment

  • Simple similarity score: Identity match = +1 point mismatch = -1 points gap = -1 points
  • Optimal alignment = The highest-scoring alignment given the similarity score.

8/31/07 C.Bystroff 11

Computing pairwise

alignments

8/31/07 C.Bystroff 13

AAAGA GATTCTGC TAGCGGT CGG

A G A G A T G C T G C A G C G A G T C G G C C

Dot plot:

8/31/07 C.Bystroff 14

AAAGA GATTCTGC TAGCGGT CGG

A G A G A T G C T G C A G C G A G T C G G C C

Dot plot:

8/31/07 C.Bystroff 16

AAAGA GATTCTGC TAGCGGT CGG

A G A G A T G C T G C A G C G A G T C G G C C

8/31/07 C.Bystroff 17

AAAGA GATTCTGC TAGCGGT CGG

A G A G A T G C T G C A G C G A G T C G G C C

blocks gaps insertion, A deletion, T mutation, T->G

8/31/07 C.Bystroff 19

AAAGA GATTCTGC TAGCGGT CGG

A G A G A T G C T G C A G C G A G T C G G C C

8/31/07 C.Bystroff 20

AAAGA GATTCTGC TAGCGGT CGG

A G A G A T G C T G C A G C G A G T C G G C C