RNA Folding – Bioinformatic Algorithms, Databases and Tools - Notes | CMSC 423, Study notes of Computer Science

RNA Folding Material Type: Notes; Class: BIOINFO ALGS, DB, TOOLS; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-o1p
koofers-user-o1p 🇺🇸

5

(1)

10 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CMSC423: Bioinformatic Algorithms,
Databases and Tools
Lecture 20
RNA folding
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download RNA Folding – Bioinformatic Algorithms, Databases and Tools - Notes | CMSC 423 and more Study notes Computer Science in PDF only on Docsity!

CMSC423: Bioinformatic Algorithms,

Databases and Tools

Lecture 20

RNA folding

RNA folding

  • Function of RNA molecules depends on how they fold, based on nucleotide base-pairing

Types of structures

  • Nested (hairpin)
  • Pseudo-knots ACAUGGAUGU ((((..)))) UUCCG--A---- AGGGCAACUCGA -A-A--UGAGCU UUCCGAAGCUCAACGGGAAAUGAGCU

From multiple alignment to structure

  • Find columns in the alignment where mutations are

correlated

  • Mutual information - how correlated are the columns?

GCCUUCGGGC GACAUCGGUC GGCUUTGGCC (......) M (^) i,j = (^) ∑ xi ,x j f (^) x i x j log

f (^) x i x j f (^) x i f (^) x j

M i,j = mutual information between columns i and j f xixj = frequency of each of 16 pairs of nucleotides at columns i and j f xi = frequency of each of 4 nucleotides at column i f xj = frequency of each of 4 nucleotides at column j

Nussinov's algorithm

  • Assumes no pseudo-knots
  • Dynamic programming approach – maximize # of pairings
  • S – string of nucleotides representing the RNA molecule
  • Sub-problem – F[i,j] – score of folding just S[i..j]
  • Initial values: F[i-1,i] = F[i,i] = F[i, i+1] = 0

Nussinov's algorithm

i+ i j j- j i i+ i j- j i j k k+ I. F[i+1,j] F[i,j] is the maximum of: II. F[i,j-1] III. F[i+1,j-1] + 1 if S[i+1] complementary to S[j-1] IV. max k F[i,k]+F[k+1,j] S[i] unpaired S[j] unpaired S[i] paired with S[j] Branch

G G G A A A U C C

G G G A A A U C C

F[i+1, j] F[i, j - 1] F[i+1, j-1] + 1 (if paired) max k F[i,k] + F[k+1,j]

GGGAAAUCC ((.(..))) .((..())) G G G A A A U C C G G G A A A U C C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 1 1 1 0 1 2 1 1 2 2 3 2 3 0 0 0 0 0 0 0 0

Question

How do you change Nussinov's algorithm to allow the computation of the stacking energy? Hint: think affine gap penalties.

Protein folding

  • Protein shape determines protein function
  • Protein sequence determines protein shape (Anfinsen’s experiment)
  • Levinthal’s paradox – space of possible protein conformations is exponentially large, yet proteins fold fast (μsec – minutes).
  • Corollary: proteins must “know” how to fold (i.e. they don’t search the entire space of conformations)
  • Note: much easier to find a protein's sequence than its structure