Prepara i tuoi esami
Ottieni punti
Guide e consigli
Vendi su Docsity
Docsity AI

Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity

Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium

Guide e consigli

Vendi su Docsity

Docsity AI

Accedi Registrati

Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity

Cerca documenti

Prepara i tuoi esami con i documenti condivisi da studenti come te su Docsity

Cerca la tua università

Trova i documenti specifici per gli esami della tua università

Video Corsi

Preparati con lezioni e prove svolte basate sui programmi universitari!

Quiz

Rispondi a reali domande d’esame e scopri la tua preparazione

Docsity AINEW

Riassumi i tuoi documenti, fagli domande, convertili in quiz e mappe concettuali

Maturità 2026

Studia con prove svolte, tesine e consigli utili

Esplora domande

Togliti ogni dubbio leggendo le risposte alle domande fatte da altri studenti come te

Argomenti di studio

Esplora i documenti più scaricati per gli argomenti di studio più popolari

Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium

Condividi documenti

20 Punti

Per ogni documento caricato

Rispondi alle domande

5 Punti

per ogni risposta data (max 1 al giorno)

Tutti i modi per ottenere punti gratis

Ottieni punti subito

Scegli un piano Premium con tutti i punti di cui hai bisogno

Opportunità di studio

Scegli il tuo prossimo programma di studio

Entra in contatto con le migliori università del mondo e scegli il tuo percorso di studi

Classifica delle migliori università

Scopri le migliori università italiane secondo gli studenti

Community

Chiedi alla community

Chiedi aiuto alla community e sciogli i tuoi dubbi legati allo studio

Guide Gratuite

I nostri eBook salva studente

Scarica gratuitamente le nostre guide sulle tecniche di studio, metodi per gestire l'ansia, dritte per la tesi realizzati da tutor Docsity

ALGORITHMS AND DATA STRUCTURES IN BIOLOGY - GENOMICS, Schemi e mappe concettuali di Sistemi Informatici

Alma Mater Studiorum – Università di Bologna (UNIBO)Sistemi Informatici

Concetti di algoritmo e di complessità computazionale: definizione di algoritmo, algoritmi ricorsivi ed iterativi, notazione asintotica. Algoritmi di ricerca esaustiva: restriction mapping, motif finding. Algoritmi greedy: sorting by reversals, algoritmi approssimati. Programmazione dinamica: edit distance, Manhattan distance. La tecnica Divide and Conquer.

Tipologia: Schemi e mappe concettuali

2024/2025

In vendita dal 25/02/2026

vivi-lerose 🇮🇹

5 documenti

1 / 16

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

ALGORITHMS AND

DATA STRUCTURES

Scopri Schemi e mappe concettuali di Sistemi Informatici Alma Mater Studiorum – Università di Bologna (UNIBO)

Documenti correlati

Algortimica A.A. 22/23

approccio a strutture dati e algoritmi

Gli algoritmi - analisi e programmazione

Paniere Algoritmi e Basi Dati 2023-2024 Pegaso

Sintesi dei principali algoritmi del corso di Modelli e Algoritmi per il Supporto alle Decisioni

(1)

Algoritmi Capitolo 3

Schemi "Algoritmi e Strutture Dati"

Complessità e algoritmi approssimati

Complessità asintotica: analisi degli algoritmi

ritorsione informatica

ALGORITMI 1 FONDAMENTI - TEORIA, Appunti con esercizi di Algoritmi 1 Fondamenti

Soluzione di sistemi lineari

Anteprima parziale del testo

Scarica ALGORITHMS AND DATA STRUCTURES IN BIOLOGY - GENOMICS e più Schemi e mappe concettuali in PDF di Sistemi Informatici solo su Docsity!

ALGORITHMS AND

DATA STRUCTURES

algorithm = finite sequence of unambiguous instructions, an algorithm is correct for a combinatorial problem if the steps dictated by the algorithm solve

the problem

pseudocode = specifing the algorithms than can easily translated to concrete programming languages

PROBLEMS AND COMPLEXITY

combinatorial problem = unambiguous and precise problem concerning the production of some outputs from

some inputs

how to prove correctness =

testing the algorithm = transfoming an input into an output (experimetal methodlogy)
proving the algorithm = mathematical proof that it does what it's supposed to do (analytical methodlogy) TYPES OF PROBLEMS

recursive problems = base case (general case to return and end the recursion) + recursive case (function call itself)

how to prove correctness =

identify the property that can be helpful for us
base of induction = show that algorithm satisfies the property in the base case
induction principle = show that the algorithm satisfies the property for all the recursive cases until (n-1)-th one
inductive step = show that algorithm satisfies the property for the n-th recursive case based on the assumption of

correctness of the inductive hypothesis

sorting problems = organizing arrays

selection sort = the array is run from 0 to end position, the smallest element is searched in swapped with

the i-th element of the array

merge sort = array is divided in 2 and sorted separately, then merged back together by adding

numbers to a new list one by one choosing the smallest one (o check the smallest one we only need to

check the first number of each sorted list since they are the smallest of that list)

EXHAUSTIVE SEARCH ALGORITHMS

exhaustive search algorithms = high complexity (NP-hard problems that generate in a domain all possible candidate solutions and searching one by one

through to find the solution) but easy to prove correctness

properties = finite domain, the domain must contains the solution and the domain must be ordered to be searched through

explore the search space (space of tuples) in a straightforward way → trees = arrangment of tuples

motif finding problem = algorithm to find frequent subsequences showing little variance (find a pattern that is the least

different from all the l-nucleotide sequences based on scoring system that counts the number of l-subsequences having in

that position a nucleotide matching the one in the consensus string) that gives as output an array of t starting positions s

maximizing the score

complexity = O(tln^t)

median string (less complex version of motif finding problem) = given two strings of the same length u and v, their

hamming distance is the number of positions at which they differ

the total distance between string u and a txn matrix is the minimum hamming distance between u (u = string) and s (s =

tuple of starting positions) complexity of hamming distance = O(nlt)

all possible strings of l are generated and if the total distance is lower than the best distance then the total distance is

the new best distance

complexity = O(nlt*4^l)

restriction mapping = searching for restriction sites

complexity = O(max^n-2)

build a table with the numbers present in the line as row and column (ordered from smaller to biggest) and for each

box find the difference (column - row) only for positive integers and put the resukts into a list

find the biggst number present (in the list) by comparing numbers with 0 (n-0) choosin the biggest one that rapresents

the furthest point from zero (add 0 and biggest point M)

find the seconf biggest number that can be the difference between M-0 or M-n so we dont know for sure the new point

(branching point)

we repeat the previous step till the end of the list to find all the restriction points

trees = arrangment of tuples (all leaves have the same height H and all nodes have a fixed number of children K (branching factor) or no children at all)

the total number of leaves is k^h

serach every leaf

skip from one vertex to another

search all the tree vertex leaf & I

greedy algorithms = making choices (not ever questioned) which are locally optimal to lower the complexity (generate a non-optimal solution in polynomial

time), approximation algorithm = gives an approximate (correct but not optimal) solution to an optimization problem

evaluate the quality of the solution = distance of the solution to the optimal solution OPT(input) =

cost of the optimal solution

approximation ratio (AR) of an algorithm on an instance of length n =

maximization problem AR(n) ≥ max (OPT(x) / A(x))
minimization problem AR(n) ≥ max (A(x) / OPT(x))

GREEDY ALGORITHMS

complexity = O(ln^2 + ln*t)

sequence alignment = to find the function of newly sequenced genes by comparing their sequence with similar

genes of known function

hamming distance = count of number of mismatches of the two sequences assuming that the i^th symbol of

one sequence is aligned with the i^th symbol of the second sequence

edit distance = minimum amount of editing operations (insertion, deletion, substitutions) transforming a string

into the other

build an alignment grid = matrix with the two sequences as row and column
use a scoring function to assign weights to edges depending on the number of matches, mismatches or gaps

in order to evaluate each alignment

generating a path where horizontal edges corespond to insertions, vertical edges correspond to deletions

and oblique eges correspond to substitutions or matches

the resulting path is the optimal alignment between the initial two sequences

global sequence alignment = similarities between the entire strings

local sequence alignment = similarities between substrings

knapsack problem = thera are various objects that we can choose from but the weight that we can carry is limited, the algorithm is able to

choose the maximum number of objects with a fixed total weight (da programmare!)

binpacking problem = pack staff in the minimum of boxes

takes as input an array with elements and an associated array with sizes or weights of the elements and returns an array of arrays

containing the partitioning of the elements firstfit = adding elements subsequently inside boxes, if we can fit the element in the

previous boxes we add it otherwise we add it to a new box

DIVIDE-AND-CONQUER

ALGORITHMS

divide-and-conquer algorithm = split the input into two (or more) parts, solve them separately and then combine them

complexity = O(n*log(n))

the algorithm builds a matrix with as middle column an array contanig the score of the best path

from the initial point to the middle point with position i (index of the array and point in the matrix)

then it computes the same matrix but from middle to end

at the end the two arrays are summed and the highest score corespond to the best point in column i

therefore we can split the matrix into two parts from position i to find the new best middle points to

reconstruct the best path

questions

I formal definition =

def motif - finding(DNA , t^ ,n^ ,^ 2)^ :

given a^ see^ of^ DNA^ sequences^ find^ a^ set^ of^ C-mers^ ,^ one^ from^ each^ sequence

best - motif <-(1 . ..., 1) such that maximises the consensusScore - > Score (S , ANA) is the Sum

, position

<-^ (1^ ,.. ., 1)^ by position^ of^ the^ n.^ ofe-subsequences^ having in^ that^ position^ a^ nucleotide^ mathing

for 51 71 to n- L + 1 : the one in the consensuous

string

for (^) S2 71 to n- L + 1 :

input =^ Exn^ marise of^ DNA^ (t^ =^ n^.^ of^ sequences , n^ = length of^ sequences) and^ the

if score (s^ ,^2 ,^ DNA)^ <^ Score^ (best-^ motif^ , 2 , DNA)^ :

length of^ the^ pattern^ &

best - motif-1S1 output= array of^ t^ starting positions^ s=^ (S1.... St) (^) maximizing the^ score best (^) - motif-2S 52 -best - motif -1 (^) used (^) technique = greedy S2-best _ motif (^) - for i 1-3 to :^ complexity analysis = (^0) (nze + (^) net) for Si^71 to^ n-^ C^ +^1 :

if score (s , 2 , DNA) <^ Score (best_ motif , 2 , DNA) :

best - motif^ -^ iSi si hamming distance^ between^ two

best-distance 7 sequences therefore^ n.^ of^ positions^ at^ which^ they differ

for each -mer word from 'AA ...^ A^ to^ 'TT...T'^ :

input =^ Exn^ marise of^ DNA^ (t^ =^ n^.^ of^ sequences , n^ = length of^ sequences (

if totaldistance (word , DNA) < best-distance : and the length of the pattern &

best-distance =^ totaldistance^ (word^ , DNA)

output=^ a (^) string word^ ofe nucleotides that minimizes the

best-word word total-distance

return best-word used technique = exhaustive search

complexity analysis^ =^0 (tent)

def PDP(L , n)^ :

m 7 maximum element in^ L

fore every set of n-2 (^) integers O <^ x2) ...

if delta (width - y , se) is part of L:

add width-y to^ a^ Remove^ Lengths deta^ (width^ - y ,^ x)^ from

place (1^ ,^ x) remove width-y^ from^ se^ and^ all^ Lengths deta^ (width^ -^ y^ ,^ x)^ to return

brute-force-PDP (1 , n) :

m = maximum element of L

for every set ofh-2 integers 01 ...^ ^ best-score : for^ S21^ to^ n-l^ +^1 :

best - Score =^ Score (s. DNA) for^ S2^ =^2 to^ n-1^ +^1 :

best (^) - motif =^ (S1 , .... St) if^ Score^ (s.^2 ,^ DNA)^ Score^ (best^ -^ motif^.^2 ,^ DNA) :

return best_motif O(t. C. nt) best - motif- 1 S

best _ motif -^ 2 S

S1 <^ best - motif-

brute-force - median-string(DNA^ , t^ , n^. L)^ :

52 best^ _^ motif-

best - word =^ AA^ ... A

for it3 to n-l+ 1 :

best _ distance =^ x

if (^) score (s. 2 , (^) DNA) < Score (^) (best - motif (^). (^2) , DNA) :

for each I-mer word from AA ... A to TT...^ T^ :

best - motif-i Si

if (^) total-distance (word (^) , DNA) best-distance : si -^ best_motif-i best-distance- total-distance^ (word , DNA) return best - motif 0(n2.^ e^ +^ n^ -^2.^ t)

best-worda word

return best_word (^0) (n. 2. t. 4t)

Simple-Reversal-sort (a) :^ improved -^ Reversal-sort^ (a)^ :

for i +^1 to n-1 : while^ b(u) >^ o :

ja position of^ element i^ in^ th^ if^ i^ has^ a decreasing strip^ : if (^) ii^ : among all reversals choose (^) p (^) minimizing bla^.^ p)

a =^ u - p(i , j) else :

output a^ choose^ reversal^ p that^ flips an (^) increasing strip in^ n ifa is (^) the (^) identity permutation : (^) a = (^) a - p return output a 0(nz) return 0(n3) Manhattan-tourist (Wi (^) , Wj (^) , n ,^ m)^ :

So ,o to

for it^1 to n^ :

Si , 0 to Si-1 , 0 +^ win^ , o

for ja 1 to mi

So (^) , j So, (^) j-1 + (^) Wjo (^) ,

for i t 1 to n:

for j-1 to^ m^ :

Si ,+Max (Si-1^ , 5 +^ Wii , j ,^ Si^ , j -^1 +^ WJi , j)

Meturn (^) Shim (^) on. m) Longest-path (6)^ : vertices-left v edges-left =^ E result =^ [] while vertices-left *^0 : top =^ vertices^ left^ of^ vertices-^ left^ not^ having (^) entering edges

result. Append (top)

vertices (^) - left. (^) Remove (top) edges-left.^ Remove^ (edges (^) having an^ endpoint^ in^ top return Result (^) O(n+ (^) m)