Bio Informatics NCBI Gene Alignment, Exams of Nursing

Bio Informatics NCBI Gene Alignment

Typology: Exams

2025/2026

Uploaded on 03/24/2026

eugine-muhindi
eugine-muhindi 🇺🇸

375 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
The scoring scheme for sequence alignments consist of character
substitution scores (score for each possible character replacement)
and penalties for gaps.
The score is based on matches, mismatches and gaps in the
sequence alignment.
Thus, the alignment score is the sum of substitution scores and gap
penalties and indicates if the alignment is optimal.
There are different methods for scoring an alignment:
1. Dot matrix / Plot methods
2. Dynamic programming methods
Methods used in Scoring an alignment
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download Bio Informatics NCBI Gene Alignment and more Exams Nursing in PDF only on Docsity!

The scoring scheme for sequence alignments consist of character

substitution scores (score for each possible character replacement)

and penalties for gaps.

The score is based on matches, mismatches and gaps in the

sequence alignment.

Thus, the alignment score is the sum of substitution scores and gap

penalties and indicates if the alignment is optimal.

There are different methods for scoring an alignment:

1. Dot matrix / Plot methods

2. Dynamic programming methods

Methods used in Scoring an alignment

Dot matrix method (Dot plot method)

This is a graphical way of comparing two sequences in a two dimensional matrix.

The two sequences to be compared are written in the horizontal and vertical axes

of the matrix.

Comparison is then done by scanning each residue of one sequence for similarity

with all residues in the other sequence.

If a residue match is found, a dot is placed within the graph.

When the two sequences have substantial regions of similarity, many dots line up

to form contiguous diagonal lines, which reveal the sequence alignment.

Interruptions in the middle of a diagonal line indicate insertions or deletions.

Parallel diagonal lines within the matrix represent repetitive regions of the

sequences

Dot matrix /plot analysis Online tools e.g NCBI BLAST

Example

>BAG32276.

MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHG

GGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIH

FGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVV

EQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG

>ABP65297.

MVKSHIGSWLLVLFVATWSDIGFCKKWPKPGGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQ

PHGGGWGQPHGGGWGQPHGGGGWGQGGGSHGQWGKPSKPKTNMKHVAGAAAAGAVVGGLGGYMLGSAMSR

PLIHFGNDYEDRYYRENMYRYPNQVYYKPVDQYSNQNNFVHDCVNITVKQHTVTTTTKGENFTETDMKIM

ERVVEQMCVTQYQRESEAYYQKGASAILFSPPPVILLISLLILLIVG

5

Procedure:

Step 1: Open the home page of NCBI BLAST through a web browser.

Step 2: Select “Nucleotide BLAST” for comparing two nucleotide

sequences, or “Protein BLAST” for comparing two protein sequences.

Step 3: Click the checkbox “Align two or more sequences” to change

the default layout to perform pairwise sequence alignment.

Step 4: Enter accession number(s), gi(s), or FASTA sequence(s) in

the text boxes (or upload FASTA format files) and click “BLAST”

button.

Step 5: Pairwise alignment result is displayed. Click the “Dot

Matrix /Plot View” link to expand the result of dot matrix.

Step 6: Click on MSA Viewer and zoom to sequence to view the

Dynamic programming methods

Dynamic programming methods is used to find the optimal alignment

between two proteins or nucleic acid sequences by comparing all

possible pairs of characters in the sequences.

Dynamic programming can be used to produce both global and local

alignments based on the two different algorithms: Needleman-Wunsch

algorithm and Smith-Waterman algorithm.

Needleman–Wunsch algorithm

This is used for global alignment and focuses on getting a maximum

score for the full-length sequence alignment.

This is however, risks missing the best local similarity thus it is used for

aligning two closely related sequences that are of the same length.

Smith–Waterman algorithm

This is is used for local alignment and it is considered a database search

algorithm.

This is because, the divergence level between the sequences to be aligned is not

easily known and sequence lengths are usually unequal.

Thus, identification of regional sequence similarity is of greater significance

than finding a match that includes all residues.

Positive scores are assigned for matching residues, zero for mismatches and

negative scores are not used.

Gaps are inserted where necessary.

The local alignment path may begin and end internally and gets the highest

score at the expense of the overall score for a full-length alignment.

Thus approach is used for aligning divergent sequences or sequences with

multiple domains of different origins.

Variants of BLAST

Nucleotide BLAST (BLASTN) queries nucleotide sequences with a nucleotide

sequence database.

Protein BLAST (BLASTP) uses protein sequences as queries to search against

a protein sequence database.

BLASTX uses nucleotide sequences as queries and translates them in all six

reading frames to produce translated protein sequences, which are used to query a

protein sequence database

TBLASTN queries protein sequences to a nucleotide sequence

database with the sequences translated in all six reading frames.

TBLASTX uses nucleotide sequences, which are translated in all

six frames, to search against a nucleotide sequence database that

has all the sequences translated in six frames.

Others types of BLAST for protein sequences:

PSI BLAST:‐ Allows the user to build a PSSM (position-specific

scoring matrix) using the results of the first BlastP run.

PHI BLAST:‐ Performs the search but limits alignments to those that

match a pattern in the query.

DELTA BLAST:‐ PSSM using the results of a Conserved Domain

Database search and searches a sequence database.