Bioinformatics: Sequence Alignment and Dynamic Programming, Exams of Bioinformatics

Set of chapters with the multiple choice questions for each related to bioinformatics.

Typology: Exams

2022/2023

Uploaded on 06/13/2023

corbyn-jason
corbyn-jason 🇿🇦

1 document

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Bioinformatics Questions and Answers – Global Sequence Alignment
1. When did Needleman-Wunsch first describe the
algorithm for global alignment?
a) 1899
b) 1970
c) 1930
d) 1950
2. Which of the following does not describe dynamic
programming?
a) The approach compares every pair of characters in
the two sequences and generates an alignment, which
is the best or optimal
b) Global alignment algorithm is based on this method
c) Local alignment algorithm is based on this method
d) The method can be useful in aligning protein
sequences to protein sequences only
3. Which of the following is not an advantage of
Needleman-Wunsch algorithm?
a) New algorithmic improvements as well as increasing
computer capacity make possible to align a query
sequence against a large DB in a few minutes.
b) Similar sequence region is of same order and
orientation.
c) This does not help in determining evolutionary
relationship
d) If you have 2 genes that are already understood as
closely related, then this type of algorithm can be used
to understand them in further details
4. Which of the following is not a disadvantage of
Needleman-Wunsch algorithm?
a) This method is comparatively slow
b) There is a need of intensive memory
c) This cannot be applied on genome sized sequences
d) This method can be applied to even large sized
sequences
5. Which of the following does not describe global
alignment algorithm?
a) In initialization step, the first row and first column
are subject to gap penalty
b) Score can be negative
c) In trace back step, beginning is with the cell at the
lower right of the matrix and it ends at top left cell
d) First row and first column are set to zero
6. Which of the following does not describe PAM
matrices?
a) These matrices are used in optimal alignment scoring
b) It stands for Point Altered Mutations
c) It stands for Point Accepted Mutations
d) It was first developed by Margaret Dayhof
7. Which of the following is untrue regarding the
scoring system used in dynamic programming?
a) If the residues are same in both the sequences the
match score is assumed as +5 which is added to the
diagonally positioned cell of the current cell.
b) If the residues are not same, the mismatch score is
assumed as -3.
c) If the residues are not same, the mismatch score is
assumed as 3.
d) The score should be added to the diagonally
positioned cell of the current cell.
8. Which of the following does not describe global
alignment algorithm?
a) Score can be negative in this method.
b) It is based on dynamic programming technique.
c) For two sequences of length m and n, the matrix to
be defined should be of dimensions m+1 and n+1.
d) For two sequences of length m and n, the matrix to
be defined should be of dimensions m and n.
9. Which of the following does not describe global
alignment algorithm?
a) It attempts to align every residue in every sequence
b) It is most useful when the aligning sequences are
similar and of roughly the same size
c) It is useful when the aligning sequences are
dissimilar
d) It can use Needleman-Wunsch algorithm
10. Which of the following is wrong in case of
substitution matrices?
a) They determine likelihood of homology between two
sequences
b) They use system where substitutions that are more
likely should get a higher score
c) They use system where substitutions that are less
likely should get a lower score
d) BLOSUM-X type uses logarithmic identity to find
similarity
Bioinformatics Questions and Answers – Local Sequence Alignment
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Bioinformatics: Sequence Alignment and Dynamic Programming and more Exams Bioinformatics in PDF only on Docsity!

Bioinformatics Questions and Answers – Global Sequence Alignment

  1. When did Needleman-Wunsch first describe the algorithm for global alignment? a) 1899 b) 1970 c) 1930 d) 1950
  2. Which of the following does not describe dynamic programming? a) The approach compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal b) Global alignment algorithm is based on this method c) Local alignment algorithm is based on this method d) The method can be useful in aligning protein sequences to protein sequences only
  3. Which of the following is not an advantage of Needleman-Wunsch algorithm? a) New algorithmic improvements as well as increasing computer capacity make possible to align a query sequence against a large DB in a few minutes. b) Similar sequence region is of same order and orientation. c) This does not help in determining evolutionary relationship d) If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details
  4. Which of the following is not a disadvantage of Needleman-Wunsch algorithm? a) This method is comparatively slow b) There is a need of intensive memory c) This cannot be applied on genome sized sequences d) This method can be applied to even large sized sequences
  5. Which of the following does not describe global alignment algorithm? a) In initialization step, the first row and first column are subject to gap penalty b) Score can be negative c) In trace back step, beginning is with the cell at the lower right of the matrix and it ends at top left cell d) First row and first column are set to zero
    1. Which of the following does not describe PAM matrices? a) These matrices are used in optimal alignment scoring b) It stands for Point Altered Mutations c) It stands for Point Accepted Mutations d) It was first developed by Margaret Dayhof
    2. Which of the following is untrue regarding the scoring system used in dynamic programming? a) If the residues are same in both the sequences the match score is assumed as +5 which is added to the diagonally positioned cell of the current cell. b) If the residues are not same, the mismatch score is assumed as -3. c) If the residues are not same, the mismatch score is assumed as 3. d) The score should be added to the diagonally positioned cell of the current cell.
    3. Which of the following does not describe global alignment algorithm? a) Score can be negative in this method. b) It is based on dynamic programming technique. c) For two sequences of length m and n, the matrix to be defined should be of dimensions m+1 and n+1. d) For two sequences of length m and n, the matrix to be defined should be of dimensions m and n.
    4. Which of the following does not describe global alignment algorithm? a) It attempts to align every residue in every sequence b) It is most useful when the aligning sequences are similar and of roughly the same size c) It is useful when the aligning sequences are dissimilar d) It can use Needleman-Wunsch algorithm
    5. Which of the following is wrong in case of substitution matrices? a) They determine likelihood of homology between two sequences b) They use system where substitutions that are more likely should get a higher score c) They use system where substitutions that are less likely should get a lower score d) BLOSUM-X type uses logarithmic identity to find similarity

Bioinformatics Questions and Answers – Local Sequence Alignment

  1. When did Smith–Waterman first describe the algorithm for local alignment? a) 1950 b) 1970 c) 1981 d) 1925
  2. Which of the following does not describe local alignment? a) A local alignment aligns a substring of the query sequence to a substring of the target sequence. b) A local alignment is defined by maximizing the alignment score, so that deleting a column from either end would reduce the score, and adding further columns at either end would also reduce the score. c) Local alignments have terminal gaps. d) The substrings to be examined may be all of one or both sequences; if all of both are included then the local alignment is also global.
  3. Which of the following does not describe local alignment algorithm? a) Score can be negative b) Negative score is set to 0 c) First row and first column are set to 0 in initialization step d) In traceback step, beginning is with the highest score, it ends when 0 is encountered
  4. Local alignments are more used when_____________ a) There are totally similar and equal length sequences b) Dissimilar sequences are suspected to contain regions of similarity c) Similar sequence motif with larger sequence context d) Partially similar, diferent length and conserved region containing sequences
  5. Which of the following does not describe BLOSUM matrices? a) It stands for BLOcks SUbstitution Matrix b) It was developed by Henikof and Henikof c) The year it was developed was 1992 d) These matrices are logarithmic identity values
    1. Which of the following is untrue regarding the gap penalty used in dynamic programming? a) Gap penalty is subtracted for each gap that has been introduced b) Gap penalty is added for each gap that has been introduced c) The gap score defines a penalty given to alignment when we have insertion or deletion d) Gap open and gap extension has been introduced when there are continuous gaps (five or more)
    2. Among the following, which one is not the approach to the local alignment? a) Smith–Waterman algorithm b) K-tuple method c) Words method d) Needleman-Wunsch algorithm
    3. Which of the following does not describe k-tuple methods? a) k-tuple methods are best known for their implementation in the database search tools FASTA and the BLAST family b) They are also known as words methods c) They are basically heuristic methods to find local alignment d) They are useful in small scale databases
    4. Which of the following does not describe BLAST? a) It stands for Basic Local Alignment Search Tool b) It uses word matching like FASTA c) It is one of the tools of the NCBI d) Even if no words are similar, there is an alignment to be considered
    5. Which of the following is untrue regarding BLAST and FASTA? a) FASTA is faster than BLAST b) FASTA is the most accurate c) BLAST has limited choices of databases d) FASTA is more sensitive for DNA-DNA comparisons

Bioinformatics Questions and Answers – Dynamic Programming Algorithm for Sequence Alignment

  1. Which of the following is incorrect regarding pair wise sequence alignment? a) The most fundamental process in this type of comparison is sequence alignment. b) It is an important first step toward structural and functional analysis of newly determined sequences. c) This is the process by which sequences are compared by searching for common character patterns and establishing residue–residue correspondence among related sequences. d) It is the process of aligning multiple sequences.
  2. Which of the following is incorrect about evolution? a) The macromolecules can be considered molecular fossils that encode the history of millions of years of evolution. b) The building blocks of these biological macromolecules, nucleotide bases, and amino acids form linear sequences that determine the primary structure of the molecules. c) DNA and proteins are products of evolution d) The molecular sequences barely undergo changes
  3. The presence of evolutionary traces is because some of the residues that perform key functional and structural roles tend to be preserved by natural selection; other residues that may be less crucial for structure and function tend to mutate more frequently. a) True b) False
  4. The degree of sequence variation in the alignment reveals evolutionary relatedness of diferent sequences, whereas the conservation between sequences reflects the changes that have occurred during evolution in the form of substitutions, insertions, and deletions. a) True b) False
  5. If the two sequences share significant similarity, it is extremely ______ that the extensive similarity between the two sequences has been acquired randomly, meaning that the two sequences must have derived from a common evolutionary origin. a) unlikely b) possible c) likely d) relevant
    1. Sometimes, it is also possible that two sequences have derived from a common ancestor, but may have diverged to such an extent that the common ancestral relationships are not recognizable at the sequence level. a) True b) False
    2. Which of the following is incorrect regarding sequence homology? a) Two sequences can homologous relationship even if have do not have common origin b) It is an important concept in sequence analysis c) When two sequences are descended from a common evolutionary origin, they are said to have a homologous relationship. d) When two sequences are descended from a common evolutionary origin, they are said to share homology.
    3. Sequence similarity can be quantified using ________ homology is a ______ statement. a) percentages, quantitative b) percentages, qualitative c) ratios, qualitative d) ratios, quantitative
    4. Shorter sequences require higher cutofs for inferring homologous relationships than longer sequences. a) True b) False
    5. Sequence similarity and sequence identity are synonymous for nucleotide sequences and protein sequences as well. a) True b) False
    6. Which of the following is untrue about iterative approach? a) The iterative approach is based on the idea that an optimal solution can be found by repeatedly modifying existing suboptimal solutions. b) Because the order of the sequences used for alignment is diferent in each iteration. c) This method is also heuristic in nature and does not have guarantees for finding the optimal alignment. d) This method is not based on heuristic methods.

Bioinformatics Questions and Answers – Needleman – Wunsch Algorithm

  1. Which of the following is not the objective to perform sequence comparison? a) To observe patterns of conservation b) To find the common motifs present in both sequences c) To study the physical properties of molecules d) To study evolutionary relationships
  2. A dotplot is visual and qualitative technique whereas the sequence alignment is exact and quantitative measure of similarity of alignments. a) True b) False
  3. The global sequence alignment is suitable when the two sequences are of dissimilar length, with a negligible degree of similarity throughout. a) True b) False Explanation: global sequence alignment suitable when two sequences are of similar length, with a significant degree of similarity throughout - best alignment over entire length of two sequences.
  4. The alignment score is the sum of substitution scores and gap penalties in this type of algorithm. a) True b) False
  5. The substitution matrices are rarely used in this type of matching. a) True b) False
  6. Which of the following is untrue about Protein substitution matrices? a) They are significantly more complex than DNA scoring matrices. b) They have the N x N matrices of the amino acids. c) Protein substitution matrices have quite important role in evolutionary studies. d) They are significantly quite less complex than DNA scoring matrices.
  7. In Needleman-Wunsch algorithm, the gaps are scored -2. a) True b) False
    1. The number of possible global alignments between two sequences of length N is _____ Answer: b
    2. Which of the following is untrue about Needleman- Wunsch algorithm? a) It is an example of dynamic programming b) Basic idea here is to build up the best alignment by using optimal alignments of larger subsequences. c) It was first used by Saul Needleman and Christian Wunsch d) It was first used in 1970
    3. There are two types matrices involved in the study - score matrices and trace matrices. a) True b) False
  1. A main application of pairwise alignment is retrieving biological sequences in databases based on similarity. a) True b) False
  2. Dynamic programming method is the fastest and most practical method. a) True b) False
  3. Which of the following is not one of the requirements for implementing algorithms for sequence databasesearching? a) Size of the dataset b) Sensitivity c) Specificity d) Speed 4.Sensitivity refers to the ability to find as manycorrect hits as possible. a) True b) False
  4. The specificity refers to the ability to include incorrect hits. a) True b) False
  5. In heuristic methods, speed doesn’t vary with the size of database. a) True b) False
    1. An increase in sensitivity is associated with _______ in selectivity. a) no specific change b) increase c) decrease d) exponential increase
    2. Which of the following is incorrect? a) Smith–Waterman algorithm is the fastest b) Smith–Waterman algorithm is comparatively slower method c) To speedup up comparison, heuristic methods are used d) Heuristic algorithms perform faster searches
    3. Currently, there are two major heuristic algorithms for performing databasesearches: BLAST and FASTA. a) True b) False
    4. Which of the following is incorrect the ‘word’ method? a) Both BLAST and FASTA use a heuristic word method. b) Word method is usedfor fast pairwise sequencealignment in BLAST and FASTA. c) The basic assumption is that two relatedsequences must have at least one word in common. d) Two related sequences must have at zero word in common while assuming.

Bioinformatics Questions and Answers – Basic Local Alignment Search Tool (BLAST)

1.The BLAST program was developed in _______ a) 1992 b) 1995 c) 1990 d) 1991

  1. In sequence alignment by BLAST, each word from query sequence is typically _______ residues for protein sequences and _______ residues for DNA sequences. a) ten, eleven b) three, three c) three, eleven d) three, ten
  2. In sequence alignment by BLAST, the second step is to search a sequence database for the occurrence of these words. a) True b) False
  3. The final step involves pairwise alignment by extending from the words in both directions while counting the alignment score using the same substitution matrix. a) True b) False
  4. A recent improvement in the implementation of BLAST is the ability to provide gapped alignment. a) True b) False
    1. Which of the following is not a variant of BLAST? a) BLASTN b) BLASTP c) BLASTX d) TBLASTNX
    2. BLASTX uses protein sequences as queries to search against a protein sequence database. a) True b) False
    3. TBLASTX queries protein sequences to anucleotide sequence database with the sequences translated in all six reading frames. a) True b) False
    4. Which of the following is not a correct about BLAST? a) The BLAST web server has been designed in suchaway as to simplify the task of program selection. b) The programs are organized based onthe type of query sequences c) The programs are organized based onthe type of nucleotide sequences, or nucleotidesequence to be translated d) BLAST is not based on heuristic searching methods
    5. If one is looking for protein homologs encoded in newly sequenced genomes, one may use TBLASTN, which translates nucleotide database sequences in all six open reading frames. a) True b) False

Bioinformatics Questions and Answers – Comparison of FASTA and BLAST

1. The rigorous dynamic programming method is

normally not used for database searching,

because it is slow and computationally expensive.

a) True

b) False

2. FASTA and BLAST are __________ but

__________ for larger datasets.

a) faster, more sensitive

b) faster, less sensitive

c) slower, less sensitive

d) slower, more sensitive

3. Scan PS is a web-based program that

implements a modified version of the Needleman-

Wunsch algorithm.

a) True

b) False

4. Par Align is a web-based server that uses

parallel processors to perform exhaustive

sequence comparisons using either a parallelized

version ofthe Smith–Waterman algorithm or a

heuristic program for further speed gains.

a) True

b) False

5. In Smith–Waterman algorithm, in initialization

Step, the _________ row and ________ column

are subject to gap penalty.

a) first, first

b) first, second

c) second, First

d) first, last

6. Local sequence alignments are necessary for

many cases out of which one is repeats.

a) True

b) False

7. In SW algorithm, to align two sequences of

lengths of m and n, _____time is required.

a) O(mn)

b) O(m^2 n)

c) O(m^2 n^3 )

d) O(mn^2 )

8. One of the challenges in SWA is obtaining

correct alignments in regions of low similarity

between distantly related biological sequences.

a) True

b) False

9. Score can be negative in Smith–Waterman

algorithm.

a) True

b) False

10. The function of the scoring matrix is to

conduct one-to-one comparisons between all

components in two sequences and record the

optimal alignment results.

a) True

b) False

Bioinformatics Questions and Answers – Comparative Genomics

  1. Which of the following is untrue about comparative genomics? a) It is comparison of whole genomes from diferent organisms b) It includes comparison of gene number, gene location, and gene content from these genomes c) It provides insights into the mechanism of genome evolution and gene transfer among genomes d) It doesn’t help to reveal the extent of conservation among genomes
  2. Which of the following is untrue about Whole Genome Alignment? a) This helps to reveal the presence of conserved functional elements b) It doesn’t help to understand sequence conservation between genomes c) It be accomplished through direct genome comparison or genome alignment d) The alignment at the genome level is fundamentally no diferent from the basic sequence alignment
  3. Which of the following is untrue about LAGAN? a) It stands for Limited Area Global Alignment of Nucleotides. b) It is a web-based program designed for pairwise alignment of small fragments of genomes only. c) It first finds anchors between two genomic sequences using an algorithm that identifies short, exactly matching words. d) Regions that have high density of words are selected as anchors.
  4. A minimal constitutes genome, which is a _____ set of genes required for maintaining a free living cellular organism. a) maximum b) maximal c) highest number of set of d) minimal
    1. Coregenes is a web-based program that determines a ____ set of genes based on comparison of ____ small genomes. a) vast, four b) core, fifteen c) core, four d) vast, fifteen
    2. Which of the following is untrue about Lateral gene transfer? a) It is also known as vertical gene transfer. b) There is exchange of genetic materials between species. c) It mainly occurs among prokaryotic organisms when foreign genes are acquired through mechanisms. d) It is one of the examples is transformation.
    3. A way to discern lateral gene transfer is through phylogenetic analysis, referred to as an ‘among-genome’ approach, which can be used to discover __________. a) recent lateral gene transfer events but almost negligible ancient events b) recent lateral gene transfer events c) ancient lateral gene transfer events d) both recent and ancient lateral gene transfer events.
    4. Within-Genome Approach is to identify regions within a genome with unusual compositions. a) True b) False
    5. Which of the following is untrue about Gene Order Comparison? a) When the order of a number of linked genes is conserved between genomes, it is called synteny b) Generally, gene order is much more conserved compared with gene sequences. c) Generally, gene order is much less conserved compared with gene sequences. d) It is in fact rarely observed among divergent species.
    6. Genes involved in the same metabolic pathway tend to be clustered among phylogenetically diverse organisms. a) True b) False

Bioinformatics Questions and Answers – Comparative Genomics – 1

  1. Which of the given statements is incorrect about Grouping Sequences? a) The problem of deciding which sequences to include in the same group or cluster and which to separate into diferent groups or clusters is a recurring one. b) Divergence is necessary, but the sequences chosen should be clearly related based on inspection of each pair-wise alignment and a statistical analysis. c) The conservative approach is to group distinct sequences. d) The adventurous approach is to choose a set of marginally alignable sequences to pursue the difficult task of making a multiple sequence alignment and then to make profile models that may recognize divergence but will also give false predictions.
  2. Which of the given statements is incorrect about Clusters of orthologous groups? a) Using the protein from one of the organisms to search the proteome of the other for high-scoring matches should identify the ortholog as the highest- scoring match, or best hit. b) When entire proteomes of the two organisms are available, orthologs may be identified. c) a pair of orthologous genes in two organisms share so much sequence similarity that they may be assumed to have arisen from a common ancestor gene. d) each of the orthologs belongs to a family composed of paralogous sequences but irrelevant or not related to each other.
  3. Which of the given statements is incorrect about Clusters of orthologous groups? a) Paralogs may include a best hit or a high-scoring match of one of the sequences by another, but the reciprocal match can have low similarity that does not have to be significant. b) Paralogs defined by sets of three matching sequences in the selected organisms were kept separated from the clusters. c) Orthologous pairs were first defined by the best hits in reciprocal searches. d) To produce COGs, similarity searches were performed among the proteomes of phylogenetically distinct clades of prokaryotes.
  4. Which of the given statements is incorrect about the Comparison of proteomes to EST databases of an organism? a) ESTs are single DNA sequence reads that contain a small fraction of incorrect base assessments, insertions, and deletions. b) Many sequences arise from near the 5’ end of the mRNA, although every efort is usually made to read as far 3’ as possible into the upstream portion of the cDNA. c) EST libraries are useful for preliminary identification of genes by database similarity searches. d) An EST database of an organism can be analyzed for the presence of gene families, orthologs, and paralogs.
    1. Which of the given statements is incorrect about Searching for orthologs to a protein family in an EST database? a) Searches of EST databases for matches to a query sequence routinely produce minimal amounts of output that must be searched manually for significant hits. b) ESTs with a high percent identity with the query sequence, a long alignment with the query sequence, and a very low E value of the alignment score represent groups of paralogous and orthologous genes. c) To identify orthologs as the most closely related sequence, ESTs were aligned using the amino acid alignment as a guide. d) To identify orthologs as the most closely related sequence, a phylogenetic tree was produced by the maximum likelihood method.
    2. Which of the given statements is incorrect about Family and Domain Analysis? a) Gene identification of predicted proteins in the genome is designed to discover the metabolic features of an organism b) In a particular organism or group of organisms, one particular domain can be expanded to perform a particular function. c) Comparison of the domain content of an entire proteome with that of another proteome cannot help in revealing the biological roles of diverse domains in diferent organisms d) Diferent proteins are mosaics of domains that occur in diferent combinations in a given protein.
    3. Which of the given statements is incorrect about Ancient Conserved Regions? a) The method involves database similarity searches of the SwissProt database with human, worm, yeast, or E.

coli genes and identification of matches with sequences from a diferent phylum than the query sequence. b) An analysis of ACRs that predate the radiation of the major animal phyla some 580–540 million years ago suggested that 50–60% of coding sequences are ACRs. c) These ACRs may represent proteins present at the time of the prokaryotic–eukaryotic divergence. d) Phylogenetically diverse groups of organisms have been analyzed for the presence of conserved proteins and protein domains that have been conserved over long periods of evolutionary time, called ancient conserved regions or ACRs.

  1. Which of the given statements is incorrect about Horizontal Gene Transfer? a) The genomes of most organisms are derived by vertical transmission, the inheritance of chromosomes from parents to ofspring from one generation to the next b) It is the acquisition of genetic material from a diferent organism. c) The transferred material becomes a temporary addition to the recipient genome d) An extreme example is the proposed endosymbiont origin of mitochondria in eukaryotic cells and chloroplasts in plants.
    1. Which of the given statements is incorrect about Horizontal Gene Transfer? a) It is a significant source of genome variation in bacteria, allowing them to exploit new environments b) Such transfer is rendered possible by a variety of natural mechanisms in bacteria for transferring DNA from one species to another. c) Detection of HT is made possible by the fact that each genome of each bacterial species has a unique base composition d) The time of transfer of DNA cannot be estimated by the composition of the HT DNA.
    2. Annotation is based on finding significant alignment to sequences of known function in database similarity searches. a) True b) False

Bioinformatics Questions and Answers – Phylogenetics Basics

1. A gene phylogeny only describes the evolution of a particular gene or encoded protein.

a) True

b) False

2. Evolution of a particular sequence _______ correlate with the evolutionary path of the species.

a) does not

b) always

c) does not necessarily

d) invariably

3. The species evolution is the ______ of evolution by _____ in a genome.

a) combined result, multiple genes

b) result, single genes

c) result, sole genes

d) distinct results, single gene

4. To obtain a species phylogeny, phylogenetic trees from a variety of gene families need to be constructed

a) True

b) False

5. It is often desirable to define the root of a tree. There are two ways to define the root of a tree. One is to

use an outgroup, which ______

a) is a sequence that is homologous to the sequences under consideration

b) is separated from those sequences at an early evolutionary time

c) is generally determined from independent sources of information

d) is generally determined from similar or related sources of information.

6. Which of the following is incorrect statement about the Kimura model?

a) It is a model to correct evolutionary distances and is a more sophisticated model

b) In this, the mutation rates for transitions and transversion are assumed to be diferent

c) According to this model, occur more frequently than transversions

d) According to this model, transversions occur more frequently than transitions.

7. Which of the following is incorrect statement about Choosing Substitution Models?

a) There is one substitution at a particular position, in divergent sequences

b) The evolutionary divergence is beyond the ability of the statistical models to correct

c) The statistical models used to correct homoplasy are called substitution models or evolutionary models.

d) For constructing DNA phylogenies, there are nucleotide substitution models available

8. The second step in phylogenetic analysis is to construct sequence alignment. This is probably the most

critical step in the procedure because it establishes positional correspondence in evolution.

a) True

b) False

Bioinformatics Questions and Answers – Forms of Tree Representation

  1. Which of the following is incorrect statement? a) In a phylogram, the branch lengths represent the amount of evolutionary divergence b) Trees like cladogram are said to be scaled c) The scaled trees have the advantage of showing both the evolutionary relationships and information about the relative divergence time of the branches. d) In a cladogram, the external taxa line up neatly in a row or column
  2. Which of the following is incorrect statement about Newick Format? a) It was designed to provide information of tree topology to computer programs without having to draw the tree itself b) In this format, trees are represented by taxa excluded in nested parentheses. c) In this linear representation, each internal node is represented by a pair of parentheses d) For a tree with scaled branch lengths, the branch lengths in arbitrary units are placed immediately after the name of the taxon separated by a colon.
  3. Sometimes a tree-building method may result in several equally optimal trees. A consensus tree can be built by showing the commonly resolved bifurcating portions and collapsing the ones that disagree among the trees, which results in a polytomy. a) True b) False
  4. The number of rooted trees (NR) for n taxa is ______ a) NR = (2n− 3)! /2n+2^ (n− 2)! b) NR = (2n− 3)! /2n^ (n− 2)! c) NR = (2n− 3)! /2n−2^ (n− 5)! d) NR = (2n− 3)! /2n−2^ (n− 2)!
  5. For unrooted trees, the number of unrooted tree topologies (NU) is ________ a) NU = (2n− 5)!/2n−3(n− 5)! b) NU = (2n− 5)!/2n−3(n− 3)! c) NU = (2n− 5)!/2−2(n− 3)! d) NU = (2n− 5)!/2n(n− 3)!
  6. It can be computationally very demanding to find a true phylogenetic tree when the number of sequences is large. a) True b) False
    1. Which of the following is incorrect statement about Molecular Markers? a) For studying very closely related organisms, protein sequences are preferred b) The decision to use nucleotide or protein sequences depends on the purposes of the study. c) For constructing molecular phylogenetic trees, one can use either nucleotide or protein sequence data d) The decision to use nucleotide or protein sequences depends on the properties of the sequences.
    2. For studying the evolution of ________ divergent groups of organisms, one may choose either ______ nucleotide sequences, such as ribosomal RNA or protein sequences. a) less widely, slowly evolving b) more widely, slowly evolving c) more widely, rapidly evolving d) less widely, rapidly evolving
    3. In many cases, ______ sequences are preferable to ______ sequences because they are relatively ____ conserved. a) protein, nucleotide, less b) nucleotide, protein, less c) protein, nucleotide, more d) nucleotide, protein, more.
    4. Protein sequences can remain the same while the corresponding DNA sequences have more room for variation. a) True b) False
    5. DNA sequences are sometimes more biased than protein sequences because of preferential codon usage in diferent organisms. a) True b) False
    6. In Jukes–Cantor Model to correct evolutionary distances, A formula for deriving evolutionary distances that include hidden changes is introduced by using a logarithmic function. It is ____ a) dAB = −(3/4) log[1 − (4/7)pAB]. b) dAB = −(3/4) ln[1 − (5/3)pAB]. c) dAB = −(3/4) log[1 − (4/3)pAB]. d) dAB = −(3/4) ln[1 − (4/3)pAB].