



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The steps to perform multiple sequence alignments, calculate distance matrices, and construct phylogenetic trees using mega4 software. It covers creating alignments using clustalw, generating publishable alignments using boxshade, exploring alignments, calculating distance matrices, and drawing phylogenetic trees. It also includes instructions on retrieving sequences from genbank and viewing 3d structures of proteins.
Typology: Lab Reports
1 / 5
This page cannot be seen from the preview
Don't miss anything!




Copy 08_Lab3 from Z: to C:, and open the file ‘FT proteins for MEGA.doc’. Objective : Perform multiple sequence alignments, calculate distance matrices, and construct phylogenetic trees, to understand and interpret relationships between species. Activities : A. Creating Multiple Sequence Alignments (MSA) In this example, we will create a multiple alignment of protein sequences that will be imported into the alignment editor using different methods. Multiple protein sequence alignment is a central tool to infer protein function, predict protein secondary structure, and identify residues important for protein specificity. A1. Start MEGA4 by using Start\Programs\BioInformatics\MEGA4. A2. In the MEGA4 window, go to Alignment | Alignment Explorer/CLUSTAL. Select ‘ Create a new alignment ’, and click on OK. Click on [NO] for protein sequence alignment. A3. Sequences can be entered either from FASTA files or by hand. We will enter the sequences by hand, one by one. In the Alignment Explorer window, go to Edit | Insert Blank Sequence or click on , and repeat it to generate 8 blank sequences. Right-click on the blank sequence name and edit the sequence name for each protein sequence, as it is in the Word document ‘FT Proteins for MEGA’. Copy and paste each sequence. A4. Go to Edit | Select All to select every site for all the protein sequences in the alignment. A5. Go to Alignment | Align by ClustalW or click on to align the selected protein sequences using the ClustalW algorithm. A6. Save the current alignment by selecting the Data | Save Session. Save it as ‘FT.mas’. This will allow the current alignment to be restored for future editing. Also, export it ( Data | Export Alignment | FASTA format ) as both a FASTA file (‘FT.fas’) and a MEGA file (‘FT.meg’). B. Generating a publishable MSA using BoxShade
B1. Using Word, open the previously created FASTA file (‘FT.fas’). Copy the FASTA sequences (including gaps). Past them in BOXShade : http://www.ch.embnet.org/ software/BOX_form.html. In the ‘ Output format ’ select RTF_new and in the ‘ Input sequence format ’ select other. Click on Run BOXSHADE. Click On ‘here is your output number 1’. The alignment will be open in a Word document. C. Exploring the MSA and identifying patterns C1. Back in MEGA4, exit the Alignment Explorer window by selecting the Data | Exit AlnExplorer. A dialog box will appear asking you if you would like to open the data file in MEGA; click on ‘ Yes ’. C2. Observe different coloring schemes by clicking on: C : conserved residues (the same amino acid at a given site in all the aligned sequences), V : variable residues (at least 2 different amino acids at a given site), Pi : Parsimony informative (at least 2 different amino acids at a given site and at least 2 of them occurring with a minimum frequency of 2), S : singletons (at least 2 different amino acids at a given site with at most 1 of them occurring multiple times). (When you have a coding DNA sequence you can translate it into a protein sequence by clicking on UUC->Phe. Clicking again you go back to the DNA sequence).
G4. Align the protein sequence using ClustalW as before, save the alignment as ‘MADS.mas’, exit and open the file in MEGA. G5. Perform a Neighbor-Joining (NJ) analysis. Copy and paste the phylogenetic tree into your Word document.
and cylinders - to represent strands and helices. The colors are green for helices , orange for strands , and blue for coils. Arrows point in the N-to-C direction. H5. In the Sequence/Alignment Viewer, you can see where in the 3D structure the selected amino acids are located, by simply selecting them with your mouse. The 3D structure will be highlighted in the position where the selected amino acids are located. I. From Multiple Sequence Alignment to Multiple Sequence Assembly I.1. Using MEGA4, perform a new ClustalW alignment with the 8 exported sequences used in 08_Lab1 (simply select them all from the Word document called ‘08_Lab1 DNA for MEGA’, copy them (Ctrl C) and paste them (Ctrl V) in the MEGA4 Alignment Explorer window).