









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Assignment; Class: Applied Bioinformatics; Subject: Biotechnology; University: University of California - Davis; Term: Fall 2008;
Typology: Assignments
1 / 16
This page cannot be seen from the preview
Don't miss anything!










Tm Triticin genomic DNA ATAATTTTTTGAATATATGGTATTGTTTTCGAAAATATGAGCAAAAGTAAATAAAATCGAATGAATAAACAAATAAAACTGAAGTAACC TGCAGTAGGATACTTGGGCCGGGCGAACCAGACGCTCGCTGTAGGCGAGGAGGGTTTCTATCTCGTGTCGGGCGAGGTATAGCCAATGT TTGGACATCCTTACTCTTAAATAGGGCACTCCATCAGTGTAGAAATGTTGGGTCAACAGCACATAAGATAATTAAGACTGGACCTATTA CTAATACGGTCTCATTTGCGTGGAGATGCCTCTTCTTCTCTCTTAAACGGGAACAAAACAAGTGTGAATCTTTTCCCTCAAAAAAAAAG TTGTGATTAGTTTGCTCCAACAAGTGCCTTTTCTTCTTATCTGAACTTTTCATATCTGCACTCCTTGTATCCAGATGGATAATGCATCA TCGTGGAAAGCTTCTTTTTTTCCTTATCCTGATACGTATTAAGTGTGTGCCTACGTGGATGTGTGTCATCCAATAAACACTACTTTGGC GAAGCAAAATCTCATCGTCACACAAGTGCAAAAGTATCCAAATGTATTATTACGCAGCAACTAAAGTGATGTATTCAGTTAGCGGTGCA GTAAACAAACAAGCATAAAAGCGAGCACTAGACAGAAGTTGGTACAGATATGTTTGAGAAGTTTGTAAGTTACGTGAAAAAGAAACACT TAACCAACGTGTACGAAATCCTTTCTAGAAATAATGATGTAGATGATCGTTAAAGACAACACAGAAAGGATTTTCTATGTTTGCGTGCA GTCATATATAACTTTAACTCATGAGAGTTGGTTAGAGCCACTCAACTTTCCAATTGTCGAAATTATCATATGTTCTACATATTTATCTT AACAAATCTTGTCTCTACATAAGTTCTAGGTAGATTTTGTTGGC TATAAAA GCCACACACATCTCCCAACGCTAACATCAAGAAAACTT CTCTCTCCTCTTCAAACAGCT ATG GCAGCCACTAGTTTCGCTTCGCTCTCTTTTTACTTTTGCATTTTGCTCTTGTGCCATAGCTCCAT GGCACAACTGTTTGGCATGAGCTTTAACCCATGGCAAAGCTCTCGCCAAGGGGGTTTCAGAGAGTGTACATTCAATAGGCTTCAAGCAT CTACACCACTTCGTCAAGTGAGGTCACAAGCAGGCCTGACCGAGTATTTTGATGAGGAAAATGAGCAATTTCGTTGTACTGGTGTATTT GTCATCCGTCGTGTAATCGAACCTCGTGGTTATTTGTTACCGCGATACCACAACACTCACGGATTAGTCTACATCATCCAAG GT TTGTG TAGTAATTTAATTAATATAGTTACCATTTCATATTACTAATAGTTTCTTGAGATAAAGGTTACTATGTTTAGTATTTTATTTATTAACA CGTTTCTAATACTAACATGCAGATATGTTGTCACCC AG GAAGTGGTTTCGCCGGATTGTCTTTTCCTGGATGCCCGGAGACATTCCAAA AACAGTTTCAAAAATATGGGCAAGCACAATCGGTACAGGGACAAAGCCAAAGCCAAAAGTTCAAAGATGAGCACCAAAAAGTTCACCGT TTCAGACAAGGAGATGTCATTGCACTACCGGCAGGCATTGTACATTGGTTCTACAATGACGGTGATGCGCCAATTGTGGCTATCTATGT TTTCGACGTAAACAACTATGCTAATCAACTTGAGCCTAGGCATAAG GT AACAAATGATCTTTGAGACAAATCTATGTGGGGTCAATAAG TCTATTCAACTAACCTGTTGTATTTAATGTAGTTTACAAAGTGACATGTTGTTTAATTTCTTTTCTTGATCAATCTTGT AG GAATTTTT GTTCGCTGGCAACTATAGGAGTTCGCAACTTCACTCTAGTCAAAACATATTCAGTGGTTTCGATGTTCGATTGCTTGCTGAGGCCTTGG GTACAAGCGGAAAAATAGCGCAAAGGCTTCAAAGTCAAAATGATGACATAATTCATGTGAATCATACCCTTAAATTTCTGAAGCCTGTT TTTACACAACAACGAGAGCCAGAATCCTACCCACACACTCAATATGAGGAAGGGCAATCTCAGGCAAAACCCTCTCAGGAAGAGCAACC TCAAATGGGGCAGTCACAGGGAGACCAACCTCAAATGGGGCAGTCTCAGGGAGAGCAACCTCAAACGGGGCAGTCTCAGGGAAAGCACA TTCAGGGAGAGCAACCTCAAATGGGGCAGTCTCAGGCAAAACACTATCAAGGAGACCAACCTGAAGAAGGGCAGGGAGGGCAATCTCAA GAAGAACAATCTCAGGCAGGGCCATATCCGGGATGTCAACCTCATGCAGGGCAATCTCATGCATCACAATCAACTTATGGTGGTTGGAA TGGTTTGGAGGAGAACTTTTGTGATCATAAGCTAAGTGTGAACATCGACGATCCCAGTCGTGCTGACATATACAACCCGCGTGCCGGTA CGATAACCCGTCTCAACACCCAAACGTTCCCCATCCTTAACATCGTGCAAATGAGTGCTACAAGAGTACATCTCTACCAG GT AATTGTG ATATTGTGTTTTTTCATACTCTTTTATATTCAAAGCTTCACAATGCAATTCTAACGTTATACCTTACATAATTTATGATCGC AG AATGC CATTATTTCACCATTATGGAACATTAATGCTCATAGTGTGATGTACATGATCCAAGGACATATCTGGGTTCAGGTTGTCAATGACCATG GTCGAAATGTGTTCAATGGCCTTCTTAGCCCGGGGCAACTATTAATCATACCACAGAACTATGTTGTTCTCAAGAAGGCACAACGTGAT GGAAGCAAGTACATTGAATTCAAGACTAACGCAAACTCCATGGTTAGTCACATCGCGGGAAAGAGCTCAATCCTCGGCGCCTTGCCCGT TGATGTCATCGCCAATGCATACGGCATCTCTAGGACAGAAGCTCGAAGCCTCAAATTTAGCAGGGAAGAGGAGCTCGGAGTATTCGCTC CTAAATTCAGTCAAAGTATCTTCCATAGTTCTCCTACCAGCGAAGAAGAGTCATCT TAA GAGCGCATGAGCTAATGTCAAAACTAGCTC ATGGACTAA AATAAA CACATCATTGAGTGTGTAGCACTTTGATGTTTCCATATATGGTCGTCTCAATAAGATATCAACAAAGGTCCATT GTGTTTCATATGTTTACCTTTCTGGAAATTTCATGAACTTTGTTTTGCAAGTTGCATTCGCGAATTCTTCATCTAGATAGTGTGCATAT GCTATCGTATTTGTACTACTATTCTATGTGGTAGTGGTTCTCGTTTCTCATTGTAGCGATACAAATTCTCACCATAGCAATAAACACCA ATGTGTCAAAGCCGGTCTGTATAGTTGGTCGCCGGTGCCGAACCCTGTCTATGTAACCATCAGCTACTCTATGTTTCTTCTTCATCAAT GAAAATCATCTCTAGCTGCTTTTTCGTCAAAAAAATAAAATAAACAGCAATGTGTTTTTCGTTTGTGTTTGCACTGACATAAACATCAC TCACTTGCTAACTAACCCTATTAAACACCAGTGTGAGGTGGCTACGGTCAGCTCAATACATTCTTCTATGTGCCATTGGCCCCATTTCA CTTGTGTTGTCTTCTACAAAAATCATATGGTCGTTGGGTGAGTAATTTCTCATTCGCGCTTAAGACGAATGAACGATGATGTAGATTTG TAGAGTGCCCTTGAGTTCTTCCTTTGCAATTGGGTTCAACTCTTTTTTTATGGGGAAATTAGGTCTGGCCACTATCCAACTTGATATTT GATGCACGCCACTTTTCTTTTGAAATTGTGGTTGTGCCCTAACGATTGCGAATAAATTGGGCAGAAGCCCTATGCGTTTGCCACCAAGT TTTCACCTCATTTGACCCATTTTTTTTTCTTTTGTGAGTCTCAATCGTGGCAATAACGGAGGGGAGACTCATATGAAACATCCATTCGA TGTGTTGGCCATCATTTGGCCATGTTGTCAACTATGAATAGGAGAGGTCCGTCCCAAGAGTGACGGGTTGGTTTTCCTCT Tm Triticin protein MAATSFASLSFYFCILLLCHSSMAQLFGMSFNPWQSSRQGGFRECTFNRLQASTPLRQVRSQAGLTEYFDEENEQFRCTGVFVIRRVIEP RGYLLPRYHNTHGLVYIIQGSGFAGLSFPGCPETFQKQFQKYGQAQSVQGQSQSQKFKDEHQKVHRFRQGDVIALPAGIVHWFYNDGDAP IVAIYVFDVNNYANQLEPRHKEFLFAGNYRSSQLHSSQNIFSGFDVRLLAEALGTSGKIAQRLQSQNDDIIHVNHTLKFLKPVFTQQREP ESYPHTQYEEGQSQAKPSQEEQPQMGQSQGDQPQMGQSQGEQPQTGQSQGKHIQGEQPQMGQSQAKHYQGDQPEEGQGGQSQEEQSQAGP YPGCQPHAGQSHASQSTYGGWNGLEENFCDHKLSVNIDDPSRADIYNPRAGTITRLNTQTFPILNIVQMSATRVHLYQNAIISPLWNINA HSVMYMIQGHIWVQVVNDHGRNVFNGLLSPGQLLIIPQNYVVLKKAQRDGSKYIEFKTNANSMVSHIAGKSSILGALPVDVIANAYGISR TEARSLKFSREEELGVFAPKFSQSIFHSSPTSEEESS*
Predicted protein(s):
FGENESH: 1 4 exon (s) 1001 - 3085 577 aa, chain + MAATSFASLSFYFCILLLCHSSMAQLFGMSFNPWQSSRQGGFRECTFNRLQASTPLRQVR SQAGLTEYFDEENEQFRCTGVFVIRRVIEPRGYLLPRYHNTHGLVYIIQGSGFAGLSFPG CPETFQKQFQKYGQAQSVQGQSQSQKFKDEHQKVHRFRQGDVIALPAGIVHWFYNDGDAP IVAIYVFDVNNYANQLEPRHKEFLFAGNYRSSQLHSSQNIFSGFDVRLLAEALGTSGKIA QRLQSQNDDIIHVNHTLKFLKPVFTQQREPESYPHTQYEEGQSQAKPSQEEQPQMGQSQG DQPQMGQSQGEQPQTGQSQGKHIQGEQPQMGQSQAKHYQGDQPEEGQGGQSQEEQSQAGP YPGCQPHAGQSHASQSTYGGWNGLEENFCDHKLSVNIDDPSRADIYNPRAGTITRLNTQT FPILNIVQMSATRVHLYQNAIISPLWNINAHSVMYMIQGHIWVQVVNDHGRNVFNGLLSP GQLLIIPQNYVVLKKAQRDGSKYIEFKTNANSMVSHIAGKSSILGALPVDVIANAYGISR TEARSLKFSREEELGVFAPKFSQSIFHSSPTSEEESS*
Score = 1179 bits (3050), Expect = 0. Identities = 577/577 (100%), Positives = 577/577 (100%), Gaps = 0/577 (0%)
GENSCAN 1.0 Date run: 22-Oct-106 Time: 19:59: Sequence 19:59:38 : 4085 bp : 40.81% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: Arabidopsis .smat **Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..
1.01 Intr + 1063 1328 266 1 2 77 76 198 0.357 18. 1.02 Intr + 1463 1737 275 0 2 66 87 135 0.995 12. 1.03 Intr + 1862 2572 711 1 0 89 91 354 0.997 31. 1.04 Term + 2666 3085 420 1 0 50 48 396 0.950 31. 1.05 PlyA + 3125 3130 6 1. Image of the predicted gene(s) Predicted peptide sequence(s):**
19:59:38|GENSCAN_predicted_peptide_1|557_aa XSMAQLFGMSFNPWQSSRQGGFRECTFNRLQASTPLRQVRSQAGLTEYFDEENEQFRCTG VFVIRRVIEPRGYLLPRYHNTHGLVYIIQGSGFAGLSFPGCPETFQKQFQKYGQAQSVQG QSQSQKFKDEHQKVHRFRQGDVIALPAGIVHWFYNDGDAPIVAIYVFDVNNYANQLEPRH KEFLFAGNYRSSQLHSSQNIFSGFDVRLLAEALGTSGKIAQRLQSQNDDIIHVNHTLKFL KPVFTQQREPESYPHTQYEEGQSQAKPSQEEQPQMGQSQGDQPQMGQSQGEQPQTGQSQG KHIQGEQPQMGQSQAKHYQGDQPEEGQGGQSQEEQSQAGPYPGCQPHAGQSHASQSTYGG
Score = 1137 bits (2941), Expect = 0. Identities = 556/556 (100%), Positives = 556/556 (100%), Gaps = 0/556 (0%)
Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)
CGCTCCGTCTGCGCCGAGCCAGCCATCGccGGAGCGAACGGCTGGTGGAGAAGGCACCGACGGAAGAGCGGGCGC CGACGGGGAGACCCATGGTGCTGCCAGGTCCAAGGACCAGGTTGCACCGCATGTCGGCGGCGCGGTCGCCTCCCA CAACGACCACGCGCTCTCCAAGACCTCGGCGTCCATACACGCACCGCGTCCATCCCAGGAGGCGCGGGACCAGCA GCGTCATGGTACCCACGGTACCATACGTCTGCTGGGCCAAGATCAAGCTGGCGGATCCCGAAGTGCTCATGACTA GCATGCACGCCAGCATGCTGGCGCGAGCGTGGCGCGGTCGCTTGGACGAGATGTTGCTCCTTCTAACGCGGAAGC GGAGGCTGTGCTGCAAGACTTGAAGAGGTACCTGTCCTCCACTCCGACACTTGTCGCGCCTAAACCACAAGAGAA GTTGCTGCTGTACATAGCGGCAACCAATCAAGTGGTTAGTGCTGCGTTAGTAGCGGAGAGGGAGGCAGATGACGA GCCAGCGACCGCGGCAAGCACATCCAGCGACAAGCAGGGGGGCTTTCCCGACAAGCTCTGGTCCCAACAAGCAAG GGTCTGCGCAGATGCAAGAGGAGATACAGAAGAAGATGGTGCAGCGCCCAGTTTACTTTGTCAGTTCCCTTTTGC AGGGGGCTAGGTCAAGGTACTCTGGTGTGCAGAAGCTGCTTTTCGACCTTCTCATGGCCTCGAGAAAGCTGCGCC ATTACTTCCAAGCACATGAGATCAAAGTTGTCACTCGCTTTCCGCTGAAGAGGATATTGCAAAATCCAGAAGCAA TAGGCAGGATTGTCGAGTGGGCACTGGAACTGTCAAGCTTTGGCCTCAAGTTTGAGAGTACATCAACAATCCAGA GCAGAGCATTGGCAGAATTCATAGCAGAGTGGACGCCAACTCCAGACGAAGAAATTCCGGAGACGAGCATCCCCG CCAAGGAAGCAAGCAAAGAGTGGCTCATGTACTTTGACGGTGCTTTCTCGCTGCAAGGCGCCGGTGCTGGTGTAC TGCTTGTCGCACCCACCGGAGAGCACCTCAAGTACATAGTCCAGATGCACTTCCCCAAGGAGCAAGCGACAAACA
Gene 2 Exon 1 Exon 2 Exon 3 Exon 4 Exon 5 Exon 6 Exon 7 Exon 8 Exon 9 Exon 10 Exon 11 Exon 12 Exon 13