Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Examen de bio informatique, Exams of Biotechnology

Université Abou Bekr Belkaid, Tlemcen Biotechnology

Prof. Seuhil Bechir Semir Gaouar

C’est un sujet d’examen concernant le module bio informatique

Typology: Exams

2023/2024

Uploaded on 03/30/2026

kawther-nkz 🇩🇿

1 document

1 / 11

This page cannot be seen from the preview

Don't miss anything!

1

Examen de bioinformatique

Février 2009

Durée : 2h (ou 2h30) - Documents interdits

Première partie (4 points)

1) La séquence ci-dessous est-elle au format fasta (justifiez votre réponse) ? 1 pt

Is the sequence below in fasta format (justify your answer) ?

>tr|A0A098|A0A098_CHLRE

MASMAAELRPSDGGSSLHMLDSLLMMGLSSGGGVGGGGSSQSQILDSAGAAELAALLLPQ

HSNDPLHLMSTGDAALGLAGPMAAAEHHQHHPHHQHHSVPATAGFPSQTPPPPLFSNATA

GAAPATRVRAAGSCGSGGVAGGTTSHSSEDGVFHSADPHHHHQQHLQQPQPQQQQ

2) Quel problème rencontrera-t-on avec cette séquence lors d’une recherche de similarité ? 1.5 pt

Which problem will be encountered with this sequence in a similarity search?

3) Définissez la banque GO. Define the GO database. 1.5 pt

Deuxième partie (7.5 points)

Un alignement multiple de séquences eucaryotes est présenté page 2. A multiple alignment of

eucaryotic sequences is shown page 2.

1) Quels groupes pouvez-vous distinguer dans cet alignement ? 2 pt

Donnez 2 résidus discriminants pour chaque groupe.

Which group can you distinguish in this alignment? Give 2 discriminative residues for each

group.

2) Donnez une erreur de séquence probable dans cet alignement. 1.5 pt

Give a probable sequence error in this alignment.

3) Quelle est la relation d’homologie entre zn143_human et znf76_human ? zn143_human et

q8ci27_mouse ? zn143_human et znf76_mouse ? 1.5 pt

Give the homology relation between zn143_human and znf76_human ? zn143_human and

q8ci27_mouse ? zn143_human and znf76_mouse ?

4) Un arbre a été construit à partir de cet alignement selon la méthode du neigbor-joining. 2.5pt

Dans cet arbre (page 3), 3 identifiants de séquences ont été remplacés par x, y et z.

A tree has been constructed from this alignment using the neighbor-joining method. In this

tree (page 3), 3 sequence identifiers have been replaced par x, y, and z.

a) Est-ce que cet arbre est en accord avec votre analyse de l’alignement ? Justifiez

votre réponse.

Is this tree in agreement with your alignment analysis? Justify your answer.

b) Selon vous, à quelles séquences correspondent respectivement X, Y et Z ?

According to you, which sequences correspond respectively to X, Y and Z?

Discover Exams of Biotechnology Université Abou Bekr Belkaid, Tlemcen

Partial preview of the text

Download Examen de bio informatique and more Exams Biotechnology in PDF only on Docsity!

Examen de bioinformatique

Février 2009

Durée : 2h (ou 2h30) - Documents interdits

Première partie (4 points)

1) La séquence ci-dessous est-elle au format fasta (justifiez votre réponse)? 1 pt

Is the sequence below in fasta format (justify your answer)?

>tr|A0A098|A0A098_CHLRE

MASMAAELRPSDGGSSLHMLDSLLMMGLSSGGGVGGGGSSQSQILDSAGAAELAALLLPQ

HSNDPLHLMSTGDAALGLAGPMAAAEHHQHHPHHQHHSVPATAGFPSQTPPPPLFSNATA

GAAPATRVRAAGSCGSGGVAGGTTSHSSEDGVFHSADPHHHHQQHLQQPQPQQQQ

2) Quel problème rencontrera-t-on avec cette séquence lors d’une recherche de similarité? 1.5 pt

Which problem will be encountered with this sequence in a similarity search?

3) Définissez la banque GO. Define the GO database. 1.5 pt

Deuxième partie (7.5 points)

Un alignement multiple de séquences eucaryotes est présenté page 2. A multiple alignment of

eucaryotic sequences is shown page 2.

1) Quels groupes pouvez-vous distinguer dans cet alignement? 2 pt

Donnez 2 résidus discriminants pour chaque groupe.

Which group can you distinguish in this alignment? Give 2 discriminative residues for each

group.

2) Donnez une erreur de séquence probable dans cet alignement. 1.5 pt

Give a probable sequence error in this alignment.

3) Quelle est la relation d’homologie entre zn143_human et znf76_human? zn143_human et

q8ci27_mouse? zn143_human et znf76_mouse? 1.5 pt

Give the homology relation between zn143_human and znf76_human? zn143_human and

q8ci27_mouse? zn143_human and znf76_mouse?

4) Un arbre a été construit à partir de cet alignement selon la méthode du neigbor-joining. 2.5pt

Dans cet arbre (page 3), 3 identifiants de séquences ont été remplacés par x, y et z.

A tree has been constructed from this alignment using the neighbor-joining method. In this

tree (page 3), 3 sequence identifiers have been replaced par x, y, and z.

a) Est-ce que cet arbre est en accord avec votre analyse de l’alignement? Justifiez

votre réponse.

Is this tree in agreement with your alignment analysis? Justify your answer.

b) Selon vous, à quelles séquences correspondent respectivement X, Y et Z?

According to you, which sequences correspond respectively to X, Y and Z?

2

ZN143_HUMAN

A6QQW0_BOVIN :Q8CI27_MOUSE :Q6VQB0_FUGRU :A0AUQ7_DANRE :Q4S173_TETNG :Q6GPP5_XENLA :ZNF76_MOUSE

ZNF76_HUMAN

MEGVSLQAVTLADGSTAYIQHNSK----DAKLIDGQVIQLEDGSAAYVQHVPIPKSTGDSLRLEDGQAVQLEDGTTAFIHHTSKDSYDQSALQAVQLEDGTTAYIHHAVQMEGVSLQAVTLADGSTAYIQHNS-----------------KDGSAAYVQHVPIPKTTGDSLRLEDGQAVQLED------------SYDQSALQAVQLEDGTTAYIHHAVQMEGVSLQAVTLADGSTAYIQHNSK----DGRLIDGQVIQLEDGSAAYVQHVPIPKS----------------------------NSYDQSSLQAVQLEDGTTAYIHHAVQMDTVSLQAVTLADGSTAYIQHDSKASFSDGQIMDGQVIQLEDGSAAYVQHVSMPKAGGDSLQLEDGQTVQLEDGTTAYIHTP-KETYDQSGLQEVQLEDGSTAYIQHTVHMDTVSLQAVTLVDGSTAYIQHSPKVSLTENKIMEGQVIQLEDGSAAYVQHLPMSKTGGEGLRLEDGQAVQLEDGTTAYTHAP-KETYDQGGLQAVQLEDGTTAYIQH---MDTVSLQAVTLADGSTAYIQHDSKASFPDGQIMDGQVIQLEDGSAAYVQHVSMPKAGGESLQLEDGQTVQLEDGTTAYIHAP-KETYDQSGLQEVQLEDGSTAYIQHTVHMESMSLQAVTLADGSTAYIQHNTK----DGKLMEGQVIQLEDGSAAYVQHIP----KGDDLSLEDGQAVQLEDGTTAYIHHSSKESYDQSSVQAVQLEDGTTAYIHHAVQMESLGLQTVRLSDGTTAYVQQAVK----GEKLLEGQVIQLEDGTTAYIHQVTI---QKESFSFEDGQPVQLEDGSMAYIHHTPKEGCDPSALEAVQLEDGSTAYIHHPVPMESLGLHTVTLSDGTTAYVQQAVK----GEKLLEGQVIQLEDGTTAYIHQVTV---QKEALSFEDGQPVQLEDGSMAYIHRTPREGYDPSTLEAVQLEDGSTAYIHHPVA

ZN143_HUMAN

A6QQW0_BOVIN :Q8CI27_MOUSE :Q6VQB0_FUGRU :A0AUQ7_DANRE :Q4S173_TETNG :Q6GPP5_XENLA :ZNF76_MOUSE

ZNF76_HUMAN

VPQSDTILAIQADGTVAGLHT-GDATIDPDTISALEQYAAKVSIDGSESVAGTGMIGENEQEKKMQIVLQGHATRVTAKSQQSGEKAFRCEYDGCG--VPQSDTILAIQADGTVAGLHT-GDAAIDPDTISALEQYAAKVSIDGSEGVTGSGIIGENEQEKKMQIVLQGHATRVTAKSQQSGEKAFRCGYDGCG--VPQSDTILAIQADGTVAGLHT-GDATIDPDTISALEQYAAKVSIDGSDGVTSTGMIGENEQEKKMQIVLQGHATRVTPKSQQSGEKAFRCKYDGCG--MPQSNTILAIQADGTIADLQA-DATGLNPETISVLEQYATKVESIENQLG--SYSRAEADNGVHMRIVLQDQDNRQS-RSTNVGEKSFRCEYEGCG--MPQSNTILAIQADGTVADLQT-EGT-IDAETISVLEQYSTKMEATECGTG--LIGRGDSD-GVHMQIVLQGQDCRSP-RIQHVGEKAFRCEHEGCG--MPQSNTILAIQADGTIADLQA-DAAGLNPETISVLEQYATKVPLVSGLRLRLLWAGGEYRKPVGLLQPAGGGERRPH-ADCFTRSRQQAVAEHQCGREVPQSDTILAIQADGTVAGLHT-GEASIDPDTITALEQYAAKVSIEGGEGAGSNALITESESEKKMQIVLS-HGSRVPVKVPQTNEKAFRCDYEGCG--VPSDSAILAVQTEAGLEDLAAEDEEGFGTDTVVALEQYASKVLHDS--------------------------PASHNGKGQQVGDRAFRCGYKGCG--VPSESTILAVQTEVGLEDLAAEDDEGFSADAVVALEQYASKVLHDS--------------------------QIPRNGKGQQVGDRAFRCGYKGCG--

DE Homo sapiens MHC class I antigen (HLA-A) gene, HLA-A01 variant allele, DE alternatively spliced. ... XX FH Key Location/Qualifiers FT source 1.. FT /db_xref="taxon:9606" FT /organism="Homo sapiens" FT gene <1..> FT /gene="HLA-A" FT /allele="HLA-A01 variant" FT mRNA join(<1..373,504..773,1015..1266,1870..2145,2248..2364, FT 2807..2839,2982..3029,3199..>3374) FT exon <1.. FT /number= FT 5'UTR <1.. FT /allele="HLA-A01 variant" FT CDS join(301..373,504..773,1015..1266,1870..2145,2248..2364, FT 2807..2839,2982..3029,3199..3203) FT /gene="HLA-A" FT /product="MHC class I antigen" FT /protein_id="AAW30165.1" FT /translation="MAVMAPRTLLLLLSGALALTQTWAGSHSMRYFFTSVSRPGRGEPR FT FIAVGYVDDTQFVRFDSDAASQKMEPRAPWIEQEGPEYWDQETRNMKAHSQTDRANLGT FT LRGYYNQSEDGSHTIQIMYGCDVGPDGRFLRGYRQDAYDGKDYIALNEDLRSWTAADMA FT AQITKRKWEAVHAAEQRRVYLEGRCVDGLRRYLENDPPKTHMTHHPISDHEATLRCWAL FT GFYPAEITLTWQRDGEDQTQDTELVETRPAGDGTFQKWAAVVVPSGEEQRYTCHVQHEG FT LPKPLTLRWELSSQPTIPIVGIIAGLVLLGAVITGAVVAAVMWRRKSSDRKGGSYTQAA FT SSDSAQGSDVSLTACKV" FT exon 504.. FT /number= FT exon 1015.. FT /number= FT variation 1268 FT /note="alternatively spliced compared to HLA-A010101; FT results in altered exon and protein length; no membrane FT expression detected" FT /replace="g" FT /gene="HLA-A" FT exon 1870.. FT /number= FT exon 2248.. FT /number= FT exon 2807.. FT /number= FT exon 2982.. FT /number= FT exon 3199..> FT /number= FT 3'UTR 3204..> ...

Une recherche blastp a été effectuée à partir de la séquence protéique HLA-A (357 aa).

Cette protéine est similaire à des immunoglobulines comme le montrent les alignements avec

la protéine MUCM_RABIT. 3 pt

A blastp search has been performed with the HLA-A sequence (357 aa). This protein is

similar to immunoglobulins as shown by the alignments with the protein MUCM_RABIT.

a) Représentez schématiquement les 2 protéines en indiquant les régions conservées. Represent schematically the 2 proteins with their conserved regions.

b) Représentez le résultat d’une comparaison des deux protéines par la méthode de la

matrice de points.

Represent the result of a dotplot comparison between the two proteins.

>sp|P04221|MUCM_RABIT Ig mu chain C region membrane-bound form

Length = 479

Score = 45.1 bits (105), Expect = 3e- Identities = 29/94 (30%), Positives = 49/94 (52%), Gaps = 11/94 (11%)

Query: 214 EATLRCWALGFYPAEITLTWQRDGED-----QTQDTELVETRPAGDGTFQKWAAVVVPSG 268 ++ L C A GF P +I+++W RDG+ T+ E ET+ AG TF + + + Sbjct: 132 KSRLICQATGFSPKQISVSWLRDGQKVESGVLTKPVE-AETKGAGPATFSISSMLTITES 190

Query: 269 E---EQRYTCHVQHEGL--PKPLTLRWELSSQPT 297

- YTC V H G+ K +++ E S+ P+ Sbjct: 191 DWLSQSLYTCRVDHRGIFFDKNVSMSSECSTTPS 224

Score = 40.4 bits (93), Expect = 7e- Identities = 24/81 (29%), Positives = 37/81 (45%), Gaps = 6/81 (7%)

Query: 215 ATLRCWALGFYPAEITLTWQRDGEDQTQD---TELVETRPAGDGTFQKWAAVVVPS---G 268 AT+ C GF PA++ + WQ+ G+ + D T P G + + + V Sbjct: 352 ATVTCLVKGFSPADVFVQWQQRGQPLSSDKYVTSAPAPEPQAPGLYFTHSTLTVTEEDWN 411

Query: 269 EEQRYTCHVQHEGLPKPLTLR 289

+TC V HE LP +T R Sbjct: 412 SGETFTCVVGHEALPHMVTER 432

Score = 31.2 bits (69), Expect = 4e- Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 10/85 (11%)

Query: 219 CWALGFYPAEITLTWQRDGEDQTQDTELVETRPA---GDGTFQKWAAVVVPS-----GEE 270 C A F P+ +T +W + + V T P GD + + V+VPS G E Sbjct: 28 CLARDFLPSSVTFSWSFKNNSEIS-SRTVRTFPVVKRGD-KYMATSQVLVPSKDVLQGTE 85

Query: 271 QRYTCHVQHEGLPKPLTLRWELSSQ 295

C VQH + L + + + S+ Sbjct: 86 EYLVCKVQHSNSNRDLRVSFPVDSE 110

Deuxième partie (6 points)

Une région génomique (982 bases, access GQ2293385) d’une souche de virus H1N1 a été

comparée à une banque de séquences nucléiques avec les programmes fasta et blastn. L’une

des séquences détectées est la séquence synthétique CS723756 (4700 pb). Les séquences

GQ2293385 et CS723756 ont également été alignées avec le programme d’alignement

optimal Water. Les alignements entre ces 2 séquences obtenus par les trois méthodes vous

sont présentés.

A genomic region (982 bases, access GQ2293385) of a H1N1 virus strain has been compared

to a nucleic sequence database using fasta and blastn programs. One of the detected

sequences is the synthetic sequence CS723756 (4700 pb). The GQ2293385 and CS

sequences have also been aligned using the optimal alignment program Water. The

alignments between the two sequences obtained using the three methods are shown.

1) Que pouvez-vous déduire sur la similarité entre les 2 séquences? Quelles sont les

principales différences entre les 3 alignements obtenus? Comment l’expliquez-vous 3 pts

What can you deduce about the similarity between these 2 sequences? What are the main

differences between the three alignments? How do you explain these differences?

2) Megablast est-il adapté dans le cadre de cette recherche? Pourquoi? 1 pts

Is Megablast suitable in the context of this search? Why?

3) Donnez schématiquement le résultat d’une comparaison de ces deux séquences par la

méthode de la matrice de points. 2 pts

Give schematically the result of a dotplot comparison of these two sequences.

BlastN

>emb|CS723756.1| Sequence 14 from Patent WO

Length=

Score = 105 bits (116), Expect = 6e-

Identities = 121/163 (74%), Gaps = 0/163 (0%)

Strand=Plus/Plus

Query 818 GATCGTCtttttttCAAATGTATTTATCGTCGCTTTAAATACGGTTTGAAAAGAGGGCCT 877 || || || || ||||| || || || || | | || || || |||| ||||| ||| Sbjct 1503 GACCGGCTGTTCTTCAAGTGCATCTACCGGAGACTGAAGTATGGACTGAAGAGAGGACCT 1562

Query 878 TCTACGGAAGGAGTGCCTGAGTCCATGAGGGAAGAATATCAACAGGAACAGCAGAGTGCT 937 | || | ||||||||||| || ||| |||| || ||| ||||||||||||||| || Sbjct 1563 GCCACAGCCGGAGTGCCTGAATCTATGCGGGAGGAGTATAGACAGGAACAGCAGAGCGCC 1622

Query 938 GTGGATGTTGACGATGGTCATTTTGTCAACATAGAGCTAGAGT 980 |||||||| || ||||| || || || || || ||||| |||| Sbjct 1623 GTGGATGTGGATGATGGCCACTTCGTGAATATCGAGCTGGAGT 1665

Score = 48.2 bits (52), Expect = 0. Identities = 32/36 (88%), Gaps = 0/36 (0%) Strand=Plus/Plus

Query 716 CCTACCAGAAGCGAATGGGAGTGCAGATGCAGCGAT 751 |||||||||| || |||||||||||||| |||||| Sbjct 1464 CCTACCAGAAATGAGTGGGAGTGCAGATGTAGCGAT 1499

Fasta

>>EM_PAT:CS723756; CS723756 Sequence 14 from Patent WO20 (4700 nt)

initn: 378 init1: 238 opt: 549 Z-score: 279.9 bits: 65.5 E(): 7.6e-

58.0% identity (69.2% similar) in 357 nt overlap (631-982:1319-1667)

Sequen GAGGCCAUGGAGGUUGCUAAUCAGACUAGGCAGAUGGUACAUGCAAUGAGAACUAUUGGG

EM_PAT GCUGACAGACUAACAGACUGUUCCUUUCCAUGGGUCUUUUCUGCAGUCACCGUCGUCGAC

Sequen ACUCAU--CCUAGCUCCAGUGCUGGUCU-GAAAGAUGACCUUCUUG-AAAAUUUGCAGGC

EM_PAT ACGUGUGAUCAGAUAUCGCGGCCGCUCUAGAGAUAUCGCCACCAUGCAGUACAUCAAGGC

Sequen CUACCAGAAGCGAAU-GGGAGUGCAGAUGCAGCGAUUCAAGUGAUCCUCUCGUCAUUGCA

EM_PAT CAACAGCAAGUUUAUCGGCAUCACAGAGCUGUCUCUGCUGACAGAAGUGGAGAC-CCCUA

Sequen GCAAAUAUCAUUGGGAUCUUGCACCUGAUAUUGUGGAUUACUGAUCGUCUUUUUUUCAAA

EM_PAT CCAGAAAUGAGUGGGA--GUGCA---GAUGUAG-CGAUAGC-GACCGGCUGUUCUUCAAG

Sequen UGUAUUUAUCGUCGCUUUAAAUACGGUUUGAAAAGAGGGCCUUCUACGGAAGGAGUGCCU

EM_PAT UGCAUCUACCGGAGACUGAAGUAUGGACUGAAGAGAGGACCUGCCACAGCCGGAGUGCCU

Sequen GAGUCCAUGAGGGAAGAAUAUCAACAGGAACAGCAGAGUGCUGUGGAUGUUGACGAUGGU

EM_PAT GAAUCUAUGCGGGAGGAGUAUAGACAGGAACAGCAGAGCGCCGUGGAUGUGGAUGAUGGC

Sequen CAUUUUGUCAACAUAGAGCUAGAGUAA

EM_PAT CACUUCGUGAAUAUCGAGCUGGAGUGAACACGUGGGAUCCAGAUCUGCUGUGCCUUCUAG

c) Que pouvez-vous dire sur la similarité entre les 2 protéines? 1.5 pts

What can you say about the similarity between the two proteins?

FIRST iteration

>sp|Q57979.2|SURE_METJA RecName: Full=5'-nucleotidase surE; AltName: Full=Nucleoside 5'-monophosphate phosphohydrolase Length=

Score = 58.9 bits (141), Expect = 3e-06, Method: Compositional matrix adjust. Identities = 55/219 (25%), Positives = 97/219 (44%), Gaps = 45/219 (20%)

Query 1 MRVLITNDDGPLSDQFSPYIRPFIQHIKRNYPEWKITVCVPHVQKSWVGKAHLAGKNLTA 60 M +LI NDDG +SP + +K + + IT+ P Q+S +G+A Sbjct 1 MEILIVNDDG----IYSPSLIALYNALKEKFSDANITIVAPTNQQSGIGRAI-------- 48

Query 61 QFIYSKVDAEDNTFWGPFIQPQIRSENSKLPYVLNAEIPKDTIEWILIDGTPASCANIGL 120

- P +++ + KD + + + GTP C +G+ Sbjct 49 ------------SLFEPLRMTKVK-------------LAKDIVGY-AVSGTPTDCVILGI 82

Query 121 HLLSNEPFDLVLSGPNVGRNTSAAYITSSGTVGGAMESVITGNTKAIAISWAYFN---GL 177

- - DLV+SG N+G N I +SGT+G A E+ G K+IA S + Sbjct 83 YQILKKVPDLVISGINIGENLGTE-IMTSGTLGAAFEAAHHG-AKSIASSLQITSDHLKF 140

Query 178 KNVS-PLLMEKASKRSLDVIKHLVKNWDPKTDLYSINIP 215 K + P+ E +K + + + + ++D D+ +INIP Sbjct 141 KELDIPINFEIPAKITAKIAEKYL-DYDMPCDVLNINIP 178

SECOND iteration

>sp|Q57979.2|SURE_METJA RecName: Full=5'-nucleotidase surE; AltName: Full=Nucleoside 5'-monophosphate phosphohydrolase Length=

Score = 210 bits (534), Expect = 5e-52, Method: Composition-based stats. Identities = 67/318 (21%), Positives = 119/318 (37%), Gaps = 71/318 (22%)

Query 1 MRVLITNDDGPLSDQFSPYIRPFIQHIKRNYPEWKITVCVPHVQKSWVGKAHLAGKNLTA 60 M +LI NDDG +SP + +K + + IT+ P Q+S +G+A + L Sbjct 1 MEILIVNDDGI----YSPSLIALYNALKEKFSDANITIVAPTNQQSGIGRAISLFEPLRM 56

Query 61 QFIYSKVDAEDNTFWGPFIQPQIRSENSKLPYVLNAEIPKDTIEWILIDGTPASCANIGL 120

D I + GTP C +G+ Sbjct 57 TKVKLAKD----------------------------------IVGYAVSGTPTDCVILGI 82

Query 121 HLLSNEPFDLVLSGPNVGRNTSAAYITSSGTVGGAMESVITGN---TKAIAISWAYFNGL 177

- - DLV+SG N+G N I +SGT+G A E+ G ++ I+ + Sbjct 83 YQILKKVPDLVISGINIGENLGTE-IMTSGTLGAAFEAAHHGAKSIASSLQITSDHLKFK 141

Query 178 KNVSPLLMEKASKRSLDVIKHLVKNWDPKTDLYSINIPLVESLSDDTKVYYAPIWENRWI 237

P+ E +K + + + + P D+ +INIP E+ + +T + + + Sbjct 142 ELDIPINFEIPAKITAKIAEKYLDYDMP-CDVLNINIP--ENATLETPIEITRLARKMYT 198

Query 238 PIFNGPHINLENSFAEIEDGNESSSISFNWAPKFGAHKDSIHYMDEYKDRTVLTDAEVI- 296 +E+ + S+ W D +E +D TD V+ Sbjct 199 --------------THVEERIDPRGRSYYW-------IDGYPIFEEEED----TDVYVLR 233

Query 297 ESEMISVTPMKATFKGVN 314

IS+TP+ N Sbjct 234 KKRHISITPLTLDTTIKN 251

Examen de bio informatique, Exams of Biotechnology

Related documents

Partial preview of the text

Download Examen de bio informatique and more Exams Biotechnology in PDF only on Docsity!

Examen de bioinformatique

Février 2009

Durée : 2h (ou 2h30) - Documents interdits

Première partie (4 points)

1) La séquence ci-dessous est-elle au format fasta (justifiez votre réponse)? 1 pt

Is the sequence below in fasta format (justify your answer)?

>tr|A0A098|A0A098_CHLRE

MASMAAELRPSDGGSSLHMLDSLLMMGLSSGGGVGGGGSSQSQILDSAGAAELAALLLPQ

HSNDPLHLMSTGDAALGLAGPMAAAEHHQHHPHHQHHSVPATAGFPSQTPPPPLFSNATA

GAAPATRVRAAGSCGSGGVAGGTTSHSSEDGVFHSADPHHHHQQHLQQPQPQQQQ

2) Quel problème rencontrera-t-on avec cette séquence lors d’une recherche de similarité? 1.5 pt

Which problem will be encountered with this sequence in a similarity search?

3) Définissez la banque GO. Define the GO database. 1.5 pt

Deuxième partie (7.5 points)

Un alignement multiple de séquences eucaryotes est présenté page 2. A multiple alignment of

eucaryotic sequences is shown page 2.

1) Quels groupes pouvez-vous distinguer dans cet alignement? 2 pt

Donnez 2 résidus discriminants pour chaque groupe.

Which group can you distinguish in this alignment? Give 2 discriminative residues for each

group.

2) Donnez une erreur de séquence probable dans cet alignement. 1.5 pt

Give a probable sequence error in this alignment.

3) Quelle est la relation d’homologie entre zn143_human et znf76_human? zn143_human et

q8ci27_mouse? zn143_human et znf76_mouse? 1.5 pt

Give the homology relation between zn143_human and znf76_human? zn143_human and

q8ci27_mouse? zn143_human and znf76_mouse?

4) Un arbre a été construit à partir de cet alignement selon la méthode du neigbor-joining. 2.5pt

Dans cet arbre (page 3), 3 identifiants de séquences ont été remplacés par x, y et z.

A tree has been constructed from this alignment using the neighbor-joining method. In this

tree (page 3), 3 sequence identifiers have been replaced par x, y, and z.

a) Est-ce que cet arbre est en accord avec votre analyse de l’alignement? Justifiez

votre réponse.

Is this tree in agreement with your alignment analysis? Justify your answer.

b) Selon vous, à quelles séquences correspondent respectivement X, Y et Z?

According to you, which sequences correspond respectively to X, Y and Z?

ZN143_HUMAN

A6QQW0_BOVIN :Q8CI27_MOUSE :Q6VQB0_FUGRU :A0AUQ7_DANRE :Q4S173_TETNG :Q6GPP5_XENLA :ZNF76_MOUSE

ZNF76_HUMAN

ZN143_HUMAN

A6QQW0_BOVIN :Q8CI27_MOUSE :Q6VQB0_FUGRU :A0AUQ7_DANRE :Q4S173_TETNG :Q6GPP5_XENLA :ZNF76_MOUSE

ZNF76_HUMAN

>sp|P04221|MUCM_RABIT Ig mu chain C region membrane-bound form

Deuxième partie (6 points)

Une région génomique (982 bases, access GQ2293385) d’une souche de virus H1N1 a été

comparée à une banque de séquences nucléiques avec les programmes fasta et blastn. L’une

des séquences détectées est la séquence synthétique CS723756 (4700 pb). Les séquences

GQ2293385 et CS723756 ont également été alignées avec le programme d’alignement

optimal Water. Les alignements entre ces 2 séquences obtenus par les trois méthodes vous

sont présentés.

A genomic region (982 bases, access GQ2293385) of a H1N1 virus strain has been compared

to a nucleic sequence database using fasta and blastn programs. One of the detected

sequences is the synthetic sequence CS723756 (4700 pb). The GQ2293385 and CS

sequences have also been aligned using the optimal alignment program Water. The

alignments between the two sequences obtained using the three methods are shown.

1) Que pouvez-vous déduire sur la similarité entre les 2 séquences? Quelles sont les

principales différences entre les 3 alignements obtenus? Comment l’expliquez-vous 3 pts

What can you deduce about the similarity between these 2 sequences? What are the main

differences between the three alignments? How do you explain these differences?

2) Megablast est-il adapté dans le cadre de cette recherche? Pourquoi? 1 pts

Is Megablast suitable in the context of this search? Why?

3) Donnez schématiquement le résultat d’une comparaison de ces deux séquences par la

méthode de la matrice de points. 2 pts

Give schematically the result of a dotplot comparison of these two sequences.

BlastN

>emb|CS723756.1| Sequence 14 from Patent WO

Length=

Score = 105 bits (116), Expect = 6e-

Identities = 121/163 (74%), Gaps = 0/163 (0%)

Strand=Plus/Plus

>>EM_PAT:CS723756; CS723756 Sequence 14 from Patent WO20 (4700 nt)

initn: 378 init1: 238 opt: 549 Z-score: 279.9 bits: 65.5 E(): 7.6e-

58.0% identity (69.2% similar) in 357 nt overlap (631-982:1319-1667)

Sequen GAGGCCAUGGAGGUUGCUAAUCAGACUAGGCAGAUGGUACAUGCAAUGAGAACUAUUGGG

EM_PAT GCUGACAGACUAACAGACUGUUCCUUUCCAUGGGUCUUUUCUGCAGUCACCGUCGUCGAC

Sequen ACUCAU--CCUAGCUCCAGUGCUGGUCU-GAAAGAUGACCUUCUUG-AAAAUUUGCAGGC

EM_PAT ACGUGUGAUCAGAUAUCGCGGCCGCUCUAGAGAUAUCGCCACCAUGCAGUACAUCAAGGC