0% found this document useful (0 votes)
23 views4 pages

Beispielfragen Bioinformatik 1

The document contains example questions for a Bioinformatics exam from the summer semester of 2011, focusing on DNA and protein sequence alignment, scoring schemes, and alignment methods. It outlines various topics including the Needleman-Wunsch algorithm, BLAST, substitution matrices, and phylogenetic trees. The questions are designed to assess understanding of bioinformatics concepts and practical applications in sequence analysis.

Uploaded by

keserwany1993
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views4 pages

Beispielfragen Bioinformatik 1

The document contains example questions for a Bioinformatics exam from the summer semester of 2011, focusing on DNA and protein sequence alignment, scoring schemes, and alignment methods. It outlines various topics including the Needleman-Wunsch algorithm, BLAST, substitution matrices, and phylogenetic trees. The questions are designed to assess understanding of bioinformatics concepts and practical applications in sequence analysis.

Uploaded by

keserwany1993
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Translated by Google

Example questions for Bioinformatics, first semester half


Summer semester 2011

Note

• The written exam will be written in German.

• The questions will be based on material from the Übungen and the Lectures.

• These are typical questions. It is not a "Fragenkatalog". It is a sample of possibilities.

Example questions

1. You are given two DNA sequences to align


ACGTCCTTCATT and GTCTCATG

You have a scoring scheme where a

• match gives you +1

• a mismatch gives you 0

• gap opening costs ÿ10

Write down the best alignment of the two sequences

2. You have a scoring scheme where

• A match gives you +1

• a mismatch gives you ÿ1

• opening a gap costs you ÿ1

Write down the best alignment for the same two DNA sequences.

Z:\summer_11_teaching\bioinformatic\exercises\example_question_bioinformatics.doc 16.05.2011 [1 / 4]
Machine Translated by Google

3. You are aligning protein sequences using a substitution matrix :


ARNDCQEGHILKMFPSTWYV
A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0
R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3
N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3
D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3
C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1
Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2
E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2
G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3
H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3
I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3
L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1
K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2
M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1
F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1
P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 0 -1 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2
S 1 -1 1 0

T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0
In -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3
And -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1
V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4

Gap opening costs -8. Gap widening (extension) costs -1.

You are given an alignment


AACDQRST
A-CD-RST
What is the score of this alignment ?
4. Given
AACDQRST
A-CD--ST
What is the score of this alignment ?
5. AACDQRST
A-CD-SST
What is the score of this alignment ?

6. You have calculated a score matrix for a pair of DNA sequences. You have performed the traceback
calculation and found a result like:
ACACCTTA

Write down the corresponding sequence alignment with gaps in the correct positions.

7. Outline the steps used to find values for a BLOSUM amino acid similarity matrix.

Z:\summer_11_teaching\bioinformatic\exercises\example_question_bioinformatics.doc 16.05.2011 [2 / 4]
Machine Translated by Google

8. What is the advantage of a Needleman-Wunsch alignment compared to a seeded alignment ?

9. What is the advantage of a seeded method like BLAST compared to a Needleman-Wunsch alignment ?

10. Name an application where you would use a method like BLAST and not a Needleman-Wunsch

alignment.

11. Name an application where you would use a slow method like Needleman and Wunsch, rather than a
BLAST-like method.

12. You have two proteins with weak, remote similarity. You know their sequences. You also know the

sequences of the original DNA. Why would you expect a better alignment using the protein sequences ?

13. In protein alignments, we do not just look at match/mismatches. We look at the similarities between

amino acids. How are these represented ?

14. There are several different substitution matrices used in protein alignments. Why would you prefer one
over another ?

15. What is the difference between an iterated blast (psi-blast) search and a simple blast search ?

16. What is the advantage of an iterated blast search compared to a simple blast search ?

17. I have written a program that generates random DNA sequences. I expect to see 25 % sequence identity

between pairs of random sequences in gapped alignments. When I try the calculation, I usually see more

than 25 % sequence identity. Why ?

18. If I take random biological sequences from a data bank, I see even more sequence similarity. Why ?

19. I have two proteins with 20 % sequence identity. I ask you if this is likely to be significant. What other

simple piece of information do you need to answer this question properly ?

20. If you use the program "chimera", what representation would you pick in order to see the secondary
structure ?

21. I want to calculate a multiple sequence alignment for Nseq sequences. How many pair-wise alignments will
I have to calculate ?

22. Nseq N
seq

In a multiple sequence alignment, you want to maximise a score, score = In the


What is this score in
,

b a=a
ÿÿÿ 1

words ?

23. In a multiple sequence alignment, I want to build a "guide tree". What determines the order in which I join

the nodes together ?

24. I have 3 sequences, A, B, C. The sequences B and C are both related to A, but I cannot get a good

alignment score when I align B and C. What could be a reason ? Draw a diagram if it is easier to explain.

25. I have calculated a multiple sequence alignment. I want to find which sites in the alignment are conserved

and which vary. I would like to make a plot of variability/conservation as a function of sequence position.
Nstates

You remember a formula S


ÿ= =
p pln What is the meaning of pi ?
ii

i 1

Z:\summer_11_teaching\bioinformatic\uebungen\Beispiel_frage_bioinformatik.doc 16.05.2011 [3 / 4]
Machine Translated by Google

26. From a multiple sequence alignment, I have calculated variability/conservation as a function of sequence

3.5

2.5

2
S
1.5

0.5

0
residue number
0 50 100
position: Residues 37 and 43 seem to be very
conserved. Why might they be important residues ?
27. In the picture above, some sites are not very conserved. I say these residues cannot be important to the
function of the protein. Why may I be wrong ?
28. I have calculated a sequence alignment of 400 tyrosine kinases and I find that very few sites seem to be
conserved in evolution. How could I change my results, so that more sites seem to be conserved ?
29. We have a family of sequences and all pair-wise alignments. I can count the number of differences
(mutations) between any two sequences and calculate the fraction of residues that have changed :

N diff
p=
.
I would like to estimate evolutionary time by saying t = k pmut for some constant k.
N mut length

This is not a good measure. Why not ?


30. I want to use aligned DNA sequences to build a phylogenetic tree. Name two reasons that the branches in
the tree may not be reliable.
31. I have built a phylogenetic tree using a neighbour joining method. Describe a general approach I could use
to see how reliable the tree is.

32. It is believed that protein sequence evolves and changes faster than protein structure. What could be an
evolutionary explanation for this.
33. Which graphical representation would you use in order to emphasize the secondary structure content of
protein? all-atom, chaintrace, ribbon, ...?
34. You have a protein of unknown function from a bacterium. You have made a knock-out mutant, but the
bacteria die immediately without the corresponding gene. You have sequenced the protein. What steps
would you take to guess the function of the protein ? What kind of information would you look for ?
35. How would you proceed in a multiple sequence alignment to identify potentially catalytically important
To identify side chains in the active center?

Z:\summer_11_teaching\bioinformatic\uebungen\Beispiel_frage_bioinformatik.doc 16.05.2011 [4 / 4]

You might also like