Notes-Dot Plot Using Threshold and Window Size
Notes-Dot Plot Using Threshold and Window Size
Description:
Dot plot is sensitive qualitative indicator of sequence similarity. Dot plot is better than alignments in
following two ways: (a) rearrangements (b) repeated sequences. Dot plot relies on visual perception
(not quantitative) and it is very useful for RNA structure similarity analysis.
.
Dot plot is the simplest method in which we put a dot ( ) wherever sequences and position are
identical that indicates the comparison of nucleotides along the sequence (Diagonal Sequence). It
represents direct similarity between two sequences whereas we place a circle (o) when the nucleic
bases are identical but located at non-identical positions along the sequence. It represents the sequence
repeats, emerging from evolutionary insertions and duplication.
Advanced Dot Plot uses two additional parameters: threshold (also refer to as Stringency) “n” and
window size “w”. We place a dot wherever the residues (nucleic acid) at least n consecutive matches
or n matches in a row (identity matching, identical position).
Window Size represents the number of nucleic acid (residues) compared each time (usually an odd
number and less than length of sequence “L”). Threshold is the minimum number of nucleic acid
(residues) in the window must be “match”, always less than the length of the sequence. Mismatch
Limit is the maximum number of nucleic acid (residues) in the window can be “non-match”
1
BIO310 Lecture Notes
1. Calculate the matching nucleic acid (residues) within a window of a given size, for example
window size=3.
2. Place a dot, if score is over a threshold (stringency), for example threshold=2.
3. Move the window over a given step, for example Move one step from left to right.
4. Repeat step one to three till the end of sequence.
Example 1:
Input Sequences:
Counter = 1
Counter = 2
Counter = 3 n ≤ 𝑐𝑜𝑢𝑛𝑡𝑒𝑟
Counter = 4
Counter = 0
Iteration #2:
Counter = 0
Counter = 0
Counter = 0
Counter = 0
Counter = 0
2
BIO310 Lecture Notes
Iteration #3:
Counter = 1
Counter = 2
Counter = 3 n ≤ 𝑐𝑜𝑢𝑛𝑡𝑒𝑟
Counter = 0
Counter = 1
Iteration #4:
Counter = 0
Counter = 0
Counter = 0
Counter = 0
Counter = 0
Iteration #5:
Counter = 0
Counter = 0
Counter = 0
Counter = 0
Counter = 0
3
BIO310 Lecture Notes
Iteration #6:
Counter = 1
Counter = 2
Counter = 3 n ≤ 𝑐𝑜𝑢𝑛𝑡𝑒𝑟
Counter = 0
Counter = 0
Similarly we will have few more iterations till the end of the sequence. Following is the Final result:
Example 2:
Input Sequences:
4
BIO310 Lecture Notes