0% found this document useful (0 votes)
838 views5 pages

Notes-Dot Plot Using Threshold and Window Size

This document contains lecture notes on dot plots from a BIO310 class. [1] Dot plots can detect sequence similarities and rearrangements better than alignments. [2] They rely on visual perception rather than quantitative analysis. [3] Advanced dot plots use threshold and window size parameters to determine if sequences match within a window.

Uploaded by

Huzaifah Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
838 views5 pages

Notes-Dot Plot Using Threshold and Window Size

This document contains lecture notes on dot plots from a BIO310 class. [1] Dot plots can detect sequence similarities and rearrangements better than alignments. [2] They rely on visual perception rather than quantitative analysis. [3] Advanced dot plots use threshold and window size parameters to determine if sequences match within a window.

Uploaded by

Huzaifah Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

BIO310 Lecture Notes

COMSATS University Islamabad, Wah Campus


BIO310 – Lecture Notes

Topic: “DOT PLOTS Using Threshold & Window Size”


Instructor: Dr. Anum Masood

Description:
Dot plot is sensitive qualitative indicator of sequence similarity. Dot plot is better than alignments in
following two ways: (a) rearrangements (b) repeated sequences. Dot plot relies on visual perception
(not quantitative) and it is very useful for RNA structure similarity analysis.

.
Dot plot is the simplest method in which we put a dot ( ) wherever sequences and position are
identical that indicates the comparison of nucleotides along the sequence (Diagonal Sequence). It
represents direct similarity between two sequences whereas we place a circle (o) when the nucleic
bases are identical but located at non-identical positions along the sequence. It represents the sequence
repeats, emerging from evolutionary insertions and duplication.

Advanced Dot Plot uses two additional parameters: threshold (also refer to as Stringency) “n” and
window size “w”. We place a dot wherever the residues (nucleic acid) at least n consecutive matches
or n matches in a row (identity matching, identical position).

Window Size represents the number of nucleic acid (residues) compared each time (usually an odd
number and less than length of sequence “L”). Threshold is the minimum number of nucleic acid
(residues) in the window must be “match”, always less than the length of the sequence. Mismatch
Limit is the maximum number of nucleic acid (residues) in the window can be “non-match”

[Mismatch Limit = Window Size - Threshold]

Dot Plot – What information is visible?


- Similar regions
- Repeated sequences
- Sequence rearrangements
- DNA/RNA structures
- Gene order

1
BIO310 Lecture Notes

Windowed Dot-Plot Algorithm:

1. Calculate the matching nucleic acid (residues) within a window of a given size, for example
window size=3.
2. Place a dot, if score is over a threshold (stringency), for example threshold=2.
3. Move the window over a given step, for example Move one step from left to right.
4. Repeat step one to three till the end of sequence.

Example 1:
Input Sequences:

Sequence Length “L” = 8; Window Size “w” = 5; Threshold “n” = 3


Iteration #1:

Counter = 1
Counter = 2
Counter = 3 n ≤ 𝑐𝑜𝑢𝑛𝑡𝑒𝑟
Counter = 4
Counter = 0

Iteration #2:

Counter = 0
Counter = 0
Counter = 0
Counter = 0
Counter = 0

2
BIO310 Lecture Notes

Iteration #3:

Counter = 1
Counter = 2
Counter = 3 n ≤ 𝑐𝑜𝑢𝑛𝑡𝑒𝑟
Counter = 0
Counter = 1

Iteration #4:

Counter = 0
Counter = 0
Counter = 0
Counter = 0
Counter = 0

Iteration #5:

Counter = 0
Counter = 0
Counter = 0
Counter = 0
Counter = 0

3
BIO310 Lecture Notes

Iteration #6:

Counter = 1
Counter = 2
Counter = 3 n ≤ 𝑐𝑜𝑢𝑛𝑡𝑒𝑟
Counter = 0
Counter = 0

Similarly we will have few more iterations till the end of the sequence. Following is the Final result:

Example 2:
Input Sequences:

Sequence Length “L” = 9; Window Size “w” = 5; Threshold “n” = 3

4
BIO310 Lecture Notes

You might also like