Disclaimer
Disclaimer
It is hereby declared that the production of the said content is meant for non-commercial, scholastic and research
purposes only.
We admit that some of the content or the images provided in this channel's videos may be obtained through the
routine Google image searches and few of them may be under copyright protection. Such usage is completely
inadvertent.
It is quite possible that we overlooked to give full scholarly credit to the Copyright Owners. We believe that the non-
commercial, only-for-educational use of the material may allow the video in question fall under fair use of such
content. However we honour the copyright holder's rights and the video shall be deleted from our channel in case of
any such claim received by us or reported to us.
Department
of
Microbiology
Unit no 3
Introduction to
Subject name
and code
Bioinformatics
& Biostatistics;
02MB0301
Dr. Purvi M.
Rakhashiya
Importance of Sequence alignment
Sequence databases
• https://youtu.be/TiKbMw_bKEk
• Convergent Vs. Divergent Evolution
– Divergent evolution occurs when two different species share a
common ancestor but have different characteristics from one
another. Each time one ancestral species diverges into multiple
descendant species it is called speciation. Speciation is an
important result of divergent evolution.
– Convergent evolution is the independent evolution of similar
features in species of different periods or epochs in time.
Convergent evolution creates analogous structures that have
similar form or function but were not present in the last common
ancestor of those groups.
• Homologs
– Sequence homology is the biological homology between DNA,
RNA, or protein sequences, defined in terms of shared ancestry in
the evolutionary history of life.
– Two segments of DNA can have shared ancestry because of three
phenomena: either a speciation event (orthologs), or a duplication
event (paralogs), or else a horizontal (or lateral) gene transfer
event (xenologs).
– Paralogs-Paralogous genes are genes that are related via
duplication events in the last common ancestor (LCA) of the
species being compared. They result from the mutation of
duplicated genes during separate speciation events.
– Orthologs- Homologous sequences are orthologous if they are
inferred to be descended from the same ancestral sequence
separated by a speciation event: when a species diverges into two
separate species, the copies of a single gene in the two resulting
species are said to be orthologous. Orthologs, or orthologous
genes, are genes in different species that originated by vertical
descent from a single gene of the last common ancestor.
Conservation or variation
• Methods of variations
• Mutation or Substitutions
• Insertions
• Deletions
From unknown to known
13
Nucleic Acid Sequence characteristics and
parameters
14
Complementarity
5’ ACGTTACG 3’
3’ TGCAATGC 5’
Most cellular processes involving DNA occur in the 5’ to 3’ direction
For every G one strand there is a C on other strand and for every A on one strand
there is a T on other strand
15
DNA Composition (Rigidity and Flexibility)
• AT content
• GC content
• AT or GC content = x100
16
Types of Alignment
• Goal - To find the best pairing of two sequences, such that there is
maximum correspondence among residues
• How - One sequence is shifted relative to the other to find the position
where maximum matches are found
• Strategies -
1. Global alignment
2. Local alignment
Global Alignment Local Alignment
Local alignments are more useful for
Attempt to align every residue in every dissimilar sequences that are suspected to
sequence, are most useful when the contain regions of similarity or similar
sequences in the query set are similar and of sequence motifs within their larger
roughly equal size. sequence context.
• Assumption – Similarity over the entire • Assumption – Similarity, confined to local
length of sequences regions instead of entire length of
• Sequences – More or less of similar sequences
length • Sequences – could vary largely over
• Alignment - Carried out from beginning length
to end of both sequences • Alignment – carried out considering local
• Aim - Look for best possible alignment regions of high similarity without regard
across the entire length between the two for the alignment of the rest of the
sequences sequence regions
• General global alignment technique is • Aim – Look for local regions with the
the Needleman–Wunsch algorithm highest level of similarity
• General global alignment technique is
the Smith–Waterman algorithm
Global Alignment Local Alignment
• Applications - More applicable for aligning • Applications - Used for aligning more
two closely related sequences of roughly divergent sequences to find conserved
the same length patterns also known as domains or
• Limitation – Not good for divergent motifs
sequences and sequences of variable
lengths