0% found this document useful (0 votes)

4 views21 pages

Lecture 4

The document outlines a course on Multiple Sequence Alignment provided by PhD Tam Tran at USTH, covering the importance, methods, and interpretation of multiple sequence alignments in bioinformatics. Key methods discussed include ClustalW and MUSCLE, along with exercises for practical application. The course also emphasizes the significance of alignments in structure prediction, genetic variation analysis, and understanding evolutionary relationships.

Uploaded by

trieupg.22bi13431

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views21 pages

Lecture 4

Uploaded by

trieupg.22bi13431

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Bioinformatics

-----
Multiple Sequence Alignment
Course Provider: PhD Tam Tran
Department of Life Sciences (LS) – USTH
Email: [email protected]

Master in Medical biotechnology - Plant biotechnology – Pharmacology – Year 2

Outline

• Introduction
• Applications: Why we do multiple sequence alignments?
• Multiple-sequence-alignment methods
 ClustalW
 MUSCLE

• Interpreting Multiple Sequence Alignment Results

2
From pairwise to multiple alignment

F G K  G K G
F G K  G K G F G K F G K G
F G K F G K G - G K Q G K G
- - K F G K G
Pairwise: MSA:
For 2 sequences For more than 2 sequences

Alignments help to analyze sequence data: organize and visualize.

3
Why we do multiple sequence alignments?

1. Structure prediction (RNA, protein)

Target: VTISCTGSSSNIGAG-NHVKWYQQLPG
Homologue 1: VTISCTGSSSNIGS--ITVNWYQQLPG
Homologue 2: LRLSCTGSGFIFSS--YAMYWYQQAPG
… …
Homologue n: LSLTCTGSGTSFDD-QYYSTWYQQPPG

2. Genetic variations (Sequence similarity)

Analysis of residues’ substitutions: mutation or polymorphism?

4
Why we do multiple sequence alignments?

3. Learn about evolutionary relationships

• Two sequences from different organisms are similar  they may have a
common ancestor.
• Needed for construction of phylogenetic trees

5
Multiple Alignment Methods

6
ClustalW
 ClustalW = multiple alignment tool
 The most commonly used program for making multiple sequence alignments
 ‘W’ stands for ‘weighted’ (sequences are weighted differently).

BLAST: Clustal:
7
pairwise comparison Global alignment "all against all"
ClustalW- Progressive Alignment

A
A B C D
A - - - -
B
B 1 - - -
C 7 8 - - C
D 11 5 2 -
D

8
See Thompson et al. (1994) for an explanation of the three
stages of progressive alignment implemented in ClustalW
MUSCLE

•MUltiple Sequence Comparison by Log- Expectation (MUSCLE)

•The most recent popular MSA software
• Considered to be the most accurate MSA software available today
• The basic idea: iterative progressive alignment

Edgar, R.C., 2004

Finding your favorite alignment method

Multiple Sequence Alignment Resources over the Internet

Method Description Address
The most commonly used
ClustalW https://www.genome.jp/tools-bin/clustalw
program.
The latest addition to the
ClustalO https://www.ebi.ac.uk/Tools/msa/clustalo/
Clustal family
A fast and accurate www.drive5.com/muscle/ (download)
MUSCLE
sequence cruncher https://www.ebi.ac.uk/Tools/msa/muscle/
Accurate combination of www.tcoffee.org
Tcoffee
sequences and structures https://www.ebi.ac.uk/Tools/msa/tcoffee/
A Bayesian version of probcons.stanford.edu/
Probcons
Tcoffee
Kalign A fast sequence aligner https://msa.sbc.su.se/cgi-bin/msa.cgi
A fast and accurate
MAFFT sequence cruncher using https://mafft.cbrc.jp/alignment/server/
Fast Fourier Transforms
Ideal for Sequences With
Dialign https://bibiserv.cebitec.uni-bielefeld.de/dialign/
Local Homology

11
EXERCISE BREAK
Exercise 1: Perform multiple sequence alignements

1. Download amino acid sequences for accession numbers P20472, P80079, P02626,
P02619, P43305, P32930, Q91482, P02620, P02622, P02586 from the NCBI protein
database.

2. Perform multiple sequence alignments with ClustalW/ClustalO and MUSCLE

3. What are the differences between the two outputs?

4. What do you guess that the symbols ("*"), (“:”) and (“.”) under the alignment mean?

5. How many stretches of perfectly conserved sequence (of at least, say, 10 amino
acid) can you find? Write down the sequence(s) of the perfectly conserved
stretch(es).

12
Interpreting Multiple Sequence Alignment

* = perfectly conserved residue

: = conservative substitution
. = semi-conservative substitution

13
How can you tell whether a block is good?

14
Important amino acids (or nucleotides) are not allowed to mutate

active site

15
Editing and Analyzing Multiple Sequence Alignments

Bioedit: https://bioedit.software.informer.com/7.2/

Jalview: https://www.jalview.org/

Mega: https://www.megasoftware.net/

16
Sequence logo

17
Sequence logo
IGF1B_HUMAN APQTGIVDECCFRSCDLRRLEMYCAPLKPAKSAR
IGF1_PIG APQTGIVDECCFRSCDLRRLEMYCAPLKPAKSAR
IGF1_CANFA APQTGIVDECCFRSCDLRRLEMYCAPLKPAKSAR
IGF2_HORSE -RSRGIVEECCFRSCDLALLETYCATPAKSERDV
INS_CHIBR -----IVDQCCTSICTLYQLENYCN---------
INS_ORNAN -----IVEECCKGVCSMYQLENYCN---------
INS_AOTTR MQKRGVVDQCCTSICSLYQLQNYCN---------

 Sequence logo shown as frequency of amino acid per position

 Size of a letter represents the information content 18
EXERCISE BREAK
Exercise 2: Visualize consensus sequences with Sequence Logos

1. Identification of a consensus block in Exercise 1

2. Build their sequence logo

3. Interpreting a Sequence Logo

Tool (WebLogo): https://weblogo.berkeley.edu/logo.cgi

- Input: Multiple Sequence Alignment
- Output: Sequence logo

19
HOMEWORK - DAY 4
1. Get 10 homologous proteins with identity <80% from BLAST result in Homework Day 3
- Provide the list of proteins sequences with their names and E-values, but NOT the full FASTA
format

2. Perform multiple alignment with your favorite alignment method

a. Overview and quality of the multiple alignment
- Can you confirm that these sequences are really homologs? Similar lengths?
- How many identical positions? How many conservative substitutions positions? Number of indels
(deletions and insertions)?
b. Identification of conserved blocks
- Identification of conserved blocks and visualize consensus sequences with Sequence Logos
- What can you observe ? What does it mean from a biological point of view ?
- Bonus: Can you make a link between your observations and what we know about the protein?
(Are there any conserved amino acids that are known as actives sites for this protein family? If
yes, position in alignment, function, activity? )
c. Write a conclusion

DEADLINE: 10am Thursday 06 May 2021 20

END

Mushoku Tensei - Jobless Reincarnation - Recollections
No ratings yet
Mushoku Tensei - Jobless Reincarnation - Recollections
180 pages
Procedure For The Inspection, Repair, and Component Replacement On The Frames of Certain
100% (1)
Procedure For The Inspection, Repair, and Component Replacement On The Frames of Certain
217 pages
Startup Investments 08.06.23
No ratings yet
Startup Investments 08.06.23
217 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
19 pages
Aff 2019-05
No ratings yet
Aff 2019-05
619 pages
Pubtexto 701 2834 10082024014520
No ratings yet
Pubtexto 701 2834 10082024014520
6 pages
Venice BLVD Great Street One-Year Evaluation Report
50% (2)
Venice BLVD Great Street One-Year Evaluation Report
40 pages
(Methods in Molecular Biology, 2231) Kazutaka Katoh - Multiple Sequence Alignment - Methods and Protocols-Humana (2020)
No ratings yet
(Methods in Molecular Biology, 2231) Kazutaka Katoh - Multiple Sequence Alignment - Methods and Protocols-Humana (2020)
322 pages
Autocad Complete
No ratings yet
Autocad Complete
222 pages
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
No ratings yet
Computational Biology (3) Alignment Algorithms: by Dr. Safynaz Abdel-Fattah Computer Science Department
107 pages
2.5-Hand and PowerTools Safety-35 Slides PDF
No ratings yet
2.5-Hand and PowerTools Safety-35 Slides PDF
35 pages
Msa Notes
No ratings yet
Msa Notes
10 pages
Making Sausages
100% (4)
Making Sausages
225 pages
Final Invoice For Epoxy & Screeding Works
No ratings yet
Final Invoice For Epoxy & Screeding Works
25 pages
Lecture 2
No ratings yet
Lecture 2
40 pages
Lecture 3
No ratings yet
Lecture 3
46 pages
Lecture 5
No ratings yet
Lecture 5
26 pages
Multiple Sequence Alignment: Department of
No ratings yet
Multiple Sequence Alignment: Department of
62 pages
Align 2
No ratings yet
Align 2
29 pages
5 Sequence Alignment
No ratings yet
5 Sequence Alignment
21 pages
Module4 Session1 Part2
No ratings yet
Module4 Session1 Part2
28 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
17 pages
Biotools Pca2
No ratings yet
Biotools Pca2
21 pages
Multiple Sequence Alignment 3
No ratings yet
Multiple Sequence Alignment 3
22 pages
Lec (5) - MSA
No ratings yet
Lec (5) - MSA
23 pages
Lec1 Building A Multiple Sequence Alignment 2024
No ratings yet
Lec1 Building A Multiple Sequence Alignment 2024
34 pages
Sequence Alignment
No ratings yet
Sequence Alignment
25 pages
3-Introduction To Physiology Part II 2024
No ratings yet
3-Introduction To Physiology Part II 2024
36 pages
Lecture 10 (Multiple Sequences Alignment)
No ratings yet
Lecture 10 (Multiple Sequences Alignment)
22 pages
1-Basic Human and Animal Anatomy 2024
No ratings yet
1-Basic Human and Animal Anatomy 2024
34 pages
Sequence Alignment
No ratings yet
Sequence Alignment
29 pages
FDA-Circular-2021-017 List of Class A
No ratings yet
FDA-Circular-2021-017 List of Class A
32 pages
Msa MTech
No ratings yet
Msa MTech
17 pages
Multiple Alignment
No ratings yet
Multiple Alignment
28 pages
Becoming Artist Becoming Educated Becomi
No ratings yet
Becoming Artist Becoming Educated Becomi
26 pages
Mythic 18
No ratings yet
Mythic 18
84 pages
BioinfoMethods-I Lab03 r2025
No ratings yet
BioinfoMethods-I Lab03 r2025
14 pages
Script MC Presentation Day
No ratings yet
Script MC Presentation Day
10 pages
Sequence Analysis - Alignment
No ratings yet
Sequence Analysis - Alignment
57 pages
Answer Multiple Sequence Alignment (MSA) Practical 2
No ratings yet
Answer Multiple Sequence Alignment (MSA) Practical 2
13 pages
Multiple Sequence Alignment Part 1
No ratings yet
Multiple Sequence Alignment Part 1
64 pages
BI Assignment 1
No ratings yet
BI Assignment 1
6 pages
KOMFIL Report
No ratings yet
KOMFIL Report
26 pages
Asthma: Ashima Tete B.SC Medical Laboratory Sciene (2 Year)
No ratings yet
Asthma: Ashima Tete B.SC Medical Laboratory Sciene (2 Year)
8 pages
L8 Msa
No ratings yet
L8 Msa
52 pages
Module - 4 - Reference Course Content
No ratings yet
Module - 4 - Reference Course Content
25 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
18 pages
Art of Alignment in R
No ratings yet
Art of Alignment in R
16 pages
4-Excitable Cell 2024
No ratings yet
4-Excitable Cell 2024
23 pages
Sequence Allignment
No ratings yet
Sequence Allignment
5 pages
Multiple Sequence Alignments:: Clustal Omega
No ratings yet
Multiple Sequence Alignments:: Clustal Omega
33 pages
AlinhamentosMultiplos 2023-24
No ratings yet
AlinhamentosMultiplos 2023-24
24 pages
MultipleSequenceAlignment 2021 PDF
No ratings yet
MultipleSequenceAlignment 2021 PDF
5 pages
Lec4 - Multiple Sequence Alignment
No ratings yet
Lec4 - Multiple Sequence Alignment
22 pages
Chap 03 BioInfo
No ratings yet
Chap 03 BioInfo
15 pages
Bioinformatics Practical Part Iii
No ratings yet
Bioinformatics Practical Part Iii
4 pages
Msa
No ratings yet
Msa
28 pages
GRADES 1 To 12 Daily Lesson LOG 9 Jonalyn N Tamayo English (Week 1) 3rd Monday Tuesday Wednesday Thursday Friday
No ratings yet
GRADES 1 To 12 Daily Lesson LOG 9 Jonalyn N Tamayo English (Week 1) 3rd Monday Tuesday Wednesday Thursday Friday
3 pages
CLOTHES Presentc
No ratings yet
CLOTHES Presentc
2 pages
Bookmark This Page
No ratings yet
Bookmark This Page
35 pages
04-Alinemiento Múltiple de Secuencias
No ratings yet
04-Alinemiento Múltiple de Secuencias
14 pages
Education Course Catalog en
No ratings yet
Education Course Catalog en
9 pages
Sequencing Alignment & Its Methods Group II
No ratings yet
Sequencing Alignment & Its Methods Group II
12 pages
Notes Bioinformatics
No ratings yet
Notes Bioinformatics
14 pages
Dairy Management System
53% (15)
Dairy Management System
43 pages
MANISHA MINOR PROJECT Edit
No ratings yet
MANISHA MINOR PROJECT Edit
21 pages
Bioinformatics Lesson 05
No ratings yet
Bioinformatics Lesson 05
13 pages
Sequence Alignment
No ratings yet
Sequence Alignment
17 pages
Note 7 - Group 7 Scribbing
No ratings yet
Note 7 - Group 7 Scribbing
7 pages
Pe2 PPTQ1.3
No ratings yet
Pe2 PPTQ1.3
13 pages
South Valley Academy DD Booklet
No ratings yet
South Valley Academy DD Booklet
79 pages
MAFFT
No ratings yet
MAFFT
3 pages
Lecture 6
No ratings yet
Lecture 6
31 pages
Module4 Session1 Prac Lucy Nakabazzi 2
100% (1)
Module4 Session1 Prac Lucy Nakabazzi 2
3 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
Dont Doubt Him Now - Brass Ensemble Accompaniment-Flugelhorn
No ratings yet
Dont Doubt Him Now - Brass Ensemble Accompaniment-Flugelhorn
1 page
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
89 pages
30 Icebreaker Activities For High School and Middle School Students 2
No ratings yet
30 Icebreaker Activities For High School and Middle School Students 2
1 page
Annexure V Certificate of Disability Form V
No ratings yet
Annexure V Certificate of Disability Form V
2 pages
Sequence Analysis in Bioinformatics
No ratings yet
Sequence Analysis in Bioinformatics
18 pages
Karate Proposal
No ratings yet
Karate Proposal
3 pages
Lab 3 - Multiple Sequence Alignment: Bioinformatic Methods I Lab 3
No ratings yet
Lab 3 - Multiple Sequence Alignment: Bioinformatic Methods I Lab 3
14 pages
CDS Default Probabilities
No ratings yet
CDS Default Probabilities
8 pages
Dominic Robinson CV 042013
No ratings yet
Dominic Robinson CV 042013
6 pages
Sequence Alignments: Felix Sappelt Irina Wagner
100% (1)
Sequence Alignments: Felix Sappelt Irina Wagner
34 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
7 pages
Damasco, John Rey C
No ratings yet
Damasco, John Rey C
11 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
24 pages
Multiple Sequence Alignment: Sumbitted To: DR - Navneet Choudhary
No ratings yet
Multiple Sequence Alignment: Sumbitted To: DR - Navneet Choudhary
23 pages
Multiple Sequence Alignment: Hamid Hamzeiy Izmir Institute of Technology
No ratings yet
Multiple Sequence Alignment: Hamid Hamzeiy Izmir Institute of Technology
6 pages
Bioinformatics: Sequence Alignment Methods
No ratings yet
Bioinformatics: Sequence Alignment Methods
32 pages
WESCO Distribution Inc.: By: Deepak Chavan Y.Sreenivasa Reddy Praveen Katiyar
No ratings yet
WESCO Distribution Inc.: By: Deepak Chavan Y.Sreenivasa Reddy Praveen Katiyar
18 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Fault Tolerant & Fault Testable Hardware Design
From Everand
Fault Tolerant & Fault Testable Hardware Design
Parag K. Lala
5/5 (2)
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Classical Approach to Constrained and Unconstrained Molecular Dynamics
From Everand
Classical Approach to Constrained and Unconstrained Molecular Dynamics
Ajith Gunaratne
No ratings yet