0% found this document useful (0 votes)

38 views11 pages

Protein Alignment Scoring - PAM and BLOSUM

The document discusses protein sequence alignment scoring methods. It introduces the PAM and BLOSUM matrices which are commonly used to score substitutions between amino acids in protein alignments based on empirical substitution frequencies observed in related proteins. The PAM matrix models substitution probabilities directly observed in very similar proteins, while the BLOSUM matrix averages these probabilities over clusters of more distantly related proteins to avoid issues with low-probability estimates.

Uploaded by

rikzariaz0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views11 pages

Protein Alignment Scoring - PAM and BLOSUM

Uploaded by

rikzariaz0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Wright State University

CORE Scholar

Computer Science and Engineering Faculty Computer Science & Engineering

Publications

2003

Protein Alignment Scoring - PAM and BLOSUM

Dan E. Krane
Wright State University - Main Campus, [email protected]

Michael L. Raymer
Wright State University - Main Campus, [email protected]

Follow this and additional works at: https://corescholar.libraries.wright.edu/cse

Part of the Computer Sciences Commons, and the Engineering Commons

Repository Citation
Krane, D. E., & Raymer, M. L. (2003). Protein Alignment Scoring - PAM and BLOSUM. .
https://corescholar.libraries.wright.edu/cse/388

This Presentation is brought to you for free and open access by Wright State University’s CORE Scholar. It has been
accepted for inclusion in Computer Science and Engineering Faculty Publications by an authorized administrator of
CORE Scholar. For more information, please contact [email protected].
Sequence Alignments Revisited
• Scoring nucleotide sequence alignments was
easier
• Match score
• Possibly different scores for transitions and
transversions
• For amino acids, there are many more possible
substitutions
• How do we score which substitutions are highly
penalized and which are moderately penalized?
• Physical and chemical characteristics
• Empirical methods
Protein-Related Algorithms Intro to Bioinformatics 1
Scoring Mismatches
• Physical and chemical characteristics
• V → I – Both small, both hydrophobic,
conservative substitution, small penalty
• V → K – Small → large, hydrophobic → charged,
large penalty
• Requires some expert knowledge and judgement
• Empirical methods
• How often does the substitution V → I occur in
proteins that are known to be related?
 Scoring matrices: PAM and BLOSUM

Protein-Related Algorithms Intro to Bioinformatics 2

PAM matrices
• PAM = “Point Accepted Mutation” interested
only in mutations that have been “accepted” by
natural selection
• Starts with a multiple sequence alignment of
very similar (>85% identity) proteins. Assumed
to be homologous
• Compute the relative mutability, mi, of each
amino acid
• e.g. mA = how many times was alanine substituted
with anything else?
Protein-Related Algorithms Intro to Bioinformatics 3
Relative mutability
• ACGCTAFKI
GCGCTAFKI
ACGCTAFKL
GCGCTGFKI
GCGCTLFKI
ASGCTAFKL
ACACTAFKL
• Across all pairs of sequences, there are 28
A → X substitutions
• There are 10 ALA residues, so mA = 2.8
Protein-Related Algorithms Intro to Bioinformatics 4
Pam Matrices, cont’d
• Construct a phylogenetic tree for the sequences
in the alignment
ACGCTAFKI
A→G I→L
FG,A = 3
GCGCTAFKI ACGCTAFKL

A→G A→L C→S G→A

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

• Calculate substitution frequences FX,X

• Substitutions may have occurred either way, so
A → G also counts as G → A.
Protein-Related Algorithms Intro to Bioinformatics 5
Mutation Probabilities
• Mi,j represents the probability of J → I
substitution.
m j Fij
M ij =
ACGCTAFKI

∑ Fij
i GCGCTAFKI
A→G I→L

ACGCTAFKL

A→G A→L C→S G→A

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

2.7 × 3
• M G, A = = 2.025
4

Protein-Related Algorithms Intro to Bioinformatics 6

The PAM matrix
• The entries, Ri,j are the Mi,j values divided by
the frequency of occurrence, fi, of residue i.
• fG = 10 GLY / 63 residues = 0.1587
• RG,A = log(2.025/0.1587) = log(12.760) = 1.106
• The log is taken so that we can add, rather than
multiply entries to get compound probabilities.
• Log-odds matrix
• Diagonal entries are 1– mj

Protein-Related Algorithms Intro to Bioinformatics 7

Interpretation of PAM matrices
• PAM-1 – one substitution per 100 residues (a
PAM unit of time)
• Multiply them together to get PAM-100, etc.
• “Suppose I start with a given polypeptide
sequence M at time t, and observe the
evolutionary changes in the sequence until 1% of
all amino acid residues have undergone
substitutions at time t+n. Let the new sequence at
time t+n be called M’. What is the probability that
a residue of type j in M will be replaced by i in
M’?”
Protein-Related Algorithms Intro to Bioinformatics 8
PAM matrix considerations

• If Mi,j is very small, we may not have a large

enough sample to estimate the real probability.
When we multiply the PAM matrices many
times, the error is magnified.
• PAM-1 – similar sequences, PAM-1000 very
dissimilar sequences

Protein-Related Algorithms Intro to Bioinformatics 9

BLOSUM matrix
• Starts by clustering proteins by similarity
• Avoids problems with small probabilities by
using averages over clusters
• Numbering works opposite
• BLOSUM-62 is appropriate for sequences of about
62% identity, while BLOSUM-80 is appropriate for
more similar sequences.

Protein-Related Algorithms Intro to Bioinformatics 10

BLOSUM Matrices
No ratings yet
BLOSUM Matrices
18 pages
Bioinformatics in PAM AND BLOSUM
100% (15)
Bioinformatics in PAM AND BLOSUM
17 pages
2-Substitution Matrices and Python - 2017
No ratings yet
2-Substitution Matrices and Python - 2017
65 pages
PAM and BLOSUM Substitution Matrices
No ratings yet
PAM and BLOSUM Substitution Matrices
3 pages
BLOSUM
No ratings yet
BLOSUM
3 pages
Introduction To Bioinformatics: Sequence Alignment
No ratings yet
Introduction To Bioinformatics: Sequence Alignment
29 pages
PAM and BLOSUM
No ratings yet
PAM and BLOSUM
21 pages
BLOSUM Matrices
No ratings yet
BLOSUM Matrices
18 pages
BLAST Lecture Notes
No ratings yet
BLAST Lecture Notes
16 pages
Alignment of Sequences
No ratings yet
Alignment of Sequences
33 pages
16 Unnamed 08 08 2024
No ratings yet
16 Unnamed 08 08 2024
13 pages
Lecture 7 - Score Matrix
No ratings yet
Lecture 7 - Score Matrix
12 pages
Bioinformatics Module 2 Notes
No ratings yet
Bioinformatics Module 2 Notes
28 pages
Mount - 2008 - Using PAM Matrices in Sequence Alignments
No ratings yet
Mount - 2008 - Using PAM Matrices in Sequence Alignments
9 pages
Unit Iii
No ratings yet
Unit Iii
14 pages
Substitution Matrix
No ratings yet
Substitution Matrix
10 pages
SECT 5 SL L1-Rev
No ratings yet
SECT 5 SL L1-Rev
30 pages
Mount - 2008 - Using BLOSUM in Sequence Alignments
No ratings yet
Mount - 2008 - Using BLOSUM in Sequence Alignments
5 pages
Comparison of The PAM and BLOSUM Amino Acid Substitution Matrices
No ratings yet
Comparison of The PAM and BLOSUM Amino Acid Substitution Matrices
4 pages
PAM Abd BLOSUM
No ratings yet
PAM Abd BLOSUM
3 pages
Lecture 9 Scoring Matrices
No ratings yet
Lecture 9 Scoring Matrices
20 pages
PAM and BLOSUM Matrices
No ratings yet
PAM and BLOSUM Matrices
3 pages
BMB 822 - Bioinformatics and Computing - Lecture Notes
No ratings yet
BMB 822 - Bioinformatics and Computing - Lecture Notes
94 pages
Bioinformatics II: PAM Matrices
No ratings yet
Bioinformatics II: PAM Matrices
9 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
PAM and BLOSUM Presentation
No ratings yet
PAM and BLOSUM Presentation
11 pages
Frid Seminar
No ratings yet
Frid Seminar
30 pages
The University of Zambia School of Natural & Applied Sciences Department of Biosciences & Biotechnology
No ratings yet
The University of Zambia School of Natural & Applied Sciences Department of Biosciences & Biotechnology
4 pages
1 Pearson
No ratings yet
1 Pearson
9 pages
Sequence Alignment: Scoring Matrices
No ratings yet
Sequence Alignment: Scoring Matrices
30 pages
Lecture 3 and 4 LSM2241
No ratings yet
Lecture 3 and 4 LSM2241
6 pages
LO5 Pairwise Sequence Alignment
No ratings yet
LO5 Pairwise Sequence Alignment
11 pages
Optimal Alignment and Heuristic Solutions
No ratings yet
Optimal Alignment and Heuristic Solutions
7 pages
12 Blossum
No ratings yet
12 Blossum
10 pages
Module III
No ratings yet
Module III
55 pages
Sequence Alignment and Searching
No ratings yet
Sequence Alignment and Searching
37 pages
Amino Acid Substitution Scores: 1 2 N 1 2 N N I 1 I I
No ratings yet
Amino Acid Substitution Scores: 1 2 N 1 2 N N I 1 I I
3 pages
Chap 03 BioInfo
No ratings yet
Chap 03 BioInfo
15 pages
04 CAP5510 Fall21
No ratings yet
04 CAP5510 Fall21
37 pages
Amino Acid Substitution Matrices: Evolutionary Model
No ratings yet
Amino Acid Substitution Matrices: Evolutionary Model
20 pages
Scoring Matrices and The Statistical Significance of Molecular Sequence Features
No ratings yet
Scoring Matrices and The Statistical Significance of Molecular Sequence Features
2 pages
Basic Bioinformatics
No ratings yet
Basic Bioinformatics
40 pages
15 Unnamed 08 08 2024
No ratings yet
15 Unnamed 08 08 2024
12 pages
Msa MTech
No ratings yet
Msa MTech
17 pages
Sequence Alignment
No ratings yet
Sequence Alignment
24 pages
Bioinformatics I
No ratings yet
Bioinformatics I
39 pages
Unit2 2
No ratings yet
Unit2 2
30 pages
Protein Sequence Alignment Lecture Notes
No ratings yet
Protein Sequence Alignment Lecture Notes
2 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
PB Bioinfo L4 2023
No ratings yet
PB Bioinfo L4 2023
29 pages
W03 Pairwise
No ratings yet
W03 Pairwise
55 pages
Multiple Sequence Alignment MSA
No ratings yet
Multiple Sequence Alignment MSA
8 pages
Bioinfo 2022 Part 2 - 240605 - 115523
No ratings yet
Bioinfo 2022 Part 2 - 240605 - 115523
10 pages
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
No ratings yet
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
59 pages
Sequence Alignment: "Continuing.." (5th Week)
No ratings yet
Sequence Alignment: "Continuing.." (5th Week)
61 pages
Bioinformatics 1 p3
No ratings yet
Bioinformatics 1 p3
17 pages
An Introductory Course Bioinformatics-I: A Student Handout
No ratings yet
An Introductory Course Bioinformatics-I: A Student Handout
320 pages
Second - Done - W14a - Substitution Patterns
No ratings yet
Second - Done - W14a - Substitution Patterns
36 pages
Biotechnology Principles and Processes, Biotechnology and Its Applications-Split
No ratings yet
Biotechnology Principles and Processes, Biotechnology and Its Applications-Split
1 page
Strategy For Botany Optional in UPSC Exams
No ratings yet
Strategy For Botany Optional in UPSC Exams
21 pages
Company and The Employees With Dependents The Rates Will Be Increased by 8.5%
No ratings yet
Company and The Employees With Dependents The Rates Will Be Increased by 8.5%
2 pages
Orientation To Pharmacy 2014 - Part 1
100% (1)
Orientation To Pharmacy 2014 - Part 1
26 pages
RNA Transcription and Translation
No ratings yet
RNA Transcription and Translation
11 pages
H2 Biology Tutorial Notes C4 Biological Evolution 1
No ratings yet
H2 Biology Tutorial Notes C4 Biological Evolution 1
20 pages
Comparison Chart of Gram +ve & - Ve
No ratings yet
Comparison Chart of Gram +ve & - Ve
2 pages
Instruction For Use HPV High-Risk Types With 1618 Genotyping Detection Kit - 20241202
No ratings yet
Instruction For Use HPV High-Risk Types With 1618 Genotyping Detection Kit - 20241202
5 pages
Kannur Digree Syllabus and Scheme For Bioinformatics
No ratings yet
Kannur Digree Syllabus and Scheme For Bioinformatics
35 pages
2013 Janbasicresourcesfor
No ratings yet
2013 Janbasicresourcesfor
54 pages
Identification of Bacteria
No ratings yet
Identification of Bacteria
4 pages
Lat Excel: Calorie - Calc
No ratings yet
Lat Excel: Calorie - Calc
4 pages
Practical 3: Photosynthesis & Transpiration: Pre-Labs Biology
No ratings yet
Practical 3: Photosynthesis & Transpiration: Pre-Labs Biology
4 pages
Assignment Cell The Unit of Life Questions
No ratings yet
Assignment Cell The Unit of Life Questions
3 pages
GIZMOS RNA Protein Synthesis Lab
No ratings yet
GIZMOS RNA Protein Synthesis Lab
4 pages
Improving Diagnosis in Health Care
100% (1)
Improving Diagnosis in Health Care
4 pages
Lab 9 Protocol
No ratings yet
Lab 9 Protocol
7 pages
SIM - Nucleic Acid - For Publication
100% (1)
SIM - Nucleic Acid - For Publication
16 pages
ProtocolRIPA Lysis and Extraction Buffer
No ratings yet
ProtocolRIPA Lysis and Extraction Buffer
3 pages
Protein Extraction and Quantification
No ratings yet
Protein Extraction and Quantification
3 pages
Identification of Bacterial Pathogens
100% (1)
Identification of Bacterial Pathogens
100 pages
04 02 2025 - 0405pm76847
No ratings yet
04 02 2025 - 0405pm76847
1 page
Fungicides and Antibiotics
No ratings yet
Fungicides and Antibiotics
29 pages
Cloze Test Practice
No ratings yet
Cloze Test Practice
3 pages
June 2017 QP (v1) Paper 1 AQA Biology AS-level
No ratings yet
June 2017 QP (v1) Paper 1 AQA Biology AS-level
28 pages
6-4 Study Guide
100% (1)
6-4 Study Guide
2 pages
Life Science Test 1 2021 Grade 10. Final
No ratings yet
Life Science Test 1 2021 Grade 10. Final
8 pages
Reproduction in Animals PDF
33% (3)
Reproduction in Animals PDF
7 pages
Mitosis, Meiosis I and II
No ratings yet
Mitosis, Meiosis I and II
2 pages
Class 12 - Biology - Biotechnology and Its Applications
No ratings yet
Class 12 - Biology - Biotechnology and Its Applications
5 pages

Protein Alignment Scoring - PAM and BLOSUM

Uploaded by

Protein Alignment Scoring - PAM and BLOSUM

Uploaded by

Wright State University

Computer Science and Engineering Faculty Computer Science & Engineering

Protein Alignment Scoring - PAM and BLOSUM

Follow this and additional works at: https://corescholar.libraries.wright.edu/cse

Part of the Computer Sciences Commons, and the Engineering Commons

Protein-Related Algorithms Intro to Bioinformatics 2

A→G A→L C→S G→A

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

• Calculate substitution frequences FX,X

A→G A→L C→S G→A

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

Protein-Related Algorithms Intro to Bioinformatics 6

Protein-Related Algorithms Intro to Bioinformatics 7

• If Mi,j is very small, we may not have a large

Protein-Related Algorithms Intro to Bioinformatics 9

Protein-Related Algorithms Intro to Bioinformatics 10

You might also like