0% found this document useful (0 votes)

147 views37 pages

04 CAP5510 Fall21

The document discusses substitution patterns in bioinformatics, including how mutations occur, models for predicting the number of mutations, and scoring matrices which are used to derive relationships between nucleotides and amino acids based on observed substitution frequencies in alignments of related sequences. Various substitution models are presented, including Jukes-Cantor, two parameter, and PAM matrices, which are log-odds scores derived from observed accepted point mutations between closely related proteins over time.

Uploaded by

Arman Singhal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

147 views37 pages

04 CAP5510 Fall21

Uploaded by

Arman Singhal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 37

CAP5510 – Bioinformatics

Substitution Patterns
Tamer Kahveci
CISE Department
University of Florida

1
Goals
• Understand how mutations occur
• Learn models for predicting the number of
mutations
• Understand why scoring matrices are used
and how they are derived
• Learn major scoring matrices

2
Why Substitute Patterns ?
• Mutations happen because of mistakes in DNA
replication and repair.
• Our genetic code changes due to mutations
– Insert, delete, replace
• Three types of mutations
– Advantageous
– Disadvantageous
– Neutral
• We only observe substitutions that passed
selection process

3
Mutation Rates

Parent Organism

T time R = K/(2T)

Organism A Organism B

K: number of substitutions

4
Functional Constraints
• Functional sites are less likely to mutate
– Noncoding = 3.33 (subs/109 yr)
– Coding = 1.58 (subs/109 yr)
• Indels about 10 times less likely than
substitutions

5
Nucleotide Substitutions and Amino
Acids
• Synonymous substitutions do not change amino acids
• Nonsynonymous do change
• Degeneracy
– Fourfold degenerate: gly = {GGG, GGA, GGU, GGC}
– Twofold degenerate: asp = {GAU, GAC}, glu = {GAA, GAG}
– Non-degenerate: phe = UUU, leu = CUU, ile = AUU, val = GUU
• Example substitution rates in human and mouse
– Fourfold degenerate: 2.35
– Twofold degenerate: 1.67
– Non-degenerate: 0.56

6
Predicting Substitutions

How can we count the true

number of substitutions ?

7
Jukes-Cantor Model
• Each nucleotide can change into another
one with the same probability
P(A->A’, 1) = x, for each A’
P(A->A, 1) = 1 – 3x
Compute P(A->A’, 2) & P(A->A, 2)
x
A C P(A->A, t+1) = 3 P(A->A’, t) P(A’->A, 1) +
x P(A->A, t) P(A->A, 1)
x
P(A->A, t) ~ ¼ + (3/4)e-4ft
G T
K = num. subst. = -¾ ln(1 – f4/3), f =
fraction of observed substitutions

Oversimplification 8
Two Parameter Model
• Transition:
– purine->purine (A, G), Purine
pyrimidine->pyrimidine (C,
T)
• Transversion:
– purine <-> pyrimidine
• Transitions are more
likely than transversions.
• Use different probabilities
Pyrimidine
for transitions and
transversions.

9
Two Parameter Model
•P(AA,1) = 1-x-2y P(AA,2) = (1-x-2y) P(AA,1) + x P(AG,1) + y
•Compute P(AA,2) P(AC,1) + y P(AT,1)

y P(AA,t) = ¼ + ¼ e-4yt + ½ e-2(x+y)t

A C
y K = ½ ln(1/(1-2P-Q)) + ¼ ln(1/(1-2Q))
x
P,Q: fraction of transitions and transversions
G T observed.

10
More Parameters ?
• Assign a different probability for each pair
of nucleotides
• Not harder to compute than simpler
models
• Not necessarily better than simpler models

11
Amino Acid substitutions (1)
• Harder to model than nucleotides
– An amino acid can be substituted for another in more
than one ways
– The number of nucleotide substitutions needed to
transform one amino acid to another may differ
• Pro = CCC, leu = CUC, ile = AUC
– The likelihood of nucleotide substitutions may differ
• Asp = GAU, asn = AAU, his = CAU
– Amino acid substitutions may have different effects on
the protein function

12
Amino Acid substitutions (2)
• Mutation rates may vary greatly among
genes
– Nonsynonymous substitution may affect
functionality with smaller probability in some
genes
• Molecular clock (Zuckerlandl, Paulding)
– Mutation rates may be different for different
organisms, but it remains almost constant
over the time.

13
Scoring Matrices

14
What is it & why ?
• Let alphabet contain N letters
– N = 4 and 20 for nucleotides and amino acids
• N x N matrix
• (i,j) shows the relationship between ith and jth
letters.
– Positive number if letter i is likely to mutate into letter j
– Negative otherwise
– Magnitude shows the degree of proximity
• Symmetric

15
The BLOSUM45 Matrix
A R N D C Q E G H I L K M F P S T W Y V
A 5 -2 -1 -2 -1 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -2 -2 0
R -2 7 0 -1 -3 1 0 -2 0 -3 -2 3 -1 -2 -2 -1 -1 -2 -1 -2
N -1 0 6 2 -2 0 0 0 1 -2 -3 0 -2 -2 -2 1 0 -4 -2 -3
D -2 -1 2 7 -3 0 2 -1 0 -4 -3 0 -3 -4 -1 0 -1 -4 -2 -3
C -1 -3 -2 -3 12 -3 -3 -3 -3 -3 -2 -3 -2 -2 -4 -1 -1 -5 -3 -1
Q -1 1 0 0 -3 6 2 -2 1 -2 -2 1 0 -4 -1 0 -1 -2 -1 -3
E -1 0 0 2 -3 2 6 -2 0 -3 -2 1 -2 -3 0 0 -1 -3 -2 -3
G 0 -2 0 -1 -3 -2 -2 7 -2 -4 -3 -2 -2 -3 -2 0 -2 -2 -3 -3
H -2 0 1 0 -3 1 0 -2 10 -3 -2 -1 0 -2 -2 -1 -2 -3 2 -3
I -1 -3 -2 -4 -3 -2 -3 -4 -3 5 2 -3 2 0 -2 -2 -1 -2 0 3
L -1 -2 -3 -3 -2 -2 -2 -3 -2 2 5 -3 2 1 -3 -3 -1 -2 0 1
K -1 3 0 0 -3 1 1 -2 -1 -3 -3 5 -1 -3 -1 -1 -1 -2 -1 -2
M -1 -1 -2 -3 -2 0 -2 -2 0 2 2 -1 6 0 -2 -2 -1 -2 0 1
F -2 -2 -2 -4 -2 -4 -3 -3 -2 0 1 -3 0 8 -3 -2 -1 1 3 0
P -1 -2 -2 -1 -4 -1 0 -2 -2 -2 -3 -1 -2 -3 9 -1 -1 -3 -3 -3
S 1 -1 1 0 -1 0 0 0 -1 -2 -3 -1 -2 -2 -1 4 2 -4 -2 -1
T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -1 -1 2 5 -3 -1 0
W -2 -2 -4 -4 -5 -2 -3 -2 -3 -2 -2 -2 -2 1 -3 -4 -3 15 3 -3
Y -2 -1 -2 -2 -3 -1 -2 -3 2 0 0 -1 0 3 -3 -2 -1 3 8 -1 16
V 0 -2 -3 -3 -1 -3 -3 -3 -3 3 1 -2 1 0 -3 -1 0 -3 -1 5
Scoring Matrices for DNA
A C G T
A C G T A C G T
A 1 -3 -3 -3
A 1 0 0 0 A 1 -5 -1 -5
C -3 1 -3 -3
C 0 1 0 0 C -5 1 -5 -1
G -3 -3 1 -3
G 0 0 1 0 G -1 -5 1 -5
T -3 -3 -3 1
T 0 0 0 1 T -5 -1 -5 1

identity BLAST Transitions &

transversions

17
Scoring Matrices for Amino Acids
• Chemical similarities
– Non-polar, Hydrophobic (G, A, V, L, I, M, F, W, P)
– Polar, Hydrophilic (S, T, C, Y, N, Q)
– Electrically charged (D, E, K, R, H)
– Requires expert knowledge
• Genetic code: Nucleotide substitutions
– E: GAA, GAG
– D: GAU, GAC
– F: UUU, UUC
• Actual substitutions
– PAM
– BLOSUM

18
Scoring Matrices: Actual
Substitutions
• Manually align proteins
• Look for amino acid substitutions
• Entry ~ log(freq(observed)/freq(expected))
• Log-odds matrices

19
PAM Matrices

(Dayhoff 1972)

20
PAM
• PAM = “Point Accepted Mutation”
interested only in mutations that have
been “accepted” by natural selection
• An accepted mutation is a mutation that
occurred and was positively selected by
the environment; that is, it did not cause
the demise of the particular organism
where it occurred.

21
Interpretation of PAM matrices
• PAM-1 : one substitution per 100 residues (a
PAM unit of time)
• “Suppose I start with a given polypeptide sequence M at
time t, and observe the evolutionary changes in the
sequence until 1% of all amino acid residues have
undergone substitutions at time t+n. Let the new
sequence at time t+n be called M’. What is the probability
that a residue of type j in M will be replaced by i in M’?”
• PAM-K : K PAM time units

22
PAM Matrices (1)
• Starts with a multiple sequence alignment
of very similar (>85% identity) proteins.
Assumed to be homologous
• Compute the relative mutability, mi, of
each amino acid
– e.g. mA = how many times was alanine
substituted with anything else on the
average?

23
Relative Mutability
• ACGCTAFKI
GCGCTAFKI
ACGCTAFKL
GCGCTGFKI
GCGCTLFKI
ASGCTAFKL
ACACTAFKL
• Across all pairs of sequences, there are 28
A  X substitutions
• There are 10 ALA residues, so mA = 2.8

24
Pam Matrices (2)
• Construct a phylogenetic tree for the sequences in the
alignment

AG
ACGCTAFKI
IL
FG,A = 3
GCGCTAFKI ACGCTAFKL

AG AL CS GA

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

• Calculate substitution frequencies FX,X

• Substitutions may have occurred either way, so A  G
also counts as G  A. 25
Mutation Probabilities
• Mi,j represents the probability of J  I
substitution.

ACGCTAFKI
AG IL

GCGCTAFKI ACGCTAFKL

AG AL CS GA

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

m j Fij 2.8  3 = 2.1

M ij  M G, A 
 Fij
i
4
26
The PAM Matrix
• The entries of the scoring matrix are the
Mi,j values divided by the frequency of
occurrence, fi, of residue i.
• fG = 10 GLY / 63 residues = 0.1587
• RG,A = log(2.1/0.1587) = log(12.760) = 1.106
• Log-odds matrix
• Diagonal entries are Mjj = 1– mj
27
Computation of PAM-K
• Assume that changes at time T+1 are
independent of the changes at time T.
• Markov chain
• P(A-->B) = X P(A->X) P(X->B)
• PAM-K = (PAM-1)K
• PAM-250 is most commonly used

28
PAM - Discussion
• Smaller K, PAM-K is better for closely related
sequences, large K is better for distantly related
sequences
• Biased towards closely related sequences since it starts
from highly similar sequences (BLOSUM solves this)
• If Mi,j is very small, we may not have a large enough
sample to estimate the real probability. When we
multiply the PAM matrices many times, the error is
magnified.
• Mutation rate may change from one gene to another

29
BLOSUM Matrices

Henikoff & Henikoff 1992

30
BLOSUM Matrix
• Begin with a set of protein sequences and obtain blocks.
– ~2000 blocks from 500 families of related proteins
– More data than PAM
• A block is the ungapped alignment of a highly conserved region of a
family of proteins.
• MOTIF program is used to find blocks
• Substitutions in these blocks are used to compute BLOSUM matrix

block 1 block 2 block 3

WWYIR CASILRKIYIYGPV GVSRLRTAYGGRKNRG

WFYVR … CASILRHLYHRSPA … GVGSITKIYGGRKRNG
WYYVR AAAVARHIYLRKTV GVGRLRKVHGSTKNRG
WYFIR AASICRHLYIRSPA GIGSFEKIYGGRRRRG
31
Constructing the Matrix
• Count the frequency of occurrence of each amino acid. This gives
the background distribution pa
• Count the number of times amino acid a is aligned with amino acid
b: fab
– A block of width w and depth s contributes ws(s-1)/2 = np pairs
• Compute the occurrence probability of each pair
– qab = fab/ np
• Compute the
i
probability of occurrence of amino acid a
– pa = qaa + Σ
a≠b qab /2

• Compute the expected probability of occurrence of each pair

– eab = 2papb, if a ≠ b
papb otherwise
• Compute the log likelihood ratios, normalize, and round.
– 2* log2 qab / eab

32
Constructing the Matrix: Example
• fAA = 36, fAS = 9
• Observed frequencies of pairs
– qAA = fAA/(fAA+fAS) = 36/45 = 0.8
– qAS = 9/45 = 0.2
A
A • Expected frequencies of letters
– pA = qAA + qAS/2 = 0.9
A
– pS = qAS/2 = 0.1
A • Expected frequencies of pairs
S – eAA = pA x pA = 0.81
… A … – eAS = 2 x pA x pS = 0.18
A • Matrix entries
A – MAA = 2x log2(qAA/eAA) = -0.04 ~ 0
A – MAS = 2 x log2(qAS/eAS) = 0.3 ~ 0
A

9A, 1S
33
Computation of BLOSUM-K
• Different levels of the BLOSUM matrix can be created by
differentially weighting the degree of similarity between
sequences. For example, a BLOSUM62 matrix is
calculated from protein blocks such that if two
sequences are more than 62% identical, then the
contribution of these sequences is weighted to sum to
one. In this
a b
way the contributions of multiple entries of
closely related sequences is reduced.
• Larger numbers used to measure recent divergence,
default is BLOSUM62

34
BLOSUM 62 Matrix
Check scores for

MILV
-small hydrophobic

NDEQ
-acid, hydrophilic

HRK
-basic

FYW
-aromatic

STPAG
-small hydrophilic

C
-sulphydryl
35
PAM vs. BLOSUM

Equivalent PAM and BLOSSUM matrices:

PAM100 = Blosum90
PAM120 = Blosum80
PAM160 = Blosum60
PAM200 = Blosum52
PAM250 = Blosum45

BLOSUM62 is the default matrix to use.

36
PAM vs. BLOSUM

PAM BLOSUM

Built from global alignments Built from local alignments

Built from small amout of Data Built from vast amout of Data
Counting is based on minimum Counting based on groups of
replacement or maximum parsimony related sequences counted as one
Perform better for finding global Better for finding local
alignments and remote homologs alignments
Higher PAM series means more Lower BLOSUM series means
divergence more divergence

QCP For Hot Tap PDF
100% (3)
QCP For Hot Tap PDF
30 pages
Bioche Problems
100% (1)
Bioche Problems
5 pages
Second - Done - W14a - Substitution Patterns
No ratings yet
Second - Done - W14a - Substitution Patterns
36 pages
Substitution Matrix
No ratings yet
Substitution Matrix
10 pages
Unit Iii
No ratings yet
Unit Iii
14 pages
Scoring Matrices 06
No ratings yet
Scoring Matrices 06
25 pages
Pam Blosum
100% (1)
Pam Blosum
71 pages
BIOINFORMATICS
No ratings yet
BIOINFORMATICS
21 pages
Frid Seminar
No ratings yet
Frid Seminar
30 pages
2-Substitution Matrices and Python - 2017
No ratings yet
2-Substitution Matrices and Python - 2017
65 pages
Lecture 7 - Score Matrix
No ratings yet
Lecture 7 - Score Matrix
12 pages
Scoring of Alignments: Einführung in Die Bioinformatik
No ratings yet
Scoring of Alignments: Einführung in Die Bioinformatik
19 pages
Introduction To Bioinformatics: Sequence Alignment
No ratings yet
Introduction To Bioinformatics: Sequence Alignment
29 pages
Alignment of Sequences
No ratings yet
Alignment of Sequences
33 pages
PAM and BLOSUM
No ratings yet
PAM and BLOSUM
21 pages
PB Bioinfo L4 2023
No ratings yet
PB Bioinfo L4 2023
29 pages
Optimal Alignment and Heuristic Solutions
No ratings yet
Optimal Alignment and Heuristic Solutions
7 pages
Aminoacid+Alignment Including PAM & BLOSUM
0% (1)
Aminoacid+Alignment Including PAM & BLOSUM
38 pages
Module III
No ratings yet
Module III
55 pages
Bioinformatics Module 2 Notes
No ratings yet
Bioinformatics Module 2 Notes
28 pages
Amino Acid Substitution Matrices: Evolutionary Model
No ratings yet
Amino Acid Substitution Matrices: Evolutionary Model
20 pages
Unit Ii
No ratings yet
Unit Ii
14 pages
Sequence Alignment: Scoring Matrices
No ratings yet
Sequence Alignment: Scoring Matrices
30 pages
Bioinformatics in PAM AND BLOSUM
100% (15)
Bioinformatics in PAM AND BLOSUM
17 pages
Sequence Analysis - Pairwise Alignment
No ratings yet
Sequence Analysis - Pairwise Alignment
26 pages
15 Unnamed 08 08 2024
No ratings yet
15 Unnamed 08 08 2024
12 pages
Comparison of The PAM and BLOSUM Amino Acid Substitution Matrices
No ratings yet
Comparison of The PAM and BLOSUM Amino Acid Substitution Matrices
4 pages
Sequence Alignment: "Continuing.." (5th Week)
No ratings yet
Sequence Alignment: "Continuing.." (5th Week)
61 pages
Sequence Comparison
No ratings yet
Sequence Comparison
39 pages
Leklj
No ratings yet
Leklj
24 pages
1 Pearson
No ratings yet
1 Pearson
9 pages
Sequence Similarity Searching: Basic Local Alignment Search Tool
No ratings yet
Sequence Similarity Searching: Basic Local Alignment Search Tool
47 pages
W03 Pairwise
No ratings yet
W03 Pairwise
55 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
Mount - 2008 - Using BLOSUM in Sequence Alignments
No ratings yet
Mount - 2008 - Using BLOSUM in Sequence Alignments
5 pages
Amino Acid Substitution Scores: 1 2 N 1 2 N N I 1 I I
No ratings yet
Amino Acid Substitution Scores: 1 2 N 1 2 N N I 1 I I
3 pages
The University of Zambia School of Natural & Applied Sciences Department of Biosciences & Biotechnology
No ratings yet
The University of Zambia School of Natural & Applied Sciences Department of Biosciences & Biotechnology
4 pages
Using Scoring Matrices
No ratings yet
Using Scoring Matrices
3 pages
PAM and BLOSUM Matrices
No ratings yet
PAM and BLOSUM Matrices
3 pages
Bioinformatics II: PAM Matrices
No ratings yet
Bioinformatics II: PAM Matrices
9 pages
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
No ratings yet
Need & Emergence of The Field: Speaker Shashi Shekhar Head of Computational Section Biowits Life Sciences
59 pages
Lecture 3 and 4 LSM2241
No ratings yet
Lecture 3 and 4 LSM2241
6 pages
Week 3
No ratings yet
Week 3
42 pages
Methods For Applying Multiple Sequence Alignment
No ratings yet
Methods For Applying Multiple Sequence Alignment
17 pages
Selecting The Right Similarity-Scoring Matrix
No ratings yet
Selecting The Right Similarity-Scoring Matrix
18 pages
Sequence Alignment and Searching
No ratings yet
Sequence Alignment and Searching
37 pages
PCB Lect02 Pairwise Allign
No ratings yet
PCB Lect02 Pairwise Allign
51 pages
Sequence Alignment Presentation
No ratings yet
Sequence Alignment Presentation
27 pages
Protein Alignment Scoring - PAM and BLOSUM
No ratings yet
Protein Alignment Scoring - PAM and BLOSUM
11 pages
Unit2 2
No ratings yet
Unit2 2
30 pages
Mount - 2008 - Using PAM Matrices in Sequence Alignments
No ratings yet
Mount - 2008 - Using PAM Matrices in Sequence Alignments
9 pages
PAM Abd BLOSUM
No ratings yet
PAM Abd BLOSUM
3 pages
16 Unnamed 08 08 2024
No ratings yet
16 Unnamed 08 08 2024
13 pages
MATH3353 Notes
No ratings yet
MATH3353 Notes
100 pages
BLOSUM
No ratings yet
BLOSUM
3 pages
Sequence Alignment
No ratings yet
Sequence Alignment
24 pages
12 Blossum
No ratings yet
12 Blossum
10 pages
Msa MTech
No ratings yet
Msa MTech
17 pages
LO5 Pairwise Sequence Alignment
No ratings yet
LO5 Pairwise Sequence Alignment
11 pages
Blosum 2014
No ratings yet
Blosum 2014
3 pages
Make Transistor as a Switch
From Everand
Make Transistor as a Switch
GURUPRASAD N H
No ratings yet
Generate Melody Music with IC UM66/BT66
From Everand
Generate Melody Music with IC UM66/BT66
GURUPRASAD N H
No ratings yet
860 Cycloalkane
No ratings yet
860 Cycloalkane
9 pages
US20120157620A1
No ratings yet
US20120157620A1
18 pages
BIOL1020 Notes
No ratings yet
BIOL1020 Notes
40 pages
Chemistry Chapter 1 Class 6 ICSE Notes by Avijit Sir
No ratings yet
Chemistry Chapter 1 Class 6 ICSE Notes by Avijit Sir
6 pages
Bannoh 5000 Grey, Base-msds영문 - (3802739)
No ratings yet
Bannoh 5000 Grey, Base-msds영문 - (3802739)
7 pages
5-Drill String Failure
No ratings yet
5-Drill String Failure
113 pages
Flash and Fire Points by Cleveland Open Cup: Standard Test Method For
No ratings yet
Flash and Fire Points by Cleveland Open Cup: Standard Test Method For
12 pages
STD 7 Science Worksheet 6 Physical and Chemical Changes
No ratings yet
STD 7 Science Worksheet 6 Physical and Chemical Changes
6 pages
Lab1 - Adeena Fitrisha Binti Rosman - 2021151741
No ratings yet
Lab1 - Adeena Fitrisha Binti Rosman - 2021151741
6 pages
Drug Computation NOTES
No ratings yet
Drug Computation NOTES
70 pages
Mass Spectrometry
No ratings yet
Mass Spectrometry
4 pages
Key Characteristics of Nanoparticles: o o o o
No ratings yet
Key Characteristics of Nanoparticles: o o o o
2 pages
Class Xi Biology Periodic Assesment Iii Examination Question Paper Acy 2023-24 Set-1
No ratings yet
Class Xi Biology Periodic Assesment Iii Examination Question Paper Acy 2023-24 Set-1
10 pages
Biology Advanced Scientific Article GRADE 12 BIOLOGY UNIT 5
No ratings yet
Biology Advanced Scientific Article GRADE 12 BIOLOGY UNIT 5
7 pages
IB Bio Theme C Detailed LOs
No ratings yet
IB Bio Theme C Detailed LOs
22 pages
Cell Membrane Structure
No ratings yet
Cell Membrane Structure
9 pages
CHM557 Exp 3
No ratings yet
CHM557 Exp 3
22 pages
Salim-Eisa Method For Modification of Evaporation
No ratings yet
Salim-Eisa Method For Modification of Evaporation
6 pages
Chemistry Unit 4 Part 2 Reallyacademics
No ratings yet
Chemistry Unit 4 Part 2 Reallyacademics
45 pages
Ca RXN Check
0% (1)
Ca RXN Check
1 page
1018 Bronze Ferrule Cock (Medium) : Salient Features
No ratings yet
1018 Bronze Ferrule Cock (Medium) : Salient Features
22 pages
Kinetics P.1 and P.2 SL IB Questions Practice
No ratings yet
Kinetics P.1 and P.2 SL IB Questions Practice
22 pages
Chapter 2 Adsorption
No ratings yet
Chapter 2 Adsorption
45 pages
Drying of AB Materials
No ratings yet
Drying of AB Materials
26 pages
Harsh Internship Report - 2023
No ratings yet
Harsh Internship Report - 2023
20 pages
IA Metals - Alkali - Metals
No ratings yet
IA Metals - Alkali - Metals
3 pages
Container Catalogue - 2020
No ratings yet
Container Catalogue - 2020
13 pages
E Nose Sensor
No ratings yet
E Nose Sensor
9 pages

04 CAP5510 Fall21

Uploaded by

04 CAP5510 Fall21

Uploaded by

CAP5510 – Bioinformatics

How can we count the true

y P(AA,t) = ¼ + ¼ e-4yt + ½ e-2(x+y)t

identity BLAST Transitions &

AG AL CS GA

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

• Calculate substitution frequencies FX,X

AG AL CS GA

GCGCTGFKI GCGCTLFKI ASGCTAFKL ACACTAFKL

m j Fij 2.8  3 = 2.1

Henikoff & Henikoff 1992

block 1 block 2 block 3

WWYIR CASILRKIYIYGPV GVSRLRTAYGGRKNRG

• Compute the expected probability of occurrence of each pair

Equivalent PAM and BLOSSUM matrices:

BLOSUM62 is the default matrix to use.

Built from global alignments Built from local alignments

You might also like