0% found this document useful (0 votes)
27 views14 pages

Class16-Introduction To Molecular Phylogenetics

Uploaded by

jamie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views14 pages

Class16-Introduction To Molecular Phylogenetics

Uploaded by

jamie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

30/5/2023

SIJ1004
Class 16
Introduction to Molecular Phylogenetics

mammals

vertebrates

invertebrates

protozoa

1
30/5/2023

Molecular sequence evolution


• If molecular sequence evolve by accumulation of mutations.
• The amount of difference in between the two sequences should indicate
the distance of from the shared common ancestor.
• The closer the distance to the common ancestor (diverged in recent past)
 fewer differences.

• This means that by comparing three or more related molecular sequences


 the evolutionary relationships between them.

2
30/5/2023

Phylogenetic basics

• Phylogenetics: the study of the evolutionary history of organisms.


• Based on fossil data, but more recently on molecular data (evolution at
molecular level).
• A process of mutation with selection.
• Multiple sequence alignment  reveal similarities and divergence among
related biological sequences.
• Phylogenetic trees rationalized and visualized the details from MSA.
• Molecular phylogenetics is a fundamental aspect of bioinformatics.
• Advantages of molecular Phylogenetics:
• Molecular data more numerous than fossils
• No sampling bias involved
• More robust phylogenetic trees can be constructed

Which species are the closest living relatives


of modern humans?

Gorillas Humans
Chimpanzees
Chimpanzees
Bonobos
Bonobos
Orangutans Gorillas
Humans Orangutans

15-30 0 14 0
MYA MYA
The pre-molecular view was that the great apes
Molecular data show that bonobos and
(chimpanzees, gorillas and orangutans) formed
chimpanzees are related more closely to
a clade separate from humans, and that
humans than either are to gorillas.
humans diverged from the apes at least 15-30
MYA.

3
30/5/2023

Molecular phylogeny: nomenclature of trees

• Molecular phylogeny uses trees to depict evolutionary relationships


among organisms.

• Molecular Phylogeny  trees are based upon DNA and protein sequence
data.

• There are two main kinds of information inherent to any tree:


• Topology
• Branch lengths.

Types of trees: unrooted vs rooted

• A rooted phylogenetic tree is a • Unrooted trees illustrate the


tree with a unique root node relatedness of the leaf nodes
without making assumptions
corresponding to the most about common ancestry. An
recent common ancestor of all unrooted tree has a node with
the entities at the leaves (aka three edges; the rest of the
tips) of the tree. A rooted tree nodes have up to two edges.
is a binary tree.

4
30/5/2023

Unrooted vs rooted
• Unrooted tree:
• No knowledge of common ancestor
• Relative relationships
• No evolutionary direction
• The root of the tree is not known (the common ancestor is already extinct).
• A need to define the root of a tree in practice.
• Define the root of a tree.
• Outgroup (distant relation; e.g.. bird for mammal tree).
• Midpoint root (midpoint of two most divergent groups)  divergence from
root to tips for both branches is equal and follows the “molecular clock”
hypothesis.
• Molecular clock
• Molecular sequences evolve at constant rates.
• Amount of accumulated mutations is proportional to evolutionary time.
• branch lengths on a tree can be used to estimate divergence time.
• Assume the of uniformity of evolutionary rates (rarely true in reality).

Terminology
clade

monophyletic taxon/taxa (plural)

node branch

dichotomy polytomy

lineage

root node

• Taxa: Operational Taxonomic Units (OTUs).


• Branch join: node
• Dichotomy: Node bifurcate to give to two descendent
• Multifurcating node: polytomy
• Lineage: Path depicting an ancestor-descendent relationship
• Clade: a group of taxa descended from a single common ancestor
• Monophyletic group: two taxa share a unique common ancestor not shared
by any other taxa (sister taxa to each other)
• Tree topology: The branching pattern.

5
30/5/2023

Examples of clades

Lindblad-Toh et al., Nature


438: 803 (2005), fig. 10

Examples of multifurcation: failure to resolve the branching order


of some metazoans and protostomes

Rokas A. et al., Animal Evolution and the Molecular Signature of Radiations


Compressed in Time, Science 310:1933 (2005).

6
30/5/2023

Dendrogram, cladogram, phylogram

• Dendrogram is the ‘generic’ term applied to any type of diagrammatic representation of


phylogenetic trees. All four trees depicted here are dendrograms.
• Cladogram (to some biologists) is a tree in which branch lengths DO NOT represent evolutionary
time; clades just represent a hypothesis about actual evolutionary history (Not scaled).
TREE1 and TREE2 are cladograms and TREE1 = TREE2
• Phylogram (to some biologists) is a tree in which branch lengths DO represent evolutionary time;
clades represent true evolutionary history (amount of character change) (Scaled).
TREE3 and TREE4 are phylograms and TREE3 ≠ TREE4

Newick format
Cladogram Phylogram
C
A B C D E E
A B D

(((B,C),A),(D,E))
(((B:1,C:2),A:2),(D:1.2,E:2.4))

To provide information of tree topology to computer programs

7
30/5/2023

Finding a tree may be difficult

• The search for a correct tree topology can sometimes be extremely difficult
and computationally demanding.
• Number of possible tree topologies is a function of the number of taxa.

• Unrooted trees:
NU = (2n-5)!/2n-3(n-3)!

• Rooted trees:
NR = (2n-3)!/2n-2(n-2)!

Tree-building methods

• Two tree-building
• Distance-based methods:
• Distance metric: such as the number of amino acid changes
between the sequences, or a distance score.
• UPGMA
• Neighbor-joining (NJ).
• Character-based methods:
• Maximum parsimony: the search for the tree with the fewest
amino acid (or nucleotide) changes that account for the
observed differences between taxa.
• Maximum likelihood.

8
30/5/2023

Multiple Sequence Alignment

Multiple Sequence Alignment

Distance-based tree
Calculate the pairwise alignments;
if two sequences are related,
put them next to each other on the tree

9
30/5/2023

Multiple Sequence Alignment

Character-based tree: identify positions


that best describe how characters (amino
acids) are derived from common ancestors

Classification of phylogenetic building methods

COMPUTATIONAL METHOD
Optimality criterion Clustering algorithm
Characters

PARSIMONY

MAXIMUM LIKELIHOOD
DATA TYPE

Distances

MINIMUM EVOLUTION UPGMA

NEIGHBOR-JOINING

10
30/5/2023

Tree-building methods

[1] distance-based

[2] character-based: maximum parsimony

[3] character- and model-based: maximum likelihood

Tree-building methods: UPGMA

UPGMA:
• unweighted pair group method using arithmetic mean.
• Distance based method

1 2

3
4

11
30/5/2023

Tree-building methods: UPGMA

Step 1: compute the pairwise distances of all


the proteins.

1 2

3
4

Tree-building methods: UPGMA

Step 2: Find the two proteins with the


smallest pairwise distance. Cluster them.

1 2

6
3
1 2
4

12
30/5/2023

Tree-building methods: UPGMA

Step 3: Find the next two proteins


with the smallest pairwise distance. Cluster them.

1 2
6 7
1 2 4 5
3
4

Tree-building methods: UPGMA

Step 4: Keep going. Cluster.

1 2
8

7
6
3
4

5
1 2 4 5 3

13
30/5/2023

Tree-building methods: UPGMA

Step 4: Last cluster! This is your tree.

1 2
8
7
3
4 6
5
1 2 4 5 3

Tree-building methods: UPGMA

• UPGMA is a simple approach for making trees.


• An UPGMA tree is always rooted.
• An assumption of the algorithm is that the molecular
clock is constant for sequences in the tree.
• If there are unequal substitution rates, the tree may
be wrong.
• While UPGMA is simple, it is less accurate than the
neighbor-joining approach.

14

You might also like