0% found this document useful (0 votes)

123 views11 pages

Unit IV

1. Phylogenetic analysis aims to reconstruct evolutionary relationships between species based on genetic sequences. The more similar sequences are, the more closely related the organisms. 2. Different software packages and datasets can produce different phylogenetic trees. It is important to have an accurate sequence alignment. Gene trees may not match organismal trees if different genes are analyzed. 3. Key terminology includes nodes, branches, topology, branch lengths, root, and distance scales. Nodes represent taxa, branches define relationships, and topology is the branching pattern. Branch lengths and distance scales represent genetic differences.

Uploaded by

Dr. R. K. Selvakesavan PSGRKCW

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views11 pages

Unit IV

Uploaded by

Dr. R. K. Selvakesavan PSGRKCW

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

PHYLOGENETIC ANALYSIS

1. Purpose of phylogenetics :
With the aid of sequences, it should be possible to find the genealogical ties between organisms.
Experience learns that closely related organisms have similar sequences, more distantly related
organisms have more dissimilar sequences. One objective is to reconstruct the evolutionary
relationship between species. An other objective is to estimate the time of divergence between
two organisms since they last shared a common ancestor.

2. Disclaimers :
The theory and practical applications of the different models are not universally accepted. With
one dataset, different software packages can give different results. Changes in the dataset can
also give different results. Therefore it is important to have a good alignment to start with.
Trees based on an alignment of a gene represent the relationship between genes and this is not
necessarily the same relationship as between the whole organisms. If trees are calculated based
on different genes from organisms, it is possible that these trees result in different relationships.

3. Terminology :
node : a node represents a taxonomic unit. This can be a taxon (an existing species) or an
ancestor(unknown species : represents the ancestor of 2 or more species).
branch : defines the relationship between the taxa in terms of descent and ancestry.
topology : is the branching pattern.
branch length : often represents the number of changes that have occurred in that branch.
root : is the common ancestor of all taxa.
distance scale : scale which represents the number of differences between sequences (e.g. 0.1
means 10 %differences between two sequences)

How to construct a Phylogenetic tree?

• A phylogenetic tree is a visual representation of the relationship between different
organisms, showing the path through evolutionary time from a common ancestor to different
descendants.
• Similarities and divergence among related biological sequences revealed by sequence
alignment often have to be rationalized and visualized in the context of phylogenetic trees.
Thus, molecular phylogenetics is a fundamental aspect of bioinformatics.
• Molecular phylogenetics is the branch of phylogeny that analyzes genetic, hereditary
molecular differences, predominately in DNA sequences, to gain information on an organism’s
evolutionary relationships.
• The similarity of biological functions and molecular mechanisms in living organisms
strongly suggests that species descended from a common ancestor. Molecular phylogenetics
uses the structure and function of molecules and how they change over time to infer these
evolutionary relationships.
• From these analyses, it is possible to determine the processes by which diversity among
species has been achieved. The result of a molecular phylogenetic analysis is expressed in a
phylogenetic tree.

Phylogenetic Analysis and the Role of Bioinformatics

Molecular data that are in the form of DNA or protein sequences can also provide very useful
evolutionary perspectives of existing organisms because, as organisms evolve, the genetic
materials accumulate mutations over time causing phenotypic changes. Because genes are the
medium for recording the accumulated mutations, they can serve as molecular fossils. Through
comparative analysis of the molecular fossils from a number of related organisms, the
evolutionary history of the genes and even the organisms can be revealed. However, phylogeny
inference are notoriously difficult endeavours because the number of solutions increases
explosively with the number of taxa and the tremendous number of new questions in
evolutionary biology that could be investigated through the use of larger taxon samplings.
But with the development and use of computational and an array of bioinformatics tools, the
ability to analyze large data sets in practical computing times, and yielding an optimal or near-
optimal solutions with high probability are being possible. In response to this trend, much of
the current research in phyloinformatics (i.e., computational phylogenetics) concentrates on the
development of more efficient heuristic approaches.
Steps in Phylogenetic Analysis
The basic steps in any phylogenetic analysis include:
1. Assemble and align a dataset
• The first step is to identify a protein or DNA sequence of interest and assemble a dataset
consisting of other related sequences.
• DNA sequences of interest can be retrieved using NCBI BLAST or similar search tools.
• Once sequences are selected and retrieved, multiple sequence alignment is created.
• This involves arranging a set of sequences in a matrix to identify regions of homology.
• There are many websites and software programs, such as ClustalW, MSA, MAFFT,
and T-Coffee, designed to perform multiple sequence on a given set of molecular data.
2. Build (estimate) phylogenetic trees from sequences using computational methods and
stochastic models
• To build phylogenetic trees, statistical methods are applied to determine the tree
topology and calculate the branch lengths that best describe the phylogenetic relationships of
the aligned sequences in a dataset.
• The most common computational methods applied include distance-matrix methods,
and discrete data methods, such as maximum parsimony and maximum likelihood.
• There are several software packages, such as Paup, PAML, PHYLIP, that apply these
most popular methods.

3. Statistically test and assess the estimated trees.

• Tree estimating algorithms generate one or more optimal trees.
• This set of possible trees is subjected to a series of statistical tests to evaluate whether
one tree is better than another – and if the proposed phylogeny is reasonable.
• Common methods for assessing trees include the Bootstrap and Jackknife Resampling
methods, and analytical methods, such as parsimony, distance, and likelihood.

Bioinformatics Tools for Phylogenetic Analysis

• There are several bioinformatics tools and databases that can be used for phylogenetic
analysis.
• These include PANTHER, P-Pod, PFam, TreeFam, and the PhyloFacts structural
phylogenomic encyclopedia.
• Each of these databases uses different algorithms and draws on different sources for
sequence information, and therefore the trees estimated by PANTHER, for example, may differ
significantly from those generated by P-Pod or PFam.
• As with all bioinformatics tools of this type, it is important to test different methods,
compare the results, then determine which database works best (according to consensus results)
for studies involving different types of datasets.
Phylogenetic Models
Possible ways of drawing a tree :Trees can be drawn in different ways. There are trees with
unscaled branches and with scaled branches.
Unscaled branches : the length is not proportional to the number of changes. Sometimes, the
number of changes are indicated on the branches with numbers. The nodes represents the
divergence event on a timescale.
Scaled branches : the length of the branch is proportional to the number of changes. The
distance between2 species is the sum of the length of all branches connecting them.
Is is also possible to draw these trees with or without a root. For rooted trees, the root is the
common ancestor. For each species, there is a unique path that leads from the root to that
species. The direction of each path corresponds to evolutionary time. An unrooted tree
specifies the relationships among species and does not define the evolutionary path.
Figure 2: Some possibilities for drawing a tree. (these are just a few examples, there are a lot
of variations possible)
Tree Building methods

5. Methods of phylogenetic analysis :

There are two major groups of analyses to examine phylogenetic relationships between
sequences :
Phenetic methods : trees are calculated by similarities of sequences and are based on distance
methods. The resulting tree is called a dendrogram and does not necessarily reflect
evolutionary relationships. Distance methods compress all of the individual differences
between pairs of sequences into a single number.
Cladistic methods : trees are calculated by considering the various possible pathways of
evolution and are based on parsimony or likelihood methods. The resulting tree is called a
cladogram. Cladistic methods use each alignment position as evolutionary information to
build a tree.

5.1. Phenetic methods based on distances :

Starting from an alignment, pairwise distances are calculated between DNA sequences as the
sum of all base pair differences between two sequences (the most similar sequences are
assumed to be closely related). This creates a distance matrix. All base changes can be
considered equally or a matrix of the possible replacements can be used.
Insertions and deletions are given a larger weight than replacements. Insertions or deletions of
multiple bases at one position are given less weight than multiple independent insertions or
deletions.
it is possible to correct for multiple substitutions at a single site.

From the obtained distance matrix, a phylogenetic tree is calculated with clustering
algorithms. These cluster methods construct a tree by linking the least distant pair of taxa,
followed by successively more distant taxa. UPGMA clustering (Unweighted Pair Group
Method using Arithmetic averages) : this is the simplest method
Neighbor Joining : this method tries to correct the UPGMA method for its assumption that
the rateof evolution is the same in all taxa.

5.2. Cladistic methods based on Parsimony :For each position in the alignment, all possible
trees are evaluated and are given a score based on the number of evolutionary changes needed
to produce the observed sequence changes. The most parsimonious tree is the one with the
fewest evolutionary changes for all sequences to derive from a common ancestor. This is a
more time-consuming method than the distance methods.
5.3. Cladistic methods based on Maximum Likelihood :This method also uses each position
in an alignment, evaluates all possible trees, and calculates the likelihood for each tree using
an explicit model of evolution (<-> Parsimony just looks for the fewest evolutionary changes).
The likelihood's for each aligned position are then multiplied to provide a likelihood for each
tree. The tree with the maximum likelihood is the most probable tree. This is the slowest
method of all but seems to give the best result and the most information about the tree.

Theoretical problems with evolutionary changes between sequences

Transitions : substitutions from A to G ; G to A ; C to T ; T to C.
Transversions : substitutions from G to C ; C to G ; T to A ; A to T.
Deletions : removal of one or more nucleotides.
Insertion : addition of one or more nucleotides.
Inversion : rotation of 180 °C of a double stranded DNA segment compromised of of 2 or more
base pairs.

The next figure shows that there is a chance that many more mutations occur than visible at a
certain time. Even the best evolutionary models can't solve this problem...

Figure 3: Two homologous DNA sequences which descended from an ancestral sequence and
accumulated mutations since their divergence from each other. Note that although 12 mutations
have accumulated, differences can be detected at only three nucleotide sites. (from
Fundamentals of Molecular Evolution, Wen-Hsiung Li andDan Graur, 1991).
7. Graphical explanation of basic phylogenetic terms

Monophyletic taxon : A group composed of a collection of organisms, including the most

recent common ancestor of all those organisms and all the descendants of that most recent
common ancestor. Monophyletic groups are also called a clade.

Examples : Mammalia, Aves (birds), angiosperms, insects, etc.

Paraphyletic taxon : A group composed of a collection of organisms, including the most

recent common ancestor of all those organisms. Unlike a monophyletic group, a
paraphyletic taxon does not include all the descendants of the most recent common ancestor.

Examples : Traditionally defined Dinosauria, fish, gymnosperms, invertebrates, protists, etc.

Polyphyletic taxon : A group composed of a collection of organisms in which the most

recent common ancestor of all the included organisms is not included, usually because the
common ancestor lacks the characteristics of the group.
Polyphyletic taxa are considered "unnatural", and usually are reclassified once they are
discovered to be polyphyletic.

Examples : marine mammals, bipedal mammals, flying vertebrates, trees, algae, etc.
Phylogenetic softwares

1. MEGA
MEGA is a useful software in constructing phylogenies and visualizing them, and also for data
conversion. It can easily convert alignment files to other formats such as nexus, paup, phylip,
and fasta, and so on. The MEGA tree explorer is helpful in editing trees very easily, subtrees
can also be selected and edited separately. Some tree image export options are also available.
The input formats are newick, phylip, mega, and nexus. The phylogenetic tree can also be
converted in newick format but it falls short on converting it into other formats such as phylip
which is required in other analyses such as selection analysis.

2. Dendroscope
It is helpful in visualizing large trees and provides several options to export their graphics with
a command line. Several different views are also available, trees can be easily re-rooted and
node labels and branches can be easily formatted. It can export trees in newick and nexus
format. Although users will have to register themselves first to use this feature.

3. FigTree
It is actually designed to visualize trees that are produced by BEAST program. Tip labels and
node labels can be easily edited. It can easily export trees in nexus, newick, and JSON format
with some graphics export options such as emf, pdf, sg, png, etc.

4. Phylotree.js
It is a javascript based library to visualize and annotate trees and offer some other
customizations. It has a wide application in Datamonkey comparative analyses. A user can
upload trees using Phylotree.js where a user can easily select test and reference branches, and
any changes can be mapped to their position on the corresponding structure. It is also good for
comparison of trees with links between leaves known as a tanglegram, where crossings can
represent evolutionary events. It also offers several export options and other built-in features.

5. ggtree
ggtree is an R package for phylogenetic tree visualization and annotation. It also displays
annotation data on the tree apart from visualizing it. Users can annotate trees with their own
data and can easily convert trees into a data frame, and a lot of other features are available
(https://guangchuangyu.github.io/software/ggtree/).

Huson D.H., Rupp R., Scornavacca C. - Phylogenetic networks-CUP (2010) PDF
100% (1)
Huson D.H., Rupp R., Scornavacca C. - Phylogenetic networks-CUP (2010) PDF
375 pages
Computational Phylogenetics
No ratings yet
Computational Phylogenetics
404 pages
Phylogenetic Analysis1
No ratings yet
Phylogenetic Analysis1
62 pages
Quantitative Controller: ZJ-LCD-M Flow Meter User Manual
0% (1)
Quantitative Controller: ZJ-LCD-M Flow Meter User Manual
4 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
25 pages
IBB - MB.501 Mol. Phylogeny
No ratings yet
IBB - MB.501 Mol. Phylogeny
81 pages
BDMH Phylogenetic
No ratings yet
BDMH Phylogenetic
32 pages
Molecular Phylogeny
No ratings yet
Molecular Phylogeny
78 pages
Swami
No ratings yet
Swami
12 pages
Phylogenetics Basics
No ratings yet
Phylogenetics Basics
28 pages
Water Energy Generator US20060180473A1
100% (1)
Water Energy Generator US20060180473A1
26 pages
Pile Type 1 - Screw Pile Load Test Outline (Terna)
No ratings yet
Pile Type 1 - Screw Pile Load Test Outline (Terna)
113 pages
Lecture 11 (Phylogenetic)
No ratings yet
Lecture 11 (Phylogenetic)
24 pages
PHYLOGENY
No ratings yet
PHYLOGENY
18 pages
Phylogenomics An Introduction Full Download
No ratings yet
Phylogenomics An Introduction Full Download
16 pages
Swami
No ratings yet
Swami
11 pages
Phylogenetics
No ratings yet
Phylogenetics
108 pages
Phylogenic Tree
No ratings yet
Phylogenic Tree
42 pages
Phyl o Genetics
No ratings yet
Phyl o Genetics
58 pages
Bioengineering 11 00480 With Cover
No ratings yet
Bioengineering 11 00480 With Cover
23 pages
Introduction - Arbres - Phylogénique
No ratings yet
Introduction - Arbres - Phylogénique
36 pages
BIL-Note 2 Last
No ratings yet
BIL-Note 2 Last
44 pages
Steps To Construct A Phylogenetic Tree Using MEGA Software
No ratings yet
Steps To Construct A Phylogenetic Tree Using MEGA Software
3 pages
Intro To Phyl o Genetics
No ratings yet
Intro To Phyl o Genetics
44 pages
Molecular Phylogenetics
No ratings yet
Molecular Phylogenetics
20 pages
Introduction To Phylogeny
No ratings yet
Introduction To Phylogeny
57 pages
BTC 506 Phylogenetic Analysis
No ratings yet
BTC 506 Phylogenetic Analysis
58 pages
Phylogenetics 1 and 2
No ratings yet
Phylogenetics 1 and 2
30 pages
Phylogenetic Analysis Extra
No ratings yet
Phylogenetic Analysis Extra
13 pages
Basic Discrete Structure
100% (1)
Basic Discrete Structure
57 pages
Pectin
100% (1)
Pectin
10 pages
Nmrws2: H O That Are Aldehydes
No ratings yet
Nmrws2: H O That Are Aldehydes
4 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
12 pages
00 Endterm Activity Building A Phylogenetic Tree
No ratings yet
00 Endterm Activity Building A Phylogenetic Tree
6 pages
Disclaimer
No ratings yet
Disclaimer
9 pages
Irrigation Engineering II
100% (1)
Irrigation Engineering II
1 page
Phylogenetic Analysis
No ratings yet
Phylogenetic Analysis
11 pages
Module 2 Unit - 2 EVOLUTIONARY TREES AND PHYLOGENY
No ratings yet
Module 2 Unit - 2 EVOLUTIONARY TREES AND PHYLOGENY
39 pages
Credit Scoring Using Machine Learning
No ratings yet
Credit Scoring Using Machine Learning
381 pages
Yang and Rannala 2012 Molecular Phylogenetics.
100% (1)
Yang and Rannala 2012 Molecular Phylogenetics.
12 pages
Phylogeny
No ratings yet
Phylogeny
21 pages
Phylogeny Notes
No ratings yet
Phylogeny Notes
14 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
9 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
25 pages
10 1109@gucon 2018 8675069
No ratings yet
10 1109@gucon 2018 8675069
4 pages
Phylogenetic Trees
No ratings yet
Phylogenetic Trees
11 pages
Arithmetic Questions For Ibps RRB Prelims Exam With Video Explanation
No ratings yet
Arithmetic Questions For Ibps RRB Prelims Exam With Video Explanation
18 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
4 pages
Molecular Phylogenetics
No ratings yet
Molecular Phylogenetics
29 pages
Vimal Roll No 2211022 ANALYSIS TOOL. PHYLIPpptx
No ratings yet
Vimal Roll No 2211022 ANALYSIS TOOL. PHYLIPpptx
27 pages
Understanding Phylogenies
No ratings yet
Understanding Phylogenies
6 pages
Sim of Tyre Rolling Resistance Final Rev
No ratings yet
Sim of Tyre Rolling Resistance Final Rev
26 pages
Phylogeny
No ratings yet
Phylogeny
43 pages
Phylogenetic Analysis
100% (1)
Phylogenetic Analysis
27 pages
Phylogenetic Analyses2
No ratings yet
Phylogenetic Analyses2
16 pages
Hydrocracking Technology
100% (1)
Hydrocracking Technology
12 pages
Phylogenetics
No ratings yet
Phylogenetics
18 pages
Reference Softener Calculations
No ratings yet
Reference Softener Calculations
1 page
Mathematical Literacy P2 Feb-March 2011 Memo Eng
No ratings yet
Mathematical Literacy P2 Feb-March 2011 Memo Eng
23 pages
Phylogenetic Tree Bioinformatics
No ratings yet
Phylogenetic Tree Bioinformatics
4 pages
4rth Phylogeny by MAtti Ullah KHanNiazi
No ratings yet
4rth Phylogeny by MAtti Ullah KHanNiazi
9 pages
Phylogenetics PDF by Matti Ullah KHan NIazi
No ratings yet
Phylogenetics PDF by Matti Ullah KHan NIazi
4 pages
Molecular Phylogenetic Analysis: - Humans-flies-Mollusks - Common Phenotype?
No ratings yet
Molecular Phylogenetic Analysis: - Humans-flies-Mollusks - Common Phenotype?
35 pages
Permeability Determination From Stoneley Waves in
No ratings yet
Permeability Determination From Stoneley Waves in
18 pages
Phylogenetic Analysis
No ratings yet
Phylogenetic Analysis
25 pages
AutoCAD (Proficient) MECH 1
No ratings yet
AutoCAD (Proficient) MECH 1
40 pages
Lab 3
No ratings yet
Lab 3
6 pages
Phylogeny Analysis
No ratings yet
Phylogeny Analysis
49 pages
Design of PV
No ratings yet
Design of PV
6 pages
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
No ratings yet
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
12 pages
Building Phylogenetic Trees From Molecular Data With MEGA
No ratings yet
Building Phylogenetic Trees From Molecular Data With MEGA
7 pages
Linux Unit 5
No ratings yet
Linux Unit 5
28 pages
BE Phylogenetics
No ratings yet
BE Phylogenetics
6 pages
Basic Principles of Colour Measurement and Colour
No ratings yet
Basic Principles of Colour Measurement and Colour
25 pages
Chapter 2 - Review Questions: Operating-System Structures
No ratings yet
Chapter 2 - Review Questions: Operating-System Structures
2 pages
IE 5004 Lecture 2
No ratings yet
IE 5004 Lecture 2
45 pages
2344-Article Text-4603-1-10-20230113
No ratings yet
2344-Article Text-4603-1-10-20230113
8 pages
Phylomatic Cheat Sheet
No ratings yet
Phylomatic Cheat Sheet
6 pages
Geosynthetic Lining System For Modern Waste Facilities - Experiences in Developing Asia
No ratings yet
Geosynthetic Lining System For Modern Waste Facilities - Experiences in Developing Asia
8 pages
Cognato q#5
No ratings yet
Cognato q#5
4 pages
SR 728 C 1
No ratings yet
SR 728 C 1
36 pages
Lecture 6 - Test Design Techniques
No ratings yet
Lecture 6 - Test Design Techniques
44 pages
Soeg RT m18 Ps K GB
No ratings yet
Soeg RT m18 Ps K GB
5 pages
Ground Motion Selection and Scaling For Seismic Design of RC Frames Against Collapse
No ratings yet
Ground Motion Selection and Scaling For Seismic Design of RC Frames Against Collapse
16 pages
Geotechnical Characteristics of Copper Mine Tailings: A Case Study
No ratings yet
Geotechnical Characteristics of Copper Mine Tailings: A Case Study
13 pages
3-Terminal 1A Positive Voltage Regulator
No ratings yet
3-Terminal 1A Positive Voltage Regulator
2 pages
Numerical Simulation of Silicon Heterojunction Solar Cells Featuring Metal Oxides As Carrier-Selective Contacts
No ratings yet
Numerical Simulation of Silicon Heterojunction Solar Cells Featuring Metal Oxides As Carrier-Selective Contacts
9 pages
Program // Mouseeventsview - CPP: Implementation of The Cmouseeventsview Class
No ratings yet
Program // Mouseeventsview - CPP: Implementation of The Cmouseeventsview Class
6 pages
A Beginner's Guide to Taxonomy
From Everand
A Beginner's Guide to Taxonomy
D Brown
No ratings yet
Genetic Algorithm: Fundamentals and Applications
From Everand
Genetic Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

Unit IV

Uploaded by

Unit IV

Uploaded by

PHYLOGENETIC ANALYSIS

How to construct a Phylogenetic tree?

Phylogenetic Analysis and the Role of Bioinformatics

3. Statistically test and assess the estimated trees.

Bioinformatics Tools for Phylogenetic Analysis

5. Methods of phylogenetic analysis :

5.1. Phenetic methods based on distances :

Theoretical problems with evolutionary changes between sequences

Monophyletic taxon : A group composed of a collection of organisms, including the most

Examples : Mammalia, Aves (birds), angiosperms, insects, etc.

Paraphyletic taxon : A group composed of a collection of organisms, including the most

Examples : Traditionally defined Dinosauria, fish, gymnosperms, invertebrates, protists, etc.

Polyphyletic taxon : A group composed of a collection of organisms in which the most

You might also like