Act 10 EVOLAB 1

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

Multiple Sequence Alignment using Clustal

X, T-Coffee and MUSCLE


GROUP 1
1. Open Exercise 4 folder which contains the sample data primatesAA.fas file.
This file contains 22 primate Trim5α amino acid sequences in fasta format. Install
Clustal X and open the sequence file using File → Load Sequences.
1.1
1.2
2. Once loaded this will be the output. The graphical display allows the user to slide over the
unaligned protein sequences.
3. Select Do complete Alignment from the Alignment menu. ClustalX performs the progressive
alignment(progress can be followed up in the lower left corner), and creates an output guide
tree file and an output alignment file in the default Clustal format.
3.1
3.2
3.3
3.4
3.5
3.6
4. It is, however, possible to choose a different format in the Output Format Options from the Alignment menu, such as,
for example, the Phylip format which, in contrast to Clustal, can be read by many of the phylogeny packages. The file
with “.dnd” extension contains the guide tree generated by ClustalX in order to carry out the multiple alignment. Note
that the guide tree is a very rough phylogenetic tree and it should not be used to draw conclusions about the
evolutionary relationships of the taxa under investigation! ClustalX also allows the user tochange the alignment
parameters (from Alignment Parameters in the Alignment menu). Unfortunately, there are no general rules for choosing
the best set of parameters, like the gap- open penalty or the gap-extension penalty. If an alignment shows, for example,
too many large gaps, the user can try to increase the gap-opening penalty and redo the alignment. ClustalX indicates
the degree of conservation at the bottom of the aligned sequences, which can be used toevaluate a given alignment.
5. By selecting Calculate Low- Scoring Segment and Show Low-Scoring Segments from the Quality
menu, it is also possible to visualize particularly unreliable parts of the alignment, which may be better to
exclude from the subsequent analyses. It should be kept in mind that alignment scores are based on
arbitrary assumptions about the details of the evolutionary process and are not always biologically
plausible.
5.1
Save the amino acid alignment using File →Save Sequences as and selecting FASTA output
format, save under file name primatesAAclustal.fas as not tooverwrite your original file.
T-Coffee alignment

1. Perform the consistency based alignment using the T-Coffee webserver in this exercise (available at
http://www.tcoffee.org/). Select the regular submission form that only requires us to upload the
“primatesAA.fas” file or paste the sequences in the submission window. An email notification with a link
to the results page will be sent if the email address is filled in, but this is not required.
1.1
1.2
1.3
1.4
2. It will now run your alignment. Screenshot your results and the gaps. The gaps are the bits
that are left behind when you try to align protein or DNA Sequences.
3. Screenshot your results file
4. When the alignment procedure is completed, a new page will appear with links to the output
files. Save the alignment in fasta format to your desktop (e.g. as “primatesAA tcoffee.fas”).

• “primatesAA tcoffee.fas
Muscle alignment
1. To compute an iteratively refined alignment, we will use the standalone Muscle software, available for
Windows, Unix/linux, and MacosX at http://www.drive5.com/. Muscle is a command-line program and thus
requires the use of a terminal (Unix/Linux/MacosX) or a DOS Window on a Windows operating system.
2. Install muscle3.8.31_i86win32.exe from exercise 4 folder.
3. Create new folder file name Muscle. Copy the input file (“primatesAA.fas”) to the Muscle folder along with
the executable muscle file (you may rename executable file to muscle), then open a terminal/DOS-Window
and go to the Muscle folder (using “cd”).
To execute the program in a DOS-Window the executable with extension needs to be specified:
muscle.exe −in primatesAA.fas −out primatesAA muscle.fas. The program completes the
alignment procedure in a few seconds and writes out the outfile in fasta format (“primatesAA
muscle.fas”).
GUIDE QUESTIONS:
1. What have you noticed between using T-Coffee alignment and Clustal
progressive alignment? Which of these two alignment offers a rapid result?
Describe your experience of using each of the alignment

- From what we have noticed while using T-Coffee alignment and Clustal progressive alignment, we’ve
noticed that Clustal progressive alignment is more efficient to use since it doesn’t require an internet
connection to view multiple sequence alignment. As well as it shows rapid results than T-Coffee
alignment. However, in terms of accuracy, T-Coffee alignment is far way better since it is more accurate
since it uses a progressive aproach.It generates a library of pairwise alignments to guide the multiple
sequence alignment. It can also combine multiple sequence alignments obtained previously and in the
latest versions can use structural information from PDB files (3D-Coffee). It has advanced features to
evaluate the quality of the alignments and some capacity for identifying occurrence of motifs (Mocca). It
produces alignment in the aln format (Clustal) by default, but can also produce PIR, MSF, and FASTA
format. Both software has its own pros and cons. Based on our experience it may be seemingly fast
and easy to use the Clustal X application however it does not produce accurate alignment sequence
results unlike the T-Coffee.

You might also like