0% found this document useful (0 votes)
9 views2 pages

Bioinformatics 2 Abd 3

The document outlines various file formats for next-generation sequencing, including SAM, BAM, and CRAM, highlighting their characteristics and uses in sequence alignment. It also explains PAM matrices, which are used to measure evolutionary changes between amino acids, and discusses alignment tools like LALIGN, PLALIGN, and T-COFFEE for multiple sequence alignments. The document emphasizes the importance of these formats and tools in analyzing biological sequences and inferring evolutionary relationships.

Uploaded by

AJITHKUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views2 pages

Bioinformatics 2 Abd 3

The document outlines various file formats for next-generation sequencing, including SAM, BAM, and CRAM, highlighting their characteristics and uses in sequence alignment. It also explains PAM matrices, which are used to measure evolutionary changes between amino acids, and discusses alignment tools like LALIGN, PLALIGN, and T-COFFEE for multiple sequence alignments. The document emphasizes the importance of these formats and tools in analyzing biological sequences and inferring evolutionary relationships.

Uploaded by

AJITHKUMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

File Format for Next- Generation Sequencing:

• SAM • CRAM:
• BAM o New format that is very similar to BAM.
o Same information as SAM and is Compressed.
• CRAM
o SAM – Sequence Alignment/ Map • BAM:
▪ A Header “optional, start o It has same information as
with @ SAM files.
▪ An alignment section o They are in binary file
• It is text file format. format.
• It contains the alignment o It is not readable by
information of various sequences human.
that are mapped against reference o It is smaller and more
sequence. efficient for software to
• It also contains unmated work, saving time and
sequence. reducing cost of
• It is readable by humans. computation and storage.

PAM: Point Accepted Mutation


• A PAM matrix is a matrix where each column and row represents
one of the twenty standard amino acids.
• PAM matrices are amino acid substitution matrices that encode
the expected evolutionary change at the amino acid level.
• Each PAM matrix is designed to compare two sequences which
are a specific number of PAM units apart.
• PAM 120 score matrix is designed to compare between sequences
that are 120 PAM units apart.
• The score it gives a pair of sequences is the probability of such
sequences evolving during 120 PAM units of evolution.
• Different PAM matrices correspond to different length of time in
1
the evolution of the protein sequence.
• LALIGN is not used for global multiple sequence alignment.

• LALIGN/ PLALIGN find internal duplications by calculating non-


intersecting local alignments of protein or DNA sequences.

• LALIGN shows the alignments and similarity scores.

• PLALIGN presents a dot-plot like graph.

• Multiple Sequence Alignments is generally the alignment of three or more


biological sequences (protein or nucleic acid) of similar length.

• Homology can be inferred and the evolutionary relationships between the


sequence studies.

• The most widely used programs for global multiple sequence alignment are
from the CLUSTAL series of programs.

• PILEUP: it creates a multiple sequence alignment from a group of related


sequence using progressive, Pair wise alignments.

• T-COFFEE: compares all sequences two by two producing a global


alignment and a series of local alignment.

You might also like