???-101 ?????????? ??.2
???-101 ?????????? ??.2
Total Marks: 8
Solution:
a. Pairwise vs. Multiple Sequence Alignment
1. Definition:
• Pairwise Sequence Alignment compares two sequences (DNA, RNA, or proteins) to find regions
of similarity.
• Multiple Sequence Alignment (MSA) aligns three or more sequences simultaneously to identify
conserved regions.
2. Purpose:
• Pairwise Alignment is primarily used to compare two sequences for similarity or evolutionary
relationships.
• MSA helps analyze conserved domains, motifs, and evolutionary trends among a group of
sequences.
3. Complexity:
• Pairwise is computationally simple and faster, using algorithms like Needleman-Wunsch or
Smith-Waterman.
• MSA is computationally intensive and requires complex algorithms like ClustalW or MUSCLE.
4. Output Information:
• Pairwise Alignment gives information on similarity, identity, and possible homology between just
two sequences.
• MSA shows alignment across many sequences, revealing conserved residues and structural or
functional motifs.
5. Use Cases:
• Pairwise is used in tasks like checking a new gene sequence against a known sequence.
• MSA is used for building phylogenetic trees, identifying conserved sequence patterns, and
understanding protein families.
2. Measurement:
• Identity is expressed as a percentage of exact matches over the alignment length.
• Similarity includes both exact matches and functionally similar (but not identical) residues using
substitution matrices like PAM or BLOSUM.
3. Implication:
• High Identity indicates strong evolutionary relationship or conserved function.
• High Similarity can still suggest functional or structural relation even if identity is low.
4. Tools Used:
• Identity can be calculated using tools like BLAST or EMBOSS pairwise alignment.
• Similarity considers evolutionary models and is assessed through scoring matrices in tools like
Clustal, BLAST, etc.
5. Use in Bioinformatics:
• Identity is often used in threshold-based sequence filtering (e.g., 90% identity cutoffs).
• Similarity helps in detecting distant homologs and understanding structural/functional
relationships.
3. Algorithm Design:
• BLAST breaks query into short words (k-mers), finds matches, then extends them.
• FASTA searches for exact matches (k-tuples), builds diagonals, and then scores the alignments.
4. Output Format:
• BLAST provides a more interactive and detailed output, including graphical displays, scores, e-
values, and alignment segments.
• FASTA gives a textual output showing alignments, scores, and statistical significance.
5. Use in Bioinformatics:
• BLAST is more commonly used today due to its speed and NCBI integration. It’s widely used for
gene annotation, database searching, and homology detection.
• FASTA is still used for teaching, smaller searches, and specific tasks where its sensitivity is
beneficial.
2. Data Nature:
• Database contains real-time, operational data such as daily transactions, user activities, etc.
• Data Warehouse contains historical, integrated, and often large-scale data optimized for query
and analysis.
3. Usage Purpose:
• Database is used for everyday operations, e.g., booking systems, hospital management, bank
transactions.
• Data Warehouse is used for business intelligence, trend analysis, and decision-making processes.
https://chat.whatsapp.com/LF8UFWqhEzxJHJzQ9bpn46
https://chat.whatsapp.com/IOBrX1fPvjMLRyi64DP6XW
https://chat.whatsapp.com/JHYW7fjvJRI2J62WpVk2UG
Regards:
VU Expert Zoologists