CR Micro
CR Micro
1. ~Describe the steps for BLAST and classify its different types.
Main steps of BLAST are: 3. ~Differentiate between primary, secondary and composite databases with examples of each.
Primary databases store and make data available to the public, acting as repositories. Example: GenBANK, DDBJ
Step 1: The first step is to create a lookup table or list of words from the query sequence. This step is also called
Secondary databases make use of publicly available sequence data in primary databases to provide layers of
seeding. BLAST takes the query sequence and breaks it into short segments called words.
information to DNA or protein sequence data. Example: UniProt Knowledgebase.
Step 2: Search database for exact matching with the list of words complied in Step 1. Composite databases are meant for keeping records of specific datasets meant for specific purpose and
Step 3: BLAST then scores the similarity of the matching words. The matching of the words is scored by a given applications. Example: OMIM
substitution matrix.
Step 4: Evaluating significance of extended hits from step 3. 4. ~Infer Global alignment.
There are five types of BLAST that are differentiated based on the type of sequence (DNA or protein) of the query Global alignment is a method of comparing two sequences, which aligns the entire length of the sequences by
and database sequences. They are: BLASTN, BLASTP, BLASTX, TBLASTN and TBLASTX. maximizing the overall similarity. This method is used when comparing sequences that are of the same length.
Global alignment is based on Needleman-Wunsch alignment. In global alignment Sequence to be aligned
2. Estimate the characteristics and the applications of BLAST.
Several key features of BLAST make it a widely used tool in bioinformatics. assume to be genetically similar over there entire length. Alignment is carried out from beginning to end of both
- BLAST is fast and efficient, making it possible to handle large databases of sequences. sequences to find the best possible alignment across the entire length between the sequences. The two
- It is a flexible and versatile tool as it can be used to search for similarities in both nucleotide and protein sequences are treated as potentially equivalent.
sequences.
- It is highly sensitive which allows the identification of even small similarities between sequences. 5. ~Describe the primary purpose of the NCBI database in the field of bioinformatics.
The NCBI database, or the National Centre for Biotechnology Information database, serves as a central repository
- It aims to identify regions of local similarity between the query sequence and the database sequence, rather
for a wide range of biological and genetic information. Its primary purpose is to provide researchers, scientists,
than attempting to align the entire sequences.
and the public with access to data related to genetics, genomics, and other biological sciences. It hosts DNA and
- It has a user-friendly interface that makes it easy to input query sequences and interpret the results.
Applications of BLAST are: protein sequences, genomic data, literature references, and tools for sequence analysis. Researchers use NCBI to
- BLAST can be used to identify unknown sequences by comparing them with known sequences in a database study genetic variations, conduct comparative genomics, and access valuable information for various biological
research purposes.
which helps in predicting the functions of proteins or genes.
- BLAST can also be used in phylogenetic analysis which is important for understanding the
evolutionary relationships between different species.
- BLAST can also be used to identify functionally conserved domains within proteins which is important 6. ~Infer Local alignment and describe its application.
for predicting the functions of proteins. In local alignment, instead of attempting to align the entire length of the sequences, only the regions with the
highest density of matches are aligned. This is useful for identifying short, conserved regions in protein or
nucleotide sequences. Local alignment programs are based on the Smith-Waterman algorithm. Local alignment
3. ~Articulate the different types of phylogenetic tree. does not assume that two sequences in question have similarity over the entirement; rather it only finds local
- Rooted tree. Make the inference about the most common ancestor of the leaves or branches of the tree. regions with the highest level of similarities between the two sequences and aligns these sequences without
- Un-rooted tree. Make an illustration about the leaves or branches and do not make any assumption regarding regard for the alignment of the rest of the sequence regions. There are three primary methods for producing local
the most common ancestor. alignments, dot Matrix method. dynamic programming and word or k tuple method.
- Bifurcating tree: Phylogenetic trees that only have two branches or leaves are referred to as Goal: See whether a substring in one sequence aligns well with a substring in the other.
bifurcating trees. Additionally, it can be divided into rooted and unrooted bifurcating trees. Application:
- Multifurcating tree: Multiple branches can be found on a single node in a multifurcating tree, as the name 1. Searching for local similarities in large sequence (example newly sequenced genome).
suggests. Both a rooted multifurcating tree and an unrooted multifurcating tree are categories for it once more. 2. Searching conserved domains or motifs.