0% found this document useful (0 votes)

16 views17 pages

Bioinformatics Lab Assaignment 2

The document outlines an assignment for a Bioinformatics Lab course focused on Retinol Binding Protein 4 and HIV pol, guiding students through the use of NCBI's website for gene research. It includes specific tasks such as searching for proteins, understanding gene functions, and exploring protein domains and mutations. The assignment emphasizes the importance of precise search terms and provides insights into the biological significance of RBP4 in vitamin A transport and its implications for human health.

Uploaded by

salmanalishba980

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views17 pages

Bioinformatics Lab Assaignment 2

Uploaded by

salmanalishba980

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

NAME; Alishba Salman

CLASS; 3rd year (1st semester)

DEPARTMENT; Biotechnology
COURSE TITLE; BIOINFORMATICS LAB
COURSE CODE; BTH-3051
COURSE INCHARGE; Miss Ayesha Aman
ASSAIGNMENT OF BIOINFORMATICS
Lab
 You will be looking at the Retinol Binding Protein 4 and HIV pol to learn to navigate
through NCBI’s website and the different linked databases. After performing this assignment
with one gene, you will perform a similar assignment with your gene of interest.

Go to NCBI’s website (http://www.ncbi.nlm.nih.gov/)

When doing database searches, your search terms need to be as specific as possible in order to
eliminate large returns of data, some of it useless. A record is an individual file or NCBI “hit”
obtained from a search.

1. Start at the main NCBI page. Use All Databases on the NCBI home page. To retrieve a large
amount of returns, use “retinol binding protein” as your search term. How many hits did you
find in the Entrez Page?

1. How many of these hits fell into the protein category?

There are 5966 hits fell into the protein category.

1. Would you get a different answer without using the quotation marks around your search
term? Why?
2. Now try “retinol binding protein 4” on the All Databases search. How many proteins do you
find in the Entrez Page?

3. What about “rpb4” on the All Databases search?

The term “rpb4” in the All Databases search likely refers to the gene encoding the fourth-largest
subunit of RNA polymerase II (Pol II), a crucial enzyme responsible for synthesizing messenger
RNA in eukaryotic cells. In Saccharomyces cerevisiae (baker’s yeast), this gene is known as
RPB4, and in humans, it is encoded by the POLR2D gene.

3. How many proteins do you find in the Entrez Page?

There are 1435 proteins in the Entrez Page

3. To make it even more specific let’s add “rbp4 homo sapiens.

3. How many proteins do you find using the All Databases Search?
There are 42 proteins find in the All Databases Search.
4. In question 3 you are actually looking for the full length rbp4 for Homo sapiens with
accession number NP_006735

5. What about searching using this tool do you think you still make you get other hits when you type in
“rbp4 homo sapiens?”
6. What is the full name of this gene’s protein product?
7. Give a brief description of what the protein does. If you quote a record, give me the link you
used?

Retinol-binding protein (RBP) is a specialized glycoprotein. Retinol binding protein 4, also

known as RBP4, This protein belongs to the lipocalin family and is the specific carrier for
protein. For retinol (vitamin A alcohol) in the bloodstream. Synthesized mainly in the liver and
adipose tissue, RBP binds retinol in a 1:1 ratio, forming a complex that solubilizes the
hydrophobic vitamin, protecting it from oxidative damage, and facilitating its transport to various
target tissues, including the retina, skin, lungs, and gonads. It delivers retinol from the liver
stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin,
which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A
blocks secretion of the binding protein posttranslational and results in defective delivery and
supply to the epidermal cells

https://en.wikipedia.org/wiki/Retinol_binding_protein_4?utm_source=chatgpt.com

8. How many amino acids are in this protein?

The number of amino acids in retinol-binding proteins varies depending on the specific type of
retinol-binding protein (RBP).

1. Human Serum Retinol-Binding Protein (RBP4): This protein consists of 182 amino
acid residues.
2. Human Cellular Retinol-Binding Protein 1 (CRBP1): This protein comprises 135
amino acid residues.

These variations are due to differences in the specific functions and structures of each RBP type

9. Is there functional protein domains described for this protein? You will find this in the
conserved domain database. This is either in RefSeq or can be linked from Domains through
the record. List them.

Yes, the RBP4 (Retinol Binding Protein 4) protein contains functional domains that are well-
characterized in the Conserved Domain Database (CDD) and other resources like UniProt and
InterPro.

Functional Domains in RBP4

1. Lipocalin_RBP_like (CDD: cd00743)

o Location: Amino acids 22–192
o Description: This domain is characteristic of the lipocalin family, which includes
proteins involved in the transport of small hydrophobic molecules. In RBP4, this
domain is crucial for binding retinol (vitamin A) and its transport in the
bloodstream. NCBIlipidmaps.org+1NCBI+1
2. Lipocalin (Pfam: PF00061)
o Location: Amino acids 37–175
o Description: This domain is also associated with the lipocalin family and is
involved in binding and transporting small hydrophobic molecules, such as
retinol. NCBI
3. Retinol Binding Protein/Purpurin (InterPro: IPR002449)
o Description: This domain is specific to retinol-binding proteins and purpurin,
highlighting the protein's role in binding and transporting retinol. lipidmaps.org

Accessing Domain Information

 NCBI Conserved Domain Database (CDD): Provides detailed information on

conserved domains within protein sequences.
 UniProt: Offers comprehensive protein sequence and functional information, including
domain annotations.
 InterPro: Integrates multiple protein signature databases to provide functional
annotations.

10. How many amino acids are in the sig_peptide____________? What is the sig peptide? How
many are in the mat_peptide__________? What is the mat peptide?

Signal peptide (sig_peptide)

The signal peptide is a short amino acid sequence at the beginning (N-terminus) of a newly
synthesized protein. It typically consists of 15–30 amino acids.

 Its function is to direct the protein to the endoplasmic reticulum (ER) for secretion or
membrane insertion in eukaryotic cells.
 After reaching the ER, the signal peptide is usually cleaved off by a signal peptidase.
 It is not part of the mature protein.

Mature peptide (mat_peptide)

The mature peptide is the final, functional form of the protein after all processing steps (like
signal peptide removal, cleavage, and folding).

 It is the protein that performs the biological function.

 It typically lacks the signal peptide.

Amino acids are in the sig_peptide

 Without specific data, the typical range is 15–30 amino acids.

To get the exact number, you'd need the amino acid sequence or information from a database
(e.g., UniProt or NCBI).
Amino acids are in the mat_peptide

 You can calculate this by subtracting the number of amino acids in the signal peptide
from the full precursor protein length (if there are no other propeptides or cleavages).

Example (if you provide a sequence or gene/protein ID):

If you have a protein like preproinsulin:

 Signal peptide = 24 amino acids

 Mature insulin = ~51 amino acids (after proinsulin processing)

11. What does CDS stand for and how many nucleotides are in the CDS for this gene?

CDS stands for Coding DNA Sequence. It refers to the portion of a gene's DNA or RNA that
codes for protein — specifically, it's the region that is translated into amino acids. To
determine how many nucleotides are in the CDS for a particular gene, I would need to see the
gene sequence or be given the gene name and access to a database such as NCBI, Ensemble or a
genome browser. If you have:

 The gene sequence: You can count the number of nucleotides in the CDS.
 The gene name: I can look it up and tell you the CDS length.
 A FASTA or GenBank file: Upload it here, and I’ll extract the CDS length for you.

12. Can you find any PubMed references for this gene? Give me the link(s) of 3 of these.
13. What does it mean when the record states that it has been “curated by NCBI staff?”

When a record states that it has been “curated by NCBI staff,” it means that the information in
that record has been reviewed, verified, and possibly edited by experts at the National Center
for Biotechnology Information (NCBI). This manual curation ensures greater accuracy,
consistency, and reliability of the data, compared to automatically generated or unreviewed
records.

Specifically, curation by NCBI staff may involve:

 Checking for correct gene/protein annotations

 Resolving discrepancies in the data
 Adding cross-references to related databases
 Improving functional descriptions or metadata
 Making sure the data follows NCBI’s quality standards

14. Read the section on RefSeq. http://www.ncbi.nlm.nih.gov/RefSeq/RSfaq.html Based on

your earlier searching, explain in your own words why RefSeq is useful in bioinformatics.

The NCBI Reference Sequence (RefSeq) database is a curated, non-redundant collection of

genomic, transcript, and protein sequences. It serves as a foundational resource in
bioinformatics by providing standardized and well-annotated sequences that are essential for
various analyses, including gene identification, mutation detection, and functional
annotation.

For instance, in the study of retinol-binding proteins, which are crucial for vitamin A
transport and metabolism, RefSeq offers a comprehensive and accurate reference for the
RBP1 gene and its associated protein. This enables researchers to confidently interpret
sequence variations, understand gene function, and explore potential implications in health
and disease.

RefSeq's utility extends beyond individual gene studies. Its integration with other NCBI
resources facilitates comparative genomics, evolutionary studies, and the development of
diagnostic tools. By providing a stable and consistent coordinate system, RefSeq supports the
accurate reporting of clinical variations and enhances the reproducibility of bioinformatics
analysis.

In summary, RefSeq is indispensable in bioinformatics for its role in providing high-quality,

curated sequence data that underpin a wide range of genomic analyses and applications.
15. Click on the link associated with Conserved Domains under Entrez Gene (it is in the list to
the right). What is a conserved domain of this protein called? What is its function
16. What chromosome is this gene on? Which chromosome arm is it on? How many nucleotides
are listed in this entire chromosome? You will find this information in Entrez GENE
database or in the Map viewer links to the right on the page.

The RBP4 gene, which encodes retinol binding protein 4, is located on chromosome 10 at
cytoband 10q23.33. Its precise genomic coordinates are 93591694 to 93601744 on the reverse
strand of chromosome 10, according to the GRCh38.p14 human genome assembly.

This gene is situated on the long arm (q arm) of chromosome 10, specifically in the 10q23.33
region. The "q" designation indicates the long arm of the chromosome, distinguishing it from the
short arm, labeled "p"
1. Entire Chromosome 10

 In the human genome (GRCh38), chromosome 10 contains:

o 133,797,422 base pairs (nucleotides)

2. RBP4 Gene Length

 The RBP4 gene spans:

o From position 93,591,694 to 93,601,744 on chromosome 10 (GRCh38)
o That's a total of 10,051 nucleotides (genomic span)

17. Click on Map viewer. What is the accession number of the genomic contig for
RBP4_____________? How many nucleotides do it contain___________? What is a
genomic contig?
 A genomic contig (short for contiguous sequence) is a continuous stretch of DNA
sequence that has been assembled from shorter sequence reads during genome
sequencing.
18. Click on the annotation links labeled sv, pr, dl, ev, mm, hm, sts in Map viewer in the pink
box. What is each of these links abbreviations for?

In NCBI's Map Viewer, the annotation links labeled sv, pr, dl, ev, mm, hm, and sts correspond
to specific tools and resources that provide detailed information about genomic regions. Here's
what each abbreviation stands for:

Annotation Links in Map Viewer

1. sv – Sequence Viewer
o Displays a graphical representation of the nucleotide sequence for the selected
region, allowing users to view gene structures, exons, and other genomic features.
2. pr – Protein
o Links to the protein sequence(s) associated with the gene or genomic region of
interest, providing insights into the translated product.
3. dl – Download
o Offers options to download sequence data from the specified chromosomal region
in various formats for further analysis.
4. ev – Evidence Viewer
o Shows alignments of RefSeq and GenBank transcript sequences (such as mRNAs
and ESTs) to the genomic contig, highlighting supporting evidence for gene
models.
5. mm – Model Maker
o Provides tools to construct or refine gene models based on available transcript
data and genomic sequence, aiding in the prediction of gene structure.
6. hm – HomoloGene
o Links to the HomoloGene database, which identifies homologous genes across
different species, facilitating comparative genomics studies.
7. sts – Sequence Tagged Site
o Directs to information about Sequence Tagged Sites, which are short, unique
DNA sequences used as landmarks in genetic mapping and marker-assisted
studies.

These tools collectively enhance the functionality of Map Viewer by providing comprehensive
resources for viewing, analyzing, and interpreting genomic data
19. Does this gene contain introns? If so, how many and where are the splice junctions? Which
link did you use to discover this? There are several, including looking for the gene name in
the genomic contig sequence, or looking in the whole chromosome sequence.

Yes, the RBP4 gene (retinol binding protein 4) contains introns. According to the NCBI Gene
database, the RBP4 gene comprises 8 exons. This suggests the presence of introns between these
exons, as the gene is transcribed into precursor mRNA (pre-mRNA) that includes both exons and
introns.

How to Determine the Number of Introns and Splice Junctions

To identify the exact number and locations of introns, as well as the splice junctions, you can
utilize the NCBI Genome Data Viewer or the NCBI Gene database. Here's how:

1. NCBI Gene Database:

o Visit the RBP4 Gene page on NCBI.
o Under the "Genomic regions, transcripts, and products" section, click on the
"See RBP4 in Genome Data Viewer" link.
o This will open a graphical representation of the gene's structure, displaying the
exons and introns along the chromosome.NCBI+2NCBI+2NCBI+2
2. NCBI Genome Data Viewer:
o In the Genome Data Viewer, you can zoom in on the specific region of
chromosome 10 where RBP4 is located (10q23.33).
o The viewer will show the gene's exons and introns, along with the splice
junctions, which are typically located at the boundaries between exons and
introns.

By examining the graphical representation, you can determine the number of introns and the
precise locations of the splice junctions.
19. Click on the OMIM link. What is a biological consequence of a mutation in this protein for
humans?

A mutation in the RBP4 (Retinol Binding Protein 4) gene can have several biological
consequences in humans, primarily due to its essential role in transporting vitamin A (retinol)
from the liver to peripheral tissues.

Biological Function of RBP4

RBP4 binds retinol (vitamin A) in the blood and delivers it to cells via interaction with a receptor
called STRA6. Vitamin A is crucial for:

 Vision (especially night vision)

 Immune function
 Embryonic development
 Cellular growth and differentiation

Consequences of Mutations in RBP4

1. Eye and Vision Disorders

o Night blindness and other retinal dysfunctions can occur due to impaired
vitamin A transport.
o Rare inherited retinal dystrophies have been linked to RBP4 mutations.
2. Congenital Malformations
o Mutations can cause RBP4-related oculofaciocardiodental (OFCD)-like
syndromes, including:
 Eye abnormalities (e.g., microphthalmia)
 Facial dysmorphisms
 Congenital heart defects
 Developmental delays
3. Metabolic Disorders
o Elevated or altered RBP4 levels (due to mutations or regulation issues) have been
associated with:
 Insulin resistance
 Type 2 diabetes
 Obesity
o Though these are not typically caused by coding mutations, RBP4 is being studied
as a biomarker for these conditions.

Case Example:

A loss-of-function mutation in RBP4 can lead to vitamin A deficiency, even if dietary intake is
adequate, because the body cannot transport retinol efficiently. This can result in symptoms like:

 Dry eyes
 Impaired immunity
 Skin issues
 Growth retardation in children

20. Can you find your gene in SwissProt (http://us.expasy.org/sprot/) database? Give me the
accession number in SwissProt.
21. What is the advantage of SWISS-PROT vs. NCBI?

Both SWISS-PROT (now part of UniProtKB/Swiss-Prot) and NCBI provide protein sequence
data, but they serve different purposes and offer different strengths.

Here’s a direct comparison:

SWISS-PROT (UniProtKB/Swiss-Prot)

✅ Advantages:

1. Manual Curation
o Every entry is reviewed by experts.
o Errors are corrected, and information is added based on experimental evidence.
2. High-Quality Functional Annotations
o Includes detailed info on:
 Protein function
 Domain structure
 Post-translational modifications
 Variants and disease links
3. Non-redundant
o Only one entry per protein per species (no duplicated submissions).
4. Stable and Consistent Format
o Better suited for reliable data mining, modeling, and pathway analysis.

NCBI (GenPept / RefSeq / GenBank)

Advantages:

1. Broad Coverage
o Includes both curated (RefSeq) and unreviewed submissions (GenBank).
o Contains more recent and raw data, including novel or predicted proteins.
2. Integrated with Genomic Data
o Easily connects with gene locations, mRNA, genomic contigs, and other NCBI
tools (BLAST, Gene, Genome Data Viewer).
3. Rapid Updates
o New sequences are submitted and published quickly—useful for cutting-edge
research.
Feature SWISS-PROT (UniProtKB) NCBI (RefSeq/GenPept)

Curation Manual (high quality) Mixed: curated + automated

Annotation Depth Rich, functional Basic to moderate

Redundancy Non-redundant May include redundant entries

Update Speed Slower, carefully curated Fast, includes raw data

Genomic Integration Limited Strong (linked to genome)

Best For Trusted annotations, models Broad discovery, genome analysis

Bioinformatics LAb Report
100% (3)
Bioinformatics LAb Report
7 pages
Homo Sapiens Retinol Binding Protein 4
No ratings yet
Homo Sapiens Retinol Binding Protein 4
6 pages
Fasta Format RB4 (Retinol Binding Protein 4)
No ratings yet
Fasta Format RB4 (Retinol Binding Protein 4)
3 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
CHM141.1 Activity No.4
No ratings yet
CHM141.1 Activity No.4
10 pages
Proteins Structures
No ratings yet
Proteins Structures
48 pages
Newcomer Et Al The Three Dimensional Structure of Retinol Binding Protein
No ratings yet
Newcomer Et Al The Three Dimensional Structure of Retinol Binding Protein
4 pages
IB Biology Revision Workbook
From Everand
IB Biology Revision Workbook
Roxanne Russo
No ratings yet
Biological Networks: June 8, 2012
No ratings yet
Biological Networks: June 8, 2012
17 pages
BindingDB Intro
No ratings yet
BindingDB Intro
12 pages
Practical Lab Exercise For Intro Bioinf II
No ratings yet
Practical Lab Exercise For Intro Bioinf II
29 pages
Biochem Lecture 03 Proteins Copy Pages
No ratings yet
Biochem Lecture 03 Proteins Copy Pages
5 pages
Biological Databases Lab 2
No ratings yet
Biological Databases Lab 2
14 pages
Protein Structure and Function
No ratings yet
Protein Structure and Function
52 pages
Uniprot (Practicle) : S B Mirza 1314
No ratings yet
Uniprot (Practicle) : S B Mirza 1314
4 pages
11.bioinformatics Analysis of Proteins
No ratings yet
11.bioinformatics Analysis of Proteins
49 pages
Analyzing You Rprotein Using Bioinformatics Tools
No ratings yet
Analyzing You Rprotein Using Bioinformatics Tools
49 pages
Metabolic Biochemistry Handout 2019
No ratings yet
Metabolic Biochemistry Handout 2019
15 pages
TRANSES Protein Classification and Protein Denaturation - Study Guides & Notes (Biochemistry)
No ratings yet
TRANSES Protein Classification and Protein Denaturation - Study Guides & Notes (Biochemistry)
10 pages
Nutrients 14 01236
No ratings yet
Nutrients 14 01236
24 pages
Koenig Biological Databases
No ratings yet
Koenig Biological Databases
35 pages
SEP 2018 Protein Structure and Signaling Slides (4) Qs
No ratings yet
SEP 2018 Protein Structure and Signaling Slides (4) Qs
55 pages
Lecture 3: Protein Structure & Regulation
No ratings yet
Lecture 3: Protein Structure & Regulation
4 pages
02 Enzim
No ratings yet
02 Enzim
106 pages
Mol Dock
No ratings yet
Mol Dock
67 pages
Lecture1 - Introduction To Biomolecules
No ratings yet
Lecture1 - Introduction To Biomolecules
25 pages
Chemical Biology of Protein-Protein Interactions
No ratings yet
Chemical Biology of Protein-Protein Interactions
170 pages
Exam I s2006
No ratings yet
Exam I s2006
6 pages
Bioinfo Lab Final
No ratings yet
Bioinfo Lab Final
49 pages
Belyaeva Et Al 2005 Biochemical Properties of Purified Human Retinol Dehydrogenase 12 (rdh12) Catalytic Efficiency
No ratings yet
Belyaeva Et Al 2005 Biochemical Properties of Purified Human Retinol Dehydrogenase 12 (rdh12) Catalytic Efficiency
13 pages
Chrebp Minireview
No ratings yet
Chrebp Minireview
4 pages
BI Lab Manual (18-19)
No ratings yet
BI Lab Manual (18-19)
21 pages
GTGF GGCF
No ratings yet
GTGF GGCF
19 pages
Protein Databases - 2020
No ratings yet
Protein Databases - 2020
4 pages
Enzyme Handbook
No ratings yet
Enzyme Handbook
1,037 pages
Protein Protein Interactions 2025
No ratings yet
Protein Protein Interactions 2025
26 pages
The Nuclear Receptor FactsBook Entire Ebook Download
No ratings yet
The Nuclear Receptor FactsBook Entire Ebook Download
17 pages
Bio221 Biochemistry Slides 1
No ratings yet
Bio221 Biochemistry Slides 1
62 pages
5 Networks3
No ratings yet
5 Networks3
19 pages
Chapter 2 Assignment - Protein 2025
No ratings yet
Chapter 2 Assignment - Protein 2025
3 pages
Advanced Perl Techniques for Bioinformatics: Optimizing Data Analysis and Computational Biology
From Everand
Advanced Perl Techniques for Bioinformatics: Optimizing Data Analysis and Computational Biology
Adam Jones
No ratings yet
NCERT Solutions For Class 11 Biology Chapter 9
No ratings yet
NCERT Solutions For Class 11 Biology Chapter 9
11 pages
CRBP1 HSCs 1999 - 004903
No ratings yet
CRBP1 HSCs 1999 - 004903
12 pages
Discovery of Small Molecule Antagonists of Human Retinoblastoma Binding Protein 4 (RBBP4)
No ratings yet
Discovery of Small Molecule Antagonists of Human Retinoblastoma Binding Protein 4 (RBBP4)
43 pages
Basic Biochemistry Lab
No ratings yet
Basic Biochemistry Lab
9 pages
Ajrps 14 4 003
No ratings yet
Ajrps 14 4 003
8 pages
4 Protein Diversity Exploration - Lee - Kyle
No ratings yet
4 Protein Diversity Exploration - Lee - Kyle
4 pages
Home Assignment
100% (2)
Home Assignment
6 pages
Biomolecules - Proteins
No ratings yet
Biomolecules - Proteins
18 pages
Intro To Protein Structure
No ratings yet
Intro To Protein Structure
30 pages
Active Site Determination
No ratings yet
Active Site Determination
18 pages
23msc02001 CB Journal
No ratings yet
23msc02001 CB Journal
34 pages
What Is Bioinformatics?
No ratings yet
What Is Bioinformatics?
10 pages
Proteins Reviewer
No ratings yet
Proteins Reviewer
8 pages
Fphys 12 659977
No ratings yet
Fphys 12 659977
15 pages
LSM1106 II Amino Acid Proteins 2019
No ratings yet
LSM1106 II Amino Acid Proteins 2019
117 pages
Mid Sem Bio Info Comp Bio
No ratings yet
Mid Sem Bio Info Comp Bio
9 pages
L4 - Protein Structure and Function
No ratings yet
L4 - Protein Structure and Function
74 pages
Peptides: A Journey Into the World of Health Optimization (The Anti-aging Power of Peptides in Cosmetics for Skin Rejuvenation)
From Everand
Peptides: A Journey Into the World of Health Optimization (The Anti-aging Power of Peptides in Cosmetics for Skin Rejuvenation)
Edward Mullaney
No ratings yet
MBC 203 - Protein Metabolism
No ratings yet
MBC 203 - Protein Metabolism
66 pages
Lab # 9 Bioinformatics
No ratings yet
Lab # 9 Bioinformatics
4 pages
Lab 7 de
No ratings yet
Lab 7 de
8 pages
Lab 8 Bioinformatics.
No ratings yet
Lab 8 Bioinformatics.
6 pages
Bioinformatics Lab # 6
No ratings yet
Bioinformatics Lab # 6
5 pages
Bioinformatics Lab #5
No ratings yet
Bioinformatics Lab #5
3 pages
LAB EXERCISE 3 of Bioinformatics ..
No ratings yet
LAB EXERCISE 3 of Bioinformatics ..
5 pages
Bioinformatics-An Introduction and Overview
No ratings yet
Bioinformatics-An Introduction and Overview
12 pages
Biopython Org DIST Docs Tutorial Tutorial HTML
No ratings yet
Biopython Org DIST Docs Tutorial Tutorial HTML
267 pages
Unipro UGENE User Manual
No ratings yet
Unipro UGENE User Manual
247 pages
Uniprot: The Universal Protein Knowledgebase
No ratings yet
Uniprot: The Universal Protein Knowledgebase
12 pages
IJP Format
No ratings yet
IJP Format
48 pages
UniproUGENE UserManual PDF
No ratings yet
UniproUGENE UserManual PDF
304 pages
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
No ratings yet
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
12 pages
A Computational Report On The Variants of ZSCAN4 Gene in Treating Down Syndrome
No ratings yet
A Computational Report On The Variants of ZSCAN4 Gene in Treating Down Syndrome
8 pages
Bioinformatics Intern
No ratings yet
Bioinformatics Intern
8 pages
Bif401 Manual 2023
No ratings yet
Bif401 Manual 2023
27 pages
Tutorial R
No ratings yet
Tutorial R
456 pages
Bif501 Handouts PDF Bif
No ratings yet
Bif501 Handouts PDF Bif
197 pages
Biological Database 1
No ratings yet
Biological Database 1
50 pages
Bif 401 PPT 1to 80 by M.habib
No ratings yet
Bif 401 PPT 1to 80 by M.habib
588 pages
Databases Exercise
No ratings yet
Databases Exercise
3 pages
FASTA
No ratings yet
FASTA
33 pages
Lecture 3 Database
No ratings yet
Lecture 3 Database
81 pages
Bioinformatics Day4
No ratings yet
Bioinformatics Day4
5 pages
Stem Cell Renewal and Cell-Cell Communication Methods and Protocols (Methods in Molecular Biology, 2346) (Kursad Turksen (Editor) ) (Z-Library) PDF
No ratings yet
Stem Cell Renewal and Cell-Cell Communication Methods and Protocols (Methods in Molecular Biology, 2346) (Kursad Turksen (Editor) ) (Z-Library) PDF
253 pages
The EMBL Nucleotide Sequence Database
No ratings yet
The EMBL Nucleotide Sequence Database
5 pages
Recent Advances in Computer-Aided Drug Design
No ratings yet
Recent Advances in Computer-Aided Drug Design
13 pages
Blast Introduction
No ratings yet
Blast Introduction
42 pages
CDD: NCBI's Conserved Domain Database
No ratings yet
CDD: NCBI's Conserved Domain Database
5 pages
Expert Protein Analysis System: Expasy
100% (1)
Expert Protein Analysis System: Expasy
14 pages
Bioinformatics Day2
No ratings yet
Bioinformatics Day2
3 pages
PROSITE
No ratings yet
PROSITE
2 pages
Lecture2-DataMining For Bioinformatics
No ratings yet
Lecture2-DataMining For Bioinformatics
7 pages
Topic 6 Information Age
No ratings yet
Topic 6 Information Age
9 pages
Online Biological Databases: A/Prof. Ly Le
No ratings yet
Online Biological Databases: A/Prof. Ly Le
64 pages
Building A Multiple Sequence Alignment
No ratings yet
Building A Multiple Sequence Alignment
7 pages

Bioinformatics Lab Assaignment 2

Uploaded by

Bioinformatics Lab Assaignment 2

Uploaded by

NAME; Alishba Salman

CLASS; 3rd year (1st semester)

Go to NCBI’s website (http://www.ncbi.nlm.nih.gov/)

1. How many of these hits fell into the protein category?

There are 5966 hits fell into the protein category.

3. What about “rpb4” on the All Databases search?

3. How many proteins do you find in the Entrez Page?

There are 1435 proteins in the Entrez Page

3. To make it even more specific let’s add “rbp4 homo sapiens.

Retinol-binding protein (RBP) is a specialized glycoprotein. Retinol binding protein 4, also

8. How many amino acids are in this protein?

Functional Domains in RBP4

1. Lipocalin_RBP_like (CDD: cd00743)

Accessing Domain Information

 NCBI Conserved Domain Database (CDD): Provides detailed information on

Signal peptide (sig_peptide)

Mature peptide (mat_peptide)

 It is the protein that performs the biological function.

Amino acids are in the sig_peptide

 Without specific data, the typical range is 15–30 amino acids.

Example (if you provide a sequence or gene/protein ID):

If you have a protein like preproinsulin:

 Signal peptide = 24 amino acids

Specifically, curation by NCBI staff may involve:

 Checking for correct gene/protein annotations

14. Read the section on RefSeq. http://www.ncbi.nlm.nih.gov/RefSeq/RSfaq.html Based on

The NCBI Reference Sequence (RefSeq) database is a curated, non-redundant collection of

In summary, RefSeq is indispensable in bioinformatics for its role in providing high-quality,

 In the human genome (GRCh38), chromosome 10 contains:

2. RBP4 Gene Length

 The RBP4 gene spans:

Annotation Links in Map Viewer

How to Determine the Number of Introns and Splice Junctions

1. NCBI Gene Database:

Biological Function of RBP4

 Vision (especially night vision)

Consequences of Mutations in RBP4

1. Eye and Vision Disorders

Here’s a direct comparison:

NCBI (GenPept / RefSeq / GenBank)

Curation Manual (high quality) Mixed: curated + automated

Annotation Depth Rich, functional Basic to moderate

Redundancy Non-redundant May include redundant entries

Update Speed Slower, carefully curated Fast, includes raw data

Genomic Integration Limited Strong (linked to genome)

Best For Trusted annotations, models Broad discovery, genome analysis

You might also like