Bioinformatic Database Record
Bioinformatic Database Record
NO: 1 DATE:
AIM:
To Search on NCBI – PubMed bibliographic search, different options author name,
keyword in title, abstract, title and/or abstract, related articles, different display options.
DESCRIPTION:
The National Center for Biotechnology Information (NCBI) is part of the United States
National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH).
The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation
sponsored by Senator Claude Pepper. The NCBI houses a series of databases relevant to
biotechnology and biomedicine and is an important resource for bioinformatics tools and
services. Major databases include GenBank for DNA sequences and PubMed, a
bibliographic database for the biomedical literature. Other databases include the NCBI
Epigenomics database. All these databases are available online through the Entrez search
engine. PubMed Health provides information for consumers and clinicians on prevention and
treatment of diseases and conditions.
PROCEDURE:
OUTPUT:
RESULT:
Searched on NCBI and Retrieved the information about Protein: case in the different
options author name, keyword in title, abstract and related articles, different display options
in NCBI- PubMed.
EX.NO:02 DATE:
DESCRIPTION:
PROCEDURE:
1. Enter the URL: www.ebi.ac.uk.
2. Enter the Organism name in the Search Box.
3. Retrieve sequence of the Organism in EMBL.INPUT:
Organism Name: Homo sapiens.
HOME PAGE:
WORK SPACE:
OUTPUT:
>ENA|EAL24519|EAL24519.1 Homo sapiens (human) LOC401434
ATGGCCCTTCGGGGAATGCCCTGGGCGCCCCGAATACTCAGTGGGGCCTGTTACTTGGCT
GTTTCTCAACATGGAGGAGCGTGGCCCAGACGGCTTTCCCACGCAGGGGAGCATGGTCCA
GATGGCTTTCCCACGTGGGGCAGCCTGGCTCAGACGGCTTTCCCACGCAGGGGAGCATGG
TCCAGACGGCTTTCCCACACCGGGGAGCGTGGCCCAGACGGCTTTCCCACGCTGGGGAGC
CTGGTCCAGACGGCTTTCCCACACCGGGGAGCGTGGCCCAGACGGCTTTCCCACACGGGC
AGCCTGGCTCAGACGGCTTTCCCAGCCTCGCAGAGCTCCCTCTTCTGTTTTCCTGCACTG
CTAAAGCTATGGTCACTCCTTCTGCCAATGCTTGGCTTCACTTCCCTCTACTTCTCCAAG
CTGTGTCCTTTTTCTTTATTCTTATTCACTTACTACTGTTTCTCTATTATCCCTGTCTTG
CTCAATTTTGATTCCACTCCCTGGCAGTTTCATCAGTTCAAAGGAACTAGAAGTCTTCAT
CCCCTAAGCCCTCCCTCCCCCAGGGACCCCTGCCGCTGCCTAGTGCTGGAGAGGCAGACG
CCCCCGCAGTGTTTGCTGCACTGA
RESULT:
Searched and Retrieved the Nucleic acid sequence of Organism : Homo sapiens ID No:MN006677 on
EMBL.
EX NO:3 DATE:
DESCRIPTION:
READSEQ(EMBOSS SEQRET):
EMBOSS Seqret reads and writes (returns) sequences. It is useful for a variety of tasks such as
extracting sequences from databases, displaying sequences, reformatting sequences, producing the reverse
complement of a sequence, extracting fragments of a sequence, sequence case conversion or any
combination of the above functions.
TRANSEQ:
PROCEDURE:
1. Go to url:
2. https://www.ebi.ac.uk/Tools/sfc/readseq/
https://www.ebi.ac.uk/Tools/st/emboss_transeq/
3. Enter the input sequence in fasta format.
4. Click submit to view the result.
5. Note down study study sequence.
HOMEPAGE(WORKSPACE):
OUTPUT:
RESULT:
EMBOSS Seqret was used to read and display the (sequence name & sequence ID)
EMBOSS Transeq translates the nucleic acid sequence (sequence name & sequence
ID) to the corresponding peptide sequence based on three forward and three reverse frames
of translation.
EX.NO:3 DATE:
Aim:
To perform a similarity search of PIR database for the given protein sequence.
Description:
Procedure:
Input:
Workspace:
OUTPUT:
RESULT:
DESCRIPTION:
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and
annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB),the UniProt
Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt consortium and host
institutions EMBL-EBI, SIB and PIR are committed to the long-term preservation of the UniProt
databases.The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence
and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt
Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt consortium and host
institutions EMBL-EBI, SIB and PIR are committed to the long-term preservation of the UniProt
databases.
PROCEDURE:
1. Enter the URL: https://www.uniprot.org
INPUT:
Organism: Cannabis sativa
Protein Name: Edestin
HOMEPAGE:
WORKSPACE:
OUTPUT:
RESULT:
Performed a similarity search of UniProt database for protein name ID and Organism
name.
EX.NO.:6 DATE:
PROTPARAM TOOL
AIM:
To retrieve various Physio-Chemical properties for a given protein sequence using Protparam
tool.
DESCRIPTION:
ProtParam computes various physical and chemical parameters for a given protein
stored in Swiss-Prot or TrEMBL or an user entered protein sequence. The computed
parameters include the molecular weight, theoretical pI, amino acid composition, atomic
composition, extinction coefficient, estimated half-life, instability index, aliphatic index and
grand average of hydropathicity.
PROCEDURE:
1. Go to URL:http://web.expasy.org/protparam/
INPUT:
Protein name:Actin
HOMEPAGE:
OUTPUT:
RESULT:
Retrieved information of various physico-chemical parameters for the protein mame & ID by
using protparam tool.
EX.NO:8 DATE:
Aim:
To perform pairwise sequence alignment for a set of Organism name 1: Homo sapiens,
Organism name 2: Mus musculus Protein name 1: keratin, Protein name 2: myosin
Description:
Procedure:
1. Enter the URL for Emboss
Needle:https://www.ebi.ac.uk/Tools/psa/emboss_needle/
2. To paste the sequence of two organisms to box one and box two
respectively.3. View the alignments of sequence
Input:
Organism name 1: Homo sapiens, Organism name 2: Mus musculus Protein name 1: keratin,
Protein name 2: myosin
Homepage:
Output:
Emboss Needle:
Result:
Performed pairwise sequence alignment for a set of analogous proteins.
EX.NO:9 DATE:
PRINTS DATABASE
AIM:
To perform Motifs searching in derived database PRINTS and BIOCK databases.
DESCRIPTION:
PROCEDURE:
1. GO to https://www.uniprot.org/uniprot/P42526
2. Enter the protein sequence actin
3. Run the sequence and retrieve the results
4. Note down the results
INPUT:
OUTPUT
RESULT:
AIM:
DESCRIPTION:
The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large
biological molecules, such as proteins and nucleic acids. The data, typically obtained by Xray
crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted
by biologists and biochemists from around the world, are freely accessible on the Internet via
the websites of its member organisations (PDBe, PDBj, and RCSB). The PDB is overseen by an
organization called the World wide Protein Data Bank, wwPDB. The PDB is a key resource in
areas of structural biology, such as structural genomics. Most major scientific journals, and
some funding agencies, now require scientists to submit their structure data to the PDB. Many
other databases use protein structures deposited in the PDB.
PROCEDURE:
INPUT:
WORK SPACE:
OUTPUT:
RESULT:
Retrieved the structure and details of 7ccc(acin) exploration by using PDB database.
EX.NO: 11 DATE:
AIM:
To list SCOP lineage and CATH architecture description for a set of proteins
DESCRIPTION:
CATH- The CATH Protein Structure Classification database is a free, publicly available
online resource that provides information on the evolutionary relationships of protein
domains.
PROCEDURE 1:
2. To click the super family option and to click the sequence search
PROCEDURE 2:
HOME PAGE:
SCOP
OUTPUT:
HOME PAGE:
CATH
OUTPUT:
RESULT:
Listed SCOP lineages and CATH architecture description for a set of proteins.
EX.NO: 12 DATE:
Aim:
To perform visualization of a protein structure using rasmol software.
Description:
RasMol is an important scientific tool for the visualisation of proteins, nucleic acids and
small molecules and to prepare publication-quality images was created by Roger Sayle in
1992. More controlled operations can be done only with the command line interface. Widely
used, simple to use (menus) for simple operations Complex operations require command-line
interface.
Procedure:
1. Open rasmol software , The main menu has ‘File’, ‘Display’, ‘Colours’, ‘Export’,
‘Options’, ‘Settings’.
2. To load a molecule click File > Open…. Select the file from your computer for
visualization.
3. By default the program displays the content as ‘wireframe’ model.
4. The display can be represented in different colours by selecting the ‘Colours’ options
in the menu bar by selecting the appropriate option
5. To display the Distance, Angle and Torsional measurement.
Input:
OUTPUT:
DISPLAY ( WIREFRAME ):
COLOUR (PINK):
PICK IDENT:
PICK DISTANCE:
PICK ANGLE:
RESULT:
The given protein 1DWD structure was analyzed using RasMol software . with these
commands display wireframe,pick ident,colour pink pick coard.
Ex.No: 13 Date:
PYMOL
Aim:
Description:
Procedure:
Input:
Output:
Superimposition of Protein:
Result
Protein structure 1qlx, 5yj5 has been viewed and interpreted using pymol.
Ex. No: 14 Date:
AIM:
Pair wise sequence alignment by LALIGN tool.
DESCRIPTION:
Compares two sequences looking for local sequence similarities. The tool reports a number of
non-overlapping alignments between sequences. LALIGN is a “linear-space algorithm” in the
sense that it needs only space proportional to the sum of the input size and the output size . lalign
is part of the Fasta3 package. This version replaces that from the Fasta2 package. While
programs such as fasta and search report only the best alignment between the query sequence and
the library sequence, lalign reports a number of nonoverlapping alignments between sequences.
PROCEDURE:
INPUT:
HOMEPAGE:
WORKSPCAE:
OUTPUT:.
Pairwise alignment has been done between Q9N764 (HOMO SAPIENS) and Q9JJZ
(RATTUS NORVEGICUS) using LALIGN tool.
EX NO:15 DATE:
NCBI-BLAST TOOL
AIM:
DESCRIPTION:
The most common local alignment tool is BLAST (Basic Local Alignment Search Tool)
developed by Altschul et al. The operative phrase in the phrase is local alignment. The BLAST is
a set of algorithms that attempt to find a short fragment of a query sequence that aligns perfectly
with a fragment of a subject sequence found in a database. That initial alignment must be greater
than a neighborhood score threshold. For the original BLAST algorithm, the fragment is then
used as a seed to extend the alignment in both directions. The alignment is extended in both
directions until the T score for the aligned segment does not continue to increase. The first step
of the BLAST algorithm is to break the query into short words of a specific length.
PROCEDURE:
1. Go to URL: https://blast.ncbi.nlm.nih.gov/Blast.cgi 2.
Enter the nucleotide sequence for bovine organism.
3. Enter the submit button.
.
INPUT:
AIM:
To retrieve amino acid sequences (in FASTA format) of Bowman-Birk inhibitors from
different species (monocots and dicots) and perform multiple alignment with ClustalW to
evaluate their homology. To compare and comment on the conservation disulfide bridge
pattern between monocots and dicots.
DESCRIPTION:
Clustal Omega is a new multiple sequence alignment program that uses seeded guide trees
and HMM profile-profile techniques to generate alignments between three or more
sequences. For the alignment of two sequences please instead use our pairwise sequence
alignment tools.
Monocot seeds are defined as seeds that consist of a single (mono) embryonic leaf or
cotyledon.The structure of the seed and the number of cotyledons present in the seed are the
most important characteristics that allow the differentiation of monocots and dicots. Dicot
seeds are defined as seeds that consist of two embryonic leaves or cotyledons.Dicot seeds
contain a single embryo with an embryo axis and two cotyledons around it. Initially, all
angiosperms or flowering plants were grouped under dicots.cysteine is a sulfer containing
amino acid,in proteins usually exists as a cystine by forming a disulfide bond between –two
cysteine residues which is essential for forming tertiary structure and stability of a protein.
PROCEDURE:
1. Go to UNIPROT.
2. Retreive the protein sequence of monocot and dicot seeds.
3. Enter the URL: https://www.ebi.ac.uk/Tools/msa/clustalo/ 4. Run MSA using
CLUSTALW.
5. Note down the result.
HOMEPAGE
RESULT
RESULT:
Homology has been compared and comment on the conservation disulfide bridge pattern
between monocots and dicots. Sequence which has the common pattern CCD.
EX.NO.:17 DATE:
AIM:
To searching metabolic pathway information using KEGG and METACYC database.
DESCRIPTION:
KEGG:
KEGG is a database resource for understanding high-level functions and utilities of the
biological system, such as the cell, the organism and the ecosystem, from genomic and
molecular-level information. It is a computer representation of the biological system, consisting
of molecular building blocks of genes and proteins (genomic information) and chemical
substances (chemical information) that are integrated with the knowledge on molecular wiring
diagrams of interaction, reaction and relation networks (systems information). It also contains
disease and drug information (health information) as perturbations to the biological system.
METACYC:
MetaCyc is a curated database of experimentally elucidated metabolic pathways from all
domains of life. MetaCyc contains 2937 pathways from 3295 different organisms. MetaCyc
contains pathways involved in both primary and secondary metabolism, as well as associated
metabolites, reactions, enzymes, and genes. The goal of MetaCyc is to catalog the universe of
metabolism by storing a representative sample of each experimentally elucidated pathway.
PROCEDURE:
KEGG:
METACYC:
INPUT: KEGG
METACYC:
WORKSPACE:
METACYC:
HOMEPAGE:
WORKSPACE:
OUTPUT:
KEGG:
METACYC:
Result
EX.NO:18 DATE:
MEGA
AIM:
1.To perform phylogenetic analysis by neighbour joining method using the kimura two-
parameter model for a set of nucleotide sequences
2. To perform phylogenetic analysis by neighbour joining method using the Dayhoff PAM
matrix for a set of amino acid sequence(ribonucleases)
DESCRIPTION:
PROCEDURE:
1. To collect and download the similar sequence of a particular organism using BLAST.
3. Click the alignment menu – edit built alignment and sub window open.
We constructed the phylogenetic tree and genetic diversity using neighbour joining
algorithm.