Module 1_Session 3_Part 1
Module 1_Session 3_Part 1
(Session 3)
Gene identification
represented in sequence features table
First you have to determine for yourself which information you want
- NA sequences or protein sequences
- If NA, genomic sequences, or RNA derived sequences
- All possible sequences that exists, or just curated ones
- Retrieving the annotated sequence
- finding and interpret the annotated information represented in sequence features table (The
different kinds of features (e.g., gene, mRNA, coding region, tRNA)
• 1. SeqID:
Initially, the Entry Name in the LOCUS line was used as the only key to a GenBank entry
This name attempted to mimic the organism and function of the gene encoded
Problem: impossible to do this systematically and uniquely with new knowledge ...
These Entry Names now change over time...
• 2. Accession Numbers:
The Accession Number was then introduced, to be the primary key to reference an entry in the
database . It will always stay with the entry, even when entry is updated
a. Genbank accession number , either 5 (eg: X79797) or 6 (eg: AF028831)
b. 'RefSeq' entry is the new entry (eg: NC_001140 )
the letter used reflects which of the three databases (GenBank, EMBL, DDBJ) is the primary database
, they have So many different IDs , we need to mapping accession numbers to move between them
Using EBI tool to Convert genbank accession number to ebi accession number
https://www.ebi.ac.uk/ebisearch/search?db=biotools&query=GenBank
Entrez and BLAST results both present the following formatted text as part of the returned result:
gi|4557284|ref|NM_000646.1|AGLf| [4557284]
Gi gene identifier 4557284
Accession Number NM_000646
Version NM_000646.1
LOCUS name AGLf Introduction to Bioinformatics online course: IBT
Bioinformatics Resources & Databases: Abeir Shalaby
Using EBI tool to Convert genbank accession number to ebi accession
number
https://www.ebi.ac.uk/ebisearch/search?db=biotools&query=GenBank